Loading…

Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment

Co‐occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of var...

Full description

Saved in:
Bibliographic Details
Published in:Journal of the American Society for Information Science and Technology 2006-10, Vol.57 (12), p.1616-1628
Main Authors: Leydesdorff, Loet, Vaughan, Liwen
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3
cites cdi_FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3
container_end_page 1628
container_issue 12
container_start_page 1616
container_title Journal of the American Society for Information Science and Technology
container_volume 57
creator Leydesdorff, Loet
Vaughan, Liwen
description Co‐occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This article discusses the difference between a symmetrical cocitation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (such as the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical cocitation matrix but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co‐occurrence matrices to the Web environment, in which the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed by using both the traditional methods of multivariate analysis and the new visualization software Pajek, which is based on social network analysis and graph theory.
doi_str_mv 10.1002/asi.20335
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_57680705</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>35099386</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3</originalsourceid><addsrcrecordid>eNqFkV1rFDEUhgdRsFYv_AdBUPBi2nxMJol3y1K3hVKFKnsZzmTOaNaZZE1mtf33Zru1giBCyAc8zwsnb1W9ZPSEUcpPIfsTToWQj6ojJgWvuTb08cNd86fVs5w3lDImGT2qNstYR-d2KWFwSCaYk3eYCYSezF_RJwLb7egdzD6GTHwoa4hpunuT7Pxee0fObmYMvQ9fyGK5IHPcu2SNHcHww6cYJgzz8-rJAGPGF_fncfX5_dmn5Xl9-WF1sVxc1q5pmaxNS3sETlGaHsD1ipdtYJ12DXQKXGcalAqFooNWHXcCG6MbbmBoqeqhF8fVm0PuNsXvO8yznXx2OI4QMO6ylarVVFH5X1BIaozQbQFf_QVu4i6FMoTlovysbg0r0NsD5FLMOeFgt8lPkG4to3bfjS3d2LtuCvv6PhCyg3FIEJzPfwTNhGobWrjTA_fTj3j770C7uL74nVwfDJ9nvHkwIH2zrRJK2vXVyurzq_XHlRD2WvwCLJytIA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>231538691</pqid></control><display><type>article</type><title>Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment</title><source>EBSCOhost Business Source Ultimate</source><source>Library &amp; Information Science Abstracts (LISA)</source><source>Wiley-Blackwell Read &amp; Publish Collection</source><creator>Leydesdorff, Loet ; Vaughan, Liwen</creator><creatorcontrib>Leydesdorff, Loet ; Vaughan, Liwen</creatorcontrib><description>Co‐occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This article discusses the difference between a symmetrical cocitation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (such as the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical cocitation matrix but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co‐occurrence matrices to the Web environment, in which the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed by using both the traditional methods of multivariate analysis and the new visualization software Pajek, which is based on social network analysis and graph theory.</description><identifier>ISSN: 1532-2882</identifier><identifier>ISSN: 2330-1635</identifier><identifier>EISSN: 1532-2890</identifier><identifier>EISSN: 2330-1643</identifier><identifier>DOI: 10.1002/asi.20335</identifier><language>eng</language><publisher>Hoboken: Wiley Subscription Services, Inc., A Wiley Company</publisher><subject>Bibliometrics. Scientometrics ; Bibliometrics. Scientometrics. Evaluation ; Citation analysis ; Citations ; Cocitation ; Correlation ; Correlation analysis ; Correlation coefficients ; Data acquisition ; Data collection ; Exact sciences and technology ; Google Scholar ; Graph theory ; Information and communication sciences ; Information retrieval ; Information science. Documentation ; Library and information science. General aspects ; Mathematical analysis ; Matrix ; Matrix methods ; Multivariate analysis ; Network analysis ; Periodicals ; Sciences and techniques of general use ; Scientific papers ; Search engines ; Searching ; Social networks ; Software ; Statistical analysis ; Studies ; Webs ; World Wide Web</subject><ispartof>Journal of the American Society for Information Science and Technology, 2006-10, Vol.57 (12), p.1616-1628</ispartof><rights>Copyright © 2006 Wiley Periodicals, Inc., A Wiley Company</rights><rights>2007 INIST-CNRS</rights><rights>Copyright Wiley Periodicals Inc. Oct 2006</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3</citedby><cites>FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906,34116,34117</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=18137640$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Leydesdorff, Loet</creatorcontrib><creatorcontrib>Vaughan, Liwen</creatorcontrib><title>Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment</title><title>Journal of the American Society for Information Science and Technology</title><addtitle>J. Am. Soc. Inf. Sci</addtitle><description>Co‐occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This article discusses the difference between a symmetrical cocitation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (such as the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical cocitation matrix but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co‐occurrence matrices to the Web environment, in which the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed by using both the traditional methods of multivariate analysis and the new visualization software Pajek, which is based on social network analysis and graph theory.</description><subject>Bibliometrics. Scientometrics</subject><subject>Bibliometrics. Scientometrics. Evaluation</subject><subject>Citation analysis</subject><subject>Citations</subject><subject>Cocitation</subject><subject>Correlation</subject><subject>Correlation analysis</subject><subject>Correlation coefficients</subject><subject>Data acquisition</subject><subject>Data collection</subject><subject>Exact sciences and technology</subject><subject>Google Scholar</subject><subject>Graph theory</subject><subject>Information and communication sciences</subject><subject>Information retrieval</subject><subject>Information science. Documentation</subject><subject>Library and information science. General aspects</subject><subject>Mathematical analysis</subject><subject>Matrix</subject><subject>Matrix methods</subject><subject>Multivariate analysis</subject><subject>Network analysis</subject><subject>Periodicals</subject><subject>Sciences and techniques of general use</subject><subject>Scientific papers</subject><subject>Search engines</subject><subject>Searching</subject><subject>Social networks</subject><subject>Software</subject><subject>Statistical analysis</subject><subject>Studies</subject><subject>Webs</subject><subject>World Wide Web</subject><issn>1532-2882</issn><issn>2330-1635</issn><issn>1532-2890</issn><issn>2330-1643</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>F2A</sourceid><recordid>eNqFkV1rFDEUhgdRsFYv_AdBUPBi2nxMJol3y1K3hVKFKnsZzmTOaNaZZE1mtf33Zru1giBCyAc8zwsnb1W9ZPSEUcpPIfsTToWQj6ojJgWvuTb08cNd86fVs5w3lDImGT2qNstYR-d2KWFwSCaYk3eYCYSezF_RJwLb7egdzD6GTHwoa4hpunuT7Pxee0fObmYMvQ9fyGK5IHPcu2SNHcHww6cYJgzz8-rJAGPGF_fncfX5_dmn5Xl9-WF1sVxc1q5pmaxNS3sETlGaHsD1ipdtYJ12DXQKXGcalAqFooNWHXcCG6MbbmBoqeqhF8fVm0PuNsXvO8yznXx2OI4QMO6ylarVVFH5X1BIaozQbQFf_QVu4i6FMoTlovysbg0r0NsD5FLMOeFgt8lPkG4to3bfjS3d2LtuCvv6PhCyg3FIEJzPfwTNhGobWrjTA_fTj3j770C7uL74nVwfDJ9nvHkwIH2zrRJK2vXVyurzq_XHlRD2WvwCLJytIA</recordid><startdate>200610</startdate><enddate>200610</enddate><creator>Leydesdorff, Loet</creator><creator>Vaughan, Liwen</creator><general>Wiley Subscription Services, Inc., A Wiley Company</general><general>Wiley</general><general>Wiley Periodicals Inc</general><scope>BSCLL</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>200610</creationdate><title>Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment</title><author>Leydesdorff, Loet ; Vaughan, Liwen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Bibliometrics. Scientometrics</topic><topic>Bibliometrics. Scientometrics. Evaluation</topic><topic>Citation analysis</topic><topic>Citations</topic><topic>Cocitation</topic><topic>Correlation</topic><topic>Correlation analysis</topic><topic>Correlation coefficients</topic><topic>Data acquisition</topic><topic>Data collection</topic><topic>Exact sciences and technology</topic><topic>Google Scholar</topic><topic>Graph theory</topic><topic>Information and communication sciences</topic><topic>Information retrieval</topic><topic>Information science. Documentation</topic><topic>Library and information science. General aspects</topic><topic>Mathematical analysis</topic><topic>Matrix</topic><topic>Matrix methods</topic><topic>Multivariate analysis</topic><topic>Network analysis</topic><topic>Periodicals</topic><topic>Sciences and techniques of general use</topic><topic>Scientific papers</topic><topic>Search engines</topic><topic>Searching</topic><topic>Social networks</topic><topic>Software</topic><topic>Statistical analysis</topic><topic>Studies</topic><topic>Webs</topic><topic>World Wide Web</topic><toplevel>online_resources</toplevel><creatorcontrib>Leydesdorff, Loet</creatorcontrib><creatorcontrib>Vaughan, Liwen</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of the American Society for Information Science and Technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Leydesdorff, Loet</au><au>Vaughan, Liwen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment</atitle><jtitle>Journal of the American Society for Information Science and Technology</jtitle><addtitle>J. Am. Soc. Inf. Sci</addtitle><date>2006-10</date><risdate>2006</risdate><volume>57</volume><issue>12</issue><spage>1616</spage><epage>1628</epage><pages>1616-1628</pages><issn>1532-2882</issn><issn>2330-1635</issn><eissn>1532-2890</eissn><eissn>2330-1643</eissn><abstract>Co‐occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This article discusses the difference between a symmetrical cocitation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (such as the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical cocitation matrix but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co‐occurrence matrices to the Web environment, in which the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed by using both the traditional methods of multivariate analysis and the new visualization software Pajek, which is based on social network analysis and graph theory.</abstract><cop>Hoboken</cop><pub>Wiley Subscription Services, Inc., A Wiley Company</pub><doi>10.1002/asi.20335</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1532-2882
ispartof Journal of the American Society for Information Science and Technology, 2006-10, Vol.57 (12), p.1616-1628
issn 1532-2882
2330-1635
1532-2890
2330-1643
language eng
recordid cdi_proquest_miscellaneous_57680705
source EBSCOhost Business Source Ultimate; Library & Information Science Abstracts (LISA); Wiley-Blackwell Read & Publish Collection
subjects Bibliometrics. Scientometrics
Bibliometrics. Scientometrics. Evaluation
Citation analysis
Citations
Cocitation
Correlation
Correlation analysis
Correlation coefficients
Data acquisition
Data collection
Exact sciences and technology
Google Scholar
Graph theory
Information and communication sciences
Information retrieval
Information science. Documentation
Library and information science. General aspects
Mathematical analysis
Matrix
Matrix methods
Multivariate analysis
Network analysis
Periodicals
Sciences and techniques of general use
Scientific papers
Search engines
Searching
Social networks
Software
Statistical analysis
Studies
Webs
World Wide Web
title Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T18%3A49%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Co-occurrence%20matrices%20and%20their%20applications%20in%20information%20science:%20Extending%20ACA%20to%20the%20Web%20environment&rft.jtitle=Journal%20of%20the%20American%20Society%20for%20Information%20Science%20and%20Technology&rft.au=Leydesdorff,%20Loet&rft.date=2006-10&rft.volume=57&rft.issue=12&rft.spage=1616&rft.epage=1628&rft.pages=1616-1628&rft.issn=1532-2882&rft.eissn=1532-2890&rft_id=info:doi/10.1002/asi.20335&rft_dat=%3Cproquest_cross%3E35099386%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=231538691&rft_id=info:pmid/&rfr_iscdi=true