Loading…
Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment
Co‐occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of var...
Saved in:
Published in: | Journal of the American Society for Information Science and Technology 2006-10, Vol.57 (12), p.1616-1628 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3 |
---|---|
cites | cdi_FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3 |
container_end_page | 1628 |
container_issue | 12 |
container_start_page | 1616 |
container_title | Journal of the American Society for Information Science and Technology |
container_volume | 57 |
creator | Leydesdorff, Loet Vaughan, Liwen |
description | Co‐occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This article discusses the difference between a symmetrical cocitation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (such as the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical cocitation matrix but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co‐occurrence matrices to the Web environment, in which the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed by using both the traditional methods of multivariate analysis and the new visualization software Pajek, which is based on social network analysis and graph theory. |
doi_str_mv | 10.1002/asi.20335 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_57680705</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>35099386</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3</originalsourceid><addsrcrecordid>eNqFkV1rFDEUhgdRsFYv_AdBUPBi2nxMJol3y1K3hVKFKnsZzmTOaNaZZE1mtf33Zru1giBCyAc8zwsnb1W9ZPSEUcpPIfsTToWQj6ojJgWvuTb08cNd86fVs5w3lDImGT2qNstYR-d2KWFwSCaYk3eYCYSezF_RJwLb7egdzD6GTHwoa4hpunuT7Pxee0fObmYMvQ9fyGK5IHPcu2SNHcHww6cYJgzz8-rJAGPGF_fncfX5_dmn5Xl9-WF1sVxc1q5pmaxNS3sETlGaHsD1ipdtYJ12DXQKXGcalAqFooNWHXcCG6MbbmBoqeqhF8fVm0PuNsXvO8yznXx2OI4QMO6ylarVVFH5X1BIaozQbQFf_QVu4i6FMoTlovysbg0r0NsD5FLMOeFgt8lPkG4to3bfjS3d2LtuCvv6PhCyg3FIEJzPfwTNhGobWrjTA_fTj3j770C7uL74nVwfDJ9nvHkwIH2zrRJK2vXVyurzq_XHlRD2WvwCLJytIA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>231538691</pqid></control><display><type>article</type><title>Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment</title><source>EBSCOhost Business Source Ultimate</source><source>Library & Information Science Abstracts (LISA)</source><source>Wiley-Blackwell Read & Publish Collection</source><creator>Leydesdorff, Loet ; Vaughan, Liwen</creator><creatorcontrib>Leydesdorff, Loet ; Vaughan, Liwen</creatorcontrib><description>Co‐occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This article discusses the difference between a symmetrical cocitation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (such as the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical cocitation matrix but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co‐occurrence matrices to the Web environment, in which the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed by using both the traditional methods of multivariate analysis and the new visualization software Pajek, which is based on social network analysis and graph theory.</description><identifier>ISSN: 1532-2882</identifier><identifier>ISSN: 2330-1635</identifier><identifier>EISSN: 1532-2890</identifier><identifier>EISSN: 2330-1643</identifier><identifier>DOI: 10.1002/asi.20335</identifier><language>eng</language><publisher>Hoboken: Wiley Subscription Services, Inc., A Wiley Company</publisher><subject>Bibliometrics. Scientometrics ; Bibliometrics. Scientometrics. Evaluation ; Citation analysis ; Citations ; Cocitation ; Correlation ; Correlation analysis ; Correlation coefficients ; Data acquisition ; Data collection ; Exact sciences and technology ; Google Scholar ; Graph theory ; Information and communication sciences ; Information retrieval ; Information science. Documentation ; Library and information science. General aspects ; Mathematical analysis ; Matrix ; Matrix methods ; Multivariate analysis ; Network analysis ; Periodicals ; Sciences and techniques of general use ; Scientific papers ; Search engines ; Searching ; Social networks ; Software ; Statistical analysis ; Studies ; Webs ; World Wide Web</subject><ispartof>Journal of the American Society for Information Science and Technology, 2006-10, Vol.57 (12), p.1616-1628</ispartof><rights>Copyright © 2006 Wiley Periodicals, Inc., A Wiley Company</rights><rights>2007 INIST-CNRS</rights><rights>Copyright Wiley Periodicals Inc. Oct 2006</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3</citedby><cites>FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906,34116,34117</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18137640$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Leydesdorff, Loet</creatorcontrib><creatorcontrib>Vaughan, Liwen</creatorcontrib><title>Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment</title><title>Journal of the American Society for Information Science and Technology</title><addtitle>J. Am. Soc. Inf. Sci</addtitle><description>Co‐occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This article discusses the difference between a symmetrical cocitation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (such as the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical cocitation matrix but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co‐occurrence matrices to the Web environment, in which the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed by using both the traditional methods of multivariate analysis and the new visualization software Pajek, which is based on social network analysis and graph theory.</description><subject>Bibliometrics. Scientometrics</subject><subject>Bibliometrics. Scientometrics. Evaluation</subject><subject>Citation analysis</subject><subject>Citations</subject><subject>Cocitation</subject><subject>Correlation</subject><subject>Correlation analysis</subject><subject>Correlation coefficients</subject><subject>Data acquisition</subject><subject>Data collection</subject><subject>Exact sciences and technology</subject><subject>Google Scholar</subject><subject>Graph theory</subject><subject>Information and communication sciences</subject><subject>Information retrieval</subject><subject>Information science. Documentation</subject><subject>Library and information science. General aspects</subject><subject>Mathematical analysis</subject><subject>Matrix</subject><subject>Matrix methods</subject><subject>Multivariate analysis</subject><subject>Network analysis</subject><subject>Periodicals</subject><subject>Sciences and techniques of general use</subject><subject>Scientific papers</subject><subject>Search engines</subject><subject>Searching</subject><subject>Social networks</subject><subject>Software</subject><subject>Statistical analysis</subject><subject>Studies</subject><subject>Webs</subject><subject>World Wide Web</subject><issn>1532-2882</issn><issn>2330-1635</issn><issn>1532-2890</issn><issn>2330-1643</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>F2A</sourceid><recordid>eNqFkV1rFDEUhgdRsFYv_AdBUPBi2nxMJol3y1K3hVKFKnsZzmTOaNaZZE1mtf33Zru1giBCyAc8zwsnb1W9ZPSEUcpPIfsTToWQj6ojJgWvuTb08cNd86fVs5w3lDImGT2qNstYR-d2KWFwSCaYk3eYCYSezF_RJwLb7egdzD6GTHwoa4hpunuT7Pxee0fObmYMvQ9fyGK5IHPcu2SNHcHww6cYJgzz8-rJAGPGF_fncfX5_dmn5Xl9-WF1sVxc1q5pmaxNS3sETlGaHsD1ipdtYJ12DXQKXGcalAqFooNWHXcCG6MbbmBoqeqhF8fVm0PuNsXvO8yznXx2OI4QMO6ylarVVFH5X1BIaozQbQFf_QVu4i6FMoTlovysbg0r0NsD5FLMOeFgt8lPkG4to3bfjS3d2LtuCvv6PhCyg3FIEJzPfwTNhGobWrjTA_fTj3j770C7uL74nVwfDJ9nvHkwIH2zrRJK2vXVyurzq_XHlRD2WvwCLJytIA</recordid><startdate>200610</startdate><enddate>200610</enddate><creator>Leydesdorff, Loet</creator><creator>Vaughan, Liwen</creator><general>Wiley Subscription Services, Inc., A Wiley Company</general><general>Wiley</general><general>Wiley Periodicals Inc</general><scope>BSCLL</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>200610</creationdate><title>Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment</title><author>Leydesdorff, Loet ; Vaughan, Liwen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Bibliometrics. Scientometrics</topic><topic>Bibliometrics. Scientometrics. Evaluation</topic><topic>Citation analysis</topic><topic>Citations</topic><topic>Cocitation</topic><topic>Correlation</topic><topic>Correlation analysis</topic><topic>Correlation coefficients</topic><topic>Data acquisition</topic><topic>Data collection</topic><topic>Exact sciences and technology</topic><topic>Google Scholar</topic><topic>Graph theory</topic><topic>Information and communication sciences</topic><topic>Information retrieval</topic><topic>Information science. Documentation</topic><topic>Library and information science. General aspects</topic><topic>Mathematical analysis</topic><topic>Matrix</topic><topic>Matrix methods</topic><topic>Multivariate analysis</topic><topic>Network analysis</topic><topic>Periodicals</topic><topic>Sciences and techniques of general use</topic><topic>Scientific papers</topic><topic>Search engines</topic><topic>Searching</topic><topic>Social networks</topic><topic>Software</topic><topic>Statistical analysis</topic><topic>Studies</topic><topic>Webs</topic><topic>World Wide Web</topic><toplevel>online_resources</toplevel><creatorcontrib>Leydesdorff, Loet</creatorcontrib><creatorcontrib>Vaughan, Liwen</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of the American Society for Information Science and Technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Leydesdorff, Loet</au><au>Vaughan, Liwen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment</atitle><jtitle>Journal of the American Society for Information Science and Technology</jtitle><addtitle>J. Am. Soc. Inf. Sci</addtitle><date>2006-10</date><risdate>2006</risdate><volume>57</volume><issue>12</issue><spage>1616</spage><epage>1628</epage><pages>1616-1628</pages><issn>1532-2882</issn><issn>2330-1635</issn><eissn>1532-2890</eissn><eissn>2330-1643</eissn><abstract>Co‐occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This article discusses the difference between a symmetrical cocitation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (such as the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical cocitation matrix but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co‐occurrence matrices to the Web environment, in which the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed by using both the traditional methods of multivariate analysis and the new visualization software Pajek, which is based on social network analysis and graph theory.</abstract><cop>Hoboken</cop><pub>Wiley Subscription Services, Inc., A Wiley Company</pub><doi>10.1002/asi.20335</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1532-2882 |
ispartof | Journal of the American Society for Information Science and Technology, 2006-10, Vol.57 (12), p.1616-1628 |
issn | 1532-2882 2330-1635 1532-2890 2330-1643 |
language | eng |
recordid | cdi_proquest_miscellaneous_57680705 |
source | EBSCOhost Business Source Ultimate; Library & Information Science Abstracts (LISA); Wiley-Blackwell Read & Publish Collection |
subjects | Bibliometrics. Scientometrics Bibliometrics. Scientometrics. Evaluation Citation analysis Citations Cocitation Correlation Correlation analysis Correlation coefficients Data acquisition Data collection Exact sciences and technology Google Scholar Graph theory Information and communication sciences Information retrieval Information science. Documentation Library and information science. General aspects Mathematical analysis Matrix Matrix methods Multivariate analysis Network analysis Periodicals Sciences and techniques of general use Scientific papers Search engines Searching Social networks Software Statistical analysis Studies Webs World Wide Web |
title | Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T18%3A49%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Co-occurrence%20matrices%20and%20their%20applications%20in%20information%20science:%20Extending%20ACA%20to%20the%20Web%20environment&rft.jtitle=Journal%20of%20the%20American%20Society%20for%20Information%20Science%20and%20Technology&rft.au=Leydesdorff,%20Loet&rft.date=2006-10&rft.volume=57&rft.issue=12&rft.spage=1616&rft.epage=1628&rft.pages=1616-1628&rft.issn=1532-2882&rft.eissn=1532-2890&rft_id=info:doi/10.1002/asi.20335&rft_dat=%3Cproquest_cross%3E35099386%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c4615-960dea20e59daacd72acdf1b8c4ab7acb94e57e370f87b2c3e498429af607dad3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=231538691&rft_id=info:pmid/&rfr_iscdi=true |