Loading…

Disambiguating Authors by Pairwise Classification

Name ambiguity is a critical problem in many applications, in particular in online bibliography sys-tems, such as DBLP, ACM, and CiteSeerx. Despite the many studies, this problem is still not resolved and is becoming even more serious, especially with the increasing popularity of Web 2.0. This paper...

Full description

Saved in:
Bibliographic Details
Published in:Tsinghua science and technology 2010-12, Vol.15 (6), p.668-677
Main Author: 林泉 王波 杜圆 王雪至 李玉华 陈松灿
Format: Article
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c283t-fdbfdceb35e962a62aeefd181a57921695cbd03e6ef949f60995a85ea2e8b05d3
cites
container_end_page 677
container_issue 6
container_start_page 668
container_title Tsinghua science and technology
container_volume 15
creator 林泉 王波 杜圆 王雪至 李玉华 陈松灿
description Name ambiguity is a critical problem in many applications, in particular in online bibliography sys-tems, such as DBLP, ACM, and CiteSeerx. Despite the many studies, this problem is still not resolved and is becoming even more serious, especially with the increasing popularity of Web 2.0. This paper addresses the problem in the academic researcher social network ArnetMiner using a supervised method for exploiting all side information including co-author, organization, paper citation, title similarity, author's homepage, web constraint, and user feedback. The method automatically determines the person number k. Tests on the researcher social network with up to 100 different names show that the method significantly outperforms the baseline method using an unsupervised attribute-augmented graph clustering algorithm.
doi_str_mv 10.1016/S1007-0214(10)70114-0
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_855703963</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><cqvip_id>37274390</cqvip_id><els_id>S1007021410701140</els_id><sourcerecordid>855703963</sourcerecordid><originalsourceid>FETCH-LOGICAL-c283t-fdbfdceb35e962a62aeefd181a57921695cbd03e6ef949f60995a85ea2e8b05d3</originalsourceid><addsrcrecordid>eNqFkE9LAzEQxYMoWKsfQShe1MPqZLNJNicp9S8UFNRzyGYnbXS72ya7Sr-9W1vPwsAMw3szvB8hpxSuKFBx_UoBZAIpzS4oXEqgNEtgjwxoLvNEChD7_fwnOSRHMX4AMMElGxB666NZFH7WmdbXs9G4a-dNiKNiPXoxPnz7iKNJZWL0ztte0tTH5MCZKuLJrg_J-_3d2-QxmT4_PE3G08SmOWsTVxautFgwjkqkpi9EV9KcGi5VSoXitiiBoUCnMuUEKMVNztGkmBfASzYk59u7y9CsOoytXvhosapMjU0Xdc65BKYE65V8q7ShiTGg08vgFyasNQW9IaR_CelN_M3ql5CG3nez9WEf48tj0NF6rC2WPqBtddn4fy-c7T7Pm3q26gHqwthP5yvUTKYyYwrYD_oweNA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>855703963</pqid></control><display><type>article</type><title>Disambiguating Authors by Pairwise Classification</title><source>IEEE Xplore All Journals</source><creator>林泉 王波 杜圆 王雪至 李玉华 陈松灿</creator><creatorcontrib>林泉 王波 杜圆 王雪至 李玉华 陈松灿</creatorcontrib><description>Name ambiguity is a critical problem in many applications, in particular in online bibliography sys-tems, such as DBLP, ACM, and CiteSeerx. Despite the many studies, this problem is still not resolved and is becoming even more serious, especially with the increasing popularity of Web 2.0. This paper addresses the problem in the academic researcher social network ArnetMiner using a supervised method for exploiting all side information including co-author, organization, paper citation, title similarity, author's homepage, web constraint, and user feedback. The method automatically determines the person number k. Tests on the researcher social network with up to 100 different names show that the method significantly outperforms the baseline method using an unsupervised attribute-augmented graph clustering algorithm.</description><identifier>ISSN: 1007-0214</identifier><identifier>EISSN: 1878-7606</identifier><identifier>EISSN: 1007-0214</identifier><identifier>DOI: 10.1016/S1007-0214(10)70114-0</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Algorithms ; Ambiguity ; arnetminer ; Bibliographies ; Classification ; disambiguating ; Graphs ; Names ; Networks ; On-line systems ; pairwise classification ; Similarity ; 学术研究 ; 开发利用 ; 用户反馈 ; 电信设备制造商 ; 社会网络</subject><ispartof>Tsinghua science and technology, 2010-12, Vol.15 (6), p.668-677</ispartof><rights>2010 Tsinghua University Press</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c283t-fdbfdceb35e962a62aeefd181a57921695cbd03e6ef949f60995a85ea2e8b05d3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://image.cqvip.com/vip1000/qk/85782X/85782X.jpg</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>林泉 王波 杜圆 王雪至 李玉华 陈松灿</creatorcontrib><title>Disambiguating Authors by Pairwise Classification</title><title>Tsinghua science and technology</title><addtitle>Tsinghua Science and Technology</addtitle><description>Name ambiguity is a critical problem in many applications, in particular in online bibliography sys-tems, such as DBLP, ACM, and CiteSeerx. Despite the many studies, this problem is still not resolved and is becoming even more serious, especially with the increasing popularity of Web 2.0. This paper addresses the problem in the academic researcher social network ArnetMiner using a supervised method for exploiting all side information including co-author, organization, paper citation, title similarity, author's homepage, web constraint, and user feedback. The method automatically determines the person number k. Tests on the researcher social network with up to 100 different names show that the method significantly outperforms the baseline method using an unsupervised attribute-augmented graph clustering algorithm.</description><subject>Algorithms</subject><subject>Ambiguity</subject><subject>arnetminer</subject><subject>Bibliographies</subject><subject>Classification</subject><subject>disambiguating</subject><subject>Graphs</subject><subject>Names</subject><subject>Networks</subject><subject>On-line systems</subject><subject>pairwise classification</subject><subject>Similarity</subject><subject>学术研究</subject><subject>开发利用</subject><subject>用户反馈</subject><subject>电信设备制造商</subject><subject>社会网络</subject><issn>1007-0214</issn><issn>1878-7606</issn><issn>1007-0214</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><recordid>eNqFkE9LAzEQxYMoWKsfQShe1MPqZLNJNicp9S8UFNRzyGYnbXS72ya7Sr-9W1vPwsAMw3szvB8hpxSuKFBx_UoBZAIpzS4oXEqgNEtgjwxoLvNEChD7_fwnOSRHMX4AMMElGxB666NZFH7WmdbXs9G4a-dNiKNiPXoxPnz7iKNJZWL0ztte0tTH5MCZKuLJrg_J-_3d2-QxmT4_PE3G08SmOWsTVxautFgwjkqkpi9EV9KcGi5VSoXitiiBoUCnMuUEKMVNztGkmBfASzYk59u7y9CsOoytXvhosapMjU0Xdc65BKYE65V8q7ShiTGg08vgFyasNQW9IaR_CelN_M3ql5CG3nez9WEf48tj0NF6rC2WPqBtddn4fy-c7T7Pm3q26gHqwthP5yvUTKYyYwrYD_oweNA</recordid><startdate>201012</startdate><enddate>201012</enddate><creator>林泉 王波 杜圆 王雪至 李玉华 陈松灿</creator><general>Elsevier Ltd</general><scope>2RA</scope><scope>92L</scope><scope>CQIGP</scope><scope>W92</scope><scope>~WA</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>7TB</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>FR3</scope><scope>JG9</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>201012</creationdate><title>Disambiguating Authors by Pairwise Classification</title><author>林泉 王波 杜圆 王雪至 李玉华 陈松灿</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c283t-fdbfdceb35e962a62aeefd181a57921695cbd03e6ef949f60995a85ea2e8b05d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Algorithms</topic><topic>Ambiguity</topic><topic>arnetminer</topic><topic>Bibliographies</topic><topic>Classification</topic><topic>disambiguating</topic><topic>Graphs</topic><topic>Names</topic><topic>Networks</topic><topic>On-line systems</topic><topic>pairwise classification</topic><topic>Similarity</topic><topic>学术研究</topic><topic>开发利用</topic><topic>用户反馈</topic><topic>电信设备制造商</topic><topic>社会网络</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>林泉 王波 杜圆 王雪至 李玉华 陈松灿</creatorcontrib><collection>维普_期刊</collection><collection>中文科技期刊数据库-CALIS站点</collection><collection>中文科技期刊数据库-7.0平台</collection><collection>中文科技期刊数据库-工程技术</collection><collection>中文科技期刊数据库- 镜像站点</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Tsinghua science and technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>林泉 王波 杜圆 王雪至 李玉华 陈松灿</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Disambiguating Authors by Pairwise Classification</atitle><jtitle>Tsinghua science and technology</jtitle><addtitle>Tsinghua Science and Technology</addtitle><date>2010-12</date><risdate>2010</risdate><volume>15</volume><issue>6</issue><spage>668</spage><epage>677</epage><pages>668-677</pages><issn>1007-0214</issn><eissn>1878-7606</eissn><eissn>1007-0214</eissn><abstract>Name ambiguity is a critical problem in many applications, in particular in online bibliography sys-tems, such as DBLP, ACM, and CiteSeerx. Despite the many studies, this problem is still not resolved and is becoming even more serious, especially with the increasing popularity of Web 2.0. This paper addresses the problem in the academic researcher social network ArnetMiner using a supervised method for exploiting all side information including co-author, organization, paper citation, title similarity, author's homepage, web constraint, and user feedback. The method automatically determines the person number k. Tests on the researcher social network with up to 100 different names show that the method significantly outperforms the baseline method using an unsupervised attribute-augmented graph clustering algorithm.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/S1007-0214(10)70114-0</doi><tpages>10</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1007-0214
ispartof Tsinghua science and technology, 2010-12, Vol.15 (6), p.668-677
issn 1007-0214
1878-7606
1007-0214
language eng
recordid cdi_proquest_miscellaneous_855703963
source IEEE Xplore All Journals
subjects Algorithms
Ambiguity
arnetminer
Bibliographies
Classification
disambiguating
Graphs
Names
Networks
On-line systems
pairwise classification
Similarity
学术研究
开发利用
用户反馈
电信设备制造商
社会网络
title Disambiguating Authors by Pairwise Classification
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-30T21%3A01%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Disambiguating%20Authors%20by%20Pairwise%20Classification&rft.jtitle=Tsinghua%20science%20and%20technology&rft.au=%E6%9E%97%E6%B3%89%20%E7%8E%8B%E6%B3%A2%20%E6%9D%9C%E5%9C%86%20%E7%8E%8B%E9%9B%AA%E8%87%B3%20%E6%9D%8E%E7%8E%89%E5%8D%8E%20%E9%99%88%E6%9D%BE%E7%81%BF&rft.date=2010-12&rft.volume=15&rft.issue=6&rft.spage=668&rft.epage=677&rft.pages=668-677&rft.issn=1007-0214&rft.eissn=1878-7606&rft_id=info:doi/10.1016/S1007-0214(10)70114-0&rft_dat=%3Cproquest_cross%3E855703963%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c283t-fdbfdceb35e962a62aeefd181a57921695cbd03e6ef949f60995a85ea2e8b05d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=855703963&rft_id=info:pmid/&rft_cqvip_id=37274390&rfr_iscdi=true