Loading…
THINKER - Entity Linking System for Turkish Language
Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been propos...
Saved in:
Published in: | IEEE transactions on knowledge and data engineering 2018-02, Vol.30 (2), p.367-380 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3 |
---|---|
cites | cdi_FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3 |
container_end_page | 380 |
container_issue | 2 |
container_start_page | 367 |
container_title | IEEE transactions on knowledge and data engineering |
container_volume | 30 |
creator | Kalender, Murat Korkmaz, Emin Erkan |
description | Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been proposed for linking entity mentions in various languages, there is currently no publicly available entity linking system specific to the Turkish language. This paper presents a novel entity linking system-THINKER - for linking Turkish content with entities defined in the Turkish dictionary (tdk.gov.tr) or Turkish Wikipedia (tr.wikipedia.org). Specifically, we first propose a novel machine learning based entity detection algorithm for the Turkish language. Then, we propose a collective disambiguation algorithm which utilizes a set of metrics for the linking task and, which is optimized using a genetic algorithm. The effectiveness of THINKER is validated empirically over generated data sets. The experimental results show that THINKER outperformed the state-of-the-art cross-lingual and multilingual entity linking systems in the literature. High entity linking performance (74.81 percent F1 score) is achieved by extending previous methods with some features specific to Turkish language and by developing a novel method that can learn better representations of entity embeddings. |
doi_str_mv | 10.1109/TKDE.2017.2761743 |
format | article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_8063926</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8063926</ieee_id><sourcerecordid>1986441086</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3</originalsourceid><addsrcrecordid>eNo9kLFOwzAQhi0EEqXwAIglEnOKz77E9ohKSqtGIEGYLSe1S1qaFDsZ-vakasV0N3z_f6ePkHugEwCqnorlSzZhFMSEiRQE8gsygiSRMQMFl8NOEWLkKK7JTQgbSqkUEkYEi_nibZl9RHGUNV3dHaK8brZ1s44-D6Gzu8i1Pip6v63Dd5SbZt2btb0lV878BHt3nmPyNcuK6TzO318X0-c8rpjiXSxKlRpjlZLIeOUqkyCveMmUtKuSKuakAWsqRJuIEu3wkWDUoVitHKMKSz4mj6fevW9_exs6vWl73wwnNSiZIgKV6UDBiap8G4K3Tu99vTP-oIHqoxx9lKOPcvRZzpB5OGVqa-0_L2nKFUv5H9gVXhw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1986441086</pqid></control><display><type>article</type><title>THINKER - Entity Linking System for Turkish Language</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Kalender, Murat ; Korkmaz, Emin Erkan</creator><creatorcontrib>Kalender, Murat ; Korkmaz, Emin Erkan</creatorcontrib><description>Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been proposed for linking entity mentions in various languages, there is currently no publicly available entity linking system specific to the Turkish language. This paper presents a novel entity linking system-THINKER - for linking Turkish content with entities defined in the Turkish dictionary (tdk.gov.tr) or Turkish Wikipedia (tr.wikipedia.org). Specifically, we first propose a novel machine learning based entity detection algorithm for the Turkish language. Then, we propose a collective disambiguation algorithm which utilizes a set of metrics for the linking task and, which is optimized using a genetic algorithm. The effectiveness of THINKER is validated empirically over generated data sets. The experimental results show that THINKER outperformed the state-of-the-art cross-lingual and multilingual entity linking systems in the literature. High entity linking performance (74.81 percent F1 score) is achieved by extending previous methods with some features specific to Turkish language and by developing a novel method that can learn better representations of entity embeddings.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2017.2761743</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; deep neural networks ; Electronic publishing ; embeddings ; Encyclopedias ; entity disambiguation ; Entity linking ; Genetic algorithms ; Internet ; Knowledge base ; Knowledge based systems ; Language ; Machine learning ; Neural networks ; Unstructured data</subject><ispartof>IEEE transactions on knowledge and data engineering, 2018-02, Vol.30 (2), p.367-380</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3</citedby><cites>FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3</cites><orcidid>0000-0002-9088-2582 ; 0000-0002-7842-7667</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8063926$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Kalender, Murat</creatorcontrib><creatorcontrib>Korkmaz, Emin Erkan</creatorcontrib><title>THINKER - Entity Linking System for Turkish Language</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been proposed for linking entity mentions in various languages, there is currently no publicly available entity linking system specific to the Turkish language. This paper presents a novel entity linking system-THINKER - for linking Turkish content with entities defined in the Turkish dictionary (tdk.gov.tr) or Turkish Wikipedia (tr.wikipedia.org). Specifically, we first propose a novel machine learning based entity detection algorithm for the Turkish language. Then, we propose a collective disambiguation algorithm which utilizes a set of metrics for the linking task and, which is optimized using a genetic algorithm. The effectiveness of THINKER is validated empirically over generated data sets. The experimental results show that THINKER outperformed the state-of-the-art cross-lingual and multilingual entity linking systems in the literature. High entity linking performance (74.81 percent F1 score) is achieved by extending previous methods with some features specific to Turkish language and by developing a novel method that can learn better representations of entity embeddings.</description><subject>Algorithms</subject><subject>deep neural networks</subject><subject>Electronic publishing</subject><subject>embeddings</subject><subject>Encyclopedias</subject><subject>entity disambiguation</subject><subject>Entity linking</subject><subject>Genetic algorithms</subject><subject>Internet</subject><subject>Knowledge base</subject><subject>Knowledge based systems</subject><subject>Language</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Unstructured data</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNo9kLFOwzAQhi0EEqXwAIglEnOKz77E9ohKSqtGIEGYLSe1S1qaFDsZ-vakasV0N3z_f6ePkHugEwCqnorlSzZhFMSEiRQE8gsygiSRMQMFl8NOEWLkKK7JTQgbSqkUEkYEi_nibZl9RHGUNV3dHaK8brZ1s44-D6Gzu8i1Pip6v63Dd5SbZt2btb0lV878BHt3nmPyNcuK6TzO318X0-c8rpjiXSxKlRpjlZLIeOUqkyCveMmUtKuSKuakAWsqRJuIEu3wkWDUoVitHKMKSz4mj6fevW9_exs6vWl73wwnNSiZIgKV6UDBiap8G4K3Tu99vTP-oIHqoxx9lKOPcvRZzpB5OGVqa-0_L2nKFUv5H9gVXhw</recordid><startdate>20180201</startdate><enddate>20180201</enddate><creator>Kalender, Murat</creator><creator>Korkmaz, Emin Erkan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-9088-2582</orcidid><orcidid>https://orcid.org/0000-0002-7842-7667</orcidid></search><sort><creationdate>20180201</creationdate><title>THINKER - Entity Linking System for Turkish Language</title><author>Kalender, Murat ; Korkmaz, Emin Erkan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>deep neural networks</topic><topic>Electronic publishing</topic><topic>embeddings</topic><topic>Encyclopedias</topic><topic>entity disambiguation</topic><topic>Entity linking</topic><topic>Genetic algorithms</topic><topic>Internet</topic><topic>Knowledge base</topic><topic>Knowledge based systems</topic><topic>Language</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Unstructured data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kalender, Murat</creatorcontrib><creatorcontrib>Korkmaz, Emin Erkan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Explore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kalender, Murat</au><au>Korkmaz, Emin Erkan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>THINKER - Entity Linking System for Turkish Language</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2018-02-01</date><risdate>2018</risdate><volume>30</volume><issue>2</issue><spage>367</spage><epage>380</epage><pages>367-380</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been proposed for linking entity mentions in various languages, there is currently no publicly available entity linking system specific to the Turkish language. This paper presents a novel entity linking system-THINKER - for linking Turkish content with entities defined in the Turkish dictionary (tdk.gov.tr) or Turkish Wikipedia (tr.wikipedia.org). Specifically, we first propose a novel machine learning based entity detection algorithm for the Turkish language. Then, we propose a collective disambiguation algorithm which utilizes a set of metrics for the linking task and, which is optimized using a genetic algorithm. The effectiveness of THINKER is validated empirically over generated data sets. The experimental results show that THINKER outperformed the state-of-the-art cross-lingual and multilingual entity linking systems in the literature. High entity linking performance (74.81 percent F1 score) is achieved by extending previous methods with some features specific to Turkish language and by developing a novel method that can learn better representations of entity embeddings.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TKDE.2017.2761743</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-9088-2582</orcidid><orcidid>https://orcid.org/0000-0002-7842-7667</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1041-4347 |
ispartof | IEEE transactions on knowledge and data engineering, 2018-02, Vol.30 (2), p.367-380 |
issn | 1041-4347 1558-2191 |
language | eng |
recordid | cdi_ieee_primary_8063926 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Algorithms deep neural networks Electronic publishing embeddings Encyclopedias entity disambiguation Entity linking Genetic algorithms Internet Knowledge base Knowledge based systems Language Machine learning Neural networks Unstructured data |
title | THINKER - Entity Linking System for Turkish Language |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T10%3A21%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=THINKER%20-%20Entity%20Linking%20System%20for%20Turkish%20Language&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Kalender,%20Murat&rft.date=2018-02-01&rft.volume=30&rft.issue=2&rft.spage=367&rft.epage=380&rft.pages=367-380&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2017.2761743&rft_dat=%3Cproquest_ieee_%3E1986441086%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1986441086&rft_id=info:pmid/&rft_ieee_id=8063926&rfr_iscdi=true |