Loading…

THINKER - Entity Linking System for Turkish Language

Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been propos...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering 2018-02, Vol.30 (2), p.367-380
Main Authors: Kalender, Murat, Korkmaz, Emin Erkan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3
cites cdi_FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3
container_end_page 380
container_issue 2
container_start_page 367
container_title IEEE transactions on knowledge and data engineering
container_volume 30
creator Kalender, Murat
Korkmaz, Emin Erkan
description Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been proposed for linking entity mentions in various languages, there is currently no publicly available entity linking system specific to the Turkish language. This paper presents a novel entity linking system-THINKER - for linking Turkish content with entities defined in the Turkish dictionary (tdk.gov.tr) or Turkish Wikipedia (tr.wikipedia.org). Specifically, we first propose a novel machine learning based entity detection algorithm for the Turkish language. Then, we propose a collective disambiguation algorithm which utilizes a set of metrics for the linking task and, which is optimized using a genetic algorithm. The effectiveness of THINKER is validated empirically over generated data sets. The experimental results show that THINKER outperformed the state-of-the-art cross-lingual and multilingual entity linking systems in the literature. High entity linking performance (74.81 percent F1 score) is achieved by extending previous methods with some features specific to Turkish language and by developing a novel method that can learn better representations of entity embeddings.
doi_str_mv 10.1109/TKDE.2017.2761743
format article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_8063926</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8063926</ieee_id><sourcerecordid>1986441086</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3</originalsourceid><addsrcrecordid>eNo9kLFOwzAQhi0EEqXwAIglEnOKz77E9ohKSqtGIEGYLSe1S1qaFDsZ-vakasV0N3z_f6ePkHugEwCqnorlSzZhFMSEiRQE8gsygiSRMQMFl8NOEWLkKK7JTQgbSqkUEkYEi_nibZl9RHGUNV3dHaK8brZ1s44-D6Gzu8i1Pip6v63Dd5SbZt2btb0lV878BHt3nmPyNcuK6TzO318X0-c8rpjiXSxKlRpjlZLIeOUqkyCveMmUtKuSKuakAWsqRJuIEu3wkWDUoVitHKMKSz4mj6fevW9_exs6vWl73wwnNSiZIgKV6UDBiap8G4K3Tu99vTP-oIHqoxx9lKOPcvRZzpB5OGVqa-0_L2nKFUv5H9gVXhw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1986441086</pqid></control><display><type>article</type><title>THINKER - Entity Linking System for Turkish Language</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Kalender, Murat ; Korkmaz, Emin Erkan</creator><creatorcontrib>Kalender, Murat ; Korkmaz, Emin Erkan</creatorcontrib><description>Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been proposed for linking entity mentions in various languages, there is currently no publicly available entity linking system specific to the Turkish language. This paper presents a novel entity linking system-THINKER - for linking Turkish content with entities defined in the Turkish dictionary (tdk.gov.tr) or Turkish Wikipedia (tr.wikipedia.org). Specifically, we first propose a novel machine learning based entity detection algorithm for the Turkish language. Then, we propose a collective disambiguation algorithm which utilizes a set of metrics for the linking task and, which is optimized using a genetic algorithm. The effectiveness of THINKER is validated empirically over generated data sets. The experimental results show that THINKER outperformed the state-of-the-art cross-lingual and multilingual entity linking systems in the literature. High entity linking performance (74.81 percent F1 score) is achieved by extending previous methods with some features specific to Turkish language and by developing a novel method that can learn better representations of entity embeddings.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2017.2761743</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; deep neural networks ; Electronic publishing ; embeddings ; Encyclopedias ; entity disambiguation ; Entity linking ; Genetic algorithms ; Internet ; Knowledge base ; Knowledge based systems ; Language ; Machine learning ; Neural networks ; Unstructured data</subject><ispartof>IEEE transactions on knowledge and data engineering, 2018-02, Vol.30 (2), p.367-380</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3</citedby><cites>FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3</cites><orcidid>0000-0002-9088-2582 ; 0000-0002-7842-7667</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8063926$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Kalender, Murat</creatorcontrib><creatorcontrib>Korkmaz, Emin Erkan</creatorcontrib><title>THINKER - Entity Linking System for Turkish Language</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been proposed for linking entity mentions in various languages, there is currently no publicly available entity linking system specific to the Turkish language. This paper presents a novel entity linking system-THINKER - for linking Turkish content with entities defined in the Turkish dictionary (tdk.gov.tr) or Turkish Wikipedia (tr.wikipedia.org). Specifically, we first propose a novel machine learning based entity detection algorithm for the Turkish language. Then, we propose a collective disambiguation algorithm which utilizes a set of metrics for the linking task and, which is optimized using a genetic algorithm. The effectiveness of THINKER is validated empirically over generated data sets. The experimental results show that THINKER outperformed the state-of-the-art cross-lingual and multilingual entity linking systems in the literature. High entity linking performance (74.81 percent F1 score) is achieved by extending previous methods with some features specific to Turkish language and by developing a novel method that can learn better representations of entity embeddings.</description><subject>Algorithms</subject><subject>deep neural networks</subject><subject>Electronic publishing</subject><subject>embeddings</subject><subject>Encyclopedias</subject><subject>entity disambiguation</subject><subject>Entity linking</subject><subject>Genetic algorithms</subject><subject>Internet</subject><subject>Knowledge base</subject><subject>Knowledge based systems</subject><subject>Language</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Unstructured data</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNo9kLFOwzAQhi0EEqXwAIglEnOKz77E9ohKSqtGIEGYLSe1S1qaFDsZ-vakasV0N3z_f6ePkHugEwCqnorlSzZhFMSEiRQE8gsygiSRMQMFl8NOEWLkKK7JTQgbSqkUEkYEi_nibZl9RHGUNV3dHaK8brZ1s44-D6Gzu8i1Pip6v63Dd5SbZt2btb0lV878BHt3nmPyNcuK6TzO318X0-c8rpjiXSxKlRpjlZLIeOUqkyCveMmUtKuSKuakAWsqRJuIEu3wkWDUoVitHKMKSz4mj6fevW9_exs6vWl73wwnNSiZIgKV6UDBiap8G4K3Tu99vTP-oIHqoxx9lKOPcvRZzpB5OGVqa-0_L2nKFUv5H9gVXhw</recordid><startdate>20180201</startdate><enddate>20180201</enddate><creator>Kalender, Murat</creator><creator>Korkmaz, Emin Erkan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-9088-2582</orcidid><orcidid>https://orcid.org/0000-0002-7842-7667</orcidid></search><sort><creationdate>20180201</creationdate><title>THINKER - Entity Linking System for Turkish Language</title><author>Kalender, Murat ; Korkmaz, Emin Erkan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>deep neural networks</topic><topic>Electronic publishing</topic><topic>embeddings</topic><topic>Encyclopedias</topic><topic>entity disambiguation</topic><topic>Entity linking</topic><topic>Genetic algorithms</topic><topic>Internet</topic><topic>Knowledge base</topic><topic>Knowledge based systems</topic><topic>Language</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Unstructured data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kalender, Murat</creatorcontrib><creatorcontrib>Korkmaz, Emin Erkan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Explore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kalender, Murat</au><au>Korkmaz, Emin Erkan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>THINKER - Entity Linking System for Turkish Language</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2018-02-01</date><risdate>2018</risdate><volume>30</volume><issue>2</issue><spage>367</spage><epage>380</epage><pages>367-380</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been proposed for linking entity mentions in various languages, there is currently no publicly available entity linking system specific to the Turkish language. This paper presents a novel entity linking system-THINKER - for linking Turkish content with entities defined in the Turkish dictionary (tdk.gov.tr) or Turkish Wikipedia (tr.wikipedia.org). Specifically, we first propose a novel machine learning based entity detection algorithm for the Turkish language. Then, we propose a collective disambiguation algorithm which utilizes a set of metrics for the linking task and, which is optimized using a genetic algorithm. The effectiveness of THINKER is validated empirically over generated data sets. The experimental results show that THINKER outperformed the state-of-the-art cross-lingual and multilingual entity linking systems in the literature. High entity linking performance (74.81 percent F1 score) is achieved by extending previous methods with some features specific to Turkish language and by developing a novel method that can learn better representations of entity embeddings.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TKDE.2017.2761743</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-9088-2582</orcidid><orcidid>https://orcid.org/0000-0002-7842-7667</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1041-4347
ispartof IEEE transactions on knowledge and data engineering, 2018-02, Vol.30 (2), p.367-380
issn 1041-4347
1558-2191
language eng
recordid cdi_ieee_primary_8063926
source IEEE Electronic Library (IEL) Journals
subjects Algorithms
deep neural networks
Electronic publishing
embeddings
Encyclopedias
entity disambiguation
Entity linking
Genetic algorithms
Internet
Knowledge base
Knowledge based systems
Language
Machine learning
Neural networks
Unstructured data
title THINKER - Entity Linking System for Turkish Language
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T10%3A21%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=THINKER%20-%20Entity%20Linking%20System%20for%20Turkish%20Language&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Kalender,%20Murat&rft.date=2018-02-01&rft.volume=30&rft.issue=2&rft.spage=367&rft.epage=380&rft.pages=367-380&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2017.2761743&rft_dat=%3Cproquest_ieee_%3E1986441086%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c293t-7b96aae998423cfca543c3b298edb092f8a1eac44e57b4e087720f47ddf2094b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1986441086&rft_id=info:pmid/&rft_ieee_id=8063926&rfr_iscdi=true