Loading…

MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction

Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowle...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2024-01
Main Authors: Mullick, Ankan, Ghosh, Akash, G Sai Chaitanya, Ghui, Samir, Nayak, Tapas, Lee, Seung-Cheol, Bhattacharjee, Satadeep, Goyal, Pawan
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Mullick, Ankan
Ghosh, Akash
G Sai Chaitanya
Ghui, Samir
Nayak, Tapas
Lee, Seung-Cheol
Bhattacharjee, Satadeep
Goyal, Pawan
description Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extractor), a Pointer Network-based encoder-decoder framework, to jointly extract entities and relations from material science articles as a triplet (\(entity1, relation, entity2\)). Specifically, we target the battery materials and identify five relations to work on - conductivity, coulombic efficiency, capacity, voltage, and energy. Our proposed approach achieved a much better F1-score (0.771) than a previous attempt using ChemDataExtractor (0.716). The overall graphical framework of MatSciRE is shown in Fig 1. The material information is extracted from material science literature in the form of entity-relation triplets using MatSciRE.
doi_str_mv 10.48550/arxiv.2401.09839
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2916499763</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2916499763</sourcerecordid><originalsourceid>FETCH-LOGICAL-a523-fdad0db892b89fd1ae0b76f84ddf57099891149e19eedd3165c3b86b58933f1b3</originalsourceid><addsrcrecordid>eNotjctOwzAURC0kJKrSD2BniXWKH3Fis6uqABXlodJ95cTXlUuwwXHasuTPiQqL0czmnEHoipJpLoUgNzoe3X7KckKnREmuztCIcU4zmTN2gSZdtyOEsKJkQvAR-nnS6a1xq-oWL2EPUW-d3-LX4HyCiJ8hHUJ873AKeNan8KET4Monl76x9gavoNXJBY-rY4q6OU0bIh6kEJ1u8aAG3wB-9OHQgtlCVusO8Dz4LsX-BFyic6vbDib_PUbru2o9f8iWL_eL-WyZacF4Zo02xNRSsSHWUA2kLgsrc2OsKIlSUlGaK6AKwBhOC9HwWha1kIpzS2s-Rtd_2s8Yvnro0mYX-uiHxw1TtMiVKgvOfwH_AWNW</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2916499763</pqid></control><display><type>article</type><title>MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction</title><source>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</source><creator>Mullick, Ankan ; Ghosh, Akash ; G Sai Chaitanya ; Ghui, Samir ; Nayak, Tapas ; Lee, Seung-Cheol ; Bhattacharjee, Satadeep ; Goyal, Pawan</creator><creatorcontrib>Mullick, Ankan ; Ghosh, Akash ; G Sai Chaitanya ; Ghui, Samir ; Nayak, Tapas ; Lee, Seung-Cheol ; Bhattacharjee, Satadeep ; Goyal, Pawan</creatorcontrib><description>Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extractor), a Pointer Network-based encoder-decoder framework, to jointly extract entities and relations from material science articles as a triplet (\(entity1, relation, entity2\)). Specifically, we target the battery materials and identify five relations to work on - conductivity, coulombic efficiency, capacity, voltage, and energy. Our proposed approach achieved a much better F1-score (0.771) than a previous attempt using ChemDataExtractor (0.716). The overall graphical framework of MatSciRE is shown in Fig 1. The material information is extracted from material science literature in the form of entity-relation triplets using MatSciRE.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2401.09839</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Construction materials ; Electric potential ; Encoders-Decoders ; Knowledge bases (artificial intelligence) ; Voltage</subject><ispartof>arXiv.org, 2024-01</ispartof><rights>2024. This work is published under http://creativecommons.org/publicdomain/zero/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2916499763?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,27925,37012,44590</link.rule.ids></links><search><creatorcontrib>Mullick, Ankan</creatorcontrib><creatorcontrib>Ghosh, Akash</creatorcontrib><creatorcontrib>G Sai Chaitanya</creatorcontrib><creatorcontrib>Ghui, Samir</creatorcontrib><creatorcontrib>Nayak, Tapas</creatorcontrib><creatorcontrib>Lee, Seung-Cheol</creatorcontrib><creatorcontrib>Bhattacharjee, Satadeep</creatorcontrib><creatorcontrib>Goyal, Pawan</creatorcontrib><title>MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction</title><title>arXiv.org</title><description>Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extractor), a Pointer Network-based encoder-decoder framework, to jointly extract entities and relations from material science articles as a triplet (\(entity1, relation, entity2\)). Specifically, we target the battery materials and identify five relations to work on - conductivity, coulombic efficiency, capacity, voltage, and energy. Our proposed approach achieved a much better F1-score (0.771) than a previous attempt using ChemDataExtractor (0.716). The overall graphical framework of MatSciRE is shown in Fig 1. The material information is extracted from material science literature in the form of entity-relation triplets using MatSciRE.</description><subject>Construction materials</subject><subject>Electric potential</subject><subject>Encoders-Decoders</subject><subject>Knowledge bases (artificial intelligence)</subject><subject>Voltage</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNotjctOwzAURC0kJKrSD2BniXWKH3Fis6uqABXlodJ95cTXlUuwwXHasuTPiQqL0czmnEHoipJpLoUgNzoe3X7KckKnREmuztCIcU4zmTN2gSZdtyOEsKJkQvAR-nnS6a1xq-oWL2EPUW-d3-LX4HyCiJ8hHUJ873AKeNan8KET4Monl76x9gavoNXJBY-rY4q6OU0bIh6kEJ1u8aAG3wB-9OHQgtlCVusO8Dz4LsX-BFyic6vbDib_PUbru2o9f8iWL_eL-WyZacF4Zo02xNRSsSHWUA2kLgsrc2OsKIlSUlGaK6AKwBhOC9HwWha1kIpzS2s-Rtd_2s8Yvnro0mYX-uiHxw1TtMiVKgvOfwH_AWNW</recordid><startdate>20240118</startdate><enddate>20240118</enddate><creator>Mullick, Ankan</creator><creator>Ghosh, Akash</creator><creator>G Sai Chaitanya</creator><creator>Ghui, Samir</creator><creator>Nayak, Tapas</creator><creator>Lee, Seung-Cheol</creator><creator>Bhattacharjee, Satadeep</creator><creator>Goyal, Pawan</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240118</creationdate><title>MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction</title><author>Mullick, Ankan ; Ghosh, Akash ; G Sai Chaitanya ; Ghui, Samir ; Nayak, Tapas ; Lee, Seung-Cheol ; Bhattacharjee, Satadeep ; Goyal, Pawan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a523-fdad0db892b89fd1ae0b76f84ddf57099891149e19eedd3165c3b86b58933f1b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Construction materials</topic><topic>Electric potential</topic><topic>Encoders-Decoders</topic><topic>Knowledge bases (artificial intelligence)</topic><topic>Voltage</topic><toplevel>online_resources</toplevel><creatorcontrib>Mullick, Ankan</creatorcontrib><creatorcontrib>Ghosh, Akash</creatorcontrib><creatorcontrib>G Sai Chaitanya</creatorcontrib><creatorcontrib>Ghui, Samir</creatorcontrib><creatorcontrib>Nayak, Tapas</creatorcontrib><creatorcontrib>Lee, Seung-Cheol</creatorcontrib><creatorcontrib>Bhattacharjee, Satadeep</creatorcontrib><creatorcontrib>Goyal, Pawan</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mullick, Ankan</au><au>Ghosh, Akash</au><au>G Sai Chaitanya</au><au>Ghui, Samir</au><au>Nayak, Tapas</au><au>Lee, Seung-Cheol</au><au>Bhattacharjee, Satadeep</au><au>Goyal, Pawan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction</atitle><jtitle>arXiv.org</jtitle><date>2024-01-18</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extractor), a Pointer Network-based encoder-decoder framework, to jointly extract entities and relations from material science articles as a triplet (\(entity1, relation, entity2\)). Specifically, we target the battery materials and identify five relations to work on - conductivity, coulombic efficiency, capacity, voltage, and energy. Our proposed approach achieved a much better F1-score (0.771) than a previous attempt using ChemDataExtractor (0.716). The overall graphical framework of MatSciRE is shown in Fig 1. The material information is extracted from material science literature in the form of entity-relation triplets using MatSciRE.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2401.09839</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-01
issn 2331-8422
language eng
recordid cdi_proquest_journals_2916499763
source Publicly Available Content Database (Proquest) (PQ_SDU_P3)
subjects Construction materials
Electric potential
Encoders-Decoders
Knowledge bases (artificial intelligence)
Voltage
title MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T14%3A12%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MatSciRE:%20Leveraging%20Pointer%20Networks%20to%20Automate%20Entity%20and%20Relation%20Extraction%20for%20Material%20Science%20Knowledge-base%20Construction&rft.jtitle=arXiv.org&rft.au=Mullick,%20Ankan&rft.date=2024-01-18&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2401.09839&rft_dat=%3Cproquest%3E2916499763%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a523-fdad0db892b89fd1ae0b76f84ddf57099891149e19eedd3165c3b86b58933f1b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2916499763&rft_id=info:pmid/&rfr_iscdi=true