Loading…
MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction
Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowle...
Saved in:
Published in: | arXiv.org 2024-01 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Mullick, Ankan Ghosh, Akash G Sai Chaitanya Ghui, Samir Nayak, Tapas Lee, Seung-Cheol Bhattacharjee, Satadeep Goyal, Pawan |
description | Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extractor), a Pointer Network-based encoder-decoder framework, to jointly extract entities and relations from material science articles as a triplet (\(entity1, relation, entity2\)). Specifically, we target the battery materials and identify five relations to work on - conductivity, coulombic efficiency, capacity, voltage, and energy. Our proposed approach achieved a much better F1-score (0.771) than a previous attempt using ChemDataExtractor (0.716). The overall graphical framework of MatSciRE is shown in Fig 1. The material information is extracted from material science literature in the form of entity-relation triplets using MatSciRE. |
doi_str_mv | 10.48550/arxiv.2401.09839 |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2916499763</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2916499763</sourcerecordid><originalsourceid>FETCH-LOGICAL-a523-fdad0db892b89fd1ae0b76f84ddf57099891149e19eedd3165c3b86b58933f1b3</originalsourceid><addsrcrecordid>eNotjctOwzAURC0kJKrSD2BniXWKH3Fis6uqABXlodJ95cTXlUuwwXHasuTPiQqL0czmnEHoipJpLoUgNzoe3X7KckKnREmuztCIcU4zmTN2gSZdtyOEsKJkQvAR-nnS6a1xq-oWL2EPUW-d3-LX4HyCiJ8hHUJ873AKeNan8KET4Monl76x9gavoNXJBY-rY4q6OU0bIh6kEJ1u8aAG3wB-9OHQgtlCVusO8Dz4LsX-BFyic6vbDib_PUbru2o9f8iWL_eL-WyZacF4Zo02xNRSsSHWUA2kLgsrc2OsKIlSUlGaK6AKwBhOC9HwWha1kIpzS2s-Rtd_2s8Yvnro0mYX-uiHxw1TtMiVKgvOfwH_AWNW</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2916499763</pqid></control><display><type>article</type><title>MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction</title><source>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</source><creator>Mullick, Ankan ; Ghosh, Akash ; G Sai Chaitanya ; Ghui, Samir ; Nayak, Tapas ; Lee, Seung-Cheol ; Bhattacharjee, Satadeep ; Goyal, Pawan</creator><creatorcontrib>Mullick, Ankan ; Ghosh, Akash ; G Sai Chaitanya ; Ghui, Samir ; Nayak, Tapas ; Lee, Seung-Cheol ; Bhattacharjee, Satadeep ; Goyal, Pawan</creatorcontrib><description>Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extractor), a Pointer Network-based encoder-decoder framework, to jointly extract entities and relations from material science articles as a triplet (\(entity1, relation, entity2\)). Specifically, we target the battery materials and identify five relations to work on - conductivity, coulombic efficiency, capacity, voltage, and energy. Our proposed approach achieved a much better F1-score (0.771) than a previous attempt using ChemDataExtractor (0.716). The overall graphical framework of MatSciRE is shown in Fig 1. The material information is extracted from material science literature in the form of entity-relation triplets using MatSciRE.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2401.09839</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Construction materials ; Electric potential ; Encoders-Decoders ; Knowledge bases (artificial intelligence) ; Voltage</subject><ispartof>arXiv.org, 2024-01</ispartof><rights>2024. This work is published under http://creativecommons.org/publicdomain/zero/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2916499763?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,27925,37012,44590</link.rule.ids></links><search><creatorcontrib>Mullick, Ankan</creatorcontrib><creatorcontrib>Ghosh, Akash</creatorcontrib><creatorcontrib>G Sai Chaitanya</creatorcontrib><creatorcontrib>Ghui, Samir</creatorcontrib><creatorcontrib>Nayak, Tapas</creatorcontrib><creatorcontrib>Lee, Seung-Cheol</creatorcontrib><creatorcontrib>Bhattacharjee, Satadeep</creatorcontrib><creatorcontrib>Goyal, Pawan</creatorcontrib><title>MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction</title><title>arXiv.org</title><description>Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extractor), a Pointer Network-based encoder-decoder framework, to jointly extract entities and relations from material science articles as a triplet (\(entity1, relation, entity2\)). Specifically, we target the battery materials and identify five relations to work on - conductivity, coulombic efficiency, capacity, voltage, and energy. Our proposed approach achieved a much better F1-score (0.771) than a previous attempt using ChemDataExtractor (0.716). The overall graphical framework of MatSciRE is shown in Fig 1. The material information is extracted from material science literature in the form of entity-relation triplets using MatSciRE.</description><subject>Construction materials</subject><subject>Electric potential</subject><subject>Encoders-Decoders</subject><subject>Knowledge bases (artificial intelligence)</subject><subject>Voltage</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNotjctOwzAURC0kJKrSD2BniXWKH3Fis6uqABXlodJ95cTXlUuwwXHasuTPiQqL0czmnEHoipJpLoUgNzoe3X7KckKnREmuztCIcU4zmTN2gSZdtyOEsKJkQvAR-nnS6a1xq-oWL2EPUW-d3-LX4HyCiJ8hHUJ873AKeNan8KET4Monl76x9gavoNXJBY-rY4q6OU0bIh6kEJ1u8aAG3wB-9OHQgtlCVusO8Dz4LsX-BFyic6vbDib_PUbru2o9f8iWL_eL-WyZacF4Zo02xNRSsSHWUA2kLgsrc2OsKIlSUlGaK6AKwBhOC9HwWha1kIpzS2s-Rtd_2s8Yvnro0mYX-uiHxw1TtMiVKgvOfwH_AWNW</recordid><startdate>20240118</startdate><enddate>20240118</enddate><creator>Mullick, Ankan</creator><creator>Ghosh, Akash</creator><creator>G Sai Chaitanya</creator><creator>Ghui, Samir</creator><creator>Nayak, Tapas</creator><creator>Lee, Seung-Cheol</creator><creator>Bhattacharjee, Satadeep</creator><creator>Goyal, Pawan</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240118</creationdate><title>MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction</title><author>Mullick, Ankan ; Ghosh, Akash ; G Sai Chaitanya ; Ghui, Samir ; Nayak, Tapas ; Lee, Seung-Cheol ; Bhattacharjee, Satadeep ; Goyal, Pawan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a523-fdad0db892b89fd1ae0b76f84ddf57099891149e19eedd3165c3b86b58933f1b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Construction materials</topic><topic>Electric potential</topic><topic>Encoders-Decoders</topic><topic>Knowledge bases (artificial intelligence)</topic><topic>Voltage</topic><toplevel>online_resources</toplevel><creatorcontrib>Mullick, Ankan</creatorcontrib><creatorcontrib>Ghosh, Akash</creatorcontrib><creatorcontrib>G Sai Chaitanya</creatorcontrib><creatorcontrib>Ghui, Samir</creatorcontrib><creatorcontrib>Nayak, Tapas</creatorcontrib><creatorcontrib>Lee, Seung-Cheol</creatorcontrib><creatorcontrib>Bhattacharjee, Satadeep</creatorcontrib><creatorcontrib>Goyal, Pawan</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mullick, Ankan</au><au>Ghosh, Akash</au><au>G Sai Chaitanya</au><au>Ghui, Samir</au><au>Nayak, Tapas</au><au>Lee, Seung-Cheol</au><au>Bhattacharjee, Satadeep</au><au>Goyal, Pawan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction</atitle><jtitle>arXiv.org</jtitle><date>2024-01-18</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extractor), a Pointer Network-based encoder-decoder framework, to jointly extract entities and relations from material science articles as a triplet (\(entity1, relation, entity2\)). Specifically, we target the battery materials and identify five relations to work on - conductivity, coulombic efficiency, capacity, voltage, and energy. Our proposed approach achieved a much better F1-score (0.771) than a previous attempt using ChemDataExtractor (0.716). The overall graphical framework of MatSciRE is shown in Fig 1. The material information is extracted from material science literature in the form of entity-relation triplets using MatSciRE.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2401.09839</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-01 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2916499763 |
source | Publicly Available Content Database (Proquest) (PQ_SDU_P3) |
subjects | Construction materials Electric potential Encoders-Decoders Knowledge bases (artificial intelligence) Voltage |
title | MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T14%3A12%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MatSciRE:%20Leveraging%20Pointer%20Networks%20to%20Automate%20Entity%20and%20Relation%20Extraction%20for%20Material%20Science%20Knowledge-base%20Construction&rft.jtitle=arXiv.org&rft.au=Mullick,%20Ankan&rft.date=2024-01-18&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2401.09839&rft_dat=%3Cproquest%3E2916499763%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a523-fdad0db892b89fd1ae0b76f84ddf57099891149e19eedd3165c3b86b58933f1b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2916499763&rft_id=info:pmid/&rfr_iscdi=true |