Loading…

Automated pipeline for superalloy data by text mining

Data provides a foundation for machine learning, which has accelerated data-driven materials design. The scientific literature contains a large amount of high-quality, reliable data, and automatically extracting data from the literature continues to be a challenge. We propose a natural language proc...

Full description

Saved in:
Bibliographic Details
Published in:npj computational materials 2022-01, Vol.8 (1), p.1-12, Article 9
Main Authors: Wang, Weiren, Jiang, Xue, Tian, Shaohan, Liu, Pei, Dang, Depeng, Su, Yanjing, Lookman, Turab, Xie, Jianxin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data provides a foundation for machine learning, which has accelerated data-driven materials design. The scientific literature contains a large amount of high-quality, reliable data, and automatically extracting data from the literature continues to be a challenge. We propose a natural language processing pipeline to capture both chemical composition and property data that allows analysis and prediction of superalloys. Within 3 h, 2531 records with both composition and property are extracted from 14,425 articles, covering γ ′ solvus temperature, density, solidus, and liquidus temperatures. A data-driven model for γ ′ solvus temperature is built to predict unexplored Co-based superalloys with high γ ′ solvus temperatures within a relative error of 0.81%. We test the predictions via synthesis and characterization of three alloys. A web-based toolkit as an online open-source platform is provided and expected to serve as the basis for a general method to search for targeted materials using data extracted from the literature.
ISSN:2057-3960
2057-3960
DOI:10.1038/s41524-021-00687-2