Loading…
Automated pipeline for superalloy data by text mining
Data provides a foundation for machine learning, which has accelerated data-driven materials design. The scientific literature contains a large amount of high-quality, reliable data, and automatically extracting data from the literature continues to be a challenge. We propose a natural language proc...
Saved in:
Published in: | npj computational materials 2022-01, Vol.8 (1), p.1-12, Article 9 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Data provides a foundation for machine learning, which has accelerated data-driven materials design. The scientific literature contains a large amount of high-quality, reliable data, and automatically extracting data from the literature continues to be a challenge. We propose a natural language processing pipeline to capture both chemical composition and property data that allows analysis and prediction of superalloys. Within 3 h, 2531 records with both composition and property are extracted from 14,425 articles, covering
γ
′ solvus temperature, density, solidus, and liquidus temperatures. A data-driven model for
γ
′ solvus temperature is built to predict unexplored Co-based superalloys with high
γ
′ solvus temperatures within a relative error of 0.81%. We test the predictions via synthesis and characterization of three alloys. A web-based toolkit as an online open-source platform is provided and expected to serve as the basis for a general method to search for targeted materials using data extracted from the literature. |
---|---|
ISSN: | 2057-3960 2057-3960 |
DOI: | 10.1038/s41524-021-00687-2 |