Loading…

Leveraging language representation for materials exploration and discovery

Data-driven approaches to materials exploration and discovery are building momentum due to emerging advances in machine learning. However, parsimonious representations of crystals for navigating the vast materials search space remain limited. To address this limitation, we introduce a materials disc...

Full description

Saved in:
Bibliographic Details
Published in:npj computational materials 2024-03, Vol.10 (1), p.58-14, Article 58
Main Authors: Qu, Jiaxing, Xie, Yuxuan Richard, Ciesielski, Kamil M., Porter, Claire E., Toberer, Eric S., Ertekin, Elif
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data-driven approaches to materials exploration and discovery are building momentum due to emerging advances in machine learning. However, parsimonious representations of crystals for navigating the vast materials search space remain limited. To address this limitation, we introduce a materials discovery framework that utilizes natural language embeddings from language models as representations of compositional and structural features. The contextual knowledge encoded in these language representations conveys information about material properties and structures, enabling both similarity analysis to recall relevant candidates based on a query material and multi-task learning to share information across related properties. Applying this framework to thermoelectrics, we demonstrate diversified recommendations of prototype crystal structures and identify under-studied material spaces. Validation through first-principles calculations and experiments confirms the potential of the recommended materials as high-performance thermoelectrics. Language-based frameworks offer versatile and adaptable embedding structures for effective materials exploration and discovery, applicable across diverse material systems.
ISSN:2057-3960
2057-3960
DOI:10.1038/s41524-024-01231-8