Loading…

Learning conditional policies for crystal design using offline reinforcement learning

Navigating through the exponentially large chemical space to search for desirable materials is an extremely challenging task in material discovery. Recent developments in generative and geometric deep learning have shown promising results in molecule and material discovery but often lack evaluation...

Full description

Saved in:
Bibliographic Details
Published in:Digital discovery 2024-04, Vol.3 (4), p.769-785
Main Authors: Govindarajan, Prashant, Miret, Santiago, Rector-Brooks, Jarrid, Phielipp, Mariano, Rajendran, Janarthanan, Chandar, Sarath
Format: Article
Language:English
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Navigating through the exponentially large chemical space to search for desirable materials is an extremely challenging task in material discovery. Recent developments in generative and geometric deep learning have shown promising results in molecule and material discovery but often lack evaluation with high-accuracy computational methods. This work aims to design novel and stable crystalline materials conditioned on a desired band gap. To achieve conditional generation, we: (1) formulate crystal design as a sequential decision-making problem, create relevant trajectories based on high-quality materials data, and use conservative Q-learning to learn a conditional policy from these trajectories. To do so, we formulate a reward function that incorporates constraints for energetic and electronic properties obtained directly from density functional theory (DFT) calculations; (2) evaluate the generated materials from the policy using DFT calculations for both energy and band gap; (3) compare our results to relevant baselines, including behavioral cloning and unconditioned policy learning. Our experiments show that conditioned policies achieve targeted crystal design and demonstrate the capability to perform crystal discovery evaluated with accurate and computationally expensive DFT calculations. Conservative Q-learning for band-gap conditioned crystal design with DFT evaluations - the model is trained on trajectories constructed from crystals in the Materials Project. Results indicate promising performance for lower band gap targets.
ISSN:2635-098X
2635-098X
DOI:10.1039/d4dd00024b