Loading…

Design Editing for Offline Model-based Optimization

Offline model-based optimization (MBO) aims to maximize a black-box objective function using only an offline dataset of designs and scores. These tasks span various domains, such as robotics, material design, and protein and molecular engineering. A common approach involves training a surrogate mode...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2024-08
Main Authors:	Ye Yuan, Zhang, Youyuan, Chen, Can, Wu, Haolun, Li, Zixuan, Li, Jianmo, Clark, James J, Liu, Xue
Format:	Article
Language:	English
Subjects:	Datasets Design optimization Editing Optimization Synthetic data
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Offline model-based optimization (MBO) aims to maximize a black-box objective function using only an offline dataset of designs and scores. These tasks span various domains, such as robotics, material design, and protein and molecular engineering. A common approach involves training a surrogate model using existing designs and their corresponding scores, and then generating new designs through gradient-based updates with respect to the surrogate model. This method suffers from the out-of-distribution issue, where the surrogate model may erroneously predict high scores for unseen designs. To address this challenge, we introduce a novel method, Design Editing for Offline Model-based Optimization} (DEMO), which leverages a diffusion prior to calibrate overly optimized designs. DEMO first generates pseudo design candidates by performing gradient ascent with respect to a surrogate model. Then, an editing process refines these pseudo design candidates by introducing noise and subsequently denoising them with a diffusion prior trained on the offline dataset, ensuring they align with the distribution of valid designs. We provide a theoretical proof that the difference between the final optimized designs generated by DEMO and the prior distribution of the offline dataset is controlled by the noise injected during the editing process. Empirical evaluations on seven offline MBO tasks show that DEMO outperforms various baseline methods, achieving the highest mean rank of 2.1 and a median rank of 1.
ISSN:	2331-8422