Abstract
Bayesian optimization of materials and molecular properties
Authors: Todorović, Milica
Conference name: School on Machine Learning for Molecules and Materials Research
Publication year: 2025
Publication's open availability at the time of reporting: No Open Access
Publication channel's open availability : No Open Access publication channel
Web address : https://members.cecam.org/storage/workshop_files/ML4MMR-1749026739.pdf
The arrival of materials science data infrastructures in the past decade has ushered in the era of data-driven materials science based on artificial intelligence (AI) algorithms, which has facilitated breakthroughs in materials optimization and design. Of particular interest are active learning algorithms, where datasets are collected on-the-fly in the search for optimal solutions. We encoded such a probabilistic algorithm into the Bayesian Optimization Structure Search (BOSS) Python tool for materials optimization [1]. BOSS builds N-dimensional surrogate models for materials’ energy or property landscapes to infer global optima, allowing to conduct targeted materials engineering. The models are iteratively refined by sequentially sampling density-functional theory (DFT) data points with high information content. This creates compact and informative datasets. We utilized this approach to study molecular surface adsorbates [2], thin film growth [3], solid-solid interfaces [4], molecular conformers [5] and even optimise experimental outcomes [6]. This tutorial will introduce the concepts of active learning and the key choices in Bayesian optimization, before focusing on its implementation in materials simulations and the quality monitoring needed to reach optimal solutions.