Identifying general reaction conditions by bandit optimization

Jason Y. Wang, Jason M. Stevens, Stavros K. Kariofillis, Mai Jan Tom, Dung L. Golden, Jun Li, Jose E. Tabora, Marvin Parasram, Benjamin J. Shields, David N. Primer, Bo Hao, David Del Valle, Stacey DiSomma, Ariel Furman, G. Greg Zipp, Sergey Melnikov, James Paulson, Abigail G. Doyle

Research output: Contribution to journalArticlepeer-review


Reaction conditions that are generally applicable to a wide variety of substrates are highly desired, especially in the pharmaceutical and chemical industries1–6. Although many approaches are available to evaluate the general applicability of developed conditions, a universal approach to efficiently discover these conditions during optimizations is rare. Here we report the design, implementation and application of reinforcement learning bandit optimization models7–10 to identify generally applicable conditions by efficient condition sampling and evaluation of experimental feedback. Performance benchmarking on existing datasets statistically showed high accuracies for identifying general conditions, with up to 31% improvement over baselines that mimic state-of-the-art optimization approaches. A palladium-catalysed imidazole C–H arylation reaction, an aniline amide coupling reaction and a phenol alkylation reaction were investigated experimentally to evaluate use cases and functionalities of the bandit optimization model in practice. In all three cases, the reaction conditions that were most generally applicable yet not well studied for the respective reaction were identified after surveying less than 15% of the expert-designed reaction space.

Original languageEnglish (US)
Pages (from-to)1025-1033
Number of pages9
Issue number8001
StatePublished - Feb 29 2024

ASJC Scopus subject areas

  • General


Dive into the research topics of 'Identifying general reaction conditions by bandit optimization'. Together they form a unique fingerprint.

Cite this