TY - JOUR
T1 - Machine learning enhanced spectroscopic analysis
T2 - towards autonomous chemical mixture characterization for rapid process optimization†
AU - Angulo, Andrea
AU - Yang, Lankun
AU - Aydil, Eray S.
AU - Modestino, Miguel A.
N1 - Funding Information:
The authors acknowledge the financial support provided by the National Science Foundation (Grant # CBET-1943972) and from NYU, Tandon School of Engineering, through the MAM and ESA startup funds. In addition, the collaboration between MAM and ESA is enabled by the Center for Decarbonizing Chemical Manufacturing Using Electrification (DC-MUSE), formed with the help of a generous grant from the Sloan Foundation (Grant # 201-16807) and a center planning grant from the National Science Foundation (Grant # EEC-1936709).
Funding Information:
The authors acknowledge the nancial support provided by the National Science Foundation (Grant # CBET-1943972) and from NYU, Tandon School of Engineering, through the MAM and ESA startup funds. In addition, the collaboration between MAM and ESA is enabled by the Center for Decarbonizing Chemical Manufacturing Using Electrication (DC-MUSE), formed with the help of a generous grant from the Sloan Foundation (Grant # 201-16807) and a center planning grant from the National Science Foundation (Grant # EEC-1936709).
Publisher Copyright:
© 2022 The Author(s). Published by the Royal Society of Chemistry.
PY - 2022/2/1
Y1 - 2022/2/1
N2 - Autonomous chemical process development and optimization methods use algorithms to explore the operating parameter space based on feedback from experimentally determined exit stream compositions. Measuring the compositions of multicomponent streams is challenging, requiring multiple analytical techniques to differentiate between similar chemical components in the mixture and determine their concentration. Herein, we describe a universal analytical methodology based on multitarget regression machine learning (ML) models to rapidly determine chemical mixtures' compositions from Fourier transform infrared (FTIR) absorption spectra. Specifically, we used simulated FTIR spectra for up to 6 components in water and tested seven different ML algorithms to develop the methodology. All algorithms resulted in regression models with mean absolute errors (MAE) between 0–0.27 wt%. We validated the methodology with experimental data obtained on mixtures prepared using a network of programmable pumps in line with an FTIR transmission flow cell. ML models were trained using experimental data and evaluated for mixtures of up to 4-components with similar chemical structures, including alcohols (i.e., glycerol, isopropanol, and 1-butanol) and nitriles (i.e., acrylonitrile, adiponitrile, and propionitrile). Linear regression models predicted concentrations with coefficients of determination, R2, between 0.955 and 0.986, while artificial neural network models showed a slightly lower accuracy, with R2 between 0.854 and 0.977. These R2 correspond to MAEs of 0.28–0.52 wt% for mixtures with component concentrations between 4–10 wt%. Thus, we demonstrate that ML models can accurately determine the compositions of multicomponent mixtures of similar species, enhancing spectroscopic chemical quantification for use in autonomous, fast process development and optimization.
AB - Autonomous chemical process development and optimization methods use algorithms to explore the operating parameter space based on feedback from experimentally determined exit stream compositions. Measuring the compositions of multicomponent streams is challenging, requiring multiple analytical techniques to differentiate between similar chemical components in the mixture and determine their concentration. Herein, we describe a universal analytical methodology based on multitarget regression machine learning (ML) models to rapidly determine chemical mixtures' compositions from Fourier transform infrared (FTIR) absorption spectra. Specifically, we used simulated FTIR spectra for up to 6 components in water and tested seven different ML algorithms to develop the methodology. All algorithms resulted in regression models with mean absolute errors (MAE) between 0–0.27 wt%. We validated the methodology with experimental data obtained on mixtures prepared using a network of programmable pumps in line with an FTIR transmission flow cell. ML models were trained using experimental data and evaluated for mixtures of up to 4-components with similar chemical structures, including alcohols (i.e., glycerol, isopropanol, and 1-butanol) and nitriles (i.e., acrylonitrile, adiponitrile, and propionitrile). Linear regression models predicted concentrations with coefficients of determination, R2, between 0.955 and 0.986, while artificial neural network models showed a slightly lower accuracy, with R2 between 0.854 and 0.977. These R2 correspond to MAEs of 0.28–0.52 wt% for mixtures with component concentrations between 4–10 wt%. Thus, we demonstrate that ML models can accurately determine the compositions of multicomponent mixtures of similar species, enhancing spectroscopic chemical quantification for use in autonomous, fast process development and optimization.
UR - http://www.scopus.com/inward/record.url?scp=85133375487&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133375487&partnerID=8YFLogxK
U2 - 10.1039/d1dd00027f
DO - 10.1039/d1dd00027f
M3 - Article
AN - SCOPUS:85133375487
SN - 2635-098X
VL - 1
SP - 35
EP - 44
JO - Digital Discovery
JF - Digital Discovery
IS - 1
ER -