Abstract
Natural language processing (NLP) and word embeddings trained neural networks were investigated as a more efficient method to extract useful information on catalytic polymerizations. Thousands of abstracts on metallocene-catalyzed polymerizations were accessed through journal Application Programming Interfaces. These abstracts were then used to create a group of related models to produce word embeddings, making use of the word2vec algorithm. This algorithm turns vocabulary into high dimensional vectors using unsupervised training. These vectors can then be used to show relationships between chemicals, suggest catalysts and activators combinations, understand acronyms, and categorize chemical compounds based on their reagent classification. We hypothesize that one can determine which areas of metallocene catalysis are understudied by comparing the predicted abstract and catalysts combinations with those found in existing abstracts, thereby guiding research to major breakthroughs as scientific literature continues to grow.
Original language | English (US) |
---|---|
Article number | 107026 |
Journal | Computers and Chemical Engineering |
Volume | 141 |
DOIs | |
State | Published - Oct 4 2020 |
Keywords
- Machine learning
- Metallocene catalysis
- Natural language
- Polymerization
- Word embeddings
ASJC Scopus subject areas
- General Chemical Engineering
- Computer Science Applications