A Semiautomated Structure-Based Method to Predict Substrates of Enzymes via Molecular Docking: A Case Study with Candida antarctica Lipase B

Zhiqiang Yao, Lujia Zhang, Bei Gao, Dongbing Cui, Fengqing Wang, Xiao He, John Z.H. Zhang, Dongzhi Wei

Research output: Contribution to journalArticlepeer-review


The discovery of unique substrates is important for developing potential applications of enzymes. However, the experimental procedures for substrate identification are laborious, time-consuming, and expensive. Although in silico structure-based approaches show great promise, recent extensive studies have shown that these approaches remain a formidable challenge for current biocomputational methodologies. Here we present an open-source, extensible, and flexible software platform for predicting enzyme substrates called THEMIS, which performs in silico virtual screening for potential catalytic targets of an enzyme on the basis of the enzyme's catalysis mechanism. On the basis of a generalized transition state theory of enzyme catalysis, we introduce a modified docking procedure called "mechanism-based restricted docking" (MBRD) for novel substrate recognition from molecular docking. Comprising a series of utilities written in C/Python, THEMIS automatically executes parallel-computing MBRD tasks and evaluates the results with various molecular mechanics (MM) criteria such as energy, distance, angle, and dihedral angle to help identify desired substrates. Exhaustive sampling and statistical measures were used to improve the robustness and reproducibility of the method. We used Candida antarctica lipase B (CALB) as a test system to demonstrate the effectiveness of our computational prediction of (non)substrates. A novel MM score function for CALB substrate identification derived from the near-attack conformation was used to evaluate the possibility of chemical transformation. A highly positive rate of 93.4% was achieved from a CALB substrate library with 61 known substrates and 35 nonsubstrates, and the screening rate has reached 103 compounds/day (96 CPU cores, 100 samples/compound). The performance shows that the present method is perhaps the first reported scheme to meet the requirement for practical applicability to enzyme studies. An additional study was performed to validate the universality of our method. In this verification we employed two distinct enzymes, nitrilase Nit6803 and SDR Gox2181, where the correct rates of both enzymes exceeded 90%. The source code used will be released under the GNU General Public License (GPLv3) and will be free to download. We believe that the present method will provide new insights into enzyme research and accelerate the development of novel enzyme applications.

Original languageEnglish (US)
Pages (from-to)1979-1994
Number of pages16
JournalJournal of Chemical Information and Modeling
Issue number10
StatePublished - Oct 24 2016

ASJC Scopus subject areas

  • General Chemistry
  • General Chemical Engineering
  • Computer Science Applications
  • Library and Information Sciences


Dive into the research topics of 'A Semiautomated Structure-Based Method to Predict Substrates of Enzymes via Molecular Docking: A Case Study with Candida antarctica Lipase B'. Together they form a unique fingerprint.

Cite this