Structure-based drug design is critically dependent on accuracy of molecular docking scoring functions, and there is of significant interest to advance scoring functions with machine learning approaches. In this work, by judiciously expanding the training set, exploring new features related to explicit mediating water molecules as well as ligand conformation stability, and applying extreme gradient boosting (XGBoost) with Vina parametrization, we have improved robustness and applicability of machine-learning scoring functions. The new scoring function vinaXGB can not only perform consistently among the top compared to classical scoring functions for the CASF-2016 benchmark but also achieves significantly better prediction accuracy in different types of structures that mimic real docking applications.
ASJC Scopus subject areas
- Chemical Engineering(all)
- Computer Science Applications
- Library and Information Sciences