Determining Optimal Coarse-Grained Representation for Biomolecules Using Internal Cluster Validation Indexes

Zhenliang Wu, Yuwei Zhang, John Zenghui Zhang, Kelin Xia, Fei Xia

Research output: Contribution to journalArticlepeer-review

Abstract

The development of ultracoarse-grained models for large biomolecules needs to derive the optimal number of coarse-grained (CG) sites to represent the targets. In this work, we propose to use the statistical internal cluster validation indexes to determine the optimal number of CG sites that are optimized based on the essential dynamics coarse-graining method. The calculated curves of Calinski-Harabasz and Silhouette Coefficient indexes exhibit the extrema corresponding to the similar CG numbers. The calculated ratios of the optimal CG numbers to the residue numbers of fine-grained models are in the range from 4 to 2. The comparison of the stability of index results indicates that Calinski-Harabasz index is the better choice to determine the optimal CG representation in coarse-graining.

Original languageEnglish (US)
Pages (from-to)14-20
Number of pages7
JournalJournal of Computational Chemistry
Volume41
Issue number1
DOIs
StatePublished - Jan 5 2020

Keywords

  • CH index
  • SC index
  • coarse-graining
  • internal cluster validation index
  • optimal CG sites

ASJC Scopus subject areas

  • General Chemistry
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Determining Optimal Coarse-Grained Representation for Biomolecules Using Internal Cluster Validation Indexes'. Together they form a unique fingerprint.

Cite this