TY - GEN

T1 - Maximum likelihood quantization of genomic features using dynamic programming

AU - Song, Mingzhou

AU - Haralick, Robert M.

AU - Boissinot, Stéphane

PY - 2007

Y1 - 2007

N2 - Dynamic programming is introduced to quantize a continuous random variable into a discrete random variable. Quantization is often useful before statistical analysis or reconstruction of large network models among multiple random variables. The quantization, through dynamic programming, finds the optimal discrete representation of the original probability density function of a random variable by maximizing the likelihood for the observed data. This algorithm is highly applicable to study genomic features such as the recombination rate across the chromosomes and the statistical properties of non-coding elements such as LINEI. In particular, the recombination rate obtained by quantization is studied for LINEI elements that are grouped also using quantization by length. The exact and density-preserving quantization approach provides an alternative superior to the inexact and distance-based k-means clustering algorithm for discretization of a single variable.

AB - Dynamic programming is introduced to quantize a continuous random variable into a discrete random variable. Quantization is often useful before statistical analysis or reconstruction of large network models among multiple random variables. The quantization, through dynamic programming, finds the optimal discrete representation of the original probability density function of a random variable by maximizing the likelihood for the observed data. This algorithm is highly applicable to study genomic features such as the recombination rate across the chromosomes and the statistical properties of non-coding elements such as LINEI. In particular, the recombination rate obtained by quantization is studied for LINEI elements that are grouped also using quantization by length. The exact and density-preserving quantization approach provides an alternative superior to the inexact and distance-based k-means clustering algorithm for discretization of a single variable.

UR - http://www.scopus.com/inward/record.url?scp=47349124623&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=47349124623&partnerID=8YFLogxK

U2 - 10.1109/ICMLA.2007.72

DO - 10.1109/ICMLA.2007.72

M3 - Conference contribution

AN - SCOPUS:47349124623

SN - 0769530699

SN - 9780769530697

T3 - Proceedings - 6th International Conference on Machine Learning and Applications, ICMLA 2007

SP - 547

EP - 553

BT - Proceedings - 6th International Conference on Machine Learning and Applications, ICMLA 2007

T2 - 6th International Conference on Machine Learning and Applications, ICMLA 2007

Y2 - 13 December 2007 through 15 December 2007

ER -