TY - GEN
T1 - Maximum likelihood quantization of genomic features using dynamic programming
AU - Song, Mingzhou
AU - Haralick, Robert M.
AU - Boissinot, Stéphane
PY - 2007
Y1 - 2007
N2 - Dynamic programming is introduced to quantize a continuous random variable into a discrete random variable. Quantization is often useful before statistical analysis or reconstruction of large network models among multiple random variables. The quantization, through dynamic programming, finds the optimal discrete representation of the original probability density function of a random variable by maximizing the likelihood for the observed data. This algorithm is highly applicable to study genomic features such as the recombination rate across the chromosomes and the statistical properties of non-coding elements such as LINEI. In particular, the recombination rate obtained by quantization is studied for LINEI elements that are grouped also using quantization by length. The exact and density-preserving quantization approach provides an alternative superior to the inexact and distance-based k-means clustering algorithm for discretization of a single variable.
AB - Dynamic programming is introduced to quantize a continuous random variable into a discrete random variable. Quantization is often useful before statistical analysis or reconstruction of large network models among multiple random variables. The quantization, through dynamic programming, finds the optimal discrete representation of the original probability density function of a random variable by maximizing the likelihood for the observed data. This algorithm is highly applicable to study genomic features such as the recombination rate across the chromosomes and the statistical properties of non-coding elements such as LINEI. In particular, the recombination rate obtained by quantization is studied for LINEI elements that are grouped also using quantization by length. The exact and density-preserving quantization approach provides an alternative superior to the inexact and distance-based k-means clustering algorithm for discretization of a single variable.
UR - http://www.scopus.com/inward/record.url?scp=47349124623&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=47349124623&partnerID=8YFLogxK
U2 - 10.1109/ICMLA.2007.72
DO - 10.1109/ICMLA.2007.72
M3 - Conference contribution
AN - SCOPUS:47349124623
SN - 0769530699
SN - 9780769530697
T3 - Proceedings - 6th International Conference on Machine Learning and Applications, ICMLA 2007
SP - 547
EP - 553
BT - Proceedings - 6th International Conference on Machine Learning and Applications, ICMLA 2007
T2 - 6th International Conference on Machine Learning and Applications, ICMLA 2007
Y2 - 13 December 2007 through 15 December 2007
ER -