TY - JOUR
T1 - A versatile statistical analysis algorithm to detect genome copy number variation
AU - Daruwala, Raoul Sam
AU - Rudra, Archisman
AU - Ostrer, Harry
AU - Lucito, Robert
AU - Wigler, Michael
AU - Mishra, Bud
PY - 2004/11/16
Y1 - 2004/11/16
N2 - We have developed a versatile statistical analysis algorithm for the detection of genomic aberrations in human cancer cell lines. The algorithm analyzes genomic data obtained from a variety of array technologies, such as oligonucleotide array, bacterial artificial chromosome array, or array-based comparative genomic hybridization, that operate by hybridizing with genomic material obtained from cancer and normal cells and allow detection of regions of the genome with altered copy number. The number of probes (i.e., resolution), the amount of uncharacterized noise per probe, and the severity of chromosomal aberrations per chromosomal region may vary with the underlying technology, biological sample, and sample preparation. Constrained by these uncertainties, our algorithm aims at robustness by using a priorless maximum a posteriori estimator and at efficiency by a dynamic programming implementation. We illustrate these characteristics of our algorithm by applying it to data obtained from representational oligonucleotide microarray analysis and array-based comparative genomic hybridization technology as well as to synthetic data obtained from an artificial model whose properties can be varied computationally. The algorithm can combine data from multiple sources and thus facilitate the discovery of genes and markers important in cancer, as well as the discovery of loci important in inherited genetic disease.
AB - We have developed a versatile statistical analysis algorithm for the detection of genomic aberrations in human cancer cell lines. The algorithm analyzes genomic data obtained from a variety of array technologies, such as oligonucleotide array, bacterial artificial chromosome array, or array-based comparative genomic hybridization, that operate by hybridizing with genomic material obtained from cancer and normal cells and allow detection of regions of the genome with altered copy number. The number of probes (i.e., resolution), the amount of uncharacterized noise per probe, and the severity of chromosomal aberrations per chromosomal region may vary with the underlying technology, biological sample, and sample preparation. Constrained by these uncertainties, our algorithm aims at robustness by using a priorless maximum a posteriori estimator and at efficiency by a dynamic programming implementation. We illustrate these characteristics of our algorithm by applying it to data obtained from representational oligonucleotide microarray analysis and array-based comparative genomic hybridization technology as well as to synthetic data obtained from an artificial model whose properties can be varied computationally. The algorithm can combine data from multiple sources and thus facilitate the discovery of genes and markers important in cancer, as well as the discovery of loci important in inherited genetic disease.
KW - Array-based comparative genomic hybridization
KW - Copy-number fluctuations
KW - Maximum a posteriori estimator
UR - http://www.scopus.com/inward/record.url?scp=9244233813&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=9244233813&partnerID=8YFLogxK
U2 - 10.1073/pnas.0407247101
DO - 10.1073/pnas.0407247101
M3 - Article
C2 - 15534219
AN - SCOPUS:9244233813
SN - 0027-8424
VL - 101
SP - 16292
EP - 16297
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 46
ER -