Abstract
Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
Original language | English (US) |
---|---|
Pages (from-to) | 827-838 |
Number of pages | 12 |
Journal | Nature Biotechnology |
Volume | 28 |
Issue number | 8 |
DOIs | |
State | Published - Aug 2010 |
ASJC Scopus subject areas
- Biotechnology
- Bioengineering
- Applied Microbiology and Biotechnology
- Molecular Medicine
- Biomedical Engineering
Access to Document
Other files and links
Fingerprint
Dive into the research topics of 'The Microarray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS
In: Nature Biotechnology, Vol. 28, No. 8, 08.2010, p. 827-838.
Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - The Microarray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models
AU - Shi, Leming
AU - Campbell, Gregory
AU - Jones, Wendell D.
AU - Campagne, Fabien
AU - Wen, Zhining
AU - Walker, Stephen J.
AU - Su, Zhenqiang
AU - Chu, Tzu Ming
AU - Goodsaid, Federico M.
AU - Pusztai, Lajos
AU - Shaughnessy, John D.
AU - Oberthuer, André
AU - Thomas, Russell S.
AU - Paules, Richard S.
AU - Fielden, Mark
AU - Barlogie, Bart
AU - Chen, Weijie
AU - Du, Pan
AU - Fischer, Matthias
AU - Furlanello, Cesare
AU - Gallas, Brandon D.
AU - Ge, Xijin
AU - Megherbi, Dalila B.
AU - Symmans, W. Fraser
AU - Wang, May D.
AU - Zhang, John
AU - Bitter, Hans
AU - Brors, Benedikt
AU - Bushel, Pierre R.
AU - Bylesjo, Max
AU - Chen, Minjun
AU - Cheng, Jie
AU - Cheng, Jing
AU - Chou, Jeff
AU - Davison, Timothy S.
AU - Delorenzi, Mauro
AU - Deng, Youping
AU - Devanarayan, Viswanath
AU - Dix, David J.
AU - Dopazo, Joaquin
AU - Dorff, Kevin C.
AU - Elloumi, Fathi
AU - Fan, Jianqing
AU - Fan, Shicai
AU - Fan, Xiaohui
AU - Fang, Hong
AU - Gonzaludo, Nina
AU - Hess, Kenneth R.
AU - Hong, Huixiao
AU - Huan, Jun
AU - Irizarry, Rafael A.
AU - Judson, Richard
AU - Juraeva, Dilafruz
AU - Lababidi, Samir
AU - Lambert, Christophe G.
AU - Li, Li
AU - Li, Yanen
AU - Li, Zhen
AU - Lin, Simon M.
AU - Liu, Guozhen
AU - Lobenhofer, Edward K.
AU - Luo, Jun
AU - Luo, Wen
AU - McCall, Matthew N.
AU - Nikolsky, Yuri
AU - Pennello, Gene A.
AU - Perkins, Roger G.
AU - Philip, Reena
AU - Popovici, Vlad
AU - Price, Nathan D.
AU - Qian, Feng
AU - Scherer, Andreas
AU - Shi, Tieliu
AU - Shi, Weiwei
AU - Sung, Jaeyun
AU - Thierry-Mieg, Danielle
AU - Thierry-Mieg, Jean
AU - Thodima, Venkata
AU - Trygg, Johan
AU - Vishnuvajjala, Lakshmi
AU - Wang, Sue Jane
AU - Wu, Jianping
AU - Wu, Yichao
AU - Xie, Qian
AU - Yousef, Waleed A.
AU - Zhang, Liang
AU - Zhang, Xuegong
AU - Zhong, Sheng
AU - Zhou, Yiming
AU - Zhu, Sheng
AU - Arasappan, Dhivya
AU - Bao, Wenjun
AU - Lucas, Anne Bergstrom
AU - Berthold, Frank
AU - Brennan, Richard J.
AU - Buness, Andreas
AU - Catalano, Jennifer G.
AU - Chang, Chang
AU - Chen, Rong
AU - Cheng, Yiyu
AU - Cui, Jian
AU - Czika, Wendy
AU - Demichelis, Francesca
AU - Deng, Xutao
AU - Dosymbekov, Damir
AU - Eils, Roland
AU - Feng, Yang
AU - Fostel, Jennifer
AU - Fulmer-Smentek, Stephanie
AU - Fuscoe, James C.
AU - Gatto, Laurent
AU - Ge, Weigong
AU - Goldstein, Darlene R.
AU - Guo, Li
AU - Halbert, Donald N.
AU - Han, Jing
AU - Harris, Stephen C.
AU - Hatzis, Christos
AU - Herman, Damir
AU - Huang, Jianping
AU - Jensen, Roderick V.
AU - Jiang, Rui
AU - Johnson, Charles D.
AU - Jurman, Giuseppe
AU - Kahlert, Yvonne
AU - Khuder, Sadik A.
AU - Kohl, Matthias
AU - Li, Jianying
AU - Lee, Li
AU - Li, Menglong
AU - Li, Quan Zhen
AU - Li, Shao
AU - Li, Zhiguang
AU - Liu, Jie
AU - Liu, Ying
AU - Liu, Zhichao
AU - Meng, Lu
AU - Madera, Manuel
AU - Martinez-Murillo, Francisco
AU - Medina, Ignacio
AU - Meehan, Joseph
AU - Miclaus, Kelci
AU - Moffitt, Richard A.
AU - Montaner, David
AU - Mukherjee, Piali
AU - Mulligan, George J.
AU - Neville, Padraic
AU - Nikolskaya, Tatiana
AU - Ning, Baitang
AU - Page, Grier P.
AU - Parker, Joel
AU - Parry, R. Mitchell
AU - Peng, Xuejun
AU - Peterson, Ron L.
AU - Phan, John H.
AU - Quanz, Brian
AU - Ren, Yi
AU - Riccadonna, Samantha
AU - Roter, Alan H.
AU - Samuelson, Frank W.
AU - Schumacher, Martin M.
AU - Shambaugh, Joseph D.
AU - Shi, Qiang
AU - Shippy, Richard
AU - Si, Shengzhu
AU - Smalter, Aaron
AU - Sotiriou, Christos
AU - Soukup, Mat
AU - Staedtler, Frank
AU - Steiner, Guido
AU - Stokes, Todd H.
AU - Sun, Qinglan
AU - Tan, Pei Yi
AU - Tang, Rong
AU - Tezak, Zivana
AU - Thorn, Brett
AU - Tsyganova, Marina
AU - Turpaz, Yaron
AU - Vega, Silvia C.
AU - Visintainer, Roberto
AU - Von Frese, Juergen
AU - Wang, Charles
AU - Wang, Eric
AU - Wang, Junwei
AU - Wang, Wei
AU - Westermann, Frank
AU - Willey, James C.
AU - Woods, Matthew
AU - Wu, Shujian
AU - Xiao, Nianqing
AU - Xu, Joshua
AU - Xu, Lei
AU - Yang, Lun
AU - Zeng, Xiao
AU - Zhang, Jialu
AU - Zheng, Li
AU - Zhang, Min
AU - Zhao, Chen
AU - Puri, Raj K.
AU - Scherf, Uwe
AU - Tong, Weida
AU - Wolfinger, Russell D.
N1 - Funding Information: The MAQC-II project was funded in part by the FDA’s Office of Critical Path Programs (to L.S.). Participants from the National Institutes of Health (NIH) were supported by the Intramural Research Program of NIH, Bethesda, Maryland or the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences (NIEHS), Research Triangle Park, North Carolina. J.F. was supported by the Division of Intramural Research of the NIEHS under contract HHSN273200700046U. Participants from the Johns Hopkins University were supported by grants from the NIH (1R01GM083084-01 and 1R01RR021967-01A2 to R.A.I. and T32GM074906 to M.M.). Participants from the Weill Medical College of Cornell University were partially supported by the Biomedical Informatics Core of the Institutional Clinical and Translational Science Award RFA-RM-07-002. F.C. acknowledges resources from The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine and from the David A. Cofrin Center for Biomedical Information at Weill Cornell. The data set from The Hamner Institutes for Health Sciences was supported by a grant from the American Chemistry Council’s Long Range Research Initiative. The breast cancer data set was generated with support of grants from NIH (R-01 to L.P.), The Breast Cancer Research Foundation (to L.P. and W.F.S.) and the Faculty Incentive Funds of the University of Texas MD Anderson Cancer Center (to W.F.S.). The data set from the University of Arkansas for Medical Sciences was supported by National Cancer Institute (NCI) PO1 grant CA55819-01A1, NCI R33 Grant CA97513-01, Donna D. and Donald M. Lambert Lebow Fund to Cure Myeloma and Nancy and Steven Grand Foundation. We are grateful to the individuals whose gene expression data were used in this study. All MAQC-II participants freely donated their time and reagents for the completion and analyses of the MAQC-II project. The MAQC-II consortium also thanks R. O’Neill for his encouragement and coordination among FDA Centers on the formation of the RBWG. The MAQC-II consortium gratefully dedicates this work in memory of R.F. Wagner who enthusiastically worked on the MAQC-II project and inspired many of us until he unexpectedly passed away in June 2008.
PY - 2010/8
Y1 - 2010/8
N2 - Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
AB - Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
UR - http://www.scopus.com/inward/record.url?scp=78650735473&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78650735473&partnerID=8YFLogxK
U2 - 10.1038/nbt.1665
DO - 10.1038/nbt.1665
M3 - Article
C2 - 20676074
AN - SCOPUS:78650735473
SN - 1087-0156
VL - 28
SP - 827
EP - 838
JO - Nature Biotechnology
JF - Nature Biotechnology
IS - 8
ER -