TY - JOUR
T1 - In Silico Evaluation of Predicted Regulatory Interactions in Arabidopsis thaliana
AU - Nero, Damion
AU - Katari, Manpreet S.
AU - Kelfer, Jonathan
AU - Tranchina, Daniel
AU - Coruzzi, Gloria M.
N1 - Funding Information:
We would like to thank Dr. Wei Hu and Dr. Hong Ma for their contribution of the MIF1 microarray data. We would also like to thank Dr. Gabriel Krouk for his review of the manuscript and contribution to the design of Figures 1 and 3. This work was funded by NIH NIGMS Grant GM032877 to GC that includes a minority supplement to DN, an NSF Arabidopsis 2010 Genome Grant (IOB 0519985) to GC, and an NSF Grant DBI 0445666 to GC & MK.
PY - 2009/12/21
Y1 - 2009/12/21
N2 - Background: Prediction of transcriptional regulatory mechanisms in Arabidopsis has become increasingly critical with the explosion of genomic data now available for both gene expression and gene sequence composition. We have shown in previous work 1, that a combination of correlation measurements and cis-regulatory element (CRE) detection methods are effective in predicting targets for candidate transcription factors for specific case studies which were validated. However, to date there has been no quantitative assessment as to which correlation measures or CRE detection methods used alone or in combination are most effective in predicting TF→target relationships on a genome-wide scale.Results: We tested several widely used methods, based on correlation (Pearson and Spearman Rank correlation) and cis-regulatory element (CRE) detection (≥1 CRE or CRE over-representation), to determine which of these methods individually or in combination is the most effective by various measures for making regulatory predictions. To predict the regulatory targets of a transcription factor (TF) of interest, we applied these methods to microarray expression data for genes that were regulated over treatment and control conditions in wild type (WT) plants. Because the chosen data sets included identical experimental conditions used on TF over-expressor or T-DNA knockout plants, we were able to test the TF→target predictions made using microarray data from WT plants, with microarray data from mutant/transgenic plants. For each method, or combination of methods, we computed sensitivity, specificity, positive and negative predictive value and the F-measure of balance between sensitivity and positive predictive value (precision). This analysis revealed that the ≥1 CRE and Spearman correlation (used alone or in combination) were the most balanced CRE detection and correlation methods, respectively with regard to their power to accurately predict regulatory-target interactions.Conclusion: These findings provide an approach and guidance for researchers interested in predicting transcriptional regulatory mechanisms using microarray data that they generate (or microarray data that is publically available) combined with CRE detection in promoter sequence data.
AB - Background: Prediction of transcriptional regulatory mechanisms in Arabidopsis has become increasingly critical with the explosion of genomic data now available for both gene expression and gene sequence composition. We have shown in previous work 1, that a combination of correlation measurements and cis-regulatory element (CRE) detection methods are effective in predicting targets for candidate transcription factors for specific case studies which were validated. However, to date there has been no quantitative assessment as to which correlation measures or CRE detection methods used alone or in combination are most effective in predicting TF→target relationships on a genome-wide scale.Results: We tested several widely used methods, based on correlation (Pearson and Spearman Rank correlation) and cis-regulatory element (CRE) detection (≥1 CRE or CRE over-representation), to determine which of these methods individually or in combination is the most effective by various measures for making regulatory predictions. To predict the regulatory targets of a transcription factor (TF) of interest, we applied these methods to microarray expression data for genes that were regulated over treatment and control conditions in wild type (WT) plants. Because the chosen data sets included identical experimental conditions used on TF over-expressor or T-DNA knockout plants, we were able to test the TF→target predictions made using microarray data from WT plants, with microarray data from mutant/transgenic plants. For each method, or combination of methods, we computed sensitivity, specificity, positive and negative predictive value and the F-measure of balance between sensitivity and positive predictive value (precision). This analysis revealed that the ≥1 CRE and Spearman correlation (used alone or in combination) were the most balanced CRE detection and correlation methods, respectively with regard to their power to accurately predict regulatory-target interactions.Conclusion: These findings provide an approach and guidance for researchers interested in predicting transcriptional regulatory mechanisms using microarray data that they generate (or microarray data that is publically available) combined with CRE detection in promoter sequence data.
UR - http://www.scopus.com/inward/record.url?scp=74049162004&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=74049162004&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-10-435
DO - 10.1186/1471-2105-10-435
M3 - Article
C2 - 20025756
AN - SCOPUS:74049162004
SN - 1471-2105
VL - 10
JO - BMC bioinformatics
JF - BMC bioinformatics
M1 - 435
ER -