TY - JOUR
T1 - BigFoot
T2 - Bayesian alignment and phylogenetic footprinting with MCMC
AU - Satija, Rahul
AU - Novk, Ádm
AU - Miklás, Istvn
AU - Lyngsø, Rune
AU - Hein, Jotun
N1 - Funding Information:
This research was supported by the BBSRC grant BB/C509566/1 and by the EU grant MTKD-CT-2006-042794. IM was also supported by a Bolyai postdoctoral fellowship and the OTKA grant F61730. RS is funded by the Rhodes Trust, UK.
PY - 2009
Y1 - 2009
N2 - Background. We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. Results. We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the -globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. Conclusion. BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from http://www.stats.ox.ac.uk/∼satija/BigFoot/.
AB - Background. We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. Results. We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the -globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. Conclusion. BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from http://www.stats.ox.ac.uk/∼satija/BigFoot/.
UR - http://www.scopus.com/inward/record.url?scp=70349205853&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349205853&partnerID=8YFLogxK
U2 - 10.1186/1471-2148-9-217
DO - 10.1186/1471-2148-9-217
M3 - Article
C2 - 19715598
AN - SCOPUS:70349205853
SN - 1471-2148
VL - 9
JO - BMC Evolutionary Biology
JF - BMC Evolutionary Biology
IS - 1
M1 - 217
ER -