TY - GEN
T1 - Highly scalable Ab initio genomic motif identification
AU - Marchand, Benoît
AU - Bajic, Vladimir B.
AU - Kaushik, Dinesh K.
PY - 2011
Y1 - 2011
N2 - We present results of scaling an ab initio motif family identification system, Dragon Motif Finder (DMF), to 65,536 processor cores of IBM Blue Gene/P. DMF seeks groups of mutually similar polynucleotide patterns within a set of genomic sequences and builds various motif families from them. Such information is of relevance to many problems in life sciences. Prior attempts to scale such ab initio motif-finding algorithms achieved limited success. We solve the scalability issues using a combination of mixed-mode MPI-OpenMP parallel programming, master-slave work assignment, multi-level workload distribution, multi-level MPI collectives, and serial optimizations. While the scalability of our algorithm was excellent (94% parallel efficiency on 65,536 cores relative to 256 cores on a modest-size problem), the final speedup with respect to the original serial code exceeded 250,000 when serial optimizations are included. This enabled us to carry out many large-scale ab initio motiffinding simulations in a few hours while the original serial code would have needed decades of execution time.
AB - We present results of scaling an ab initio motif family identification system, Dragon Motif Finder (DMF), to 65,536 processor cores of IBM Blue Gene/P. DMF seeks groups of mutually similar polynucleotide patterns within a set of genomic sequences and builds various motif families from them. Such information is of relevance to many problems in life sciences. Prior attempts to scale such ab initio motif-finding algorithms achieved limited success. We solve the scalability issues using a combination of mixed-mode MPI-OpenMP parallel programming, master-slave work assignment, multi-level workload distribution, multi-level MPI collectives, and serial optimizations. While the scalability of our algorithm was excellent (94% parallel efficiency on 65,536 cores relative to 256 cores on a modest-size problem), the final speedup with respect to the original serial code exceeded 250,000 when serial optimizations are included. This enabled us to carry out many large-scale ab initio motiffinding simulations in a few hours while the original serial code would have needed decades of execution time.
KW - Data-flow parallel processing
KW - Master-slave MPI parallel processing
KW - Mixed-mode MPI-openMP parallel processing
KW - Multi-level MPI collective operations
KW - Multi-level workload distribution
UR - http://www.scopus.com/inward/record.url?scp=83155173350&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=83155173350&partnerID=8YFLogxK
U2 - 10.1145/2063384.2063459
DO - 10.1145/2063384.2063459
M3 - Conference contribution
AN - SCOPUS:83155173350
SN - 9781450307710
T3 - Proceedings of 2011 SC - International Conference for High Performance Computing, Networking, Storage and Analysis
BT - Proceedings of 2011 SC - International Conference for High Performance Computing, Networking, Storage and Analysis
T2 - 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC11
Y2 - 12 November 2011 through 18 November 2011
ER -