TY - GEN
T1 - A new parallel kernel-independent fast multipole method
AU - Ying, Lexing
AU - Biros, George
AU - Zorin, Denis
AU - Langston, Harper
PY - 2003
Y1 - 2003
N2 - We present a new adaptive fast multipole algorithm and its parallel implementation. The algorithm is kernel-independent in the sense that the evaluation of pairwise interactions does not rely on any analytic expansions, but only utilizes kernel evaluations. The new method provides the enabling technology for many important problems in computational science and engineering. Examples include viscous flows, fracture mechanics and screened Coulombic interactions. Our MPI-based parallel implementation logically separates the computation and communication phases to avoid synchronization in the upward and downward computation passes, and thus allows us to fully exploit computation and communication overlapping. We measure isogranular and fixed-size scalability for a variety of kernels on the Pittsburgh Supercomputing Center's TCS-1 Alphaserver on up to 3000 processors. We have solved viscous flow problems with up to 2.1 billion unknowns and we have achieved 1.6 Tflops/s peak performance and 1.13 Tflops/s sustained performance.
AB - We present a new adaptive fast multipole algorithm and its parallel implementation. The algorithm is kernel-independent in the sense that the evaluation of pairwise interactions does not rely on any analytic expansions, but only utilizes kernel evaluations. The new method provides the enabling technology for many important problems in computational science and engineering. Examples include viscous flows, fracture mechanics and screened Coulombic interactions. Our MPI-based parallel implementation logically separates the computation and communication phases to avoid synchronization in the upward and downward computation passes, and thus allows us to fully exploit computation and communication overlapping. We measure isogranular and fixed-size scalability for a variety of kernels on the Pittsburgh Supercomputing Center's TCS-1 Alphaserver on up to 3000 processors. We have solved viscous flow problems with up to 2.1 billion unknowns and we have achieved 1.6 Tflops/s peak performance and 1.13 Tflops/s sustained performance.
KW - Fast multipole methods
KW - N-body problems
KW - adaptive algorithms
KW - boundary integral equations
KW - massively parallel computing
KW - viscous flows
UR - http://www.scopus.com/inward/record.url?scp=84877033732&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84877033732&partnerID=8YFLogxK
U2 - 10.1145/1048935.1050165
DO - 10.1145/1048935.1050165
M3 - Conference contribution
AN - SCOPUS:84877033732
SN - 1581136951
SN - 9781581136951
T3 - Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, SC 2003
BT - Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, SC 2003
T2 - 2003 ACM/IEEE Conference on Supercomputing, SC 2003
Y2 - 15 November 2003 through 21 November 2003
ER -