TY - JOUR

T1 - Parallel scalable adjoint-based adaptive solution of variable-viscosity Stokes flow problems

AU - Burstedde, Carsten

AU - Ghattas, Omar

AU - Stadler, Georg

AU - Tu, Tiankai

AU - Wilcox, Lucas C.

N1 - Funding Information:
This work was partially supported by NSF (Grants OCI-0749334, DMS-0724746, CNS-0619838, CCF-0427985), DOE SC’s SciDAC program (Grant DE-FC02-06ER25782), DOE NNSA’s PSAAP program (cooperative agreement DE-FC52-08NA28615), and AFOSR’s Computational Math program (Grant FA9550-07-1-0480). We acknowledge many helpful discussions with the hypre developers and with George Biros and Serge Prudhomme. We thank TACC for their outstanding support, in particular Bill Barth, Karl Schulz, and Victor Eijkhout. We also thank Marc Spiegelman for referring us to the benchmark example used in Section 5.2 . Finally, we dedicate this paper to Professor J. Tinsley Oden, who provided inspiration and encouragement of this work, on the occasion of his 70th birthday.

PY - 2009/5/1

Y1 - 2009/5/1

N2 - We present a framework for parallel adaptive solution of variable-viscosity Stokes flow problems. We focus on data structures, algorithms, and solvers that can scale to thousands of processor cores. The problem is discretized by octree-based finite elements with explicit enforcement of continuity constraints at hanging nodes. The parallel octree structure allows for fast neighbor-finding and facilitates local coarsening and refinement of the mesh. Mesh adaptivity is driven by a posteriori error indicators, including adjoint-based goal-oriented techniques. Dynamic load-balancing is achieved by dynamically partitioning a Morton-ordered space-filling curve. The Stokes system is solved iteratively using the minimum residual method (MINRES), preconditioned by a Schur-complement-based approximate inverse that employs algebraic multigrid V-cycle approximations of the inverses of the Poisson-like operators. We demonstrate the effectiveness of this framework on several testbed problems with up to 6 orders of magnitude variation in viscosity and up to 1.7 billion unknowns, on up to 4096 cores. The results indicate that the overhead due to all AMR components is less than 3% of the overall solve time, the solver exhibits very good algorithmic and parallel implementation scalability, the solver is insensitive to the magnitude of viscosity variation, and adjoint-based adaptivity results in over two orders of magnitude reduction in number of unknowns and up to an order of magnitude improvement in runtime relative to a uniform mesh, for the same level of error.

AB - We present a framework for parallel adaptive solution of variable-viscosity Stokes flow problems. We focus on data structures, algorithms, and solvers that can scale to thousands of processor cores. The problem is discretized by octree-based finite elements with explicit enforcement of continuity constraints at hanging nodes. The parallel octree structure allows for fast neighbor-finding and facilitates local coarsening and refinement of the mesh. Mesh adaptivity is driven by a posteriori error indicators, including adjoint-based goal-oriented techniques. Dynamic load-balancing is achieved by dynamically partitioning a Morton-ordered space-filling curve. The Stokes system is solved iteratively using the minimum residual method (MINRES), preconditioned by a Schur-complement-based approximate inverse that employs algebraic multigrid V-cycle approximations of the inverses of the Poisson-like operators. We demonstrate the effectiveness of this framework on several testbed problems with up to 6 orders of magnitude variation in viscosity and up to 1.7 billion unknowns, on up to 4096 cores. The results indicate that the overhead due to all AMR components is less than 3% of the overall solve time, the solver exhibits very good algorithmic and parallel implementation scalability, the solver is insensitive to the magnitude of viscosity variation, and adjoint-based adaptivity results in over two orders of magnitude reduction in number of unknowns and up to an order of magnitude improvement in runtime relative to a uniform mesh, for the same level of error.

KW - Adaptive mesh refinement

KW - Adjoint error estimation

KW - Algebraic multigrid

KW - Octree algorithms

KW - Parallel computing

KW - Stokes equations

UR - http://www.scopus.com/inward/record.url?scp=63249129443&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=63249129443&partnerID=8YFLogxK

U2 - 10.1016/j.cma.2008.12.015

DO - 10.1016/j.cma.2008.12.015

M3 - Article

AN - SCOPUS:63249129443

SN - 0045-7825

VL - 198

SP - 1691

EP - 1700

JO - Computer Methods in Applied Mechanics and Engineering

JF - Computer Methods in Applied Mechanics and Engineering

IS - 21-26

ER -