Distributed Multigrid Neural Solvers on Megavoxel Domains

Aditya Balu, Sergio Botelho, Biswajit Khara, Vinay Rao, Soumik Sarkar, Chinmay Hegde, Adarsh Krishnamurthy, Santi Adavani, Baskar Ganapathysubramanian

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    We consider the distributed training of large scale neural networks that serve as PDE (partial differential equation) solvers producing full field outputs. We specifically consider neural solvers for the generalized 3D Poisson equation over megavoxel domains. A scalable framework is presented that integrates two distinct advances. First, we accelerate training a large model via a method analogous to the multigrid technique used in numerical linear algebra. Here, the network is trained using a hierarchy of increasing resolution inputs in sequence, analogous to the V , W , F and Half-V cycles used in multigrid approaches. In conjunction with the multi-grid approach, we implement a distributed deep learning framework which significantly reduces the time to solve.We show scalability of this approach on both GPU (Azure VMs on Cloud) and CPU clusters (PSC Bridges2). This approach is deployed to train a generalized 3D Poisson solver that scales well to predict output full field solutions up to the resolution of 512×512×512 for a high dimensional family of inputs. This strategy opens up the possibility of fast and scalable training of neural PDE solvers on heterogeneous clusters.

    Original languageEnglish (US)
    Title of host publicationProceedings of SC 2021
    Subtitle of host publicationThe International Conference for High Performance Computing, Networking, Storage and Analysis: Science and Beyond
    PublisherIEEE Computer Society
    ISBN (Electronic)9781450384421
    DOIs
    StatePublished - Nov 14 2021
    Event33rd International Conference for High Performance Computing, Networking, Storage and Analysis: Science and Beyond, SC 2021 - Virtual, Online, United States
    Duration: Nov 14 2021Nov 19 2021

    Publication series

    NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
    ISSN (Print)2167-4329
    ISSN (Electronic)2167-4337

    Conference

    Conference33rd International Conference for High Performance Computing, Networking, Storage and Analysis: Science and Beyond, SC 2021
    Country/TerritoryUnited States
    CityVirtual, Online
    Period11/14/2111/19/21

    Keywords

    • Distributed training
    • Multigrid
    • Neural PDE solvers
    • Physics aware neural networks

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Computer Science Applications
    • Hardware and Architecture
    • Software

    Fingerprint

    Dive into the research topics of 'Distributed Multigrid Neural Solvers on Megavoxel Domains'. Together they form a unique fingerprint.

    Cite this