Abstract: A hierarchical, sparse Gaussian process for in situ inference in expensive physics simulations

Abstract: A hierarchical, sparse Gaussian process for in situ inference in expensive physics simulations
High-fidelity physics simulations, such as the Energy Exascale Earth System Model (E3SM), are generating ever increasing quantities of data. In the near future, it will be infeasible to store the full computation after the simulation. To facilitate the training of complex machine learning models in the future, there is a glaring need for {\it in-situ} inference algorithms. In this work, we focus on the need for spatio-temporal inference models which are (i) highly scalable, (ii) distributed,  (iii) able to capture small-scale, high-resolution structures, and (iv) long-range global structure. One possible approach is to leverage algorithms for fast approximate Gaussian process regression, such as the sparse variational GP. In this work, we present a Hierarchical SVGP (HSVGP) model in which local models capture small-scale structures {\it in-situ}, while large-scale structures are captured and communicated by a global SVGP.  Using both synthetic data and E3SM model output, we demonstrate that the HSVGP model outperforms its simpler non-hierarchical counterpart in terms of predictive accuracy and smoothness at partition boundaries, while remaining both highly scalable and easily distributed.