Lulu Kang (UMASS): Optimal Kernel Learning for Gaussian Process Models with High-Dimensional Input
Optimal Kernel Learning for Gaussian Process Models with High-Dimensional Input
Gaussian process (GP) regression is a popular surrogate modeling tool for computer simulations in engineering and scientific domains. However, it often struggles with high computational costs and low prediction accuracy when the simulation involves a large number of input variables. For some simulation models, the response variables may only be significantly influenced by a small subset of the input variables, referred to as the ``active variables''. We propose an optimal kernel learning approach to identify these active variables, thereby overcoming GP model limitations and enhancing system understanding. Our method approximates the original GP model's covariance function through a convex combination of kernel functions, each utilizing low-dimensional subsets of input variables. Inspired by the Fedorov-Wynn algorithm from optimal design literature, we develop an optimal kernel learning algorithm to determine this approximation. We incorporate the effect heredity principle to ensure sparsity in active variable selection. Through multiple examples, we demonstrate that our proposed method outperforms alternative approaches in both correctly identifying active input variables and improving prediction accuracy. This approach offers researchers a powerful tool to simplify complex models and focus on the most influential factors in their simulations.