Konstantinos Spiliopoulos: Normalization effects on deep neural networks and deep learning for scientific problems
Abstract
We study the effect of normalization on the layers of deep neural networks. A given layer $i$ with $N_{i}$ hidden units is normalized by $1/N_{i}^{\gamma_{i}}$ with $\gamma_{i}\in[1/2,1]$. We study the effect of the choice of the $\gamma_{i}$ on the statistical behavior of the neural network’s output (such as variance) as well as on the test
accuracy and generalization properties of the architecture. We find that in terms of variance of the neural network’s output and test accuracy the best choice is to choose the $\gamma_{i}$’s to be equal to one, which is the mean-field scaling. We also find that this is particularly true for the outer layer, in that the neural network’s behavior is more sensitive in the scaling of the outer layer as opposed to the scaling of the inner layers. The mechanism for the mathematical analysis is an asymptotic expansion for the neural network’s output. An important practical consequence of the analysis is that it provides a systematic and mathematically informed way to choose the learning rate
hyperparameters. Such a choice guarantees that the neural network behaves in a statistically robust way as the number of hidden units $N_i$ grow. Time permitting, I will discuss applications of these ideas to design of
deep learning algorithms for scientific problems including solving high dimensional partial differential equations (PDEs), closure of PDE models and reinforcement learning with applications to financial engineering, turbulence and more.
Short Bio:
Konstantinos Spiliopoulos is the Director of Statistics and a Professor of Mathematics and Statistics at Boston University (BU). He is a member of the Hariri Institute for Computing and of the Center for Information and Systems Engineering at BU. Between 2009-2012, he was a Prager Assistant Professor at the Division of Applied Mathematics at Brown University, and he earned his PhD in Mathematical Statistics at University of Maryland in 2009. He has been at BU since 2012. Currently, Professor Spiliopoulos works on mathematical and computational methods for machine learning algorithms, stochastic online algorithms and deep learning for scientific problems, accelerated Monte Carlo methods for sampling and optimization, agent-based modeling and applications to financial engineering, biological systems and opinion and social dynamics. He has received the Stochastics and Dynamics Best Paper Award in 2023, the Simons Fellow in Mathematics Award in 2020, the NSF Career Award in 2016, the Hariri Institute Junior Fellowship
Award in 2013, the Seymour Goldberg Paper Award in 2008 and the Eygenidio Foundation Award in 2004. His work has been continuously funded by NSF, Simons Foundations and the DoD.