You are here

Coarse Gradient Descent Method and Quantization of Deep Neural Networks

Event Category:
Reading Seminar on Mathematics of Machine Learning
Jack Xin
University of California, Irvine

Quantization is an effective approach to accelerate deep neural networks by restricting their weights and activation functions to low precisions. However, the training objective (loss function) becomes discontinuous so that
the standard gradient either vanishes or does not exist. We discuss a notion of coarse gradient (also known as straight through estimator) that acts on smooth proxies of discontinuous functions, and (with proper design) leads to subtle descent of the loss function in training as well as satisfactory generalization accuracy. We perform convergence analysis on simplified models and experiments on image classification, some in conjunction with a feature-affinity assisted multi-level knowledge distillation to extract an efficient student network from a larger teacher network on label-free data.

Friday, September 22, 2023 - 12:00pm