Data-Driven Priors for Trustworthy Machine Learning

Statistics and Data Science Seminar Series
Tim Rudner

Machine learning models, while effective in controlled environments, can fail catastrophically when exposed to unexpected conditions upon deployment. This lack of robustness, well-documented even in state-of-the-art models, can lead to severe harm in high-stakes, safety-critical application domains such as healthcare. This shortcoming raises two central questions: When do machine learning models fail, and how can we develop machine learning models we can trust?

In this talk, I will approach this question from a probabilistic perspective, stepping through ways to address deficiencies in trustworthiness that arise in model construction and training. First, I will demonstrate how a probabilistic approach to model construction can reveal—and help mitigate—failures in neural network training. Then, I will show how to improve the trustworthiness of neural networks with data-driven, domain-informed prior distributions over model parameters. Throughout this talk, I will highlight carefully designed evaluation procedures for assessing the trustworthiness of machine learning models in safety-critical settings.

Wednesday, March 27, 2024 - 1:30am
LGRT 1681