UMass Amherst Professor Markos Katsoulakis and colleagues at Johns Hopkins University and the University of Delaware,recently reported in “Science Advances” that they have developed a mathematical and computational framework that provides more trustworthy models and algorithms for probabilistic artificial intelligence-based modeling. It also incorporates a systematic assessment of its predictive capabilities.

Katsoulakis, a professor in the department of mathematics and statistics, explains, “In data science, learning mathematical models is predominantly and justifiably driven by the abundance of available data in numerous application domains. However, for many complex physics, chemistry and engineering problems, data is often sparse, expensive and sourced from both experiments and computations with various levels of noise and accuracy.”

In their new paper, the authors show how their approach works in predicting materials that increase the efficiency of a chemical reaction known as oxygen reduction reaction that is a performance bottleneck in fuel cells. Traditional modeling and experiments – often based on trial and error – are commonly used to test chemical catalysts to find those that work faster or more efficiently in fuel cells, they note. But their new framework can narrow down this “hit or miss” guessing approach to identify the most influential components of the modeled system.

They expect the approach will also apply to many other types of modeling with the same common themes – where available data are fairly sparse, for example, but some expert knowledge is at hand – problems in energy storage, renewable energy, high-throughput experimentation and biomaterials design.

But how much one can trust predictions made by such models – in fact by any AI algorithm – and whether they are reliable enough for design and decision-making tasks with important real-world consequences, is a significant mathematical challenge, Katsoulakis says.

He adds, “The first element of our proposed framework is the use of probabilistic graphical models – one of the established mathematical tools of artificial intelligence – to allow us to integrate available data, physics-based models and causality, and expert knowledge into a single, structured probabilistic model with many components and inputs.”

To address such questions, he notes that “the second and complementary element of the proposed methodology is the development of tailored uncertainty quantification methods for probabilistic AI.” These systematically assess model predictive uncertainty by quantifying the contributions of each data input and modeling decision, which leads to correctable and, eventually, more trustworthy models.

These uncertainty quantification methods can determine what additional, but potentially expensive to generate, data can have the biggest impact in improving a model.” This approach relies on earlier work Katsoulakis and his collaborators who have been developing information theory-based tools to assess the predictive performance of complex computational models and deep learning for multi-scale systems.