AMHERST, Mass. – Seventy years ago, science fiction writer Isaac Asimov imagined a world where robots would serve humans in countless ways, and he equipped them with built-in safeguards now known as Asimov’s Three Laws of Robotics, to prevent them, among other goals, from ever harming a person.
Guaranteeing safe and fair machine behavior is still an issue today, says machine learning researcher and lead author Philip Thomas at the University of Massachusetts Amherst. “When someone applies a machine learning algorithm, it’s hard to control its behavior,” he points out. This risks undesirable outcomes from algorithms that direct everything from self-driving vehicles to insulin pumps to criminal sentencing, say he and co-authors.
Writing in Science, Thomas and his colleagues Yuriy Brun, Andrew Barto and graduate student Stephen Giguere at UMass Amherst, Bruno Castro da Silva at the Federal University of Rio Grande do Sul, Brazil, and Emma Brunskill at Stanford University this week introduce a new framework for designing machine learning algorithms that make it easier for users of the algorithm to specify safety and fairness constraints.
“We call algorithms created with our new framework ‘Seldonian’ after Asimov’s character Hari Seldon,” Thomas explains. “If I use a Seldonian algorithm for diabetes treatment, I can specify that undesirable behavior means dangerously low blood sugar, or hypoglycemia. I can say to the machine, ‘while you’re trying to improve the controller in the insulin pump, don’t make changes that would increase the frequency of hypoglycemia.’ Most algorithms don’t give you a way to put this type of constraint on behavior; it wasn’t included in early designs.”
“But making it easier to ensure fairness and avoid harm is becoming increasingly important as machine learning algorithms impact our lives more and more,” he says.
However, “a recent paper listed 21 different definitions of fairness in machine learning. It’s important that we allow the user to select the definition that is appropriate for their intended application,” he adds. “The interface that comes with a Seldonian algorithm allows the user to do just this: to define what ‘undesirable behavior’ means for their application.”
In Asimov’s Foundation series, Seldon is in the same universe as his Robot series. Thomas explains, “Everything has fallen apart, the galactic empire is collapsing, partly because the Three Laws of Robotics require certainty. With that level of safety required, robots are paralyzed with indecision because they cannot act with certainty and guarantee that no human will be harmed by their actions.”
Seldon proposes fixing this by turning to reasoning probabilistically about safety. “That’s a good fit to what we’re doing, Thomas says. The new approach he and colleagues provide allows for probabilistic constraints and requires the algorithm to specify ways the user can tell it what to constrain. He says, “The framework is a tool for the machine learning researcher. It guides them toward creating algorithms that are easier for users to apply responsibly to real-world problems.”
To test the new framework, they applied it to predict grade point averages in a data set of 43,000 students in Brazil by creating a Seldonian algorithm with constraints. It successfully avoided several types of undesirable gender bias. In another test, they show how an algorithm could improve the controller in an insulin pump while guaranteeing that it would not increase the frequency of hypoglycemia.
Thomas says, “We believe there’s massive room for improvement in this area. Even with our algorithms made of simple components, we obtained impressive results. We hope that machine learning researchers will go on to develop new and more sophisticated algorithms using our framework, which can be used responsibly for applications where machine learning used to be considered too risky. It’s a call to other researchers to conduct research in this space.”