"I see machine learning and cognitive computing as the future of big data science.”
- Sridar Mahadevan
They are funded by a new four-year, $1.2 million National Science Foundation grant to computer scientist Sridhar Mahadevan, lead principal investigator at UMass Amherst’s College of Information and Computer Sciences. His co-investigators are Mario Parente, an expert in analysis of hyperspectral images at UMass Amherst, and Darby Dyar of Mount Holyoke, a specialist in planetary chemistry and geology who serves on the scientific mission team for the Mars rover.
As Mahadevan explains, NASA’s Curiosity rover, a car-sized robot, has been exploring a crater on Mars since August 2012 and sending back a steady stream of specialized camera images and data on the chemical composition of rocks and dust for analysis. The data range from one-dimensional spectra of rock samples to three-dimensional hyperspectral images of the Martian surface.
He advises Ph.D. students Thomas Boucher, CJ Carey, Steve Giguere, Ian Gemp, Francisco Garcia and Ishan Durugkar in the Autonomous Learning Laboratory, who are exploring machine learning methods to show, for the first time, that new deep learning approaches provide a practical and useful new tool for handling large scientific data sets.
Scientists glean knowledge about Mars’ rock and dust from data generated by a process called laser-induced breakdown spectroscopy (LIBS). At Mount Holyoke, Dyar directs a laboratory that uses the same laser-plus-spectrometer instrument on the rover. The laser blasts Martian rocks and sends back information on light frequencies emitted from the rock surface that is heated to high temperatures, which is then used to identify the rocks’ chemical composition.
Martian imaging analyses are directed by Parente of UMass Amherst’s department of electrical and computer engineering, who uses hyperspectral cameras that send back images of large areas of the planet’s surface. Unlike traditional cameras, a hyperspectral instrument divides the light spectrum into many more bands than are visible to humans. For example, a hyperspectral map of the Amazon rainforest might identify hundreds of different tree species by analyzing images over many frequencies. The hyperspectral camera aboard Curiosity can observe in both the visible range and shorter wavelengths, over the full infrared wavelength range. The ability to detect light in these ranges allows scientists to identify a broad range of minerals on the Martian surface.
Deep learning approaches have already proven amazingly effective in speech and visual recognition software, Mahadevan notes. The question now is whether these methods can be applied to analyze vast amounts of scientific data created by radio and optical telescopes in astronomy, for example, as well as spectroscope and microscope data in medicine and countless other areas.
“With this grant, we’re exploring how well deep learning will work for such analyses. We know that deep learning is now almost as good as humans at recognizing different objects. Our study will test the ability not at recognizing objects on Earth, but understanding planetary geochemistry from the Martian rock tests. The hope is that it will be good at doing this new thing, that in four years we can show that deep learning can have a much better success rate than other previous methods of differentiation.
As Mahadevan explains, deep learning was designed partly in response to meeting the challenge of huge new data sets. “One characteristic of deep learning is that the more data you give it, the happier it gets,” he says. “It learns the way we do, by recognizing and remembering patterns. You give it millions of experiences and it will learn. Think of a spam filter. You show it a few thousand junk emails as well as normal emails, and ask the program to learn to separate good emails from spam.”
Machine learning is not only faster than a human at solving certain problems, it can discern detail that humans cannot see, the artificial intelligence expert adds. Deep learning software can be trained to recognize chemical patterns much more accurately than human-based methods, for example.
Mahadevan adds, “This used to be prohibitively expensive when computers were slow and memory was expensive. But now machines are tens of thousands of times faster than they used to be and memory is very much cheaper, so it’s feasible to mine enormous amounts of data this way. Of course,” he cautions, “it’s not perfect. It can make mistakes. This scientific project addresses one of the things we need to know. Nobody has actually shown that deep learning techniques can deliver what we hope for from these scientific data sets. Somebody has to explore the question.”
“I’ve been waiting 30 years for this and now it’s here,” he notes. “We always thought artificial intelligence was a dream, but it’s happening. I see machine learning and cognitive computing as the future of big data science.”
UMass Amherst News Office