The University of Massachusetts Amherst

Close up view of hands on Tesla Model Y steering wheel with autopilot screen showing road traffic. Credit: Getty Images
Research

New All-Silicon Computer Vision Hardware by UMass Researchers Advances In-Sensor Visual Processing Technology

The experimental analog hardware could perform better than state-of-the-art computer vision methods

Researchers at the University of Massachusetts Amherst have pushed forward the development of computer vision with new, silicon-based hardware that can both capture and process visual data in the analog domain. Their work, described in the journal Nature Communications, could ultimately add to large-scale, data-intensive and latency-sensitive computer vision tasks.

Image
A figure showing how recognizing human motions in sophisticated environments is a classic computer vision challenge. Xu and colleagues found that their analog technology was able to perform the task with 90% accuracy, outperforming its digital counterparts.
Recognizing human motions in sophisticated environments is a classic computer vision challenge. Xu and colleagues found that their analog technology was able to perform the task with 90% accuracy, outperforming its digital counterparts.

“This is very powerful retinomorphic hardware,” says Guangyu Xu, associate professor of electrical and computer engineering and adjunct associate professor of biomedical engineering at UMass Amherst. “The idea of fusing the sensing unit and the processing unit at the device level, instead of physically separating them apart, is very similar to the way that human eyes are processing the visual world.” 

Existing computer vision systems often involve exchanging redundant data between physically separated sensing and computing units. “For instance, the camera on your cell phone captures every single pixel of data in the field of view,” says Xu. However, that image has more information than the system requires in order to identify an object or its movement. As a result, the time it takes to transmit and process this extra information introduces a lag for understanding the captured visual information, which is often time-sensitive and data-intensive.

“Our technology is trying to cut this latency between the moment you sense the physical world and the moment you identify what you are capturing,” he says. 

Guangyu Xu

This is very powerful retinomorphic hardware. The idea of fusing the sensing unit and the processing unit at the device level, instead of physically separating them apart, is very similar to the way that human eyes are processing the visual world.

Guangyu Xu, associate professor of electrical and computer engineering and adjunct associate professor of biomedical engineering at UMass Amherst

 

Xu and his team created two integrated arrays of gate-tunable silicon photodetectors, or in-sensor visual processing arrays. Sharing bipolar analog output and low-power operation, one array can capture dynamic visual information, such as event-driven light changes, and one can capture the spatial features in static images to identify what the object is.

Scaling up these silicon arrays holds promise for retinomorphic computing and intelligence sensing. For dynamic motions, when asked to classify human motions (walking, boxing, waving and clapping) in sophisticated environments, the new analog technology was accurate 90% of the time, while digital counterparts were 77.5 to 85% accurate. For static images, their technology classified handwritten numbers with 95% accuracy, which outperforms methods without in-sensor computing capabilities (90%).

Image
Made of silicon, these in-sensor visual processing arrays can both capture and process visual data in the analog domain, as opposed to conventional systems that often physically separate these functions.
Made of silicon, these in-sensor visual processing arrays can both capture and process visual data in the analog domain, as opposed to conventional systems that often physically separate these functions.

A unique feature of these arrays is that they are made of silicon, the same material used in computer chips, in contrast to prior in-sensor visual processors that are mostly made of nanomaterials. As such, these arrays are more compatible with existing complementary metal–oxide–semiconductors (CMOS), the most commonly used semiconductor technology used to build integrated circuits in a wide range of electronic devices such as computers and memory chips. This compatibility makes them uniquely suited for large-scale computer vision tasks, in which many processes are executed simultaneously, also known as high parallelism.

“Our all-silicon technology lends itself to CMOS integration, mass production and large-scale array operation with low variabilities, so I think that’s a major leap in this field,” says Xu.

Xu gives concrete examples of potential applications for this work. First is self-driving vehicles: “You always have to, in real time, process what is surrounding your vehicle and how fast they move,” he says. Any reduction in processing time increases the safety of autonomous vehicles.

Another area that stands to benefit is bioimaging. Current technology may capture way more data than what is really needed. “We can perhaps compress the amount of data and give out the same biological insight for the scientists,” he says.

This research was supported by the U.S. National Science Foundation. 

Related

dual-color brain probe

The new design has the potential to advance basic neuroscience studies and ultimately shed light on neurological conditions such as epilepsy.