By reconfiguring neural networks in artificial intelligence (AI) devices, a multi-institute team that included Penn State researchers facilitated AI systems to continually learn and adapt new data and tasks in ways that were not possible or practical before. The approach combines a novel algorithm with perovskite nickelate — a material so sensitive to subatomic changes, they can change its electrical properties — to overcome two of the main limitations of AI: catastrophic forgetting and high energy consumption.
The team published their results Feb. 4 in Science.
Neural networks, the brains of AI systems, are modeled after the human brain’s network of neurons and their connectors, synapses. These connections enable AI devices to learn complicated processes after being fed large datasets that train the AI to properly assess and complete a specific type of task. While this allows for advanced, brain-like functionality, these systems are limited in their ability to be retrained if the tasks evolve over time.
"There is a burning AI problem called catastrophic forgetting, where when you deploy an AI system with one stationary data task and the task changes over time, the AI starts failing to perform the new task well because it was initially built for that first task,” said Abhronil Sengupta, assistant professor of electrical engineering and co-author on the paper.
According to Sengupta, it is catastrophic because the AI completely forgets previously learned representations, resulting in poor performance in continual learning environments.
Sengupta gave the example of an AI model designed to learn and recognize different species of birds.
“Let’s say the model learnt the different birds it sees in winter. However, as the seasons change the birds we see change too,” Sengupta said. “For example, in summer, many birds migrate from the south as the weather warms up. When our model tries to learn these new species, it interferes with the model’s ability to accurately predict the different species. Even if you retrain the model with new data, it will fail — in all seasons. That is where the problem lies. The way that dynamic networks solve the issue is to reconfigure the neural network based on the new data.”
Reconfiguring the neural networks allows for dynamic systems that can be retrained and evolve to keep pace with changing tasks, according to Sengupta. These sorts of dynamic networks exist, but only for software, which run on traditional hardware and therefore consume massive amounts of energy.
“If you look at current AI systems, the average power consumption that ends up taking place is almost six orders of magnitude higher than what takes place in the brain, because current systems have a functional mismatch between the software and the underlying hardware implementation, and that results in the energy consumption problem we have,” Sengupta said, explaining that one needs to completely redesign the current hardware stack and look at devices that mimic brain-like components through their intrinsic physics. “[With our system], the energy consumption is much lower because we are building devices that can directly implement the algorithm functionality in hardware.”
To create this new system, the team members at Purdue University — who were responsible for the material science and device aspects of the project — used electric pulses at different voltages to change the concentration of hydrogen ions in perovskite nickelate-based devices. The material is very sensitive to hydrogen, so the concentration change reconfigures the makeup of neurons and synapses. Penn State researchers combined this hardware development with their algorithm to create grow-when-required networks. The approach is based on the idea that when new data are introduced, the neural network should grow and adapt.
The grow-when-required networks from the algorithm use incoming information and task requests to inform the device’s responsive reconfiguration. This on-the-fly adaptation makes it possible for the AI to continually learn.
“Continual or lifelong learning has been elusive in AI and the issue of catastrophic forgetting is a fundamental one as we try to mimic human-like intelligence,” said A.N.M. Nafiul Islam, doctoral candidate in electrical engineering and co-first author on the paper. “Our work is extremely exciting as it not only tackles the issue of incremental learning but also leverages the unique reconfigurability of the device platform to increase efficiency without loss of performance.”
Other authors of the paper were Hai-Tian Zhang, Tae Joon Park, Qi Wang, Sandip Mondal, Haoming Yu and Shiram Ramanathan of Purdue University; Dat S. J. Tran of Santa Clara University; Suvo Banik of the University of Illinois; Hua Zhou of Argonne National Laboratory; Sukriti Manna and Subramanian K.R.S. Sankaranarayanan of the Argonne National Laboratory and the University of Illinois; Shaobo Cheng and Yimei Zhu of Brookhaven National Laboratory; Sampath Gamage and Yohannes Abate of the University of Georgia; Nan Jiang of the University of Illinois at Chicago; and Christof Teuscher of Portland State University.