Engineers have developed a neuro-inspired hardware-software co-design approach that could make neural network training more energy-efficient and faster. Their work could one day make it possible to train neural networks on low-power devices such as smartphones, laptops and embedded devices.
Training neural networks to perform tasks like recognize objects, navigate self-driving cars or play games eats up a lot of computing power and time. Large computers with hundreds to thousands of processors are typically required to learn these tasks, and training times can take anywhere from weeks to months.
That's because doing these computations involves transferring data back and forth between two separate units, the memory and the processor, and this consumes most of the energy and time during neural network training.
To address this problem, the research group teamed up with Adesto Technologies to develop hardware and algorithms that allow these computations to be performed directly in the memory unit, eliminating the need to repeatedly shuffle data.
The hardware component is a super energy-efficient type of non-volatile memory technology, a 512 kilobit subquantum Conductive Bridging RAM (CBRAM) array. It consumes 10 to 100 times less energy than today's leading memory technologies.
The device is based on Adesto's CBRAM memory technology, it has primarily been used as a digital storage device that only has '0' and '1' states, but the lab demonstrated that it can be programmed to have multiple analog states to emulate biological synapses in the human brain. This so-called synaptic device can be used to do in-memory computing for neural network training.
The team developed algorithms that could be easily mapped onto this synaptic device array. The algorithms provided even more energy and time savings during neural network training.
The approach uses a type of energy-efficient neural network, called a spiking neural network, for implementing unsupervised learning in the hardware.
Neural networks are a series of connected layers of artificial neurons, where the output of one layer provides the input to the next. The strength of the connections between these layers is represented by what are called "weights." Training a neural network deals with updating these weights.
Conventional neural networks spend a lot of energy to continuously update every single one of these weights. But in spiking neural networks, only weights that are tied to spiking neurons get updated. This means fewer updates, which means less computation power and time.
The network also does what's called unsupervised learning, which means it can essentially train itself. For example, if the network is shown a series of handwritten numerical digits, it will figure out how to distinguish between zeros, ones, twos, etc. A benefit is that the network does not need to be trained on labeled examples, meaning it does not need to be told that it's seeing a zero, one or two, which is useful for autonomous applications like navigation.
To make training even faster and more energy-efficient, the lab developed a new algorithm that they dubbed "soft-pruning" to implement with the unsupervised spiking neural network. Soft-pruning is a method that finds weights that have already matured during training and then sets them to a constant non-zero value. This stops them from getting updated for the remainder of the training, which minimizes computing power.
Soft-pruning differs from conventional pruning methods because it is implemented during training, rather than after. It can also lead to higher accuracy when a neural network puts its training to the test. Normally in pruning, redundant or unimportant weights are completely removed. The downside is the more weights you prune, the less accurate the network performs during testing. But soft-pruning just keeps these weights in a low energy setting, so they're still around to help the network perform with higher accuracy.
The team implemented the neuro-inspired unsupervised spiking neural network and the soft-pruning algorithm on the subquantum CBRAM synaptic device array. They then trained the network to classify handwritten digits from the MNIST database.
In tests, the network classified digits with 93 percent accuracy even when up to 75 percent of the weights were soft pruned. In comparison, the network performed with less than 90 percent accuracy when only 40 percent of the weights were pruned using conventional pruning methods.
In terms of energy savings, the team estimates that their neuro-inspired hardware-software co-design approach can eventually cut energy use during neural network training by two to three orders of magnitude compared to the state of the art.