An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Gradient Descent in Recurrent Neural Networks with Model-Free Multiplexed Gradient Descent: Toward Temporal On-Chip Neuromorphic Learning
Published
Author(s)
Ryan O'Loughlin, Nicholas Skuda, Bakhrom Oripov, Sonia Buckley, Adam McCaughan
Abstract
The brain implements recurrent neural networks (RNNs) efficiently, and modern computing hardware does not. Although specialized neuromorphic hardware is well suited for recurrent implementations in the inference phase, it does not lend itself to the primary training method for RNNs, backpropagation through time (BPTT), for on-chip learning. To solve the mismatch of RNNs and hardware, we propose the application of Multiplexed Gradient Descent (MGD) for model-free learning in recurrent systems with simple operations realizable by a number of neuromorphic hardware. MGD does not require analytic differentiation, weight transport, diverse computations, or even a hardware model. Nevertheless, it is guaranteed to converge to the true gradient of the loss landscape for any differentiable system, and is able to do so with multiplexed perturbations. Moreover, considerable time advantages are available by minimizing the resolution of gradient estimation required for gradient descent. The only operations required are those canonically associated with a forward pass and cost calculation, along with an additional mechanism for global broadcast, local parameter-adjacent storage, and pseudorandom value generation (or alternatively, a way to feed in orthogonal binary vectors. There are a number of ways to instantiate these requirements, and on this basis, we provide a list of candidate hardware for implementation. In this paper, we demonstrate that MGD converges to the true gradient, in agreement with both finite difference and backpropagation, for a linearly recurrent neural network, independent of the learning task and network initializations. It is then demonstrated that gradient descent can be performed with efficient gradient approximations on real-world time-series and sequence modeling tasks. The separability of learned representations by this method is visually and quantitatively analyzed. Finally, the prospects for implementations on neuromorphic hardware is discussed, given the modest operation requirements to learn on recurrent neural network with MGD in-situ, thereby accessing the full neuromorphic advantage of on-chip learning in recurrent neural networks.
O'Loughlin, R.
, Skuda, N.
, Oripov, B.
, Buckley, S.
and McCaughan, A.
(2025),
Gradient Descent in Recurrent Neural Networks with Model-Free Multiplexed Gradient Descent: Toward Temporal On-Chip Neuromorphic Learning, International Conference on Neuromorphic Systems, Seattle, WA, US
(Accessed June 16, 2025)