Memristor-based spiking neural networks: cooperative development of neural network architecture/algorithms and memristors

Peng Huihui; Gan Lin; Guo Xin

doi:10.1016/j.chip.2024.100093

2024 , Vol. 3 >Issue 2: 100093 - 17

DOI: https://doi.org/10.1016/j.chip.2024.100093

Review

Memristor-based spiking neural networks: cooperative development of neural network architecture/algorithms and memristors

Peng Huihui ,
Gan Lin ^,^* ,
Guo Xin ^,^*

Expand

School of Materials Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China

*E-mails: ganlinust@hust.edu.cn (Lin Gan),

xguo@hust.edu.cn (Xin Guo)

Received date: 2023-12-05

Accepted date: 2024-04-02

Online published: 2024-04-06

Fold

Abstract

Inspired by the structure and principles of the human brain, spike neural networks (SNNs) appear as the latest generation of artificial neural networks, attracting significant and universal attention due to their remarkable low-energy transmission by pulse and powerful capability for large-scale parallel computation. Current research on artificial neural networks gradually change from software simulation into hardware implementation. However, such a process is fraught with challenges. In particular, memristors are highly anticipated hardware candidates owing to their fast-programming speed, low power consumption, and compatibility with the complementary metal-oxide semiconductor (CMOS) technology. In this review, we start from the basic principles of SNNs, and then introduced memristor-based technologies for hardware implementation of SNNs, and further discuss the feasibility of integrating customized algorithm optimization to promote efficient and energy-saving SNN hardware systems. Finally, based on the existing memristor technology, we summarize the current problems and challenges in this field.

Key words： Spike neural networks; Hardware; Memristor; Algorithm; Cooperative development

Cite this article

Peng Huihui , Gan Lin , Guo Xin . Memristor-based spiking neural networks: cooperative development of neural network architecture/algorithms and memristors[J]. Chip, 2024 , 3(2) : 100093 -17 . DOI: 10.1016/j.chip.2024.100093

INTRODUCTION

The human brain, a highly interconnected complex network structure constructed by enormous quantity of neurons and synapses, is one of the most powerful biological information-processing organs on the earth. In such a complex network, neurons work as the receptors and processors of information, and synapses are the carriers for the pattern of information exchanged between neurons. This network structure not only endows the human brain with excellent logical reasoning, abstract thinking, and creativity¹ but also suppresses the power consumption down to 20 W even when performing high-complexity multi-information parallel processing². In recent years, the field of neural networks, which simulates the structure of neurons and synaptic connections in the human brain, has undergone tremendous progress and transformation. On the one hand, the computational capabilities of silicon-based chips still continue to rise³, albeit the growth rate predicted by “Moore's Law” is slowed down. On the other hand, the sustained innovation of deep-learning theories and architectures⁴ offer more and more opportunities for the widespread application of neural networks in various fields, including but not limited to pattern recognition⁵, computer vision^6-8, natural language processing^9-11, etc. Particularly, OpenAI's ChatGPT¹² and Sora¹³ have led technological innovations in the fields of natural language generation and video content generation, respectively. However, this rapid progress is also accompanied by a fast increasing demand for computational resources and energy^14,15. Therefore, how to balance the performance of neural networks and the requirement of computational resources and energy has become a major challenge. Unlike the continuous signals in traditional deep learning, spike neural networks (SNNs) adopt pulse signals as a form of data transmission. Instead of transmitting each discrete pulse in real time, the neurons decide whether to pass the pulse signal to other neurons based on the cumulative effect of multiple pulse signals, a mechanism known as the event-triggered mode. Such a mode renders SNNs for high efficiency and low energy consumption characteristics, which has been confirmed in various fields^16-19.

The hardware implementation of SNNs is key to further reducing their energy consumption. SNNs have achieved substantial success at the aspect of software, however, there still remain tremendous challenges in hardware implementation. Nowadays, many matured neural network hardware based on traditional complementary metal-oxide semiconductor (CMOS) devices have been developed including central processing units (CPUs)²⁰, graphics processing units (GPUs)²¹, field-programmable gate arrays (FPGAs)²², application-specific integrated circuits (ASICs)²³, etc. CPUs exhibit fast computing performance but poor parallel processing capabilities, and the power required by GPUs is too large to be deployed at the terminal. Therefore, some alternative technologies have been developed for acceleration, for example, flexible and reprogrammable low-power field-effect transistor arrays (FPGAs), or highly optimized and application-specific custom accelerators. These accelerators usually appear in the form of ASICs, which provide greater design flexibility and higher energy efficiency, especially for the inference process in artificial intelligence. Large-scale SNN hardware based on FPGAs^24,25 and ASICs (such as TrueNorth²⁶) has also been realized and remarkable achievements have been made. However, limited to their von Neumann architecture, the advantage of SNNs in energy consumption has not been fully exploited yet. In contrast, emerging technologies that are compatible with the “in-memory computing” such as phase-change memory, resistive memory (RRAM), magnetic memory, and ferroelectric memory have shown better potential in the hardware implementation of SNNs. Among them, RRAM (i.e., memristors) has become a strong candidate for simulating the key units (neurons and synapses) of SNNs due to the merits of scaling potential (>10 nm)²⁷, fast switching speed (sub-nanosecond)²⁸, low programming current (nanoampere)²⁹ and high durability (<10¹²)³⁰. The exciting aspect of a two-terminal memristor array as network synapses is that, the computation and storage of data are physically separated in modern computers based on the von Neumann architecture, leading to the need for data to be transmitted back and forth between the computing unit and the storage unit³¹. This inevitably introduces issues of latency and energy consumption³². Particularly for neural networks that require memory-intensive computing, bandwidth limitations and energy consumption are the main challenges³³. In contrast, in the human brain, the computation and storage of data can occur in the same physical space. This new computing architecture is referred to as “in-memory computing”³⁴. The memristor arrays used for SNN hardware can meet the requirements of “in-memory computing” through the physical laws of Ohm's law and Kirchhoff's law³⁵, which reduces latency and energy consumption.

Although there already have been many reviews on memristors in terms of neural networks, covering the aspects of synaptic function implementation, integrated network training and operation, as well as neuromorphic computing applications^36-39. However, memristors have some inherent non-ideal characteristics, such as a limited number of conductance states, non-linear regulation, and randomness among others. The optimization of algorithms/architectures based on these inherent characteristics has not yet been fully explored, especially in the area of memristor-based hardware implementation of SNNs. Therefore, this paper starts from the basic principles of SNNs, then introduces the memristor-based technologies for hardware implementation of SNNs, and subsequently discusses the challenges of hard implementation from both the algorithmic and hardware perspectives. Finally, we focus on the algorithm optimization for memristors so as to provide a reference for the implementation of high-performance SNNs based on memristors. Based on the existing memristor technology, the current problems and challenges in this field are also summarized.

SPIKING NEURAL NETWORKS

Neural networks can be classified into three generations by Maass⁴⁰. The first generation, for instance, are perceptrons, which generate a singular numerical output through the execution of threshold operations⁴¹. Based on this, the second generation introduced continuous non-linear functions, such as the sigmoid function, rendering the output being continuous⁴². This evolution, supported by the noted backpropagation (BP), is culminated in today's deep neural networks (DNNs)⁴³. The principal distinction between the third generation and its predecessors lies in the mode of information transmission, where pulse signals replace real values⁴⁴. In terms of composition, SNNs are similar to traditional artificial neural networks (ANNs), in which neurons are interconnected to build specific structures via synapses. In the aspect of information transmission, however, SNNs operate in a completely different way. Specifically, in traditional ANNs, signals from the preceding layer are multiplied by corresponding synaptic weights and then summed, prior to being conveyed to neuron. After that, the neuron processes this signal through an activation function and transmits it to the next layer. In SNNs, pulse signals also need to be multiplied by synaptic weights and summed before being transmitted to neuron, but the result is not directly conveyed to the activation function. Instead, the neuron accumulates the effects of these signals over time till reaching a threshold. Afterwards, a pulse is emitted to the next layer, as depicted in Fig. 1a^44-47. This implies that, unlike traditional neural networks, SNNs do not need to activate all the synaptic weights and neurons for computation at each time step. Instead, they rely on the cumulative effect of multiple pulse signals to determine whether to continue transmitting the signal, which significantly reduces the energy consumption and enhances the efficiency.

View original graphic|Download|PPT slide

Fig. 1. Schematic of SNNs model. a, Classical SNNs structure, comprising a post-neuron driven by input pre-neurons. Reprinted with permission from ref.⁴⁴. © 2019 Nature Publishing Group. b, SNNs flow chart. c, The dynamics of LIF spiking neurons is shown. d, The spike-timing-dependent plasticity (STDP) Rule. Abbreviations: LIF, Leaky Integrate-and-Fire; SNNs, spike neural networks.

Neuron

A multitude of spiking neuron models have been proposed, depending on various degrees of biomimicry, for instance, the highly biomimetic Hodgkin-Huxley (HH) model, which meticulously delineates the variations in voltage and current on the neuronal cell membrane. In this model, the passive electrical properties of the neuronal cell membrane, and the dependence of sodium-channel and potassium-channel conductance on membrane potential and time, are taken into account⁴⁸. Despite matching well with the electrophysiological experimental results of biological neurons, the high computational cost of HH model makes it unsuitable for large-scale networks⁴⁹. The Izhikevich model simplifies the HH model. This model takes into account of the membrane potential of the neuron and a recovery variable. This recovery variable represents the activation effect of the potassium-ion current and the inactivation effect of the sodium ion, thus providing negative feedback to the membrane potential⁵⁰. Therefore, the Izhikevich model operates in a low computational complexity but still retains various firing modes of biological neurons, such as periodic discharge, burst discharge, etc. Although these bionic neuron models perform well, it is worth noting that not all applications require such a high-level but excessive power consumption biomimicry. It is necessary to balance the degree of biomimicry and computational cost according to the needs of specific applications. Therefore, the commonly adopted SNN neuron model is the Leaky Integrate-and-Fire (LIF) neuron model⁵¹, which simulates the basic dynamic behavior of neurons in a concise and efficient manner: the membrane potential of the neuron changes with the input current. Once the membrane potential reaches a certain threshold, the neuron will generate a pulse, and then the membrane potential will fall back to the resting value. The LIF model has succeeded in neural network researches⁴⁵,⁵², ⁵³, ⁵⁴, ⁵⁵. The excellent balance achieved between perfect imitation and energy efficiency makes it more suitable for large-scale neural network applications. The basic membrane potential dynamics of this model can be described by the following formula:

(1)$I_{\mathrm{in}}(t) R_{\mathrm{m}}=U_{\mathrm{mem}}(t)+\tau_{\mathrm{m}} \frac{d U_{\mathrm{mem}}(t)}{d t}$

(2)$U_{\mathrm{mem}}(t)=I_{\mathrm{in}}(t) R_{\mathrm{m}}+\left[U_{0}-I_{\mathrm{in}}(t) R_{\mathrm{m}}\right] e^{-\frac{t}{\tau_{\mathrm{m}}}}$

Where, I_in(t) denotes the current input value, R_m is the membrane resistance, U_mem(t) represents the membrane potential, τ_m is the membrane time constant, t is the time, and U₀ is the initial membrane potential. The U_mem(t) accumulates the effect of signals according to the aforementioned formula until it reaches a threshold, then fires a pulse and resets the U_mem(t) to the resting state (red line in Fig. 1b). It is also worth noting that the U_mem(t) decays in the form of β(e^−t/τm) (blue line in Fig. 1b) all the time. Therefore, the effect of aforementioned accumulation depends on the intervals of the signals. The dynamics of a single LIF neuron is depicted in Fig. 1c. In addition, there is a simplified model similar to LIF, namely the Integrate-and-Fire (IF) neuron model. In this model, the decay function in the U_mem(t) with time has been removed, resulting in a constant U_mem(t) even if no signal is received. Building upon the LIF model, in addition to its simplification to the IF model, there are more complex spike response model (SRM) neurons. After firing a pulse, these neurons enter a brief refractory period, during which they do not respond to any input. Moreover, in a neural network composed of SRM neurons, the activation of one neuron will briefly inhibit the activation of its neighboring neurons, which is a phenomenon known as lateral inhibition. Typically, the refractory period lasts longer than the duration of lateral inhibition, ensuring that the neuron has sufficient recovery time after firing a pulse. SRM neurons can simulate the adaptive behavior of biological neurons to continuous stimuli, avoid information overload through the refractory period, and support competitive learning through lateral inhibition. SRM neurons with refractory periods and lateral inhibition are of significant importance for unsupervised local training methods, such as spike-timing-dependent plasticity (STDP), which enables neurons to have higher adaptability and flexibility when dealing with complex tasks⁵⁶. In addition, there are adaptive threshold neurons that can dynamically adjust the threshold according to the input data, such as the Double Exponential Adaptive Threshold neuron. The use of this neuron can improve the convergence speed and accuracy of the network, and it can enhance the memory capacity of the network by taking both long-term and short-term input information into consideration. Moreover, researches have shown that networks that use adaptive threshold neurons have strong resistance to minor hardware changes, which demonstrates good robustness⁵⁷.

Synapse

For synapses, their “weight” plasticity is the foundation of biological learning and memory functions⁵⁸. The essence of neural network training is to adjust the synaptic weights through some kinds of method⁵⁹. In SNNs, the pulse signal is discrete, and the activation function is non-differentiable, so the BP algorithm cannot be directly used as in ANNs to accurately update synaptic weights. As for the training methods of SNNs, they can be roughly divided into two types: rule-based and optimization-based.

Rule-based learning methods are inspired by STDP and reward-modulated STDP (R-STDP) in biology. The basic principle of the STDP rule is that the strength of the synaptic connection between the pre-neuron and the post-neuron will be strengthened or attenuated depending on the order in which they are activated. In this way, STDP can realize the self-organization and functional differentiation of neurons, that is, unsupervised learning functions, as shown in Fig. 1d^60,61. Similarly, R-STDP adopts STDP to regulate synaptic plasticity, but the training is under the guidance of external reward signals. Upon this, the goal orientation and behavior selection of neurons can be achieved, namely, weakly supervised learning functions^62,63. These two rule-based learning algorithms exhibit the characteristics of high efficiency and adaptability. Peter U. Diehl et al. used the STDP rule to train SNNs and achieved 95% accuracy in the Modified National Institute of Standards and Technology database (MNIST) benchmark test. The network structure they built contains two layers (as shown in Fig. 2a): the input layer and the processing layer. The processing layer is composed of excitatory and inhibitory neurons. The latter ones are employed to achieve lateral inhibition, and at the same time, an adaptive threshold is introduced in the neurons to prevent a single neuron from dominating the response mode and thus ensure the differentiation of the receptive field of the neurons⁶⁴. However, these methods ignore the complex dynamic process and task information inside the network when designing, and as a result, the prediction accuracy tend to decline quickly with increasing number of layers.

View original graphic|Download|PPT slide

Fig. 2. Training of synaptic weights. a, Unsupervised STDP network structure. Reprinted with permission from ref.⁶⁴. © 2015 Diehl and Cook. b, Schematic diagram of the structure of the BP algorithm based on STDP. Reprinted with permission from ref.⁶⁵. © 2019 Elsevier B.V. c, Modification of the CNN structure according to the requirement of SNNs, before (left) and after (right). Reprinted with permission from ref.⁴⁶. © 2014 Springer Science+Business Media New York. Abbreviations: BP, backpropagation; CNN, convolutional neural network; STDP, spike-timing-dependent plasticity.

Optimization-based methods minimize the error between the output and the expectation through mathematical means, introducing BP to optimize the network globally during the training process. Amirhossein Tavanaei et al. proposed a BP algorithm based on STDP, as shown in Fig. 2b. The core idea is to use the approximate relationship between Rectified Linear Units neurons and IF neurons to transform the traditional BP update rule into an STDP rule based on pulse time. This method retains the accuracy of BP while utilizing the biological rationality and computational efficiency of STDP⁶⁵. A threshold-dependent batch normalization method based on STDP backpropagation has also been proposed, achieving direct training of high-performance deep SNNs on ImageNet and efficiently implementing its inference on neuromorphic hardware⁶⁶. However, introducing rule-based local training methods (such as STDP) during training preserves biological rationality but can lead to lower accuracy, limiting the competitiveness and practical application of SNNs. BP based entirely on gradient descent is more likely to achieve high-precision training of models^67-69. Backpropagation through time (BPTT) has become an efficient supervised learning algorithm for training SNNs, with BPTT divided into forward propagation, BP, and weight update. Because the neurons of SNNs are dynamic, BP needs to be performed over the entire time step, propagating from the last time step to the first, the opposite of forward propagation. During BP, it is necessary to calculate the pulse gradient and neuron membrane potential gradient at each time step, using these gradients to update the weights of the SNN model to reduce loss. The current mainstream accelerators are still used for ANN series networks, but accelerators based on BPTT training SNNs are also being actively explored for training SNNs⁷⁰.

In addition to directly training SNNs, there is another solution based on converting ANNs to SNNs (ANN2SNN). For example, Yongqiang Cao et al. proposed a method to convert convolutional neural networks (CNNs) to SNNs, in which the corresponding CNN structure has been modified according to the requirements of SNNs before training, such as eliminating negative inputs and biases, adopting HalfRect activation functions, and using linear subsampling; detailed information are shown in Fig. 2c. After training the structured CNN by BP algorithm, the trained CNN weights are mapped to a similar SNN architecture⁴⁶. In this way, the merits of the low-power-consumption and high-efficiency advantages of SNNs can be acquired at the cost of almost no loss to the accuracy of the ANN counterpart⁷¹. However, most of the ANN2SNN methods are based on frequency coding, which means that a long time step to accumulate the output pulses of neurons is required to get accurate results. This will undoubtedly increase the inference time and power consumption of SNNs and thus reduce their efficiency. In other words, this method is not suitable for online learning⁷². Online learning is capable of updating model parameters to adapt to the changes in data distribution in real time, which requires SNNs to have capabilities of continuous learning.

In the current field of machine learning, mainstream platforms such as TensorFlow, PyTorch, and Keras do not provide sufficient support for SNNs. Therefore, to build, train, and deploy SNNs more efficiently, frameworks specifically designed for SNNs are needed. In this regard, many researchers have conducted in-depth exploration. For example, SpikingJelly is a framework developed for the construction, training, and deployment of spike-based intelligent systems⁷³. It provides efficient simulation and acceleration methods for SNNs and is based on the PyTorch framework; thus, it can utilize the parallel computing capabilities of GPUs. SpikingJelly supports various spiking neuron and synapse models, and researchers can choose the appropriate model according to their needs. In addition, SpikingJelly also provides a fast ANN-SNN conversion interface, preprocessing of neuromorphic datasets, and SNN analysis functions. There is also Snntorch, which is also based on the PyTorch framework. It is a framework specifically used for gradient optimization of spiking neural networks⁷⁴. It provides pre-designed spiking neuron models and is deeply integrated with PyTorch, allowing researchers to develop and train SNNs in a familiar environment. The downside is that compared to some static graph frameworks (such as TensorFlow), the training speed of Snntorch is slower. There is also Brian2, an upgraded version of Brian, developed based on the Python programming language⁷⁵. Brian2 is an equation-oriented simulator that allows researchers to directly define neuron and synapse models using mathematical equations, providing greater flexibility in building neural networks. Brian2 can also automatically convert user-defined models into efficient low-level code, which is very useful for simulating large-scale networks. All of the aforementioned frameworks provide researchers with more efficient ways to build, train, and deploy SNNs, laying a solid foundation for the future development of SNNs.

MEMRISTOR

Since the memristor theory proposed by Chua was linked to the resistive switching phenomenon of TiO₂ devices by HP Labs in 2008⁷⁶, this field has been widely explored. In addition to various binary oxide-based materials^{77, 78, 79, 80, 81}, there are also chalcogenides^82,83, two-dimensional materials^84,85, organic materials^86,87, etc., that have been reported in the researches of memristors, and a relatively matured theory on memristors has been developed⁸⁸. A typical memristor is usually a two-terminal configural resistor device (metal-insulator-metal), with a memory function of non-linear response characteristics. It is the fourth type of circuit element in addition to resistance, capacitance, and inductance, with the advantages of high speed, low power consumption, and high scalability⁸⁹. Especially, they have important application prospects in the fields of information storage, logic operations, neural networks, etc.⁹⁰. Memristors have different physical mechanisms according to the selection of the dielectric layer and electrodes, which would not be discussed in detail here. Generally, it is widely accepted that under the action of an electric field, the migration of electrons and ions or some other physical particles work together to induce local structural changes in the material, which in turn leads to changes in the conductive state⁹¹. According to the different ability of maintaining conductivity, memristors can be divided into two categories: non-volatile and volatile.

Non-volatile

Taking the most classic resistive switching mechanism (ionic effect) as an example, it can be divided into valence change memory⁹² and electrochemical metallization memory⁹³, which correspond to the mechanism of oxygen anions (dielectric layer) and active metal ions (electrodes) directed moving under an electric field to eventually form conductive filaments (CFs), respectively, as shown in Fig. 3a. The fundamental I-V curve is depicted, as in Fig. 3b. In the off state, as the positive voltage increases, the conductance changes from a high-resistance state (HRS) to a low-resistance state (LRS); in the on state, as the negative voltage decreases, the conductance changes back from a LRS to a HRS, and the read voltage always remains at a low positive voltage⁹⁴. However, this ion-conduction-based method will bring some problems, for example, continuous write-erase operations will cause durability failures due to the destructive nature of the programming method, and the conductance state after programming will drift over time. Therefore, even if the on/off ratio of device is very large, there are only a few stable and available conductance states that can be reached. Non-volatile memristors are mainly used to simulate synaptic plasticity by adjusting their multi-level conductance states in neural networks^37,95,96. which puts higher requirements on the linearity, symmetry, durability, and retention of the memristor working curve. To meet these requirements, researchers are keep optimizing relevant performance through doping, interlayering, annealing, and other methods. Interlayering can effectively improve the linearity and symmetry of memristor conductance curves. Zongwei Wang et al. inserted a SiO₂ layer that restricts oxygen-ion diffusion at the TiN/TaO_x interface (Scanning Transmission Electron Microscopy (STEM) image is shown in Fig. 4a) to achieve uniform conductance modulation. This method can effectively suppress the growth/dissolution rate of CFs in memristors, thereby achieving more uniform conductance modulation, increasing linearity and symmetry (Fig. 4b-c)⁹⁷. Lei Wu et al. found that the resistance evolution of Al-doped HfO₂ devices is milder than that of undoped Al (Fig. 4d) and that multi-value storage of devices can be achieved by changing the limiting current or using pulse sequences and that the retention time of the multi-conductance state of the device exceeds 10⁴ s at 85 °C⁹⁸. Woo Sik Choi et al. inserted Al₂O₃ into InGaZnO memristors, achieving higher durability and more stable transient characteristics. As shown in Fig. 4e, in 500 scanning cycles, compared with the instability of the S₁ (without Al₂O₃ intercalation) conductance state, the conductance fluctuation of S₂ (with Al₂O₃ intercalation) is much smaller. Moreover, the high resistance of Al₂O₃ can suppress thermal fluctuations, and S₂ shows more stable transient characteristics and smaller resistance state changes (Fig. 4f)⁹⁹. Interlayering combined with doping can further improve the durability and retention of memristor conductance. Yun-Lai Zhu et al. used the method of interface doping (combination of doping and interlayering) to improve the performance of TiN/HfO₂/Pt memristors. They prepared two optimized devices with different insertion configurations, namely TiN/HfO₂/HfO₂:Al/Pt and TiN/HfO₂:Al/HfO₂/Pt. The former device showed improved durability and retention, whereas the latter showed a decline in performance. This is because different insertion positions lead to different CF formation/breakdown mechanisms; therefore, it is necessary to pay attention to the impact on CF formation/breakdown in the meantime¹⁰¹. Annealing treatment in inert atmosphere can improve the uniformity of ion or vacancy distribution in the device, further optimizing and stabilizing the performance of memristors. Yongyue Xiao et al. studied the effect of N₂ environment annealing treatment on the performance of Pt/HfO₂/BiFeO₃/HfO₂/TiN structure memristors. After N₂ annealing, the uniformity of the memristor (Fig. 4g), and the pulse linearity were improved (Fig. 4i). Furthermore, the durability was increased by 10³ times (Fig. 4h). As we know, the conductance regulation of oxide-type memristors can be attributed to the migration of oxygen vacancies to form CF. Annealing in N₂ produces more oxygen vacancies in the oxide, therefore, resulting in a better control to the conductance of memristors¹⁰⁰.

View original graphic|Download|PPT slide

Fig. 3. Basic performance diagram of a non-volatile memristor. a, The principle of two different conductive filaments. b, Fundamental I–V curve of non-volatile memristors. Reprinted with permission from ref.⁹⁴. © 2023 The Author(s).

View original graphic|Download|PPT slide

Fig. 4. Performance optimization for non-volatile memristors. a, STEM image of a TiN/SiO₂/TaO_x/Pt device. Comparison of conductance curves for no intercalation (b), and intercalation (c). Reprinted with permission from ref.⁹⁷. © 2016 Royal Society of Chemistry. d, Al-doped and non-doped I–V curves. Reprinted with permission from ref.⁹⁸. © 2019 The Author(s). e, Measured endurance and f, transient characteristics with or without intercalation. Reprinted with permission from ref.⁹⁹. © 2022 Elsevier Ltd. g, The cumulative probability distribution of set and reset voltage of both devices (N₂ annealing is red). h, Endurance test (N₂ annealing is red). i, The reproducible conductance modulation of N₂ annealing device under amplitude-repeated potentiation and depression pulses. Reprinted with permission from ref.¹⁰⁰. © 2022 Elsevier Ltd and Techna Group S.r.l. Abbreviation: STEM, Scanning Transmission Electron Microscopy.

In neural networks, non-volatile memristor cross arrays are adopted to simulate synaptic network structures. These cross arrays can not only store weight parameters but also provide computational capabilities (storage and computation in one), physically implementing the main workload of network computation—vector matrix multiplication (VMM) based on Ohm's law and Kirchhoff's current law (this is consistent in ANN and SNN). J. Joshua Yang and Qiangfei Xia et al. employed hafnium dioxide memristors to build a reconfigurable cross array (128 × 64), as shown in the left figure of Fig. 5a, and the right figure of Fig. 5a shows the chips integrated with different-scale memristor arrays and related testing equipment as well as the microscope images of four 1T1R. This cross array exhibits a high yield (99.8%), high stability (no drift after 6.4 h), multiple conductance states (6 bits), and a good linear I-V relationship¹⁰². They adopted this cross array to demonstrate signal processing, image compression, and convolutional filtering¹⁰³, moreover, further evaluation on the computational accuracy of the VMM produced by this array was performed, which can reach a classification accuracy of 89.9% on the MNIST handwritten digit set¹⁰⁴. The network trained in situ with this cross array even showed competitive classification accuracy on standard datasets¹⁰⁵. In order to increase the array scale to achieve more complex tasks, Qi Liu et al. proposed a fully integrated memristor-based computing-in-memory (CIM) chip, the array scale could reach 158.8 kb. This chip innovatively adopted a sign-weighted 2T2R (SW-2T2R) structure to reduce the accumulated source line current (ISL), which significantly reduces the adverse effects brought by IR drop. And the chip also exhibits a low-power interface design, that is, achieving a flexible trade-off between system accuracy and power consumption by employing a resolution-adjustable LPAR-ADC, and the layout structure of SW-2T2R and LPAR-ADC is depicted in Fig. 5b. Then they employed this chip to implement a fully integrated multilayer perceptron (MLP) (784-100-10) model, and high-recognition accuracy (94.4%), high-inference speed (77 μs/image), and 78.4 TOPS/W peak energy efficiency were achieved on the MNIST database by this model, showing the significant advantages of the CIM chip based on large-scale memristor arrays¹⁰⁶. More complex CNN memristor hardware was also implemented in the same year. Huaqiang Wu et al. adopted eight memristor arrays (each 2048 units) to successfully build a five-layer CNN hardware; the integrated printed circuit board (PCB) subsystem is shown in Fig. 5c, where the processing element (PE) chip has 2048 memristors and on-chip decoder circuits. This hardware uses a hybrid training method to adapt to device defects and parallel convolution technology so as to eliminate the throughput gap between memristor-based convolution computation and fully connected (FC) VMM. On the MNIST dataset, the image recognition accuracy was further improved to 96%, and its energy efficiency was two orders of magnitude higher than that with Nvidia Tesla V100 GPU^107,109.

View original graphic|Download|PPT slide

Fig. 5. Non-volatile memristor arrays. a, A photo of a probe card touching a 128 × 64 1T1R array. A photo (2 cm × 2 cm) showing two 1T1R memristor crossbars dies with various array sizes and test devices. A photomicrograph of four cells in a 1T1R array (10-μm scale bar) and its structure (from left to right). Reprinted with permission from ref.¹⁰³. © 2017 Macmillan Publishers Limited, part of Springer Nature. b, A photomicrograph of a CIM chip based on a memristor array (158.8 kb), which includes the layout of SW-2T2R units and resolution-adjustable LPAR-ADC units. Reprinted with permission from ref.¹⁰⁶. © 2020 IEEE. c, A photograph of an integrated PCB subsystem, including an image of a partial PE chip with a 2048 memristor and on-chip decoder circuits. Below is a schematic of the 1T1R connections on the PE chip. Reprinted with permission from ref.¹⁰⁷. © 2020 The Author(s), under exclusive licence to Springer Nature Limited. d, SEM images of the 1-kb crossbar arrays (CAs) Reprinted with permission from ref.¹⁰⁸. © 2024 The Author(s). Abbreviations: CIM, computing-in-memory; PCB, printed circuit board; PE, processing element; SEM, scanning electron microscopy.

It is worthy to be noted that the aforementioned memristor arrays all adopt a 1T1R structure, which is similar to that of dynamic random-access memory. Its operation principle and usage are quite intuitive, and it can effectively isolate the mutual influence of currents between different units. However, the inclusion of transistors increases the complexity and limits the scalability of the system. Therefore, transistor-less metal oxide memristor crossbar arrays (0T1R) have also been actively explored. M. Prezioso et al. have realized a transistor-less crossbar array based on Al₂O₃-/TiO_2-x-stacked metal oxide memristors for the first time and adopted this array to implement a simple integrated neural network (single-layer perceptron). They demonstrated in-situ training with the adoption of a coarse-grained delta-rule algorithm and performed pattern recognition on 3 × 3 pixel black and white images³⁶. F. Merrikh Bayat et al. also demonstrated a fully integrated three-layer neural network hardware based on the mixed signals. This hardware consists of two 20 × 20 memristor crossbar arrays (0T1R) integrated with CMOS units in external circuits on a single circuit board, demonstrating high complexity¹¹⁰. However, due to the lack of selection function for units in the array, it limits the network scale and reduces the computational accuracy of the network. Compared with the above solutions that prevent crosstalk by adding a transistor (1T1R) to each memristor or by additional operations when there is no transistor, memristors with self-rectifying characteristics do not increase the additional consumption and complexity of the circuit and can effectively avoid crosstalk. They are endowed with natural advantages in information storage and neural synapse simulation. Kanghyeok Jeon et al. have recently successfully integrated self-rectifying memristors into a 1-kb array. The scanning electron microscopy (SEM) image is shown in Fig. 5d. They adopted this array to achieve full-hardware single-layer neural network training and inference, achieving 100% accuracy on the MNIST dataset. They further found that defects in the array (especially open circuit defects) could significantly reduce the classification accuracy since the units at the defect cannot switch correctly, which leads to incorrect array output. The impact of the read range of non-filamentary memristors on accuracy is not significant. This may be ascribed to the fact that although non-filamentary (interface-type) memristors have a smaller read range, the devices exhibit highly consistent operating characteristics and can reliably represent the weights of neural networks, and the output is not unpredictable due to inconsistent memristor behavior¹⁰⁸.

With the advancement of technology, the supervised learning neural network memristor array mentioned earlier has been demonstrated multiple times, and the field is gradually developing in other directions, for example, in-situ unsupervised learning of self-organizing maps (SOMs). Rui Wang et al. have experimentally implemented in-situ SOM on memristors based on Ta/TaOx/Pt 1T1R chips for the first time. In an SOM, the Euclidean distance between the input vector and the weight vector is employed to measure their similarity. The smaller the Euclidean distance is, the more similar the two vectors are. During the training process, for a given input, the neuron that is most similar to it (i.e., the smallest Euclidean distance) is found. This neuron is called the Best-Matching Unit (BMU). Subsequently, according to the neighborhood function, the weights of the BMU and the neurons in its neighborhood are updated to make them closer to the current input vector. This process is repeated until the network reaches a stable state. They calculated the Euclidean distance directly in the hardware by adopting an additional 1T1R unit row. The similarity between the input vector and the weight vector can be calculated directly in the hardware without normalizing the weights. They further adopted the memristor-based SOM for data clustering and image processing and solved the traveling salesman problem. The intrinsic physical properties of memristors and the large-scale parallelism of cross arrays make the memristor-based SOM advantageous in terms of computation speed, throughput, and energy efficiency¹¹¹. In addition, memristor arrays are also being applied to non-neural network fields, such as medical image reconstruction. Han Zhao et al. proposed and experimentally verified a system called “memristive image reconstructor (MIR).” This system is based on memristor arrays for medical image reconstruction, and it performs excellently in terms of energy efficiency and computation speed, and shows high robustness to the non-ideal characteristics of memristors. The MIR system employes the computation method of discrete Fourier transform (DFT), and by computing directly on the memristor array, it greatly reduces the energy-intensive data movement. The paper also proposes a quasi-analog mapping scheme so that the MIR system can more accurately map DFT matrix entries to the conductance of memristors and also uses a complex matrix transmission scheme, so that the MIR system can directly obtain the real and imaginary parts of the DFT result in one step, thus improving the transmission efficiency. In magnetic resonance imaging and computed tomography image reconstruction tasks, compared with Nvidia Tesla V100 GPU, the energy efficiency has been respectively increased by 112 times and 153 times, and the normalized image reconstruction speed has been respectively increased by 36 times and 79 times, which demonstrates the huge application prospects of memristor arrays in medical image reconstruction¹¹².

Volatile

Volatile memristors are gradually developed as spiking neurons applied in neural networks, such as diffusive memristors¹¹³ and Mott memristors¹¹⁴. Similar to non-volatile memristors, volatile memristors still need to be integrated with CMOS devices to achieve more complex neuron functions, however, the introduction of memristors significantly simplifies the neuron circuit. Zhongrui Wang et al. adopted diffusive memristors to successfully simulate the function of LIF neurons. The middle layer of this memristor is composed of dielectric materials doped with silver (such as SiO_xN_y or SiO_x). In this model, volatile memristors were employed to simulate ion channels. When there is a signal input, the membrane capacitance begins to charge. If the charge reaches a threshold, the ion channel will be activated, triggering a neural firing to discharge (Fig. 6a). In practical operation process, the device will be excited after a certain number of pulses and will be reset to its initial state after each neural firing (Fig. 6b)¹¹⁵.

View original graphic|Download|PPT slide

Fig. 6. Volatile memristor-based spiking neurons. a, Illustration of an ion channel embedded in the cell membrane near the soma of a biological neuron. b, Experimental plot of the device's response to multiple voltage pulses. Reprinted with permission from ref.¹¹⁵. © 2018 Macmillan Publishers Limited, part of Springer Nature. c, Schematic of habitation in the SNS under the repetition of an identical stimulus (left) and the fully memristor-based artificial SNS circuit (right). d, The habitual behavior of the system under a certain frequency pulse, the pulse emission is from fast to slow. Reprinted with permission from ref.¹¹⁶. © 2020 Wiley-VCH GmbH. e, Schematic of the switching mechanism of the device under different Icc. f, The typical I–V curve of device under Icc of 100 μA (left) and 500 μA (right). Abbreviation: SNS, sensory nervous system.

In order to attenuate the response of LIF neurons to the same stimulus, volatile memristors are also employed to construct LIF neurons that can adapt to the environment. These neurons could gradually reduce their pulse firing frequency under repeated stimulation of the same signal. Zuheng Wu et al. reported a memristor-based artificial sensory nervous system with habituation characteristics. This system is based on an LIF neuron built with a Mott (Ag/SiO₂:Ag/Au) memristor and a Li_xSiO_y memristor synapse (TiN/Li_xSiO_y/Pt), as shown in Fig. 6e (right), which can filter out irrelevant repetitive information and adapt to the external environment. The realization of this function mainly comes from the habituation tendency of the synapse under continuous stimulation, as shown in Fig. 6c (left), so that the output of the neuron presents a frequency-drop characteristic (Fig. 6d). The process is shown in Fig. 6e. In the initial state, the synapse is in an HRS (i) since the lithium silicate film is in a low-conductivity amorphous state. When applying a positive voltage to the device under a lower limiting current (100 μA), a CF composed of crystalline lithium silicate will be formed, causing the device to switch from an HRS to an LRS (ii). Due to insufficient Joule heat, the filament is difficult to be melted, instead, a negative voltage is required for destroying the filament. However, at a higher current (500 μA), a crystalline filament that can increase the conductivity (iii) will be formed firstly, and then the filament tends to break under the Joule thermal effect, which results in a decrease in conductivity (iv) and an accompanying decrease in response. The two situations correspond to the I-Wiley-VCH curves in Fig. 6f ¹¹⁶. This kind of indirect method for adapting to the environment puts higher requirements on the synapse. Rui Yuan et al. designed a high-efficiency neuromorphic detection system based on VO₂ memristors, which can directly implement adaptability in LIF neurons. This system contains mainly two major modules: coding and information processing. In the coding segment, they designed an asynchronous encoder that can pulse encode the increment of analog signals based on the highly symmetrical volatile-threshold resistive switching characteristics of VO₂ memristors (Fig. 7a [top]), and the pulse sequence after encoding has high sparsity and minimizes loss of information (Fig. 7a [bottom]). In the neuronal signal processing segment, not only have LIF neurons been constructed using VO₂ memristors (Fig. 7b [top right]), but an adaptive version of these neurons, known as Adaptive Leaky Integrate-and-Fire (ALIF) neurons, has also been designed (Fig. 7b [left]). Under signal stimulation, these ALIF neurons initially maintain a high frequency, which then decreases and stabilizes at a lower level (Fig. 7b [bottom right]). Ultimately, both LIF and ALIF neurons have been integrated into a long- and short-term memory spiking neural network (LSNN). The detection system based on an asynchronous encoder and LSNN shows excellent computing power, and only a small network is required to achieve a quite high accuracy in physiological detection¹¹⁷. However, the adaptive function of the ALIF neuron is derived from the peripheral CMOS circuit. In contrast, Haowei Wang et al. realized the adaptive function without changing the LIF neuron circuit structure. They used an Ag/Ti/GaSe/Pt/Ti memristor with a threshold switch function to make an LIF neuron of adaptive function. Under a constant pulse sequence, the pulse triggering frequency in this neuron continuously and adaptively decreases to a stable level (Fig. 7c)¹¹⁸. Besides, Jianhui Zhao et al. made a light-sensitive NdNiO₃ volatile-threshold switch memristor that responds to the variation of light intensity, and its I-V curve is shown in Fig. 7d. Based on this device, an LIF neuron with a light-intensity-dependent firing frequency has been fabricated, without the need for additional light sensors in the circuit¹¹⁹.

View original graphic|Download|PPT slide

Fig. 7. Volatile memristor-based spiking neurons. a, Circuit schematic of memristor-based asynchronous spike encoder. b, Circuit schematic of the VO₂ memristor-based ALIF neuron. Reprinted with permission from ref.¹¹⁷. © 2023 The Author(s). c, Spike frequency gradually decreases with time under different inputs. Reprinted with permission from ref.¹¹⁸. © 2023 IEEE. d, I–V characteristics for different light intensities. Reprinted with permission from ref.¹¹⁹. © 2023 Royal Society of Chemistry. Abbreviation: ALIF, Adaptive Leaky Integrate-and-Fire.

Apart from being used as neuron components, volatile memristors can also be widely used as selectors in synaptic arrays to suppress sneak currents. For example, Moonkyu Song et al. inserted a single-layer defect graphene between the Ag electrode and the Al₂O₃ interlayer. The defect pores in graphene restrict the migration of Ag cations and the size of Ag CFs, which leads to the self-compliance of the selector without the need for additional CMOS devices¹²⁰. Moreover, Qilin Hua et al. prepared threshold-switching selectors that showed high selectivity (greater than 10⁸), high conduction current (greater than 100 μA), and high durability (10⁸) through rapid thermal processing. The selector can maintain stable switching even at 200 °C¹²¹.

ALGORITHMIC OPTIMIZATION BASED ON MEMRISTORS

Constrained by the inherent characteristics of memristors and current fabrication technology, the space for optimizing device performance through process and technology is limited. However, the application potential of memristors can be fully explored by directly adopting non-ideal memristors or designing specific algorithm structures for non-ideal characteristics.

Memristors with good symmetry and linear conductance regulation can improve the accuracy of ANN simulation¹²², but the conductance curve of real memristor devices often fails to meet the requirements. Some researchers have found that SNN has a strong tolerance for the non-linearity of the device (Fig. 8a). Algorithm simulation results show that as long as the memristor has a symmetrical LTP and LTD curve or a positive long-term potentiation (LTP) and long-term depression (LTD) non-linearity factor (the positive and negative nonlinearity factors are shown in Fig. 8b), the network can maintain high accuracy, and the actual manufactured memristor can meet the requirements of positive LTP and LTD non-linearity factors. This means that SNNs can directly use asymmetric and non-linear memristors as synapses¹²³. On the other hand, synapses in the SNN network often have a large number of weights with different values, and the existing memristor only has a limited number of conductance states. Even the currently most outstanding memristor has achieved 2¹¹ conductance states with high-precision tuning¹²⁴, but the complexity of regulation and the reliability of the device after programming cannot be ignored. Nitin Rathi et al. can not only reduce the network through the algorithm of sparse SNN topology but also adapt to the limited conductance state of the memristor. Its schematic diagram is shown in Fig. 8c. The sparsity comes from pruning, and the use of a weight-dependent STDP model which will retain or delete synaptic connections based on the pulse correlation of the front and rear neurons, will retain synaptic connections with stronger weights, and prune other synaptic connections. In order to adapt to the limited conductance state of the memristor, the synaptic weight of the retained connection is quantized to the available conductance state. In the training process, the network is pruned and quantized every training period to achieve the purpose of sparse SNN topology. Training results show that sparse SNN topology has the same classification accuracy as FC SNN topology with lower energy consumption¹²⁵. For non-volatile memristors, HRSs and LRS (binary) are considered as the most stable and least random states. Therefore, non-volatile memristors demonstrate the potential to develop into commercial memory¹²⁶. In order to take full advantage of the excellent performance of memristors and avoid the challenge of resistance state instability in neuromorphic hardware, two-state nonvolatile memristors tend to be attractive options for implementing neural network hardware. Binary neural networks (BNNs) are the products of quantizing weights of DNNs and have demonstrated great application potential^127-132. In BNNs, network weights and neuron activations are limited to be −1 or 1, which greatly reduces the demand for conductance quantity and is compatible with non-volatile memristors. In addition, with the use of bit operations (such as XNOR) instead of arithmetic operations, the algorithm efficiency can be significantly improved by BNNs. The concept of BNNs is also introduced to develop hardware-friendly binary SNNs (BSNNs). During the training process, BSNNs use surrogate gradient methods to replace the derivatives of spike training, and binarize the weights. In BSNNs, the input/output and the activation values of neurons in SNNs are all binary. Such a simplification treatment significantly reduces the requirements for memristor performance and renders the memristor hardware to have a great practical application potential for bit-operation inference, which is conducive to the integration of large-scale neuromorphic hardware. After training, BSNNs showed excellent recognition accuracy on dynamic image datasets N-MNIST and DVS-CIFAR10, dynamic gesture dataset DvsGesture, and dynamic audio dataset N-TIDIGITS18^133,134. In the implementation of BSNNs, a memristor synapse array that performs XNOR operations can be adopted to implement its computing core. This method of adoping logical operations (such as XNOR) has been successfully demonstrated in many studies^135,136. However, the binarization of weights sacrifices the precision of a single synapse, which inevitably leads to an increase in both the network scale and the energy consumption.

View original graphic|Download|PPT slide

Fig. 8. Memristor-based algorithm optimization. a, The final accuracy for 121 different cases of LTP and LTD. Reprinted with permission from ref.¹²³. © 2021 Authors. S.A. b, The weight-update curves (different non-linearity factors). c, SNNs topology with lateral inhibition. Reprinted with permission from ref.¹²⁵. © 2019 IEEE. d, Diagram of the STELLAR architecture used in the memristor chip. Reprinted with permission from ref.¹³⁷. © 2023 American Association for the Advancement of Science. e, Flowchart of the hybrid-training method (left) and diagram of the demonstration with hybrid training (right). Reprinted with permission from ref.¹⁰⁷. © 2020 The Author(s), under exclusive licence to Springer Nature Limited. Abbreviations: LTD, Long-Term Depression; LTP, Long-Term Potentiation; SNNs, spike neural networks.

In addition to the limited number of conductive states, many other non-ideal characteristics of memristors can be mitigated by designing specific network topologies to reduce their impact on hardware networks. Wenbin Zhang et al. proposed the STELLAR algorithm architecture that can adapt to device non-idealities and confirmed its excellence on a fully integrated memristor chip. On the condition of ensuring learning capabilities, this architecture could achieve about 75 times higher energy efficiency than the digital accelerators. It is designed as a universal method suitable for different neural network structures. This algorithm does not use a write verification scheme but tunes the memristor without verification, making the hardware adapt to the non-linearity and asymmetry of the device. This could be ascribed to the fact that the computational modules on the chip do not perform precise weight calculations but are adopted to determine the update direction of the weight memristor cross array. For a two-layer network, the update direction only depends on the signs of the first-layer output, the second-layer output, and the error. After the direction is determined, the same SET or RESET voltage pulse is applied to the memristor unit. At the same time, a configurable threshold is introduced for adapting to different learning tasks. This threshold can filter out small error values, avoid too frequent weight updates and oscillation states of the network, and improve the convergence of the algorithm. This process is depicted in Fig. 8d. During the forward inference phase, the vector-matrix multiplication operation is performed using the memristor cross array to obtain the output vector of the network, where, X denotes the input vector, Y1 is the output vector of the first layer, Y2 is the output vector of the second layer, and W1 and W2 represent the conductance weight matrices of the first and second layers, respectively. In the weight-update phase, the error vector E is firstly calculated with the adoption of the target vector T and Y2, and the error sign SE is extracted under the influence of the configurable threshold. Combined with the sign vector SY1 of the first layer output and the sign vector SY2 of the second layer output, the weight update ΔW2 is calculated in a row-parallel manner using cyclic parallel conductance tuning¹³⁷. Various non-ideal characteristics of memristors can also be avoided through hybrid training algorithms. As shown in Fig. 8e, the model is firstly trained in software through ex situ training to obtain high-precision weights. Subsequently, these weights are quantized and mapped to the memristor cross array. This step will lead to accuracy loss, and errors in the mapping will be caused due to the non-ideal characteristics of the memristor, such as randomness, drift, etc. Finally, in order to compensate for the errors introduced by quantization and device non-ideal characteristics, the last FC layer is trained in situ using part of the data. In situ training uses the memristor-based neural processing unit to calculate output data, compares it with the label to get the error signal, and then inputs it into the error BP module to generate the corresponding pulse sequence with the learning rate, which is used to adjust the memristor conductance value of the FC layer until the entire system reaches the preset accuracy or times¹⁰⁷. When mapping network weights to large memristor arrays, issues such as latency and energy consumption need to be taken into consideration. Shared buses can cause basic latency and energy bottlenecks in large neuromorphic hardware. The SpiNeMap design algorithm¹³⁸, which maps SNNs to neuromorphic hardware based on crossbars, can minimize peak latency and energy consumption. Compared with the best-performing SNN-mapping technology, SpiNeMap could reduce the average energy consumption by 45% and peak latency by 21%. Some energy-saving reconfigurable architectures have also been proposed, such as RESPARC for deep spiking neural network memristor arrays. As shown in Fig. 9a, when using CNN and MLP topologies, for the three datasets MNIST, SVHN, and CIFAR-10, energy consumption and performance acceleration were normalized. The results show that, for CNNs, the average energy consumption is reduced by 12 times, and the performance acceleration is 60 times. For MLPs, the average energy consumption is reduced by 513 times, and the performance acceleration is 382 times¹³⁹. These computing architectures can better map weights to memristor arrays with non-ideal characteristics to improve the efficiency and practicality. However, in-situ learning algorithms can avoid weight mapping and also improve the network's tolerance to non-idealities. For example, Can Li et al. built an in-situ learning multilayer neural network based on hafnium oxide memristor arrays (1T1R), which proves this point. They firstly applied a positive pulse to the bottom electrode of the memristor to initialize the array, and subsequently applied synchronous positive-voltage pulses to the top electrode of the memristor and the gate of the series transistor to change the conductance state of the memristor. Since the limiting current of the memristor is controlled by the gate voltage of the transistor, this scheme for regulating conductance can achieve linearity and symmetry of the conductance value while reducing the conductance changes between cycles and devices. This in-situ learning hardware network employs stochastic gradient descent for training and can achieve a classification accuracy of 91.71% on the MNIST dataset, which is close to the defect-free simulation results. Importantly, the training algorithm assumes that all the devices are pulse-programmable, and 11% of the devices in the memristor array are unresponsive to pulses (the typical defect value ratio observed in the experiment), which fully demonstrates that in-situ learning can adapt to hardware defects. The specific results are shown in Fig. 9b and c. It can be found in Fig. 9b that the in-situ training process adapts to defects and provides higher defect tolerance compared to networks that load offline training weights. Even if 50% of the devices are in a defective state, the network can still achieve an accuracy rate of 60%. In Fig. 9c, the tolerance of device defects for in-situ training of multi-layer and single-layer networks is compared. It is found that multi-layer networks are more conducive to improving defect tolerance. This may be attributed to the fact that when the network scale increases, in-situ training can provide better adaptability to reduce the accuracy drop caused by defective units¹⁰⁵.

View original graphic|Download|PPT slide

Fig. 9. Optimization of memristor-based algorithm performance diagram. a, Comparison of energy savings and performance speedups obtained per classification for RESPARC over the CMOS baseline for various SNNs applications with CNN and MLP topologies. Reprinted with permission from ref.¹³⁹. © 2017 ACM. b, The impact of non-responsive devices on the inference accuracy with in situ and ex situ training approaches. c, Comparison of defect tolerance between the multilayer network with and the single-layer network in situ training. Reprinted with permission from ref.¹⁰⁵. © 2018 The Author(s). d, Performance improvement (accuracy and AUROC) brought by weight noise injection (WNI) under different conductance ranges and variation coefficients. Reprinted with permission from ref.¹⁴⁰. © 2021 IEEE. Abbreviations: AUROC, area under the receiver operating characteristic curve; CMOS, complementary metal–oxide semiconductor; CNN, convolutional neural network; MLP, multilayer perceptron; SNNs, spike neural networks.

There are other methods to alleviate the performance degradation brought by the non-ideal characteristics of memristors to the network. For example, a method called weight noise injection (WNI) is introduced in the algorithm. This method simulates the random fluctuations generated by memristors during conductance switching by adding noise to the weights during training. The noise comes from the random distribution of memristor devices. In this way, the network can adapt to the non-ideal characteristics of memristors during training, thereby maintaining high performance when mapped to hardware. Fig. 9d shows the performance improvement (accuracy and area under the receiver operating characteristic curve values) of the WNI method on the MNIST and performance measurement system (PeMS) data under different conductance ranges and variation coefficients. When the variation on the MNIST data is 40% and the normalized distance of high and low conductance is 0.5, 10% improvement can be achieved by the network accuracy. However, the improvement on the PeMS data is not as significant as that on MNIST (3.3%), which could be attributed to the limited space for improvement¹⁴⁰. This non-ideal randomness can also be utilized. For example, Parami Wijesinghe et al. combined random memristors with other circuit elements (such as capacitors, transistors, etc.) to construct a spiking neuron with a random activation function. By combining it with a memristor array to form a circuit unit, and then stacking multiple circuit units together, a deep random SNN was formed. In order to support the random activation function provided by the memristor, they proposed a support algorithm to determine the voltage and current of each layer based on the input and output signals of each layer, ensuring that each memristor neuron can work normally and minimize the energy consumption as much as possible. The final result shows that the accuracy of the all-memristor deep random SNN in image classification is equivalent to that of ANN, but it saves about 6.4 times the energy of CMOS hardware. Moreover, the introduction of this random neuron is conducive to improving the robustness and generalization ability of the neural network¹⁴¹.

CHALLENGE

In spite of the fact that the SNNs have been theoretically demonstrated to exhibit good computational capabilities equivalent to Turing machines¹⁴², their training remains challenging due to the non-differentiability of the activation function, which makes the BP algorithm inapplicable. Rule-based training methods, for instance STDP, are applicable, however, it still remains a formidable challenge to train a deep SNN. In many reported spiking networks based on STDP, training is usually confined to a single layer^143,144. In contrast, various optimization-based methods have enabled the use of BP for training in SNNs, yielding commendable results. However, BP is not only biologically implausible but also has other limitations, such as the necessity for extensive adjustments and optimizations using the BP algorithm, which may not always be feasible or efficient in practical applications and hardware implementations. Therefore, more efforts focused on developing efficient and practical training methods are required to better exploit the potential of deep SNNs. This includes developing new learning rules and optimization strategies, as well as exploring how to integrate SNNs with other types of neural networks (such as CNNs or recurrent neural networks), which facilitates the application of SNNs in a broader range of tasks¹⁴⁵.

From the perspective of network accuracy, even the most advanced SNN training methods currently available cannot produce the competitive accuracy comparable to ANNs (for example, variants of BP in SNNs). This can be attributed to the fact that information loss occurs in the process of converting information into spikes within a finite time window, so SNNs are usually at a disadvantage in tasks designed for ANNs. Currently, there is a lack of appropriate benchmark data for evaluating SNNs. Ideally, SNNs should always be evaluated on event-based datasets, where they can surpass ANNs by leveraging the spatiotemporal information encoding of event streams¹⁴⁶. For example, in autonomous driving or robot-vision systems, event cameras can provide visual inputs with high temporal resolution and high dynamic range, and SNNS can efficiently process these inputs to perform a variety of tasks that traditional ANNs cannot accomplish, such as feature detection, optical flow calculation, scene reconstruction, image segmentation and object recognition. In addition, since event cameras only produce output when brightness changes, their power consumption is much lower than that of the traditional cameras. This makes them very suitable for applications that need to run for a long time or are battery-powered¹⁴⁷. In addition, even if the same dataset is used, the differences in frameworks^148,149 and hardware platforms make it difficult to compare the pros and cons between different systems. Relevant articles provide benchmark testing methods³³, which can evaluate the quality of different networks and hardware systems, but further optimization is still needed.

At the aspect of hardware integration, large-scale passive cross memristor synapse arrays are faced with two insurmountable obstacles, that is, sneak current and IR drop. The sneak current problem will reduce the reading accuracy of the array, while the IR drop problem stems from the need for a larger writing voltage to rewrite the conductance state of the device when the wire is scaled down to the nanoscale. This larger writing voltage brings many restrictions, for example, its writing scheme (V/2 or V/3) prohibits a significant increase in writing voltage⁸⁹. The sneak current problem can be reduced by increasing the nonlinearity of the device unit itself or by integrating a memristor with a strong non-linear element (such as a transistor) to suppress sneak current^150,151. The IR drop problem can be attenuated by adopting a higher-resistance memristor to avoid the voltage drop resulted from the reduction of electrode size (this will limit the reading speed, so it needs to be considered comprehensively). However, these patches will significantly increase the cost and complexity. As an alternative, two-terminal structures metal-insulator-metal (MIM), threshold selectors^152,153, and memristors with self-rectification effects deserve more attention¹⁵⁴. At the same time, new design strategies and architectures also need to be explored continuously so as to overcome the challenges existing in the design process of large-scale passive cross-memristor synapse arrays.

At the aspect of devices, there exist differences in properties between the real and ideal devices, and among real devices themselves, all these inevitably impact the network¹⁵⁵. Among them, the difference in the same storage unit from cycle to cycle and that in different storage units on the same chip are the biggest challenges that limit the development of this field¹⁵⁶, although this also brings some opportunities, such as using randomness for encryption¹⁵⁷. It is currently believed that the difference in memristors from cycle to cycle is ascribed to the randomness of the CF mechanism, so memristors based on non-CF-switching mechanisms are worthy of more attention in the future. The difference in storage units on the same chip, which can be attributed to the randomness, may also be ascribed to the unevenness introduced during the manufacturing process. This requires exploring new processes and manufacturing methods to reduce the differences on the same chip¹⁵⁸.

In short, memristor promises a very broad application prospect in neuromorphic computing. They can simulate neurons and synapses, and simultaneously build complex neural network structures to achieve advanced functions such as in-situ learning and pattern classification. However, both the algorithm and the memristor hardware face their own challenges. Optimizing only one aspect cannot implement SNNs on hardware and fully exploit their potential. Therefore, development of customized algorithms that can maximize their performance based on the inherent attributes of memristors is highly desired. Only in this way can we fully utilize the unique properties of memristors on the basis of deep understanding, thereby building more efficient and powerful neuromorphic computing systems.

CONCLUSION

As the latest generation of neural networks, SNNs work in a pulse-based event-triggered operating mode that ensures their low energy consumption. Undoubtedly, hardware implementation of neural networks is one of the main pathways towards lower-energy-consuming SNNs, which involve not only superior device structures but also well-matched algorithm models. Based on the current progress of neural networks and memristor devices, a detailed analysis on algorithm optimization for memristors has been performed, and the challenges faced at the algorithm and hardware levels were also summarized. This key area involves, on the one hand, determining the performance requirements of memristors through algorithm simulation, and on the other hand, designing dedicated network topologies for memristors in turn. This algorithm development based on the inherent attributes of memristors will promote the further development of SNNs and achieve a higher level of performance. This is a challenging but promising field. SNNs are expected to achieve more biologically reasonable and efficient neuromorphic systems, thereby promoting cutting-edge researches in neuroscience and computation.

MISCELLANEA

Declaration of competing interest The authors declare no competing interests.

References

Publishing order | Descend order by publishing year | Descend order by cited within

1.	Bullmore E. & Sporns O. The economy of brain network organization. Nat. Rev. Neurosci. 13, 336-349 (2012). https://doi.org/10.1038/nrn3214.

2.	Cox D. D. & Dean T. Neural networks and neuroscience-inspired computer vision. Curr. Biol. 24, R921eR929 (2014). https://doi.org/10.1016/j.cub.2014.08.026.

3.	Leiserson C. E. et al. There's plenty of room at the top: what will drive computer performance after Moore's law? Science 368, eaam 9744 (2020). https://doi.org/10.1126/science.aam9744.

4.	Alom M. Z. et al. A state-of-the-art survey on deep learning theory and architectures. Electronics 8, 292 (2019). https://doi.org/10.3390/electronics8030292.

5.	LeCun Y. et al. Backpropagation applied to handwritten zip code recognition. Neural. Comput. 1, 541-551 (1989). https://doi.org/10.1162/neco.1989.1.4.541.

6.	Krizhevsky A., Sutskever I. & Hinton G. E. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84-90 (2017). https://doi.org/10.1145/3065386.

7.	Mahardi, Wang, I.-H., Lee, K. -C. & Chang, S. -L. Images classification of dogs and cats using fine-tuned VGG models. In 2020 IEEE Eurasia Conference on IOT, Communication and Engineering (ECICE), 230-233 (IEEE, 2020). https://doi.org/10.1109/ECICE50847.2020.9301918.

8.	He K., Zhang X., Ren S. & Sun J.Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),770-778 (IEEE, 2016). https://doi.org/10.1109/CVPR.2016.90.

9.	Goldberg Y. Neural network methods for natural language processing. (Springer, 2017). https://doi.org/10.1007/978-3-031-02165-7.

10.	Hu B., Lu Z., Li H. & Chen Q. Convolutional neural network architectures for matching natural language sentences. Preprint at https://doi.org/10.48550/arXiv.1503.03244 (2015).

11.	Hinton G. et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29, 82-97 (2012). https://doi.org/10.1109/msp.2012.2205597.

12.	Wu T. et al. A brief overview of ChatGPT: the history, status quo and potential future development. IEEE/CAA J. Autom. Sin. 10, 1122-1136 (2023). https://doi.org/10.1109/jas.2023.123618.

13.	Video generation models as world simulators. OpenAI. Accessed February 15, 2024.

14.	Patterson D. et al. Carbon emissions and large neural network training. Preprint at https://doi.org/10.48550/arXiv.2104.10350 (2021).

15.	Horowitz M. 1.1 Computing's energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 10-14 (IEEE, 2014). https://doi.org/10.1109/ISSCC.2014.6757323.

16.	Yan Z., Zhou J. & Wong W.-F. Energy efficient ECG classification with spiking neural network. Biomed. Signal Process. Control 63, 102170 (2021). https://doi.org/10.1016/j.bspc.2020.102170.

17.	Sadovsky E., Jarina R. & Orjesek R. Image recognition using spiking neural networks. In 2021 31st International Conference Radioelektronika ( RADIOELEKTRONIKA), 1-5 (IEEE, 2021). https://doi.org/10.1109/RADIOELEKTRONIKA52220.2021.9420192.

18.

Martinelli

, Dellaferrera

, Mainar

. & Cernak

. Spiking neural networks trained with backpropagation for low power neuromorphic implementation of voice activity detection. In 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 8544-8548 (IEEE, 2020). https://doi.org/10.1109/ICASSP40776.2020.9053412.

19.	Foderaro G., Henriquez C. & Ferrari S. Indirect training of a spiking neural network for flight control via spike-timing-dependent synaptic plasticity. In 49th IEEE Conference on Decision and Control ( CDC), 911-917 (IEEE, 2010). https://doi.org/10.1109/CDC.2010.5717260.

20.	Vanhoucke V., Senior A. & Mao M. Z. Improving the speed of neural networks on CPUs. In DeepLearning and Unsupervised Feature Learning Workshop (NIPS), 1-8. (MIT Press, 2011). https://andrewsenior.com/papers/VanhouckeNIPS11.pdf.

21.	Wang L. et al. Superneurons: dynamic GPU memory management for training deep neural networks. ACM Sigplan Not. 53, 41-53 (2018). https://doi.org/10.1145/3200691.3178491.

22.	Mittal S. A survey of FPGA-based accelerators for convolutional neural networks. Neural Comput. Appl. 32, 1109-1139 (2020). https://doi.org/10.1007/s00521-018-3761-1.

23.	Nurvitadhi E. et al. Accelerating recurrent neural networks in analytics servers: comparison of FPGA, CPU, GPU, and ASIC. In 2016 26th International Conference on Field Programmable Logic and Applications ( FPL),1-4 (IEEE, 2016). https://doi.org/10.1109/FPL.2016.7577314.

24.	Ju X., Fang B., Yan R., Xu X. & Tang H. An FPGA implementation of deep spiking neural networks for low-power and fast classification. Neural Comput. 32, 182-204 (2020). https://doi.org/10.1162/neco_a_01245.

25.	Maguire L. P. et al. Challenges for large-scale implementations of spiking neural networks on FPGAs. Neurocomputing 71, 13-29 (2007). https://doi.org/10.1016/j.neucom.2006.11.029.

26.	Merolla P. A. et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345, 668-673 (2014). https://doi.org/10.1126/science.1254642.

27.	Govoreanu B. et al. 10× 10 nm2 Hf/HfOx crossbar resistive RAM with excellent performance, reliability and low-energy operation. In 2011 International Electron Devices Meeting, 31. 6.1-31.6.4 (IEEE, 2011). https://doi.org/10.1109/IEDM.2011.6131652.

28.	Choi B. J. et al. High-speed and low-energy nitride memristors. Adv. Funct. Mater. 26, 5290-5296. https://doi.org/10.1002/adfm.201600680.

29.	Zhou J. et al. Very low-programming-current RRAM with self-rectifying characteristics. IEEE Electron Device Lett. 37, 404-407 (2016). https://doi.org/10.1109/led.2016.2530942.

30.	Lee M.-J. et al. A fast, high-endurance and scalable non-volatile memory device made from asymmetric Ta2O(5-x)/TaO(2-x) bilayer structures. Nat. Mater. 10, 625-630 (2011). https://doi.org/10.1038/nmat3070.

31.	Brink J. R. & Haden C. R. The computer and the brain. IEEE Ann. Hist. Comput. 11, 161-163 (1989). https://doi.org/10.1109/mahc.1989.10032.

32.	Jeon W. et al. Chapter Six - Deep Learning with GPUs. Adv. Comput. 122, 167-215 (2021). https://doi.org/10.1016/bs.adcom.2020.11.003.

33.	Capra M. et al. Hardware and software optimizations for accelerating deepneural networks: survey of current trends, challenges, and the road ahead. IEEE Access 8, 225134-225180 (2020). https://doi.org/10.1109/access.2020.3039858.

34.	Sebastian A., Le Gallo M., Khaddam-Aljameh R. & Eleftheriou E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 15, 529-544 (2020). https://doi.org/10.1038/s41565-020-0655-z.

35.	Ahn J., Yoo S., Mutlu O. & Choi K. PIM-enabled instructions: a low-overhead, locality-aware processing-in-memory architecture. ACM Sigarch Comput. Archit. News 43, 336-348 (2015). https://doi.org/10.1145/2872887.2750385.

36.	Prezioso M. et al. Training and operation of an integrated neuromorphic network based on metal-oxide memristors. Nature 521, 61-64 (2015). https://doi.org/10.1038/nature14441.

37.	Burr G. W. et al. Neuromorphic computing using non-volatile memory. Adv. Phys.: X 2, 89-124 (2016). https://doi.org/10.1080/23746149.2016.1259585.

38.	Zhu X. & Lu W. D. Optogenetics-inspired tunable synaptic functions in memristors. ACS Nano 12, 1242-1249 (2018). https://doi.org/10.1021/acsnano.7b07317.

39.	Krestinskaya O., James A. P. & Chua L. O. Neuromemristive circuits for edge computing: a review. IEEE Trans. Neural Netw. Learn. Syst. 31, 4-23 (2020). https://doi.org/10.1109/TNNLS.2019.2899262.

40.	Maass W. Networks of spiking neurons: the third generation of neural network models. Neural Netw. 10, 1659-1671 (1997). https://doi.org/10.1016/s0893-6080(97)00011-7.

41.	McCulloch W. S. & Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 115-133 (1943). https://doi.org/10.1007/bf02478259.

42.	Nair V. & Hinton G. E. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML-10), 807-814 (Omnipress, 2010).

43.	Rumelhart D. E., Hinton G. E. & Williams R. J. Learning representations by back-propagating errors. Nature 323, 533-536 (1986). https://doi.org/10.1038/323533a0.

44.	Roy K., Jaiswal A.& Panda P. Towards spike-based machine intelligence with neuromorphic computing. Nature 575, 607-617 (2019). https://doi.org/10.1038/s41586-019-1677-2.

45.	Taherkhani A. et al. A review of learning in biologically plausible spiking neural networks. Neural Netw. 122, 253-272 (2020). https://doi.org/10.1016/j.neunet.2019.09.036.

46.	Cao Y., Chen Y. & Khosla D. Spiking deep convolutional neural networks for energy-efficient object recognition. Int. J. Comput. Vis. 113, 54-66 (2015). https://doi.org/10.1007/s11263-014-0788-3.

47.	Diehl P. U. et al. Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In 2015 International Joint Conference on Neural Networks (IJCNN), 1-8 (IEEE, 2015). https://doi.org/10.1109/IJCNN.2015.7280696.

48.	Hodgkin A. L. & Huxley A. F. A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 117, 500-544 (1952). https://doi.org/10.1113/jphysiol.1952.sp004764.

49.	Gerstner W. & Kistler W. M. Spiking Neuron Models: Single Neurons, Populations, Plasticity. (Cambridge University Press, 2002). https://doi.org/10.1017/CBO9780511815706.

50.	Izhikevich E. M. Simple model of spiking neurons. IEEE Trans. Neural Netw. 14, 1569-1572 (2003). https://doi.org/10.1109/tnn.2003.820440.

51.	Segee B. Methods in neuronal modeling: from ions to networks, 2nd Edition. Comput. Sci. Eng. 1, 81 (1999). https://doi.org/10.1109/mcise.1999.743629.

52.	Fang W. et al. Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV 2021), 2641-2651 (IEEE, 2021). https://doi.org/10.1109/Iccv48922.2021.00266.

53.	Duan Q. et al. Spiking neurons with spatiotemporal dynamics and gain modulation for monolithically integrated memristive neural networks. Nat. Commun. 11, 3399 (2020). https://doi.org/10.1038/s41467-020-17215-3.

54.	Yang R., Huang H.-M. & Guo X. Memristive synapses and neurons for bioinspired computing. Adv. Electron. Mater. 5, 1900287 (2019). https://doi.org/10.1002/aelm.201900287.

55.	Xu M. et al. Recent advances on neuromorphic devices based on chalcogenide phase-change materials. Adv. Funct. Mater. 30, 2003419 (2020). https://doi.org/10.1002/adfm.202003419.

56.	Huang J.-N., Wang T., Huang H.-M. & Guo X. Adaptive SRM neuron based on NbO memristive device for neuromorphic computing. Chip 1, 100015 (2022). https://doi.org/10.1016/j.chip.2022.100015.

57.	Shaban A., Bezugam S. S. & Suri M. An adaptive threshold neuron for recurrent spiking neural networks with nanodevice hardware implementation. Nat. Commun. 12, 4234 (2021). https://doi.org/10.1038/s41467-021-24427-8.

58.	Hochreiter S. & Schmidhuber J. Long short-term memory. Neural Comput. 9, 1735-1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735.

59.	Tang J. et al. Bridging biological and artificial neural networks with emerging neuromorphic devices: fundamentals, progress, and challenges. Adv. Mater. 31, 1902761 (2019). https://doi.org/10.1002/adma.201902761.

60.	Masquelier T., Guyonneau R. & Thorpe S. J. Competitive STDP-based spike pattern learning. Neural Comput. 21, 1259-1276 (2009). https://doi.org/10.1162/neco.2008.06-08-804.

61.	Kheradpisheh S. R., Ganjtabesh M., Thorpe S. J. & Masquelier T. TDPbased spiking deep convolutional neural networks for object recognition. Neural Netw. 99, 56-67 (2018). https://doi.org/10.1016/j.neunet.2017.12.005.

62.	Mozafari M., Kheradpisheh S. R., Masquelier T., Nowzari-Dalini A. & Ganjtabesh M. First-spike-based visual categorization using reward-modulated STDP. IEEE Trans. Neural Netw. Learn. Syst. 29, 6178-6190 (2018). https://doi.org/10.1109/tnnls.2018.2826721.

63.	Mozafari M., Ganjtabesh M., Nowzari-Dalini A., Thorpe S. J. & Masquelier T. Bio-inspired digit recognition using reward-modulated spiketiming- dependent plasticity in deep convolutional networks. Pattern Recognit. 94, 87-95 (2019). https://doi.org/10.1016/j.patcog.2019.05.015.

64.	Diehl P. U. & Cook M. Unsupervised learning of digit recognition using spiketiming-dependent plasticity. Front. Comput. Neurosci. 9 (2015). https://doi.org/10.3389/fncom.2015.00099.

65.	Tavanaei A. & Maida A. BP-STDP: approximating backpropagation using spike timing dependent plasticity. Neurocomputing 330, 39-47 (2019). https://doi.org/10.1016/j.neucom.2018.11.014.

66.	Zheng H., Wu Y., Deng L., Hu Y. & Li G.Going deeper with directly-trained larger spiking neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence,11062-11070 (AAAI, 2021). https://doi.org/10.1609/aaai.v35i12.17320.

67.	Wu Y. et al. Direct training for spiking neural networks: faster, larger, better. In Proceedings of the AAAI Conference on Artificial Intelligence, 1311-1318 (AAAI, 2019). https://doi.org/10.1609/aaai.v33i01.33011311.

68.

, Xiao

, Pan

. & Tang

. STCA: spatio-temporal credit assignment with delayed feedback in deep spiking neural networks. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ( IJCAI-19), 1366-1372 (Morgan Kaufmann, 2019). https://www.ijcai.org/Proceedings/2019/0189.pdf.

69.	Yan Y. et al. Graph-based spatio-temporal backpropagation for training spiking neural networks. In 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems ( AICAS), 1-4 (IEEE, 2021). https://doi.org/10.1109/AICAS51828.2021.9458461.

70.	Liang L. et al. H2Learn: high-efficiency learning accelerator for high-accuracy spiking neural networks. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 41, 4782-4796 (2022). https://doi.org/10.1109/tcad.2021.3138347.

71.	Han J., Wang Z., Shen J. & Tang H. Symmetric-threshold ReLU for fast and nearly lossless ANN-SNN conversion. Mach. Intell. Res. 20, 435-446 (2023). https://doi.org/10.1007/s11633-022-1388-2.

72.	Ding J., Yu Z., Tian Y. & Huang T. Optimal ANN-SNN conversion for fast and accurate inference in deep spiking neural networks. Preprint at https://doi.org/10.48550/arXiv.2105.11654 (2021).

73.	Fang W. et al. SpikingJelly: an open-source machine learning infrastructure platform for spike-based intelligence. Sci. Adv. 9, eadi1480 (2023). https://doi.org/10.1126/sciadv.adi1480.

74.	Eshraghian J. K. et al. Training spiking neural networks using lessons from deep learning. Proc. IEEE 111, 1016-1054 (2023). https://doi.org/10.1109/jproc.2023.3308088.

75.	Stimberg M., Brette R. & Goodman D. F. Brian 2, an intuitive and efficient neural simulator. Elife 8, e 47314 (2019). https://doi.org/10.7554/eLife.47314.

76.	Strukov D. B., Snider G. S., Stewart D. R. & Williams R. S. The missing memristor found. Nature 453, 80-83 (2008). https://doi.org/10.1038/nature06932.

77.	Su Y.-T. et al. A method to reduce forming voltage without degrading device performance in hafnium oxide-based 1T1R resistive random access memory. IEEE J. Electron Devices Soc. 6, 341-345 (2018). https://doi.org/10.1109/jeds.2018.2805285.

78.	Chen S.-X., Chang S.-P., Chang S.-J., Hsieh W.-K. & Lin C.-H. Highly stable ultrathin TiO 2 based resistive random access memory with low operation voltage. ECS J. Solid State Sci. Technol. 7, Q3183eQ3188 ( 2018). https://doi.org/10.1149/2.0281807jss.

79.	Prakash A., Deleruyelle D., Song J., Bocquet M. & Hwang H. Resistance controllability and variability improvement in a TaOx-based resistive memory for multilevel storage application. Appl. Phys. Lett. 106, 233104 (2015). https://doi.org/10.1063/1.4922446.

80.	Simanjuntak F. M., Panda D., Wei K.-H. & Tseng T.-Y. Status and prospects of ZnO-based resistive switching memory devices. Nanoscale Res. Lett. 11, 368 (2016). https://doi.org/10.1186/s11671-016-1570-y.

81.	Banerjee W. et al. Occurrence of resistive switching and threshold switching in atomic layer deposited ultrathin (2 nm) aluminium oxide crossbar resistive random access memory. IEEE Electron. Device Lett. 36, 333-335 (2015). https://doi.org/10.1109/led.2015.2407361.

82.	Li Y. et al. Ultrafast synaptic events in a chalcogenide memristor. Sci. Rep. 3, 1619 (2013). https://doi.org/10.1038/srep01619.

83.	Li Y. et al. Activity-dependent synaptic plasticity of a chalcogenide electronic synapse for neuromorphic systems. Sci. Rep. 4, 4906 (2014). https://doi.org/10.1038/srep04906.

84.	Xia X. et al. 2D-Material-Based volatile and nonvolatile memristive devices for neuromorphic computing. ACS Mater. Lett. 5, 1109-1135 (2023). https://doi.org/10.1021/acsmaterialslett.2c01026.

85.	Cao G. et al. 2D material based synaptic devices for neuromorphic computing. Adv. Funct. Mater. 31, 2005443 (2020). https://doi.org/10.1002/adfm.202005443.

86.	van de Burgt Y., Melianas A., Keene S. T., Malliaras G. & Salleo A. Organic electronics for neuromorphic computing. Nat. Electron. 1, 386-397 (2018). https://doi.org/10.1038/s41928-018-0103-3.

87.	Yuan L., Liu S., Chen W., Fan F. & Liu G. Organic memory and memristors: from mechanisms, materials to devices. Adv. Electron. Mater. 7, 2100432 (2021). https://doi.org/10.1002/aelm.202100432.

88.	Waser R., Dittmann R., Staikov G. & Szot K. Redox-based resistive switching memories e nanoionic mechanisms, prospects, and challenges. Adv. Mater. 21, 2632-2663 (2009). https://doi.org/10.1002/adma.200900375.

89.	Zahoor F., Azni Zulkifli T. Z. & Khanday F. A. Resistive random access memory (RRAM): an overview of materials, switching mechanism, performance, multilevel cell (mlc) storage, modeling, and applications. Nanoscale Res. Lett. 15, 90 (2020). https://doi.org/10.1186/s11671-020-03299-9.

90.	Zidan M. A., Strachan J. P. & Lu W. D. The future of electronics based on memristive systems. Nat. Electron. 1, 22-29 (2018). https://doi.org/10.1038/s41928-017-0006-8.

91.	Schmitt R. et al. Accelerated ionic motion in amorphous memristor oxides for nonvolatile memories and neuromorphic computing. Adv. Funct. Mater. 29, 1804782 (2019). https://doi.org/10.1002/adfm.201804782.

92.	Kwon D.-H. et al. Atomic structure of conducting nanofilaments in TiO 2 resistive switching memory. Nat. Nanotechnol. 5, 148-153 (2010). https://doi.org/10.1038/nnano.2009.456.

93.	Valov I., Waser R., Jameson J. R. & Kozicki M. N. Electrochemical metallization memoriesdfundamentals, applications, prospects. Nanotechnology 22, 254003 (2011). https://doi.org/10.1088/0957-4484/22/25/254003.

94.	Qin F., Zhang Y., Song H. W. & Lee S. Enhancing memristor fundamentals through instrumental characterization and understanding reliability issues. Mater. Adv. 4, 1850-1875 (2023). https://doi.org/10.1039/d3ma00069a.

95.	Chakraborty I., Jaiswal A., Saha A. K., Gupta S. K. & Roy K. Pathways to efficient neuromorphic computing with non-volatile memory technologies. Appl. Phys. Rev. 7, 021308 (2020). https://doi.org/10.1063/1.5113536.

96.	Patil A. R., Dongale T. D., Kamat R. K. & Rajpure K. Y. Binary metal oxidebased resistive switching memory devices: a status review. Mater. Today Commun. 34, 105356 (2023). https://doi.org/10.1016/j.mtcomm.2023.105356.

97.	Wang Z. et al. Engineering incremental resistive switching in TaOx based memristors for brain-inspired computing. Nanoscale 8, 14015-14022 (2016). https://doi.org/10.1039/c6nr00476h.

98.	Wu L., Liu H., Li J., Wang S. & Wang X. A multi-level memristor based on Al-doped HfO 2 thin film. Nanoscale Res. Lett. 14, 177 ( 2019). https://doi.org/10.1186/s11671-019-3015-x.

99.	Choi W. S. et al. Influence of Al2O 3 layer on InGaZnO memristor crossbar array for neuromorphic applications. Chaos Solit. Fractals 156, 111813 ( 2022). https://doi.org/10.1016/j.chaos.2022.111813.

100.	Xiao Y. et al. Improved artificial synapse performance of Pt/HfO2/BiFeO3/HfO2/TiN memristor through N2 annealing. Ceram. Int. 48, 34584-34589 (2022). https://doi.org/10.1016/j.ceramint.2022.08.045.

101.	Zhu Y.-L. et al. Uniform and robust TiN/HfO2/Pt memristor through interfacial Al-doping engineering. Appl. Surf. Sci. 550, 149274 (2021). https://doi.org/10.1016/j.apsusc.2021.149274.

102.	Li C. et al. Large memristor crossbars for analog computing. In 2018 IEEE International Symposium on Circuits and Systems (ISCAS), 1-4 (IEEE, 2018). https://doi.org/10.1109/ISCAS.2018.8351877.

103.	Li C. et al. Analogue signal and image processing with large memristor crossbars. Nat. Electron. 1, 52-59 (2018). https://doi.org/10.1038/s41928-017-0002-z.

104.	Hu M. et al. Memristor-based analog computation and neural network classification with a dot product engine. Adv. Mater. 30, 1705914 (2018). https://doi.org/10.1002/adma.201705914.

105.	Li C. et al. Efficient and self-adaptive in-situ learning in multilayer memristor neural networks. Nat. Commun. 9, 2385 (2018). https://doi.org/10.1038/s41467-018-04484-2.

106.	Liu Q. et al. 33.2 A fully integrated analog ReRAM based 78.4 TOPS/W compute-in-memory chip with fully parallel MAC computing. In 2020 IEEE International Solid-State Circuits Conference (ISSCC), 500-502 (IEEE, 2020). https://doi.org/10.1109/ISSCC19947.2020.9062953.

107.	Yao P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641-646 (2020). https://doi.org/10.1038/s41586-020-1942-4.

108.	Jeon K. et al. Purely self-rectifying memristor-based passive crossbar array for artificial neural network accelerators. Nat. Commun. 15, 129 (2024). https://doi.org/10.1038/s41467-023-44620-1.

109.	Ambrogio S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60-67 (2018). https://doi.org/10.1038/s41586-018-0180-5.

110.	Bayat F. M. et al. Implementation of multilayer perceptron network with highly uniform passive memristive crossbar circuits. Nat. Commun. 9, 2331 (2018). https://doi.org/10.1038/s41467-018-04482-4.

111.	Wang R. et al. Implementing in-situ self-organizing maps with memristor crossbar arrays for data mining and optimization. Nat. Commun. 13, 2289 (2022). https://doi.org/10.1038/s41467-022-29411-4.

112.	Zhao H. et al. Energy-efficient high-fidelity image reconstruction with memristor arrays for medical diagnosis. Nat. Commun. 14, 2276 (2023). https://doi.org/10.1038/s41467-023-38021-7.

113.	Ye F., Kiani F., Huang Y. & Xia Q. Diffusive memristors with uniform and tunable relaxation time for spike generation in event-based pattern recognition. Adv. Mater. 35, 2204778 (2023). https://doi.org/10.1002/adma.202204778.

114.	Kumar S., Williams R. S. & Wang Z. Third-order nanocircuit elements for neuromorphic engineering. Nature 585, 518-523 (2020). https://doi.org/10.1038/s41586-020-2735-5.

115.	Wang Z. et al. Fully memristive neural networks for pattern classification with unsupervised learning. Nat. Electron. 1, 137-145 (2018). https://doi.org/10.1038/s41928-018-0023-2.

116.	Wu Z. et al. A habituation sensory nervous system with memristors. Adv. Mater. 32, 2004398 (2020). https://doi.org/10.1002/adma.202004398.

117.	Yuan R. et al. A neuromorphic physiological signal processing system based on VO 2 memristor for next-generation human-machine interface. Nat. Commun. 14, 3695 ( 2023). https://doi.org/10.1038/s41467-023-39430-4.

118.	Wang H., Xu Y., Yang R. & Miao X. A LIF neuron with adaptive firing frequency based on the GaSe memristor. IEEE Trans. Electron Devices 70, 4484-4487 (2023). https://doi.org/10.1109/ted.2023.3288508.

119.	Zhao J. et al. Memristors based on NdNiO 3 nanocrystals film as sensory neurons for neuromorphic computing. Mater. Horiz. 10, 4521-4531 (2023). https://doi.org/10.1039/d3mh00835e.

120.	Song M. et al. Self-compliant threshold switching devices with high on/off ratio by control of quantized conductance in Ag filaments. Nano Lett. 23, 2952-2957 (2023). https://doi.org/10.1021/acs.nanolett.3c00327.

121.	Hua Q., Wu H., Gao B. & Qian H. Enhanced performance of Ag-filament threshold switching selector by rapid thermal processing. In International Symposium on VLSI Technology, Systems and Application (VLSI-TSA), 1-2 (IEEE, 2018). https://doi.org/10.1109/VLSI-TSA.2018.8403855.

122.	Agarwal S. et al. Resistive memory device requirements for a neural algorithm accelerator. In 2016 International Joint Conference on Neural Networks (IJCNN), 929-938 ( IEEE 2016). https://doi.org/10.1109/IJCNN.2016.7727298.

123.	Kim T. et al. Spiking neural network (SNN) with memristor synapses having non-linear weight update. Front. Comput. Neurosci. 15, 646125 (2021). https://doi.org/10.3389/fncom.2021.646125.

124.	Rao M. et al. Thousands of conductance levels in memristors integrated on CMOS. Nature 615, 823-829 (2023). https://doi.org/10.1038/s41586-023-05759-5.

125.	Rathi N., Panda P. & Roy K. STDP-based pruning of connections and weight quantization in spiking neural networks for energy-efficient recognition. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 38, 668-677 (2019). https://doi.org/10.1109/tcad.2018.2819366.

126.	Shen W. C. et al. High-K metal gate contact RRAM (CRRAM) in pure 28 nm CMOS logic process. In 2012 International Electron Devices Meeting, 31. 36. 31-31.36.34 (IEEE, 2012). https://doi.org/10.1109/IEDM.2012.6479146.

127.	Zhao R. et al. Accelerating binarized convolutional neural networks with software-programmable FPGAs. In 2017 International Symposium on Field Programmable Gate Arrays (FPGA), 15-24 (ACM, 2017). https://doi.org/10.1145/3020078.3021741.

128.

Hubara

, Courbariaux

, Soudry

, El-Yaniv

. & Bengio

.Binarized neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS),4114-4122 (MIP Press, 2016). https://proceedings.neurips.cc/paper_files/paper/2016/hash/d8330f857a17c53d217014ee776bfd50-Abstract.html.

129.	Nurvitadhi E. et al. Accelerating binarized neural networks: comparison of FPGA, CPU, GPU, and ASIC. In 2016 International Conference on Field- Programmable Technology (FPT), 77-84 (IEEE, 2016). https://doi.org/10.1109/FPT.2016.7929192.

130.	Liang S., Yin S., Liu L., Luk W. & Wei S. FP-BNN: binarized neural network on FPGA. Neurocomputing 275, 1072-1086 (2018). https://doi.org/10.1016/j.neucom.2017.09.046.

131.	Sun X. et al. XNOR-RRAM: a scalable and parallel resistive synaptic architecture for binary neural networks. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), 1423-1428 (IEEE, 2018). https://doi.org/10.23919/DATE.2018.8342235.

132.	Simons T. & Lee D.-J. A review of binarized neural networks. Electronics 8, 661 (2019). https://doi.org/10.3390/electronics8060661.

133.	Qiao G. C. et al. Direct training of hardware-friendly weight binarized spiking neural network with surrogate gradient learning towards spatio-temporal event-based dynamic data recognition. Neurocomputing 457, 203-213 (2021). https://doi.org/10.1016/j.neucom.2021.06.070.

134.	Nguyen V.-T., T., Zhang R. & Nakashima Y. XNOR-BSNN: in-memory computing model for deep binarized spiking neural network. In 2021 International Conference on High Performance Big Data and Intelligent Systems (HPBD&IS), 17-21 (IEEE, 2021). https://doi.org/10.1109/HPBDIS53214.2021.9658467.

135.	Abu Lebdeh M., Abunahla H., Mohammad B. & Al-Qutayri M. An efficient heterogeneous memristive xnor for in-memory computing. IEEE Trans. Circuits Syst. I: Regul. Pap. 64, 2427-2437 (2017). https://doi.org/10.1109/tcsi.2017.2706299.

136.	Wang X.-Y. et al. High-density memristor-CMOS ternary logic family. IEEE Trans. Circuits Syst. I: Regul. Pap. 68, 264-274 (2021). https://doi.org/10.1109/tcsi.2020.3027693.

137.	Zhang W. et al. Edge learning using a fully integrated neuro-inspired memristor chip. Science 381, 1205-1211 (2023). https://doi.org/10.1126/science.ade3483.

138.	Balaji A. et al. Mapping spiking neural networks to neuromorphic hardware. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 28, 76-86 (2020). https://doi.org/10.1109/tvlsi.2019.2951493.

139.	Ankit A., Sengupta A., Panda P. & Roy K. RESPARC: a reconfigurable and energy-efficient architecture with memristive crossbars for deep spiking neural networks. In 2017 Design Automation Conference (DAC), 1-6 (ACM, 2017). https://doi.org/10.1145/3061639.3062311.

140.	Boquet G. et al. Offline training for memristor-based neural networks. In 2020 28th European Signal Processing Conference ( EUSIPCO), 1547-1551 (IEEE, 2021). https://doi.org/10.23919/Eusipco47968.2020.9287574.

141.	Wijesinghe P., Ankit A., Sengupta A. & Roy K. An all-memristor deep spiking neural computing system: a step toward realizing the low-power stochastic brain. IEEE Trans. Emerg. Top. Comput. Intell. 2, 345-358 (2018). https://doi.org/10.1109/tetci.2018.2829924.

142.	Maass W. Lower bounds for the computational power of networks of spiking neurons. Neural Comput. 8, 1-40 (1996). https://doi.org/10.1162/neco.1996.8.1.1.

143.	Beyeler M., Dutt N. D. & Krichmar J. L. Categorization and decision-making in a neurobiologically plausible spiking network using a STDP-like learning rule. Neural Netw. 48, 109-124 (2013). https://doi.org/10.1016/j.neunet.2013.07.012.

144.	Tavanaei A., Masquelier T. & Maida A. S. Acquisition of visual features through probabilistic spike-timing-dependent plasticity. In 2016 International Joint Conference on Neural Networks (IJCNN), 307-314 (IEEE, 2016). https://doi.org/10.1109/IJCNN.2016.7727213.

145.	Tavanaei A., Ghodrati M., Kheradpisheh S. R., Masquelier T. & Maida A. Deep learning in spiking neural networks. Neural Netw. 111, 47-63 (2019). https://doi.org/10.1016/j.neunet.2018.12.002.

146.	Deng L. et al. Rethinking the performance comparison between SNNS and ANNS. Neural Netw. 121, 294-307 (2020). https://doi.org/10.1016/j.neunet.2019.09.005.

147.	Gallego G. et al. Event-based vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44, 154-180 (2022). https://doi.org/10.1109/tpami.2020.3008413.

148.	Abadi M. et al. TensorFlow: large-scale machine learning on heterogeneous distributed systems. Preprint at https://doi.org/10.48550/arXiv.1603.04467(2016).

149.	Paszke A. et al. Automatic Differentiation in PyTorch. In 31st Conference on Neural Information Processing Systems ( NIPS), 1-4 (MIT Press, 2017).

150.	Sivan M. et al. All WSe 2 1T1R resistive RAM cell for future monolithic 3D embedded memory integration. Nat. Commun. 10, 5201 ( 2019). https://doi.org/10.1038/s41467-019-13176-4.

151.	Merced-Grafals E. J., Dávila N., Ge N., Williams R. S. & Strachan J. P. Repeatable accurate, and high speed multi-level programming of memristor 1T1R arrays for power efficient analog computing applications. Nanotechnology 27, 365202 (2016). https://doi.org/10.1088/0957-4484/27/36/365202.

152.	Song J., Woo J., Prakash A., Lee D. & Hwang H. Threshold selector with high selectivity and steep slope for cross-point memory array. IEEE Electron Device Lett. 36, 681-683 (2015). https://doi.org/10.1109/led.2015.2430332.

153.	Wang Y., Zhang Z., Li H. & Shi L. Realizing bidirectional threshold switching in Ag/Ta2O5/Pt diffusive devices for selector applications. J. Electron. Mater. 48, 517-525 (2018). https://doi.org/10.1007/s11664-018-6730-7.

154.	Kim K. M. et al. self-rectifying, and forming-free memristor with an asymmetric programing voltage for a high-density crossbar application. Nano Lett. 16, 6724-6732 (2016). https://doi.org/10.1021/acs.nanolett.6b01781.

155.	Yang J. J., Strukov D. B. & Stewart D. R. Memristive devices for computing. Nat. Nanotechnol. 8, 13-24 (2013). https://doi.org/10.1038/nnano.2012.240.

156.	Grossi A. et al. Fundamental variability limits of filament-based RRAM. In 2016 IEEE International Electron Devices Meeting (IEDM), 4.7.1-4.7.4 (IEEE, 2016). https://doi.org/10.1109/IEDM.2016.7838348.

157.	Chen A. Utilizing the variability of resistive random access memory to implement reconfigurable physical unclonable functions. IEEE Electron Device Lett. 36, 138-140 (2015). https://doi.org/10.1109/led.2014.2385870.

158.	Pan F., Gao S., Chen C., Song C. & Zeng F. Recent progress in resistive random access memories: materials, switching mechanisms, and performance. Mater. Sci. Eng.: R: Rep. 83, 1-59 (2014). https://doi.org/10.1016/j.mser.2014.06.002.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

INTRODUCTION

SPIKING NEURAL NETWORKS

Neuron

Synapse

MEMRISTOR

Non-volatile

Fig. 3. Basic performance diagram of a non-volatile memristor. a, The principle of two different conductive filaments. b, Fundamental I–V curve of non-volatile memristors. Reprinted with permission from ref.94. © 2023 The Author(s).

Volatile

ALGORITHMIC OPTIMIZATION BASED ON MEMRISTORS

CHALLENGE

CONCLUSION

MISCELLANEA

References

Links

Fig. 3. Basic performance diagram of a non-volatile memristor. a, The principle of two different conductive filaments. b, Fundamental I–V curve of non-volatile memristors. Reprinted with permission from ref.⁹⁴. © 2023 The Author(s).