A Resource-Efficient Weight Quantization and Mapping Method for Crossbar Arrays in ReRAM-Based Computing-in-Memory Systems

MINGYUAN MA; WEI JIANG; JUNTAO LIU; LI DU; ZHONGYUAN MA; YUAN DU

doi:10.23919/ICS.2025.3597876

2025 , Vol. 2 >Issue 4: 233 - 242

DOI: https://doi.org/10.23919/ICS.2025.3597876

Regular Papers

A Resource-Efficient Weight Quantization and Mapping Method for Crossbar Arrays in ReRAM-Based Computing-in-Memory Systems

MINGYUAN MA ¹ ,
WEI JIANG ¹ ,
JUNTAO LIU ² ,
LI DU ^,¹^,³^,⁴ ,
ZHONGYUAN MA ¹ ,
YUAN DU ^,¹^,³^,⁴

Expand

¹ Electronic Science and Engineering, Nanjing University, Nanjing 210023, China
² China Mobile Research Institute, Beijing 518048, China
³ School of Electronic Science and Engineering, Nanjing University, Nanjing 210023, China
⁴ Interdisciplinary Research Center for Future Intelligent Chips (Chip-X), Nanjing University, Suzhou 215163, China

YUAN DU (e-mails: yuandu@nju.edu.cn);

LI DU (e-mails: ldu@nju.edu.cn)

Received date: 2025-03-30

Revised date: 2025-06-11

Accepted date: 2025-07-02

Online published: 2025-12-24

Supported by

National Key Research and Development Program of China under(2021YFA0717700)

Nanjing University-China Mobile Communications Group Company, Ltd. Joint Institute

Fold

Abstract

Resistance Random Access Memory (ReRAM) crossbar arrays have been used in compute in-memory (CIM) application owing to its high bit-density, non-volatility, and capability to perform multiplyaccumulate (MAC) calculations efficiently. The expansion of the size of the crossbars has led to the emerging challenge of high IR voltage drop and more complex logic control devices. In this paper, we propose a progressive weight pruning strategy based on gradient sensitivity analysis to reduce redundant parameters and enhance overall sparsity. Building upon this sparsity-enhanced structure, we further introduce two complementary weight quantization-mapping methods tailored for high-bit and low-bit quantization scenarios. The proposed method utilizes group quantization for clustering to merge weights in higher bits and leverages differential properties to conduct spectral clustering for merging weights in lower bits. Experimental results indicate notable savings in crossbar resources with minimal loss of precision. Moreover, we designed a carrier board-FPGA testing platform and deployed a neural network on a 32×32 size ReRAM crossbar. The results show that the proposed algorithm saves 42% of units, and the recognition accuracy of the MNIST dataset is within an acceptable range (91.5% to 88.3%).

Key words： ReRAMs; compute in-memory; weight mapping; quantization; spectral clustering; network pruning

Cite this article

MINGYUAN MA , WEI JIANG , JUNTAO LIU , LI DU , ZHONGYUAN MA , YUAN DU . A Resource-Efficient Weight Quantization and Mapping Method for Crossbar Arrays in ReRAM-Based Computing-in-Memory Systems[J]. Integrated Circuits and Systems, 2025 , 2(4) : 233 -242 . DOI: 10.23919/ICS.2025.3597876

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Z. Li, F. Liu, W. Yang,S. Peng, and J. Zhou, “A survey of convolutional neural networks: Analysis, applications, and prospects,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 12, pp. 6999-7019, Dec. 2022.

[2]	G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504-507, 2006.

[3]	C. Liu, F. Liu, and H. Li, “Brain-inspired computing accelerated by memristor technology,” in Proc. 4th ACM Int. Conf. Nanoscale Comput. Commun., 2017, pp. 1-6.

[4]	M. A. Zidan,J. P. Strachan, and W. D. Lu, “The future of electronics based on memristive systems,” Nature Electron., vol. 1, no. 1, pp. 22-29, 2018.

[5]	F. Aguirre et al., “Hardware implementation of memristor-based artificial neural networks,” Nature Commun., vol. 15, no. 1, 2024, Art. no. 1974.

[6]	C. -C. Chang et al., “Mitigating asymmetric nonlinear weight update effects in hardware neural network based on analog resistive synapse,” IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 8, no. 1, pp. 116-124, Mar. 2018.

[7]	P. Yao et al., “Fully hardware-implemented memristor convolutional neural network,” Nature, vol. 577, no. 7792, pp. 641-646, 2020.

[8]	F. Liu et al., “Bit-transformer: Transforming bit-level sparsity into higher performance in ReRAM-based accelerator,” in Proc. IEEE/ACM Int. Conf. Comput. Aided Des., 2021, pp. 1-9.

[9]	F. Liu et al., “SME: ReRAM-based sparse-multiplication-engine to squeeze-out bit sparsity of neural network,” in Proc. IEEE 39th Int. Conf. Comput. Des., 2021, pp. 417-424.

[10]	S. Yang et al., “APQ: Automated DNN pruning and quantization for ReRAM-based accelerators,” IEEE Trans. Parallel Distrib. Syst., vol. 34, no. 9, pp. 2498-2511, Sep. 2023.

[11]	S. Yang et al., “AUTO-PRUNE: Automated DNN pruning and mapping for ReRAM-based accelerator,” in Proc. ACM Int. Conf. Supercomput., 2021, pp. 304-315.

[12]	T.-H. Yang et al., “Sparse ReRAM engine: Joint exploration of activation and weight sparsity in compressed neural networks,” in Proc. 46th Int. Symp. Comput. Architecture, 2019, pp. 236-249.

[13]	M. S. Tarkov, “Mapping neural network computations onto memristor crossbar,” in Proc. Int. Siberian Conf. Control Commun., 2015, pp. 1-4.

[14]	M. E. Fouda, S. Lee, J. Lee,A. Eltawil, and F. Kurdahi, “Mask technique for fast and efficient training of binary resistive crossbar arrays,” IEEE Trans. Nanotechnol., vol. 18, pp. 704-716, 2019.

[15]	A. Gholami et al., “A survey of quantization methods for efficient neural network inference,” in Proc. Low-Power Comput. Vis., 2022, pp. 291-326.

[16]	M. Ahmed,R. Seraj, and S. M. S. Islam, “The k-means algorithm: A comprehensive survey and performance evaluation,” Electronics, vol. 9, no. 8, 2020, Art. no. 1295.

[17]	A. Ng,M. Jordan, and Y. Weiss, “On spectral clustering: Analysis and an algorithm,” in Proc. Adv. Neural Inf. Process. Syst., 2001, pp. 849-856.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References