A High-Resistance SOT Device Based Computing-in-Memory Macro With High Sensing Margin and Multi-Bit MAC Operations for AI Edge Inference

JUNZHAN LIU; JINYAO MI; YANG LIU; LIANG ZHANG; HE ZHANG; WANG KANG

doi:10.23919/ICS.2025.3567939

2025 , Vol. 2 >Issue 3: 102 - 109

DOI: https://doi.org/10.23919/ICS.2025.3567939

Regular Papers

A High-Resistance SOT Device Based Computing-in-Memory Macro With High Sensing Margin and Multi-Bit MAC Operations for AI Edge Inference

JUNZHAN LIU ¹^,² ,
JINYAO MI ¹^,² ,
YANG LIU ³ ,
LIANG ZHANG ¹^,² ,
HE ZHANG ¹^,² ,
WANG KANG ^,¹^,²

Expand

¹ National Key Laboratory of Spintronics, Hangzhou International Innovation Institute, Beihang University, Hangzhou 311115, China
² School of Integrated Circuit Science and Engineering, Beihang University, Beijing 100191, China
³ School of Electronic Information Engineering, Beihang University, Beijing 100191, China

WANG KANG (e-mail: wang.kang@buaa.edu.cn).

Junzhan Liu and Jinyao Mi contributed equally to this work.

WANG KANG (Senior Member, IEEE)

Received date: 2024-12-20

Revised date: 2025-03-03

Accepted date: 2025-04-30

Online published: 2025-10-22

Supported by

Beijing MSTC Program under Grant(Z231100007423019)

Beijing Natural Science Foundation under Grant(L223004)

Natural Science Foundation of China under Grant(62274008)

Research Funding of Hangzhou International Innovation Institute of Beihang University under Grant(2024KQ157)

Fold

Abstract

Computing-in-memory (CIM) offers a promising solution to the memory wall issue. Magnetoresistive random-access memory (MRAM) is a favored medium for CIM due to its non-volatility, high speed, low power, and technology maturity. However, MRAM has continuously encountered the challenge of an insufficient high-resistance state (HRS) to low-resistance state (LRS) ratio, which affects the result accuracy of CIM. In this paper, based on SOT devices, we propose a 5T2M bit-cell structure that increases the high-to-low current ratio by modulating the sub-threshold operation region. Besides, by jointly using high-resistance devices (M_ level), the power consumption of the bit-cell array can be significantly reduced. Simultaneously, we have designed a compatible multi-bit implementation and macro architecture to support AI edge inference acceleration. This work was simulated under a 40-nm foundry process and a physically verified SOT-MTJ model. The results show that under the same high-to-low resistance ratio, a 52.6× high-to-low current ratio can be achieved, along with a 38.6%-98% bit-cell array power reduction.

Key words： Computing-in-memory; SOT-MRAM; HRS/LRS ratio; multi-bit; artificial intelligence

Cite this article

JUNZHAN LIU , JINYAO MI , YANG LIU , LIANG ZHANG , HE ZHANG , WANG KANG . A High-Resistance SOT Device Based Computing-in-Memory Macro With High Sensing Margin and Multi-Bit MAC Operations for AI Edge Inference[J]. Integrated Circuits and Systems, 2025 , 2(3) : 102 -109 . DOI: 10.23919/ICS.2025.3567939

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning ” Nature, vol. 521, no. 7553, pp. 436-444, 2015.

[2]	C. Szegedy et al., “Going deeper with convolutions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 1-9.

[3]	M. Liang, B. Yang, S. Wang, and R. Urtasun, “Deep continuous fusion for multi-sensor 3D object detection,” in Proc. Eur. Conf. Comput. Vis., Sep. 2018, pp. 641-656.

[4]	M. Horowitz, “1.1 computing’s energy problem (and what we can do about it),” in Proc. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, 2014, pp. 10-14.

[5]	Z. Sun, S. Kvatinsky, X. Si, A. Mehonic, Y. Cai, and R. Huang, “A full spectrum of computing-in-memory technologies,” Nature Electron., vol. 6, no. 11, pp. 823-835, 2023.

[6]	N. Verma et al., “In-memory computing: Advances and prospects,” IEEE Solid-State Circuits Mag., vol. 11, no. 3, pp. 43-55, Summer 2019.

[7]	H. Xu et al., “Senputing: An ultra-low-power always-on vision perception chip featuring the deep fusion of sensing and computing,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 69, no. 1, pp. 232-243, Jan. 2022.

[8]	C. Lu et al., “Self-rectifying all-optical modulated optoelectronic multistates memristor crossbar array for neuromorphic computing,” Nano Lett., vol. 24, no. 5, pp. 1667-1672, 2024.

[9]	J. Liu et al., “A 1000FPS@360,000 pixels mixed-signal sensing with computing macro featuring analog compression and maximum parallelism for objective detection tasks,” Sensors Actuators A, Phys., vol. 379, 2024, Art. no. 115951.

[10]	Q. Liu et al., “33.2 A fully integrated analog ReRAM based 78.4TOPS/W compute-in-memory chip with fully parallel MAC computing,” in Proc. IEEE Int. Solid-State Circuits Conf., 2020, pp. 500-502.

[11]	X. Si et al., “A dual-split 6T SRAM-based computing-in-memory unitmacro with fully parallel product-sum operation for binarized DNN edge processors,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 66, no. 11, pp. 4172-4185, Nov. 2019.

[12]	H. Zhang et al., “CP-SRAM: Charge-pulsation SRAM macro for ultra-high energy-efficiency computing-in-memory,” in Proc. 59th ACM/IEEE Des. Automat. Conf., 2022, pp. 109-114.

[13]	X. Si et al., “A local computing cell and 6T SRAM-based computingin- memory macro with 8-b MAC operation for edge AI chips,” IEEE J. Solid-State Circuits, vol. 56, no. 9, pp. 2817-2831, Sep. 2021.

[14]	W.-S. Khwa et al., “A 65nm 4Kb algorithm-dependent computing-inmemory SRAM unit-macro with 2.3ns and 55.8TOPS/W fully parallel product-sum operation for binary DNN edge processors,” in Proc. IEEE Int. Solid-State Circuits Conf., 2018, pp. 496-498.

[15]	Z. Chen et al., “CAP-RAM: A charge-domain in-memory computing 6T-SRAM for accurate and precision-programmable CNN inference,” IEEE J. Solid-State Circuits, vol. 56, no. 6, pp. 1924-1935, Jun. 2021.

[16]	J. Liu et al., “HiT-CIM: A high-throughput compute-in-memory sram architecture with simultaneous weight loading/computing and balance capabilities,” IEEE Trans. Emerg. Topics Comput., early access, Oct. 09, 2024.

[17]	S. Kim et al., “Scaling-CIM: eDRAM in-memory-computing accelerator with dynamic-scaling ADC and adaptive analog operation,” IEEE J. Solid-State Circuits, vol. 59, no. 8, pp. 2694-2705, Aug. 2024.

[18]	V. Seshadri et al., “Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology,” in Proc. 50th Annu. IEEE/ACM Int. Symp. Microarchit., New York, NY, USA, 2017, pp. 273-287.

[19]	T. C. Kao, M. J. Huang, Y. R. Liu, Y. K. Wang, J. C. Guo, and S. S. Chung, “An ultra-low voltage auger-recombination enhanced hot hole injection scheme in implementing a 3 bits per cell e-DRAM CIM macro for inference accelerator,” in Proc. IEEE Symp. VLSI Technol. Circuits, 2024, pp. 1/2.

[20]	E. Choi et al., “A 333TOPS/W logic-compatible multi-level embedded flash compute-in-memory macro with dual-slope computation,” in Proc. IEEE Custom Integr. Circuits Conf., 2023, pp. 1/2.

[21]	H. Hu et al., “A 55 nm 3Mb digital flash CIM using compressed LUT multiplier and low power WL voltage trimming scheme for AI edge inference,” in Proc. IEEE Asia Pacific Conf. Circuits Syst., 2022, pp. 1-5.

[22]	H.-T. Lue, H.-W. Hu, T.-H. Hsu, P.-K. Hsu, K.-C. Wang, and C.-Y. Lu, “Design of computing-in-memory (CIM) with vertical split-gate flash memory for deep neural network (DNN) inference accelerator,” in Proc. IEEE Int. Symp. Circuits Syst., 2021, pp. 1-4.

[23]	T. -H. Hsu et al., “A vertical split-gate flash memory featuring highspeed source-side injection programming, read disturb free, and 100K endurance for embedded flash (eFlash) scaling and computing-inmemory (CIM),” in Proc. IEEE Int. Electron Devices Meeting (IEDM), 2020, pp. 6.3.1-6.3.4.

[24]	D. Chen, Z. Guo, J. Fang, and X. Xue, “A dual-mode ReRAM CIM macro for low power memory-augmented neural networks,” in Proc. IEEE 16th Int. Conf. Solid-State Integr. Circuit Technol., 2022, pp. 1-3.

[25]	S. K. Dubey, A. Reddy, R. Patel, M. Abz, A. Srinivasulu, and A. Islam, “Architecture of resistive RAM with write driver,” Solid State Electron. Lett., vol. 2, pp. 10-22, 2020.

[26]	T. Tang, L. Xia, B. Li, Y. Wang, and H. Yang, “Binary convolutional neural network on RRAM,” in Proc. 22nd Asia South Pacific Des. Automat. Conf., 2017, pp. 782-787.

[27]	M. Imani, Y. Kim, and T. Rosing, “MPIM: Multi-purpose in-memory processing using configurable resistive memory,” in Proc. 22nd Asia South Pacific Des. Automat. Conf., 2017, pp. 757-763.

[28]	H. Cai et al., “33.4 A 28nm 2Mb STT-MRAN computing-inmemory macro with a refined bit-cell and 22.4- 41.5TOPS/W for AI inference,” in Proc. IEEE Int. Solid-State Circuits Conf., 2023, pp. 500-502.

[29]	H. Zhang et al., “Spintronic processing unit within voltage-gated spin Hall effect MRAMs,” IEEE Trans. Nanotechnol., vol. 18, pp. 473-483, 2019.

[30]	P. Deaville, B. Zhang, and N. Verma, “A 22 nm 128-kb MRAM row/column-parallel in-memory computing macro with memoryresistance boosting and multi-column ADC readout,” in Proc. IEEE Symp. VLSI Technol. Circuits, 2022, pp. 268-269.

[31]	L. Luo et al., “CiTST-AdderNets: Computing in toggle spin torques MRAM for energy-efficient AdderNets,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 71, no. 3, pp. 1130-1143, Mar. 2024.

[32]	L. Luo et al., “Linear error correction codec implementation based on an in-memory computing architecture for nonvolatile memories,” IEEE Trans. Electron Devices, vol. 69, no. 6, pp. 3455-3461, Jun. 2022.

[33]	P. Deaville, B. Zhang, L. -Y. Chen, and N. Verma, “A maximally row-parallel MRAM in-memory-computing macro addressing readout circuit sensitivity and area,” in Proc. IEEE 47th Eur. Solid State Circuits Conf., 2021, pp. 75-78.

[34]	P. Deaville, B. Zhang, and N. Verma, “A 256-kb fully row/columnparallel 22nm MRAM in-memory-computing macro with differential readout for robust parallelization and scale-up,” in Proc. IEEE 49th Eur. Solid State Circuits Conf., 2023, pp. 21-24.

[35]	J. Liu et al., “A SOT-MRAM based CIM design with multi-bit resistance-sum paradigm and non-idealities tuning mechanism,” IEEE Trans. Magn., vol. 61, no. 1, Jan. 2025, Art. no. 3400106.

[36]	S. Jung et al., “A crossbar array of magnetoresistive memory devices for in-memory computing,” Nature, vol. 601, pp. 211-216, Jan. 2022.

[37]	X. Sun et al., “PCM-based analog compute-in-memory: Impact of device non-idealities on inference accuracy,” IEEE Trans. Electron Devices, vol. 68, no. 11, pp. 5585-5591, Nov. 2021.

[38]	Y. Zhou et al., “Hybrid-FE-layer FeFET with high linearity and endurance toward on-chip CIM by array demonstration,” IEEE Electron Device Lett., vol. 45, no. 2, pp. 276-279, Feb. 2024.

[39]	C. Matsui, E. Kobayashi, K. Toprasertpong, S. Takagi, and K. Takeuchi, “Versatile FeFET voltage-sensing analog CIM for fast & small-area hyperdimensional computing,” in Proc. IEEE Int. Symp. Circuits Syst., 2022, pp. 3403-3407.

[40]	S. Zhang, J. Chen, Y. Wang, Z. Jia, C. Zhuo, and X. Yin, “Design and optimization of FeFET based CIM for neural network acceleration,” in Proc. Int. Symp. Electron. Des. Automat., 2023, pp. 225-229.

[41]	J. Cui, K.-X. Zhang, and J.-G. Park, “All van der Waals three-terminal SOT-MRAM realized by topological ferromagnet Fe3GeTe2,” Adv. Electron. Mater., vol. 10, no. 9, Apr. 2024, Art. no. 2400041.

[42]	H. Lin et al., “All-electrical control of compact SOT-MRAM: Toward highly efficient and reliable non-volatile in-memory computing,” Micromach., vol. 13, no. 2, 2022, Art. no. 319.

[43]	J. Doevenspeck et al., “SOT-MRAM based analog in-memory computing for DNN inference,” in Proc. IEEE Symp. VLSI Technol., 2020, pp. 1/2.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References