Regular Papers

The Decomposition and Combination Paradigms of Chiplet-Based Integrated Chips

  • FUPING LI ,
  • YING WANG ,
  • MEIXUAN LU ,
  • YUTONG ZHU ,
  • HAORAN WANG ,
  • ZHUN ZHAO ,
  • JUNPEI HUANG ,
  • XIAOTONG WEI ,
  • XIHAO LIANG ,
  • YUJIE WANG ,
  • HAOBO XU ,
  • HUAWEI LI ,
  • XIAOWEI LI ,
  • QI LIU ,
  • MING LIU ,
  • NINGHUI SUN ,
  • YINHE HAN
Expand
  • 1 SKLP, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • 2 Zhongguancun Laboratory, Beijing 100086, China
  • 3 Fudan University, Shanghai 200433, China
  • 4 University of Chinese Academy of Sciences, Beijing 100190, China

YING WANG, (Member, IEEE)

YUJIEWANG, (Member, IEEE)

HAOBO XU, (Member, IEEE)

HUAWEI LI, (Senior Member, IEEE)

XIAOWEI LI, (Senior Member, IEEE)

QI LIU, (Member, IEEE)

YINHE HAN, (Senior Member, IEEE)

Received date: 2024-04-24

  Revised date: 2024-06-16

  Accepted date: 2024-07-15

  Online published: 2024-11-27

Supported by

National Natural Science Foundation of China (NSFC) under Grant(92373206)

National Natural Science Foundation of China (NSFC) under Grant(62222411)

National Natural Science Foundation of China (NSFC) under Grant(62025404)

National Key Research and Development Program of China under Grant(2023YFB4404400)

Abstract

Due to the waning of Moore’s Law, the conventional monolithic chip architectural design is confronting hurdles such as increasing die size and skyrocketing cost. In this post-Moore era, the integrated chip has emerged as a pivotal technology, gaining substantial interest from both the academia and industry. Compared with monolithic chips, the chiplet-based integrated chips can significantly enhance system scalability, curtail costs, and accelerate design cycles. However, integrated chips introduce vast design spaces encompassing chiplets, inter-chiplet connections, and packaging parameters, thereby amplifying the complexity of the design process. This paper introduces the Optimal Decomposition-Combination Theory, a novel methodology to guide the decomposition and combination processes in integrated chip design. Furthermore, it offers a thorough examination of existing integrated chip design methodologies to showcase the application of this theory.

Cite this article

FUPING LI , YING WANG , MEIXUAN LU , YUTONG ZHU , HAORAN WANG , ZHUN ZHAO , JUNPEI HUANG , XIAOTONG WEI , XIHAO LIANG , YUJIE WANG , HAOBO XU , HUAWEI LI , XIAOWEI LI , QI LIU , MING LIU , NINGHUI SUN , YINHE HAN . The Decomposition and Combination Paradigms of Chiplet-Based Integrated Chips[J]. Integrated Circuits and Systems, 2024 , 1(1) : 18 -30 . DOI: 10.23919/ICS.2024.3451428

[1]
S. Naffziger et al., “Pioneering chiplet technology and design for the amd epycTM and ryzenTM processor families: Industrial product,” in Proc. ACM/IEEE 48th Annu. Int. Symp. Comput. Archit., 2021, pp. 57-70.

[2]
J. Xia, C. Cheng, X. Zhou, Y. Hu, and P. Chun, “Kunpeng 920: The first 7-nm chiplet-based 64-core arm SoC for cloud services,” IEEE Micro, vol. 41, no. 5, pp. 67-75, Sep./Oct. 2021.

[3]
F. Zaruba, F. Schuiki, and L. Benini, “Manticore: A 4096-core RISC-V chiplet architecture for ultraefficient floating-point computing,” IEEE Micro, vol. 41, no. 2, pp. 36-42, Mar./Apr. 2021.

[4]
Y. S. Shao et al., “Simba: Scaling deep-learning inference with multi- chip-module-based architecture,” in Proc. 52nd Annu. IEEE/ACM Int. Symp. Microarchitecture, 2019, pp. 14-27.

[5]
W. Gomes et al., “8.1 Lakefield and mobility compute: A 3D stacked 10 nm and 22FFL hybrid processor system in 12 12 mm 2, 1 mm package-on-package,” in Proc. IEEE Int. Solid-State Circuits Conf., 2020, pp. 144-146.×

[6]
H. Zhu et al., “COMB-MCM: Computing-on-memory-boundary NN processor with bipolar bitwise sparsity optimization for scalable multi- chiplet-module edge machine learning,” in Proc. IEEE Int. Solid-State Circuits Conf., 2022, pp. 1-3.

[7]
R. Hwang et al., “Centaur: A chiplet-based, hybrid sparse-dense ac- celerator for personalized recommendations,” in Proc. ACM/IEEE 47th Annu. Int. Symp. Comput. Archit., 2020, pp. 968-981.

[8]
M. Gao et al., “Tetris: Scalable and efficient neural network accelera- tion with 3D memory,” in Proc. 22nd Int. Conf. Architectural Support Program. Lang. Operating Syst., 2017, pp. 751-764.

[9]
J. H. Lau and J. H. Lau, Advanced Packaging. Berlin, Germany: Springer, 2021.

[10]
C. P. Wong and M. M. Wong, “Recent advances in plastic packag- ing of flip-chip and multichip modules (MCM) of microelectronics,” IEEE Trans. Compon. Packag. Technol., vol. 22, no. 1, pp. 21-25, Mar. 1999.

[11]
R. Mahajan et al., “Embedded multi-die interconnect bridge (EMIB)-a high density, high bandwidth packaging interconnect,” in Proc. IEEE 66th Electron. Compon. Technol. Conf., 2016, pp. 557-565.

[12]
T. G. Lenihan, L. Matthew, and E. J. Vardaman,“Developments in 2.5 D: The role of silicon interposers,” in Proc. IEEE 15th Electron. Packag. Technol. Conf., 2013, pp. 53-55.

[13]
A. C. Carusone, B. Dehlaghi, R. Beerkens, and D. Tonietto, “Ultra- short-reach interconnects for package-level integration,” in Proc. IEEE Opt. Interconnects Conf., 2016, pp. 10-11.

[14]
D. Kehlet,“Accelerating innovation through a standard chiplet interface: The advanced interface bus (AIB),” 2022. Accessed: Sep. 20, 2024. [Online]. Available: https://www.intel.com/content/dam/www/public/us/en/documents/whitepapers/accelerating-innovation-through-aib-whitepaper.pdf.

[15]
R. Farjadrad, M. Kuemerle, and B. Vinnakota, “A bunch-of-wires (BoW) interface for interchiplet communication,” IEEE Micro, vol. 40, no. 1, pp. 15-24, Jan./Feb. 2020.

[16]
J. W. Poulton et al., “A 0.54 pJ/b 20Gb/s ground-referenced single- ended short-haul serial link in 28 nm CMOS for advanced packaging applications,” in Proc. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, 2013, pp. 404-405.

[17]
Y. Han et al., “The big chip: Challenge, model and architecture,” Fun- dam. Res., 2023. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S2667325823003709

[18]
A. Coskun et al., “Cross-layer co-optimization of network design and chiplet placement in 2.5-D systems,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 39, no. 12, pp. 5183-5196, Dec. 2020.

[19]
“Too many package options: What makes sense for your application?,” 2021. [Online]. Available: https://hc33.hotchips.org/assets/program/tutorials/TechSearchInternational_TutorialHotChipsFINAL.pdf

[20]
F. Li et al., “Chipletizer: Repartitioning SoCs for cost-effective chiplet integration,” in Proc. IEEE 29th Asia South Pacific Des. Automat. Conf., 2024, pp. 58-64.

[21]
X. Ma et al., “Survey on chiplets: Interface, interconnect and integra- tion methodology,” CCF Trans. High Perform. Comput., vol. 4, no. 1,pp. 43-52, 2022.

[22]
D. S. Randall, Cost-Driven Integration Architectures For Multi-Die Silicon Systems. Berkeley, CA, USA: Univ. California Press, 2020.

[23]
G. Lauterbach, “The path to successful wafer-scale integration: The cerebras story,” IEEE Micro, vol. 41, no. 6, pp. 52-57, Nov./Dec. 2021.

[24]
Y. Feng and K. Ma, “Chiplet actuary: A quantitative cost model and multi-chiplet architecture exploration,” in Proc. 59th ACM/IEEE Des. Automat. Conf., 2022, pp. 121-126.

[25]
F. Zaruba, F. Schuiki, and L. Benini, “Manticore: A 4096-Core RISC-V chiplet architecture for ultraefficient floating-point computing,” IEEE Micro, vol. 41, no. 2, pp. 36-42, Mar./Apr. 2021.

[26]
E. Beyne, D. Milojevic, G. Van der Plas, and G. Beyer, “3D SoC integra- tion, beyond 2.5D chiplets,” in Proc. IEEE Int. Electron Devices Meet- ing, 2021, pp. 3.6. 1-3.6.4, doi: 10.1109/IEDM19574.2021.9720614.

[27]
Y. Kwon, “ABSX: The chiplet hyperscale AI processing unit for energy-efficient high-performance AI processing,” in Proc. IEEE 20th Int. SoC Des. Conf., 2023, pp. 217-218, doi: 10.1109/ISOCC59558.2023.10396520.

[28]
T. Wang, F. Feng, S. Xiang, Q. Li, and J. Xia, “Application defined on-chip networks for heterogeneous chiplets: An implementation per- spective,” in Proc. IEEE Int. Symp. High-Perform. Comput. Archit., 2022, pp. 1198-1210, doi: 10.1109/HPCA53966.2022.00091.

[29]
J. H. Lau, Chiplet Design and Heterogeneous Integration Packaging. Berlin, Germany: Springer, 2023.

[30]
G. Karypis and V. Kumar, “Multilevel K-way hypergraph partition- ing,” in Proc. ACM/IEEE 36th Annu. Des. Automat. Conf., 1999,pp.343-348.

[31]
B. W. Kernighan and S. Lin, “An efficient heuristic procedure for partitioning graphs,” Bell Syst. Tech. J., vol. 49, no. 2, pp. 291-307, Feb. 1970.

[32]
C. M. Fiduccia and R. M. Mattheyses, “A linear-time heuristic for improving network partitions,” in Proc. 19th Des. Automat. Conf., 1982,pp.175-181.

[33]
S. Schlag, T. Heuer, L. Gottesbüren, Y. Akhremtsev, C. Schulz, and P. Sanders, “High-quality hypergraph partitioning,” Assoc. Comput. Ma- chinery J. Exp. Algorithmics, vol. 27, pp. 1-39, Dec. 2022.

[34]
M. Khazraee et al., “Moonwalk: NRE optimization in Asic clouds,” Assoc. Comput. Machinery SIGARCH Comput. Archit. News, vol. 45, no. 1, pp. 511-526, 2017.

[35]
T. Thorolfsson, G. Luo, J. Cong, and P. D. Franzon,“Logic-on-logic 3D integration and placement,” in Proc. IEEE Int. 3D Syst. Integration Conf., 2010, pp. 1-4.

[36]
ICCAD 2022 CAD Contest. Accessed: Sep. 20, 2024. [Online]. Avail- able: https://www.iccad-contest.org/2022/

[37]
ICCAD 2023 CAD Contest. Accessed: Sep. 20, 2024. [Online]. Avail- able: https://www.iccadcontest.org/2023/

[38]
Y.-J. Chen, Y.-S. Chen, W.-C. Tseng, C.-Y. Chiang, Y.-H. Lo, and Y.-W. Chang,“Late breaking results: Analytical placement for 3D ICs with multiple manufacturing technologies,” in Proc. ACM/IEEE 60th Des. Automat. Conf., 2023, pp. 1-2.

[39]
S. Banerjee, S. Majumder, and B. B. Bhattacharya, “A graph-based 3D IC partitioning technique,” in Proc. IEEE Comput. Soc. Annu. Symp. VLSI, 2014, pp. 613-618.

[40]
K. Bhat and R. Jayagowri, “Descending order thermal distribution partitioning algorithm for flip-chip packaged 3-D ICs to improve heat sinking and reduce TSV count,” IEEE Trans. Compon., Packag. Manuf. Technol., vol. 10, no. 7, pp. 1148-1157, Jul. 2020.

[41]
J. Rajendran, O. Sinanoglu, and R. Karri, “Is split manufacturing se- cure?,” in Proc. IEEE Des., Automat. Test Europe Conf. Exhib., 2013,pp.1259-1264.

[42]
Y. Wang, P. Chen, J. Hu, G. Li, and J. Rajendran, “The cat and mouse in split manufacturing,” IEEE Trans. Very Large Scale Integration Syst., vol. 26, no. 5, pp. 805-817, May 2018.

[43]
Y. Safari, P. Aghanoury, S. S. Iyer, N. Sehatbakhsh, and B. Vaisband, “Hybrid obfuscation of chiplet-based systems,” in Proc. IEEE 60th Assoc. Comput. Machinery/IEEE Des. Automat. Conf., 2023, pp. 1-6.

[44]
V. Pano, R. Kuttappa, and B. Taskin,“3D NoCs with active interposer for multi-die systems,” in Proc. IEEE/ACM 13th Int. Symp. Netw.-on- Chip, 2019, pp. 1-8.

[45]
Y. Thonnart et al., “POPSTAR: A robust modular optical NoC archi- tecture for chiplet-based 3D integrated systems,” in Proc. IEEE Des., Automat. Test Europe Conf. Exhib., 2020, pp. 1456-1461.

[46]
A. Narayan, Y. Thonnart, P. Vivet, A. Joshi, and A. K. Coskun, “System- level evaluation of chip-scale silicon photonic networks for emerging data-intensive applications,” in Proc. IEEE Des., Automat. Test Europe Conf. Exhib., 2020, pp. 1444-1449.

[47]
S. Bharadwaj, J. Yin, B. Beckmann, and T. Krishna, “Kite: A family of heterogeneous interposer topologies enabled via accurate interconnect modeling,” in Proc. Assoc. Comput. Machinery/IEEE Des. Automat. Conf., 2020, pp. 1-6.

[48]
J. Kadomoto, H. Irie, and S. Sakai, “Design of shape-changeable chiplet-based computers using an inductively coupled wireless bus interface,” in Proc. IEEE 38th Int. Conf. Comput. Des., 2020,pp.589-596.

[49]
J. Kadomoto, S. Mitsuno, H. Irie, and S. Sakai, “An inductively coupled wireless bus for chiplet-based systems,” in Proc. IEEE 25th Asia South Pacific Des. Automat. Conf., 2020, pp. 9-10.

[50]
J. Kadomoto, H. Irie, and S. Sakai, “WiXI: An inter-chip wireless bus interface for shape-changeable chiplet-based computers,” in Proc. IEEE 37th Int. Conf. Comput. Des., 2019, pp. 100-108.

[51]
M. Wang, Y. Wang, C. Liu, and L. Zhang, “Network-on-interposer design for agile neural-network processor chip customization,” in Proc. ACM/IEEE 58th Des. Automat. Conf., 2021, pp. 49-54.

[52]
F. Li et al., “Gia: A reusable general interposer architecture for agile chiplet integration,” in Proc. IEEE/ACM 41st Int. Conf. Comput.-Aided Des., 2022, pp. 1-9.

[53]
C.-H. O. Chen, S. Park, T. Krishna, S. Subramanian, A. P. Chan- drakasan, and L.-S. Peh, “SMART: A single-cycle reconfigurable NoC for SoC applications,” in Proc. IEEE Des., Automat. Test Europe Conf. Exhib., 2013, pp. 338-343.

[54]
E. Taheri, S. Pasricha, and M. Nikdast, “DeFT: A deadlock-free and fault-tolerant routing algorithm for 2.5 D chiplet networks,” in Proc. IEEE Des., Automat. Test Europe Conf. Exhib., 2022, pp. 1047-1052.

[55]
S. Pal et al., “Designing a 2048-chiplet, 14336-core waferscale processor,” in Proc. ACM/IEEE 58th Des. Automat. Conf., 2021,pp.1183-1188.

[56]
Y. Feng, D. Xiang, and K. Ma, “A scalable methodology for designing efficient interconnection network of chiplets,” in Proc. IEEE Int. Symp. High-Perform. Comput. Archit., 2023, pp. 1059-1071.

[57]
Y. Wang et al., “Economizing TSV resources in 3-D network-on-chip design,” IEEE Trans. Very Large Scale Integration, vol. 23, no. 3,pp. 493-506, Mar. 2015.

[58]
B. Fu, Y. Han, H. Li, and X. Li, “ZoneDefense: A fault-tolerant routing for 2-D meshes without virtual channels,” IEEE Trans. Very Large Scale Integration Syst., vol. 22, no. 1, pp. 113-126, Jan. 2014.

[59]
D. Xiang, K. Chakrabarty, and H. Fujiwara, “Multicast-based test- ing and thermal-aware test scheduling for 3D ICs with a stacked network-on-chip,” IEEE Trans. Comput., vol. 65, no. 9, pp. 2767-2779,Sep. 2016.

[60]
T. Ni, Q. Xu, Z. Huang, H. Liang, A. Yan, and X. Wen, “A cost-effective TSV repair architecture for clustered faults in 3-D IC,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 40, no. 9, pp. 1952-1956,Sep. 2021.

[61]
W. T. Beyene,“Chiplet technology and heterogeneous integration.” Accessed: Sep. 20, 2024. [Online]. Available: https://eps.ieee.org/images/files/enews/EDMS_TC_-_Chiplet_Technology_and_Heterogenous_Integration.pdf

[62]
G. Bonilla, B. Quinlan, T. Wassick, R. Kastberg, S. Li, and M. Basutkar, “On the path to AI hardware via chiplet integration enabled by high density organic substrates,” in Proc. IEEE 73rd Electron. Compon. Technol. Conf., 2023, pp. 1374-1380.

[63]
D. Stow, I. Akgun, and Y. Xie,“Investigation of cost-optimal network- on-chip for passive and active interposer systems,” in Proc. ACM/IEEE Int. Workshop System Level Interconnect Prediction, 2019, pp. 1-8.

[64]
J. Nasrullah, Z. Luo, and G. Taylor, “Designing software configurable chips and SIPs using chiplets and zGlue,” in Proc. Int. Symp. Micro- electronics, 2019, pp. 27-32.

[65]
UCIe,“Universal chiplet interconnect express (UCIe) specification re- vision 1.0,” Feb. 2022. Accessed: Feb. 30, 2022. [Online]. Available: https://www.uciexpress.org/specification

[66]
R. Farjadrad, M. Kuemerle, and B. Vinnakota, “A bunch of wires (BoW) interface for inter-chiplet communication,” in Proc. IEEE Symp. High-Perform. Interconnects, 2019, pp. 27-273.

[67]
S. Ardalan, R. Farjadrad, M. Kuemerle, K. Poulton, S. Subramaniam, and B. Vinnakota, “An open inter-chiplet communication link: Bunch of wires (BoW),” IEEE Micro, vol. 41, no. 1, pp. 54-60, Jan./Feb. 2021.

[68]
JEDEC,“High bandwidth memory DRAM (HBM3),” Jan. 2022. Ac- cessed: Feb. 30, 2022. [Online]. Available: https://www.jedec.org/document_search?search_api_views_fulltext=JESD238

[69]
S. Naffziger, K. Lepak, M. Paraschou, and M. Subramony, “2.2 AMD chiplet architecture for high-performance server and desktop prod- ucts,” in Proc. IEEE Int. Solid- State Circuits Conf., 2020, pp. 44-45, doi: 10.1109/ISSCC19947.2020.9063103.

[70]
Kiwimoore.Accessed: Sep. 20, 2024. [Online]. Available: https://www.kiwimoore.com/product/index.html

[71]
IWLS 2005 Benchmarks. Accessed: Sep. 20, 2024. [Online]. Available: https://iwls.org/iwls2005/benchmarks.html

[72]
MCNC Benchmark Netlists for Floorplanning and Placement.Ac- cessed: Sep. 20, 2024. [Online]. Available: https://s2.smu.edu/∼manikas/Benchmarks/MCNC_Benchmark_Netlists.html

[73]
Y. Ma, L. Delshadtehrani, C. Demirkiran, J. L. Abellan, and A. Joshi, “TAP-2.5 D: A thermally-aware chiplet placement methodology for 2.5 D systems,” in Proc. IEEE Des., Automat. Test Europe Conf. Exhib., 2021, pp. 1246-1251.

[74]
T. Lu, C. Serafy, Z. Yang, S. K. Samal, S. K. Lim, and A. Srivastava, “TSV-based 3-D ICs: Design methods and tools,” IEEE Trans. Comput.- Aided Des. Integr. Circuits Syst., vol. 36, no. 10, pp. 1593-1619, Oct. 2017.

[75]
G. H. Loh and R. Swaminathan, “The next era for chiplet innovation,” in Proc. IEEE Des., Automat. Test Europe Conf. Exhib., 2023, pp. 1-6.

Outlines

/