PDF(3032 KB)
An Adaptive Parallel Layer-Skipping Framework for Large Language Model Inference Speedup With Speculative Decoding
ZHE WEN, LIANG XU, MEIQIWANG
Integrated Circuits and Systems ›› 2025, Vol. 2 ›› Issue (2) : 58-66.
PDF(3032 KB)
PDF(3032 KB)
An Adaptive Parallel Layer-Skipping Framework for Large Language Model Inference Speedup With Speculative Decoding
({{custom_author.role_en}}), {{javascript:window.custom_author_en_index++;}}| {{custom_ref.label}} |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
/
| 〈 |
|
〉 |