Co-Optimization for Large Language Models: Advances in Algorithm and Hardware
SHAOBO LUO, ALBERT YU, ZHIYUAN XIE, HONG HUANG, MINGQIANG HUANG, KAI LI, YUK KAN PUN, ZHIRU GUO, SHUWEI LI, YIMING ZHU, CHANGHAI MAN, HUIYUAN SUN, TUNG-HAN CHANG, ZIYI GUAN, QIYUAN ZHANG, TINGTING WANG, GUANQI PENG, WENJUN CHEN, YAN SUN, GENGXIN CHEN, MEI YAN, HAO YU
Precision medicine is revolutionizing global healthcare by enabling personalized diagnostics, disease prediction, and tailored treatment strategies. While the integration of genomics and data science holds immense potential to optimize precision therapeutic outcomes, a critical challenge lies in translating gene sequencing data into actionable insights for in vitro diagnostics. This bottleneck is largely attributed to the limitations of edge-side intelligent processing and automation. Despite advancements in gene sequencing technologies and bioinformatics tools, the workflow from sample collection to diagnostic report generation remains fragmented, inefficient, and lacks of intelligence. To address these challenges, we introduce an embodied LLM NGS sequencer on the edge for real-time, on-site smart genetic diagnostics. This instrument integrates a streamlined and comprehensive pipeline with deep learning networks for primary data analysis, machine learning for secondary data processing, and a large language model (LLM) optimized for tertiary data interpretation. The LLM is enhanced through quantization and compression, facilitating deployment on FPGA/GPU to accelerate diagnostic workflows. Experimental results showcased the superior performance by achieving a 13.72% increase in throughput, a 99.50% Q30%, and enable smart diagnostic on the edge with the performance up to 75 tokens/s. This work enables immediate, on-site DNA analysis, hence dramatically improving precision medicine’s accessibility and efficiency, and significantly advances diagnostic accuracy, automation, establishing a robust platform for AI-driven personalized medicine and setting a new benchmark for the future of healthcare delivery.