诊断学理论与实践 ›› 2024, Vol. 23 ›› Issue (05): 484-493.doi: 10.16150/j.1671-2870.2024.05.004

• 论著 • 上一篇    下一篇

基于机器学习的功能性近红外光谱信号识别早期帕金森病患者的研究

于津, 汪杰, 王虎军, 王丛笑, 李瑛琦, 方伯言, 王颖鹏()   

  1. 首都医科大学附属北京康复医院,北京 100144
  • 收稿日期:2023-12-04 接受日期:2024-06-07 出版日期:2024-10-25 发布日期:2025-02-25
  • 通讯作者: 王颖鹏 E-mail: ypwang@ccmu.edu.cn
  • 基金资助:
    北京康复医院院内课题(2022-029)

Study on the recognition of early-stage Parkinson’s disease patients using functional near-infrared spectroscopy signals based on machine learning

YU Jin, WANG Jie, WANG Hujun, WANG Congxiao, LI Yingqi, FANG Boyan, WANG Yingpeng()   

  1. Beijing Rehabilitation Hospital, Capital Medical University, Beijing 100144, China
  • Received:2023-12-04 Accepted:2024-06-07 Published:2024-10-25 Online:2025-02-25

摘要:

目的:探索应用功能性近红外光谱(functional near-infrared spectroscopy, fNIRS)信号结合机器学习算法对早期PD患者进行诊断的可行性。方法:研究连续纳入自2021年12月至2023年8月期间在首都医科大学附属北京康复医院确诊的60例PD患者和60名健康对照者,使用22个通道(channel, CH)的ETG-4000型近红外脑功能成像仪采集受试者前额叶氧合血红蛋白和脱氧血红蛋白浓度变化值,使用一般线性模型计算每个通道激活程度β值。构建4种机器学习诊断模型,即支持向量机(support vector machine, SVM)、反向传播(back-propagation, BP)神经网络、随机森林和逻辑回归模型。采用准确率、灵敏度、特异度、受试者操作特征(receiver operating characteristic, ROC)曲线下面积对4种诊断学模型的效果进行评价。此外,使用SHAP(SHapley Additive exPlanations)技术来提高最优模型的可解释性,计算每个通道的SHAP值,将不同通道SHAP值进行加权平均汇总后,结合脑区分布,得到不同脑区对于模型分类任务的贡献比例。结果:4种诊断模型的准确率范围为81%~90%,灵敏度为69%~89%,特异度为93%~100%,ROC曲线下面积为0.90~0.98。其中,SVM模型表现最佳, ROC曲线下面积为0.96,准确率为90%,灵敏度为89%,特异度为93%。SHAP分析显示对于SVM模型贡献最大的4个通道为:CH08、CH05、CH01和CH13,其中右侧前额极皮层(frontopolar cortex,FPC)区域占比最大占总贡献36.5%。结论:基于fNIRS信号和SVM算法构建的模型在诊断早期PD患者中表现出诊断优势,其灵敏度(89%)和特异度(93%)均优于大多数现有方法。未来的研究应重点关注右侧前额极皮层区域和背外侧前额叶皮层区域的fNIRS信号特征,以进一步提高诊断模型的效能。

关键词: 帕金森病, 功能性近红外光谱, 机器学习, 诊断模型

Abstract:

Objective This study aims to investigate the feasibility of diagnosing early-stage Parkinson’s disease (PD) patients by combining functional near-infrared spectroscopy (fNIRS) signals with machine learning algorithms. Methods Sixty PD patients as well as 60 healthy controls, diagnosed between December 2021 and August 2023 at Beijing Rehabilitation Hospital, Capital Medical University, were consecutively enrolled in this study. The ETG-4000 near-infrared brain imaging system with 22 channels (CH) was used to record changes in oxyhemoglobin and deoxyhemoglobin concentrations in the prefrontal cortex of the subjects. A general linear model was applied to calculate the activation degree (β value) for each channel. Four machine learning diagnostic models were developed: support vector machine (SVM), back-propagation (BP) neural network, random forest, and logistic regression models. The performance of the four diagnostic models was evaluated based on accuracy, sensitivity, specificity, and the area under the Receiver Operating Characteristic (ROC) curve. Additionally, SHapley Additive exPlanations (SHAP) analysis was applied to improve the interpretability of the optimal model. SHAP values for each channel were calculated, and the weighted average of the SHAP values from different channels was summarized. By combining this with the brain region distribution, the contribution of different brain regions to the model’s classification task was obtained. Results The accuracy of the four diagnostic models ranged from 81% to 90%, sensitivity from 69% to 89%, specificity from 93% to 100%, and the area under the ROC curve from 0.90 to 0.98. The SVM model outperformed the others, achieving an area under the ROC curve of 0.96, accuracy of 90%, sensitivity of 89%, and specificity of 93%. SHAP analysis revealed that the four channels contributing most to the SVM model were CH08, CH05, CH01, and CH13, with the right frontopolar cortex (FPC) region contributing the largest share (36.5% of the total). Conclusions The model based on fNIRS signals and the SVM algorithm shows great diagnostic advantages in diagnosing early-stage PD patients, with sensitivity (89%) and specificity (93%) exceeding those of most existing methods. Future research should focus on the fNIRS signal characteristics of the right frontopolar cortex and dorsolateral prefrontal cortex regions to further improve the performance of the diagnostic model.

Key words: Parkinson’s disease, Functional near-infrared spectroscopy, Machine learning, Diagnostic model

中图分类号: