外科理论与实践 ›› 2025, Vol. 30 ›› Issue (04): 316-324.doi: 10.16139/j.1007-9610.2025.04.05

• 论著 • 上一篇    下一篇

基于机器学习的胆囊癌意向性根治术后极早期复发预测模型的构建及验证

唐祯齐, 李起, 刘恒超, 张东, 耿智敏()   

  1. 西安交通大学第一附属医院肝胆外科,陕西 西安 710061
  • 收稿日期:2025-05-25 出版日期:2025-07-25 发布日期:2025-10-23
  • 作者简介:第一联系人:

    通信作者:耿智敏,E-mail: gengzhimin@mail.xjtu.edu.cn

  • 基金资助:
    国家自然科学基金(62076194);陕西省重点研发计划(2025SF-YBXM-386);西安交通大学第一附属医院院基金(2024-QN-015)

Construction and validation of a machine learning-based prediction model for very early recurrence after curative-intent resection for gallbladder cancer

TANG Zhenqi, LI Qi, LIU Hengchao, ZHANG Dong, GENG Zhimin()   

  1. Department of Hepatobiliary Surgery, the First Affiliated Hospital of Xi’an Jiaotong University, Shaanxi Xi’an 710061, China
  • Received:2025-05-25 Online:2025-07-25 Published:2025-10-23

摘要:

目的:探讨胆囊癌(GBC)病人意向性根治术后极早期复发(VER)的危险因素,并基于不同机器学习算法构建术后VER预测模型。方法:回顾性分析2016年1月至2020年12月本院收治的329例行GBC意向性根治术病人的临床病理资料,分析VER的危险因素,基于VER的独立相关因素分别采用逻辑回归、支持向量机、朴素贝叶斯、随机森林、轻量梯度提升机、极限梯度提升等机器学习算法构建预测模型,验证并比较不同机器学习算法预测模型的效能。结果:329例行GBC意向性根治术病人,术后复发162例(49.2%),其中VER(<6个月)69例(42.6%),non-VER(≥6个月)93例(57.4%)。生存分析显示,GBC术后VER病人中位总生存期明显低于non-VER病人(6个月比未达到,χ2=398.2,P<0.001)。单因素分析显示,癌胚抗原(CEA)、糖类抗原(CA)19-9、CA-125、肿瘤分化程度、病理类型、肝侵犯、血管侵犯、神经浸润、TNM分期、T分期及N分期是影响GBC术后VER的危险因素(P<0.05),术后辅助化疗是术后VER的保护因素(P<0.05)。多因素分析显示,CA-125、分化程度、病理类型、血管侵犯及N分期是术后VER的独立危险因素(P<0.05),而术后辅助化疗是术后VER的独立保护因素(P<0.05)。极限梯度提升在验证集中表现更优的预测性能,曲线下面积(AUC)为0.841、精确度(ACC)为83.0%。沙普利加和法解释(SHAP)条形图显示分化程度、N分期、病理类型、CA-125具有最高预测权重,其均对VER的发生概率具有正向的预测作用。结论:CA-125、分化程度、病理类型、血管侵犯、N分期及术后辅助化疗与GBC意向性根治术后VER独立相关。基于上述因素构建的机器学习算法预测模型可在一定程度上识别术后VER的高风险病人,为监测GBC术后VER提供适当参考。

关键词: 胆囊癌, 极早期复发, 预后, 机器学习, 预后模型

Abstract:

Objective To explore the risk factors for very early recurrence (VER) after curative-intent resection for gallbladder cancer (GBC) patients and construct prediction models for VER based on various machine learning (ML) algorithms. Methods A retrospective study was conducted on 329 GBC patients who underwent curative-intent surgery at our hospital between January 2016 and December 2020. Risk factors for VER were identified, and prediction models were constructed, validated and compared with multiple ML algorithms[logistic regression (LR), support vector machine (SVM), naive Bayes (NB), random forest (RF), light gradient boosting machine (LGB), and extreme gradient boosting (XGB)]based on independent associated factors for VER. Results Among the 329 patients who underwent curative-intent resection in patients with GBC, 162 (49.2%) patients experienced recurrence, including 69 (42.6%) with VER(<6 months) and 93 (57.4%) with non-VER(≥6 months). Survival analysis showed that patients with VER had significantly worse median overall survival compared to those with non-VER (6 months vs. not arrived,χ2=398.2, P<0.001). Univariate analysis showed that carcinoembryonic antigen (CEA), carbohydrate antigen (CA)19-9, CA-125, tumor differentiation, pathological type, liver involvement, vascular invasion, perineural invasion, TNM stage, T stage and N stage were risk factors of VER (P<0.05), whereas adjuvant chemotherapy was protective factor (P<0.05). Multivariate analysis confirmed CA-125, tumor differentiation, pathological type, vascular invasion and N stage as independent risk factors (P<0.05), whereas adjuvant chemotherapy was independent protective factor (P<0.05). XGB model achieved the best performance with an area under curve (AUC) of 0.841 and an accuracy (ACC) of 83.0% in the validation set. Shapley additive explanations (SHAP) bar plots highlighted tumor differentiation, N stage, pathological type of tumor, and CA-125 the top four features contributing to the model, each positively influencing the predicted probability of VER. Conclusions CA-125, tumor differentiation, pathological type, vascular invasion, N stage and adjuvant chemotherapy are independent factors associated with VER of GBC following curative-intent resection. ML-based prediction models incorporating these factors have the potential to some extent to effectively identify high-risk patients, providing a valuable reference for VER surveillance in GBC.

Key words: Gallbladder cancer(GBC), Very early recurrence(VER), Prognosis, Machine learning, Prediction model

中图分类号: