Loading…

Comparing supervised and semi-supervised machine learning approaches in NTCP modeling to predict complications in head and neck cancer patients

•Supervised and semi-supervised machine learning approaches to predict toxicities in head and neck cancer patients were compared.•Similar performance of the models was observed both in terms of discrimination and calibration.•Varying the amount of data for development or the confidence threshold did...

Full description

Saved in:

Bibliographic Details
Published in:	Clinical and translational radiation oncology 2023-11, Vol.43, p.100677-100677, Article 100677
Main Authors:	Spiero, I., Schuit, E., Wijers, O.B., Hoebers, F.J.P., Langendijk, J.A., Leeuwenberg, A.M.
Format:	Article
Language:	English
Subjects:	Head and neck cancer NTCP modeling Original Radiation-induced toxicity Semi-supervised learning
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•Supervised and semi-supervised machine learning approaches to predict toxicities in head and neck cancer patients were compared.•Similar performance of the models was observed both in terms of discrimination and calibration.•Varying the amount of data for development or the confidence threshold did not impact the similarity in performance.•The (supervised and semi-supervised) models involving ridge regression outperformed the logistic regression models for the dysphagia outcomes. Head and neck cancer (HNC) patients treated with radiotherapy often suffer from radiation-induced toxicities. Normal Tissue Complication Probability (NTCP) modeling can be used to determine the probability to develop these toxicities based on patient, tumor, treatment and dose characteristics. Since the currently used NTCP models are developed using supervised methods that discard unlabeled patient data, we assessed whether the addition of unlabeled patient data by using semi-supervised modeling would gain predictive performance. The semi-supervised method of self-training was compared to supervised regression methods with and without prior multiple imputation by chained equation (MICE). The models were developed for the most common toxicity outcomes in HNC patients, xerostomia (dry mouth) and dysphagia (difficulty swallowing), measured at six months after treatment, in a development cohort of 750 HNC patients. The models were externally validated in a validation cohort of 395 HNC patients. Model performance was assessed by discrimination and calibration. MICE and self-training did not improve performance in terms of discrimination or calibration at external validation compared to current regression models. In addition, the relative performance of the different models did not change upon a decrease in the amount of (labeled) data available for model development. Models using ridge regression outperformed the logistic models for the dysphagia outcome. Since there was no apparent gain in the addition of unlabeled patient data by using the semi-supervised method of self-training or MICE, the supervised regression models would still be preferred in current NTCP modeling for HNC patients.
ISSN:	2405-6308 2405-6308
DOI:	10.1016/j.ctro.2023.100677