Loading…
High-performance prediction models for prostate cancer radiomics
When researchers are faced with building machine learning (ML) radiomic models, the first choice they have to make is what model to use. Naturally, the goal is to use the model with the best performance. But what is the best model? It is well known in ML that modern techniques such as gradient boost...
Saved in:
Published in: | Informatics in medicine unlocked 2023, Vol.37, p.101161, Article 101161 |
---|---|
Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | When researchers are faced with building machine learning (ML) radiomic models, the first choice they have to make is what model to use. Naturally, the goal is to use the model with the best performance. But what is the best model? It is well known in ML that modern techniques such as gradient boosting and deep learning have better capacity than traditional models to solve complex problems in high dimensions. Despite this, most radiomics researchers still do not focus on these models in their research. As access to high-quality and large data sets increase, these high-capacity ML models may become even more relevant. In this article, we use a large dataset of 949 prostate cancer patients to compare the performance of a few of the most promising ML models for tabular data: gradient-boosted decision trees (GBDTs), multilayer perceptions, convolutional neural networks, and transformers. To this end, we predict nine different prostate cancer pathology outcomes of clinical interest. Our goal is to give a rough overview of how these models compare against one another in a typical radiomics setting. We also investigate if multitask learning improves the performance of these models when multiple targets are available. Our results suggest that GBDTs perform well across all targets, and that multitask learning does not provide a consistent improvement.
•Machine learning models are trained to predict pathological prostate cancer variables.•Four model types are compared: gradient boosting, MLPs, transformers, and CNNs.•Multitask learning is compared against regular training for all models.•Gradient boosting with CatBoost outperforms the deep learning models.•Multitask training only improved the MLP and CNN models. |
---|---|
ISSN: | 2352-9148 2352-9148 |
DOI: | 10.1016/j.imu.2023.101161 |