Abstract
Purpose
Methods and Materials
Results
Conclusions
Introduction
Methods and Materials

Study cohort and participants
- Seibold P
- Webb A
- Aguado-Barrera ME
- Azria D
- Bourgier C
- Brengues M
- et al.
A prospective multicentre cohort study of patients undergoing radiotherapy for breast, lung or prostate cancer.
Endpoint definition
Variable selection, imputation, and preprocessing
Resampling
Modeling
Results
- Seibold P
- Webb A
- Aguado-Barrera ME
- Azria D
- Bourgier C
- Brengues M
- et al.
A prospective multicentre cohort study of patients undergoing radiotherapy for breast, lung or prostate cancer.
REQUITE breast cancer cohort | |
---|---|
Eligible patients | 2059 |
Location | Western Europe, United States |
Study design | Prospective cohort |
Recruitment year (range) | 2014-2016 |
Treatment year (range) | 2014-2016 |
Toxicity assessment scale | CTCAE v4.0 |
Toxicity assessment time points | Start-of-RT |
End-of-RT | |
Age (median, range) | 58 (23-90) |
Whole breast dose (Gy, median, range) | 50 (28.5-56) |
Whole breast fractions (median, range) | 25 (5-31) |
Hypofractionated regimen (proportion of patients) | 47.9% |
IMRT, simple field-in-field | 39.7% |
IMRT, complex modulated | 9.8% |
RT to axilla | 11.9% |
RT to supraclavicular fossa | 12.8% |
Boost | 67.8% |
BMI ≥25 | 54.0% |
Smoker (current or previous) | 42.7% |
Chemotherapy | 31.0% |
Diabetes | 6.1% |
Hypertension | 28.0% |
Cardiovascular disease | 6.9% |
Toxicity (end of treatment) | |
Ulceration | |
Grade 0 | 1868 (91.2%) |
Grade ≥1 | 181 (8.8%) |
Dermatitis | |
Grade 0 | 257 (12.5%) |
Grade 1 | 1288 (62.6%) |
Grade 2 | 462 (22.4%) |
Grade 3 | 28 (1.4%) |
Acute desquamation | |
Ulceration ≥G1 or dermatitis ≥G3 | 192 (9.3%) |
Training in ITD (n = 1029) | Validation in VD (n = 1029) | ||||||
---|---|---|---|---|---|---|---|
Classifier | Specificity (TNR) | Sensitivity (TPR) | AUC | Specificity (TNR) | Sensitivity (TPR) | AUC | Rank |
(K = 1) NN | 0.908 | 0.167 | 0.548 | 0.923 | 0.292 | 0.607 | 9 |
(K = 3) NN | 0.975 | 0.094 | 0.601 | 0.979 | 0.125 | 0.627 | 8 |
(K = 5) NN | 0.985 | 0.042 | 0.624 | 0.989 | 0.063 | 0.651 | 6 |
(K = 7) NN | 0.996 | 0.031 | 0.648 | 0.998 | 0.052 | 0.644 | 7 |
(K = 9) NN | 0.999 | 0.031 | 0.660 | 0.999 | 0.042 | 0.665 | 5 |
ANN | 0.945 | 0.198 | 0.694 | 0.953 | 0.177 | 0.676 | 4 |
C4.5 | 0.985 | 0.083 | 0.575 | 0.979 | 0.125 | 0.496 | 12 |
LMT | 0.996 | 0.010 | 0.578 | 0.995 | 0.042 | 0.746 | 1 |
LR | 0.910 | 0.188 | 0.567 | 0.959 | 0.135 | 0.596 | 10 |
NB | 0.810 | 0.438 | 0.697 | 0.833 | 0.500 | 0.737 | 3 |
SVM | 0.966 | 0.156 | 0.561 | 0.976 | 0.146 | 0.561 | 11 |
RF | 0.998 | 0.021 | 0.725 | 0.999 | 0.010 | 0.742 | 2 |

Model selection and feature filtering

Model's feature | MDI | Model's feature | MDI |
---|---|---|---|
other_lipid_lowering_drugs_duration_yrs | 0.52 | alcohol_current_consumption | 0.2 |
surgery_type | 0.41 | smoking_time_since_quitting_yrs | 0.2 |
radio_bolus | 0.4 | radio_imrt | 0.19 |
chemotherapy | 0.36 | radio_photon_boostdose_Gy | 0.19 |
boost | 0.35 | other_antihypertensive_drug | 0.19 |
radio_photon_dose_MV | 0.34 | household_members | 0.19 |
epirubicin_chemo_drug | 0.34 | radio_breast_fractions_dose_per_fraction_Gy | 0.19 |
blood_pressure | 0.33 | radio_elec_boost_field_y_cm | 0.19 |
Bra_band_size | 0.3 | radio_photon_2nd | 0.19 |
radio_treated_breast | 0.3 | bra_cup_size | 0.19 |
tumour_size_mm | 0.29 | radio_breast_fractions | 0.19 |
paclitaxel_chemo_drug | 0.29 | n_stage | 0.18 |
grade_invasive | 0.28 | hypertension_duration_yrs | 0.18 |
breast_separation | 0.28 | radio_supraclavicular_fossa | 0.18 |
smoking | 0.27 | education_profession | 0.18 |
radio_elec_energy_MeV | 0.27 | radio_axillary_levels | 0.18 |
BED_boost | 0.27 | hypertension | 0.18 |
docetaxel_chemo_drug | 0.27 | radio_photon_boost_fractions_per_week | 0.17 |
BED_Total | 0.27 | smoker | 0.17 |
radio_elec_boost_dose_Gy | 0.27 | depression | 0.17 |
On_tamoxifen | 0.26 | menopausal_status | 0.17 |
radio_heart_mean_dose_Gy | 0.26 | radio_boost_diameter_cm | 0.16 |
t_stage | 0.26 | 5-fluorouracil (5-FU)_chemo_drug | 0.16 |
radio_hot_spots_107 | 0.25 | radio_photon_boost_dose_per_fraction_Gy | 0.16 |
BED_Breast | 0.25 | antidepressant_duration_yrs | 0.16 |
tobacco_products_per_day | 0.25 | radio_breast_fractions_per_week | 0.15 |
age_at_radiotherapy_start_yrs | 0.25 | radio_boost_type | 0.15 |
radio_breast_ct_volume_cm3 | 0.25 | Carboplatin_chemo_drug | 0.15 |
hormone_replacement_therapy | 0.24 | radio_boost_sequence | 0.15 |
radio_photon_boost_volume_cm3 | 0.24 | radio_photon_boost_fractions | 0.15 |
antidepressant | 0.24 | household_income | 0.15 |
height_cm | 0.24 | methotrexate_chemo_drug | 0.15 |
radio_photon_2nd_energy_MV | 0.24 | other_lipid_lowering_drugs | 0.14 |
radio_ipsilateral_lung_mean_Gy | 0.24 | radio_photon_energy_MV or kV | 0.14 |
alcohol_previous_consumption | 0.24 | ace_inhibitor | 0.13 |
radio_photon_2nd_dose_fractions_per_week | 0.23 | analgesics_duration_yrs | 0.13 |
radio_skin_max_dose_Gy | 0.23 | radio_photon_2nd_dose_per_fraction_Gy | 0.13 |
histology | 0.23 | antidiabetic_duration_yrs | 0.13 |
monopause_age_yrs | 0.23 | depression_duration_yrs | 0.13 |
other_antihypertensive_drug_duration_yrs | 0.23 | on_statin_duration_yrs | 0.12 |
weight_at_cancer_diagnosis_kg | 0.23 | antidiabetic | 0.12 |
tobacco_product | 0.23 | diabetes | 0.11 |
cyclophosphamide_chemo_drug | 0.22 | ace_inhibitor_duration_yrs | 0.11 |
combined_chemo_drugs | 0.22 | on_statin | 0.11 |
boost_frac | 0.22 | doxorubicin_chemo_drug | 0.11 |
analgesics | 0.22 | history_of_heart_disease | 0.09 |
breast_cancer_family_history_1st_degree | 0.22 | radio_axillary_other | 0.09 |
smoking_duration_yrs | 0.21 | ethnicity | 0.09 |
radio_photon_boostdose_precise_Gy | 0.21 | radio_interrupted | 0.08 |
radio_elec_boost_field_x_cm | 0.21 | pegfilgrastim_chemo_drug | 0.07 |
radio_photon_2nd_fractions | 0.21 | history_of_heart_disease_duration_yrs | 0.06 |
radio_boost_fractions | 0.21 | radiotherapy_toxicity_family_history | 0.06 |
alcohol_intake | 0.21 | diabetes_duration_yrs | 0.05 |
radio_type_imrt | 0.21 | radio_interrupted_days | 0.05 |
radio_treatment_pos | 0.21 | trastuzumab_chemo_drug | 0.04 |
radio_breast_dose_Gy | 0.2 | other_collagen_vascular_disease | 0.03 |
rheumatoid arthritis_duration_yrs | 0.2 | rheumatoid arthritis | 0.02 |
Discussion
Study limitations
Conclusion
Acknowledgments
Appendix. Supplementary materials
References
- Early and Locally Advanced Breast Cancer: Diagnosis and Management.National Institute for Health and Clinical Excellence, London2018
- Effect of radiotherapy after breast-conserving surgery on 10-year recurrence and 15-year breast cancer death: Meta-analysis of individual patient data for 10 801 women in 17 randomised trials.The Lancet. 2011; 378: 1707-1716
- Cancer Survival in England: Patients Diagnosed Between 2010 and 2014 and Followed Up to 2015.Office for National Statistics, London2016
- Priorities for Research on Cancer Survivorship.Department of Health, London2010
- Tolerance of normal tissue to therapeutic irradiation.Int J Radiat Oncol Biol Phys. 1991; 21: 109-122
- Patient-to-patient variability in the expression of radiation-induced normal tissue injury.Sem Radiat Oncol. 1994; 4: 68-80
- A longitudinal study of symptoms and self-care activities in women treated with primary radiotherapy for breast cancer.Cancer Nurs. 2005; 28: 210-218
- Postmastectomy radiation therapy and immediate autologous breast reconstruction: Integrating perspectives from surgical oncology, radiation oncology, and plastic and reconstructive surgery.J Surg Oncol. 2015; 111: 251-257
- Personal characteristics, therapy modalities and individual DNA repair capacity as predictive factors of acute skin toxicity in an unselected cohort of breast cancer patients receiving radiotherapy.Radiother Oncol. 2003; 69: 145-153
- Impact of radiation therapy on acute toxicity in breast conservation therapy for early breast cancer.Clin Oncol. 2004; 16: 12-16
- Hypofractionated radiotherapy after conservative surgery for breast cancer: Analysis of acute and late toxicity.Radiat Oncol. 2010; 5: 112
- The Cambridge Breast Intensity-Modulated Radiotherapy trial: Patient- and treatment-related factors that influence late toxicity.Clin Oncol. 2011; 23: 662-673
- Factors of influence on acute skin toxicity of breast cancer patients treated with standard three-dimensional conformal radiotherapy (3D-CRT) after breast conserving surgery (BCS).Radiat Oncol. 2012; 7: 217
- Common variants of eNOS and XRCC1 genes may predict acute skin toxicity in breast cancer patients receiving radiotherapy after breast conserving surgery.Radiother Oncol. 2012; 103: 199-205
- Smoking as an independent risk factor for severe skin reactions due to adjuvant radiotherapy for breast cancer.The Breast. 2013; 22: 634-638
- Standard or hypofractionated radiotherapy in the postoperative treatment of breast cancer: A retrospective analysis of acute skin toxicity and dose inhomogeneities.BMC Cancer. 2013; 13: 230
- Toxicity and cosmetic outcome of hypofractionated whole-breast radiotherapy: Predictive clinical and dosimetric factors.Radiat Oncol. 2014; 9: 97
- Factors modifying the risk for developing acute skin toxicity after whole-breast intensity modulated radiotherapy.BMC Cancer. 2014; 14: 711
- Pitfalls in prediction modeling for normal tissue toxicity in radiation therapy: An illustration with the individual radiation sensitivity and mammary carcinoma risk factor investigation cohorts.Int J Radiat Oncol Biol Phys. 2016; 95: 1466-1476
- Incorporating spatial dose metrics in machine learning-based normal tissue complication probability (NTCP) models of severe acute dysphagia resulting from head and neck radiotherapy.Clin Transl Radiat Oncol. 2018; 8: 27-39
- Machine learning on a genome-wide association study to predict late genitourinary toxicity after prostate radiation therapy.Int J Radiat Oncol Biol Phys. 2018; 101: 128-135
- Quantitative thermal imaging biomarkers to detect acute skin toxicity from breast radiation therapy using supervised machine learning.Int J Radiat Oncol Biol Phys. 2020; 106: 1071-1083
- Applying a machine learning approach to predict acute toxicities during radiation for breast cancer patients.Int J Radiat Oncol Biol Phys. 2018; 102: S59
- External validation of a predictive model for acute skin radiation toxicity in the REQUITE breast cohort.Front Oncol. 2020; 10: 575909
- Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement.Br J Surg. 2015; 102: 148-158
- The REQUITE project: validating predictive models and biomarkers of radiotherapy toxicity to reduce side-effects and improve quality of life in cancer survivors.Clin Oncol (R Coll Radiol). 2014; 26: 739-742
- A prospective multicentre cohort study of patients undergoing radiotherapy for breast, lung or prostate cancer.Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology. 2019; 138: 59-67
- Common Terminology Criteria for Adverse Events V4.0.National Cancer Institute: NIH, 2009
- Complexity of equivalence class and boundary value testing methods.Int J Comput Sci Inform Techn. 2009; 751: 80-101
- Techniques for dealing with missing values in classification.in: Liu X Cohen P Berthold M International Symposium on Intelligent Data Analysis. Springer, 1997: 527-536
- A decision tree-based missing value imputation technique for data pre-processing.Proc Ninth Australasian Data Min Conf. 2011; 121: 41-50
- Improved use of continuous attributes in C4. 5.J Artif Intell Res. 1996; 4: 77-90
- About feature scaling and normalization and the effect of standardization for machine learning algorithms.Political Leg Anthropology Rev. 2014; 30: 67-89
- Data mining for imbalanced datasets: An overview. Data Mining And Knowledge Discovery Handbook.Springer, Boston, MA2009: 875-886
- A survey of predictive modeling on imbalanced domains.ACM Comput Surv. 2016; 49: 1-50
- SMOTE: Synthetic minority over-sampling technique.J Artif Intell Res. 2002; 16: 321-357
- Targeted projection pursuit tool for gene expression visualisation.J Integrat Bioinform. 2006; 3: 264-273
- Estimation of prediction error by using K-fold cross-validation.Stat Comput. 2011; 21: 137-146
- Discrete Bayesian network classifiers: A survey.ACM Comput Surv. 2014; 47: 1-43
- The multilayer perceptron as an approximation to a Bayes optimal discriminant function.IEEE Trans Neural Net. 1990; 1: 296-298
Cunningham P, Delany SJ. K-nearest neighbour classifiers. arXiv preprint arXiv:200404523. 2020.
- Logistic model trees.Mach Learn. 2005; 59: 161-205
- Random decision forests.in: Proc Third Int Conf Doc Analysis and Recog. 1995: 278-282
- Imbalanced Classification with Python: Better Metrics, Balance Skewed Classes, Cost-Sensitive Learning.Machine Learning Mastery, 2020
- The foundations of cost-sensitive learning.in: Int Joint Conf Artif Intell. 2001: 973-978
- Understanding variable importances in forests of randomized trees.Adv Neur Inform Process Syst. 2013; 26: 431-439
- Weka: The waikato environment for knowledge analysis.in: Proc New Zeal Comput Sci Res Student Conf. 1995: 57-64
- Decision tree analysis using weka.Machine Learning-Project II. 2012; : 1-3
- Improvements to Platt's SMO algorithm for SVM classifier design.Neur Comput. 2001; 13: 637-649
- Potential use of HMG-CoA reductase inhibitors (statins) as radioprotective agents.Br Med Bull. 2011; 97: 17-26
- Data Preparation for Data Mining.Morgan Kaufmann, Burlington, MA1999
More A. Survey of resampling techniques for improving classification performance in unbalanced datasets. arXiv preprint arXiv:160806048. 2016.
- Bias in random forest variable importance measures: Illustrations, sources and a solution.BMC Bioinformat. 2007; 8: 25
Article Info
Publication History
Footnotes
Sources of support: This research collaboration was formed by the UK Radiotherapy Machine Learning Network (RTML) funded through the Advanced Radiotherapy Challenge+ by the Science and Technology Facilities Council (STFC). The REQUITE study received funding from the European Union's 7th Framework Programme for research, technological development and demonstration under grant agreement no. 601826. The research was supported by the Quintin Hogg Trust research awards award no.165435391. The workshops were hosted by the University of Manchester and the Health and Innovation Ecosystem at the University of Westminster. Dr Alison M. Dunning was supported by Cancer Research-UK C8197/A16565. Dr Sara Gutiérrez-Enríquez is supported by the ISCIII Miguel Servet II Program (CP16/00034). Dr Tim Rattay is currently an NIHR Clinical Lecturer (CL 2017-11-002). He was previously funded by a National Institute of Health Research (NIHR) Doctoral Research Fellowship (DRF 2014-07-079). This publication presents independent research funded by the NIHR. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health. Dr Leila Shelley reports grants from Chief Scientist Office (CSO) Scotland grant (TCS/17/26 - CSO Award). Dr Petra Seibold is supported by the ERA-Net ERAPerMed / German Federal Ministry of Education and Reseach (BMBF) as well as German Federal Office for Radiation Protection (BfS). Dr Elena Sperk was previously supported by the Ministry of Science and Arts of the State of Baden-Württemberg (2017-19) through the Brigitte-Schlieben-Lange-Programme. Dr Ana Vega is supported by Spanish Instituto de Salud Carlos III (ISCIII) funding, an initiative of the Spanish Ministry of Economy and Innovation partially supported by European Regional Development FEDER Funds (INT15/00070, INT16/00154, INT17/00133, INT20/00071, PI19/01424, PI16/00046, PI13/02030, PI10/00164), and through the Autonomous Government of Galicia (Consolidation and structuring program: IN607B). Prof Catharine West is supported by Cancer Research UK (C1094/A18504, C147/A25254) and by the NIHR Manchester Biomedical Research Centre.
Disclosures: Prof David Azria: has been involved in the creation of the start-up NovaGray in 2015. Prof Dirk de Ruysscher: none related to the current manuscript. Outside the current manuscript: advisory board of Astra Zeneca, Bristol-Myers-Squibb, Roche/ Genentech, Merck/ Pfizer, Celgene, Noxxon, Mologen and has received investigator initiated grants from Bristol-Myers-Squibb, Boehringer Ingelheim and Astra-Zeneca. Dr Elena Sperk: none related to the current manuscript. Outside the current manuscript: General speakers bureau Zeiss Meditec, travel support Zeiss Meditec. The other authors have no conflict of interests to disclose.
Data sharing statement: Research data are stored in an institutional repository and will be shared upon request to the corresponding author and the REQUITE consortium.
Identification
Copyright
User License
Creative Commons Attribution (CC BY 4.0) |
Permitted
- Read, print & download
- Redistribute or republish the final article
- Text & data mine
- Translate the article
- Reuse portions or extracts from the article in other works
- Sell or re-use for commercial purposes
Elsevier's open access license policy