Comparative Analysis of Discrimination and Calibration Accuracy of Discrete Survival, Random Forests, and Neural Networks in Health-Related Survival Prediction Models

Bere, AlphonceMulaudzi, TshilidziMotsuku, LactatiaRamachela, Audrey Tshepho2025-11-072025-11-072025-09-05Ramachela, A.T. 2025. Comparative Analysis of Discrimination and Calibration Accuracy of Discrete Survival, Random Forests, and Neural Networks in Health-Related Survival Prediction Models. . .https://univendspace.univen.ac.za/handle/11602/3028MSc in StatisticsDepartment of Mathematical and Computational SciencesPrediction models for survival analysis are commonly used in biomedical sciences to understand the onset of certain diseases. Traditional statistical models have been employed for the previous years, however, their limitations and inability to handle big data sets has made a way for the introduction of machine learning methods which gained recognition due to their ability to learn complex algorithms. However, existing literature indicates that the predictive accuracy of machine learning and statistical models for survival analysis varies significantly across different data sets. This variability underscores the need for further research utilizing data sets with diverse characteristics. Such research is essential to develop generalizable insights into the conditions under which each method performs best. In this research project, we compared the predictive performance of traditional statistical method and machine learning algorithms in discrete survival analysis. The machine learning methods include discrete-time survival trees, discrete-time random survival forests, and discrete-time neural networks. The study uses calibration (measured by the prediction error curves) to assess model fit and discrimination (measured by the Concordance index and area under curve) to evaluate predictive accuracy. These methods were applied to data sets: Breast cancer, age at first alcohol intake and CRASH-2. The discrete-time neural network had the best prediction performance as compared to the rest of the models for survival of breast cancer. The discrete-time random forest with hellinger distance had the overall prediction performance on the age at first alcohol intake. The discrete-time survival model outperformed the rest of the models in predicting survival of bleeding trauma patients from the CRASH-2 data .1 online resource (xii, 90 leaves)enUniversity of VendaDiscrete-time survival analysisUCTDStatistical methodsMachine learningCalibrationDiscriminationComparative Analysis of Discrimination and Calibration Accuracy of Discrete Survival, Random Forests, and Neural Networks in Health-Related Survival Prediction ModelsDissertationRamachela AT. Comparative Analysis of Discrimination and Calibration Accuracy of Discrete Survival, Random Forests, and Neural Networks in Health-Related Survival Prediction Models. []. , 2025 [cited yyyy month dd]. Available from:Ramachela, A. T. (2025). <i>Comparative Analysis of Discrimination and Calibration Accuracy of Discrete Survival, Random Forests, and Neural Networks in Health-Related Survival Prediction Models</i>. (). . Retrieved fromRamachela, Audrey Tshepho. <i>"Comparative Analysis of Discrimination and Calibration Accuracy of Discrete Survival, Random Forests, and Neural Networks in Health-Related Survival Prediction Models."</i> ., , 2025.TY - Dissertation AU - Ramachela, Audrey Tshepho AB - Prediction models for survival analysis are commonly used in biomedical sciences to understand the onset of certain diseases. Traditional statistical models have been employed for the previous years, however, their limitations and inability to handle big data sets has made a way for the introduction of machine learning methods which gained recognition due to their ability to learn complex algorithms. However, existing literature indicates that the predictive accuracy of machine learning and statistical models for survival analysis varies significantly across different data sets. This variability underscores the need for further research utilizing data sets with diverse characteristics. Such research is essential to develop generalizable insights into the conditions under which each method performs best. In this research project, we compared the predictive performance of traditional statistical method and machine learning algorithms in discrete survival analysis. The machine learning methods include discrete-time survival trees, discrete-time random survival forests, and discrete-time neural networks. The study uses calibration (measured by the prediction error curves) to assess model fit and discrimination (measured by the Concordance index and area under curve) to evaluate predictive accuracy. These methods were applied to data sets: Breast cancer, age at first alcohol intake and CRASH-2. The discrete-time neural network had the best prediction performance as compared to the rest of the models for survival of breast cancer. The discrete-time random forest with hellinger distance had the overall prediction performance on the age at first alcohol intake. The discrete-time survival model outperformed the rest of the models in predicting survival of bleeding trauma patients from the CRASH-2 data . DA - 2025-09-05 DB - ResearchSpace DP - Univen KW - Discrete-time survival analysis KW - Statistical methods KW - Machine learning KW - Calibration KW - Discrimination LK - https://univendspace.univen.ac.za PY - 2025 T1 - Comparative Analysis of Discrimination and Calibration Accuracy of Discrete Survival, Random Forests, and Neural Networks in Health-Related Survival Prediction Models TI - Comparative Analysis of Discrimination and Calibration Accuracy of Discrete Survival, Random Forests, and Neural Networks in Health-Related Survival Prediction Models UR - ER -