Bere, AlphonceMulaudzi, Tshilidzi B.Muthundinne, Phindulo Pretty2025-09-152025-09-152025-09-05Muthundinne, P.P. 2025. Predictive modelling of student progression at the University of Venda using statistical and machine learning techniques. . .https://univendspace.univen.ac.za/handle/11602/2936M.Sc. (Statistics)Department of Mathematical and Computational ScienceOne of the challenges facing higher education is the steadily rising number of university dropouts. Over the years, survival analysis has been used in order to address the issue of student’s dropout. In developed countries, machine learning methods have gained more attention on solving the problem of student’s dropout. The main motivation is the lack of application of both the discrete time statistical and discrete time machine learning methods when analysing student academic outcomes. This study built both the discrete time competing risk model and discrete time machine learning models for the time from registration until graduation or dropout for students at the University of Venda. These two approaches were compared(in terms of calibration and discrimination) to check which one works best. The proposed methodology implemented the application of statistical methods (discrete time survival model for single risk and competing risk) and the machine learning models(Classification trees for competing risk) using the R Statistical Software. For the competing risk models, we considered the time intervals 3 up to 6, since the possibility of graduation starts ate the third year. This study used comparison measures like Brier Score and C-Index to evaluate the models. Results show that the discrete cause-specific model and decision tree for competing risks showed a higher discrimination ability about the students progression. However, the decision tree model seemed to be the best model than the cause-specific model since the C-index is higher. While the results showed that male students are more likely to dropout and less likely to graduate, They also showed that female students are more likely to graduate. Students with an average mark of 70+ have 48.2% higher odds of graduating compared to those with an average below 50. Students in the faculty of Human and Social Sciences are less likely to dropout as compared to those in the faculty of Science, Engineering and Agriculture. However, HSS students do not significantly differ from FSEA students in graduation odds(SE = 0.073, OR=0.904, 95% CI(0.784; 1.042) and p-value= 0.165). The Faculty of Commerce, Management, and Law (FMCL) does not significantly differ from FSEA in either dropout(p-value=0.766) or graduation(p-value=0.072). This study found that older students are more likely to dropout than younger ones. This study suggests that using a decision tree model is more efficient than standard approaches for analyzing student dropout and academic results and recommends that it should therefore be used for analysing academic outcomes. Interventions for reducing dropout rates and shortening the time from first registration to graduation should target the identified high risk groups such as male and older students.1 online resource (xiv, 121 leaves): color illustrationsenUniversity of VendaDropoutUCTDGraduationMachine LearningSurvival AnalysisPredictive modelling of student progression at the University of Venda using statistical and machine learning techniquesDissertationMuthundinne PP. Predictive modelling of student progression at the University of Venda using statistical and machine learning techniques. []. , 2025 [cited yyyy month dd]. Available from:Muthundinne, P. P. (2025). <i>Predictive modelling of student progression at the University of Venda using statistical and machine learning techniques</i>. (). . Retrieved fromMuthundinne, Phindulo Pretty. <i>"Predictive modelling of student progression at the University of Venda using statistical and machine learning techniques."</i> ., , 2025.TY - Dissertation AU - Muthundinne, Phindulo Pretty AB - One of the challenges facing higher education is the steadily rising number of university dropouts. Over the years, survival analysis has been used in order to address the issue of student’s dropout. In developed countries, machine learning methods have gained more attention on solving the problem of student’s dropout. The main motivation is the lack of application of both the discrete time statistical and discrete time machine learning methods when analysing student academic outcomes. This study built both the discrete time competing risk model and discrete time machine learning models for the time from registration until graduation or dropout for students at the University of Venda. These two approaches were compared(in terms of calibration and discrimination) to check which one works best. The proposed methodology implemented the application of statistical methods (discrete time survival model for single risk and competing risk) and the machine learning models(Classification trees for competing risk) using the R Statistical Software. For the competing risk models, we considered the time intervals 3 up to 6, since the possibility of graduation starts ate the third year. This study used comparison measures like Brier Score and C-Index to evaluate the models. Results show that the discrete cause-specific model and decision tree for competing risks showed a higher discrimination ability about the students progression. However, the decision tree model seemed to be the best model than the cause-specific model since the C-index is higher. While the results showed that male students are more likely to dropout and less likely to graduate, They also showed that female students are more likely to graduate. Students with an average mark of 70+ have 48.2% higher odds of graduating compared to those with an average below 50. Students in the faculty of Human and Social Sciences are less likely to dropout as compared to those in the faculty of Science, Engineering and Agriculture. However, HSS students do not significantly differ from FSEA students in graduation odds(SE = 0.073, OR=0.904, 95% CI(0.784; 1.042) and p-value= 0.165). The Faculty of Commerce, Management, and Law (FMCL) does not significantly differ from FSEA in either dropout(p-value=0.766) or graduation(p-value=0.072). This study found that older students are more likely to dropout than younger ones. This study suggests that using a decision tree model is more efficient than standard approaches for analyzing student dropout and academic results and recommends that it should therefore be used for analysing academic outcomes. Interventions for reducing dropout rates and shortening the time from first registration to graduation should target the identified high risk groups such as male and older students. DA - 2025-09-05 DB - ResearchSpace DP - Univen KW - Dropout KW - Graduation KW - Machine Learning KW - Survival Analysis LK - https://univendspace.univen.ac.za PY - 2025 T1 - Predictive modelling of student progression at the University of Venda using statistical and machine learning techniques TI - Predictive modelling of student progression at the University of Venda using statistical and machine learning techniques UR - ER -