Maluta, N. E.Dima, R. S.Netshikweta, R.Mudau, Mulweli Raymond2026-06-172026-06-172026-05-19Mudau, M.R. 2026. Enhancing PCE Prediction for Organic Solar Cells through the Integration of Supervised and Unsupervised Learning. . .https://univendspace.univen.ac.za/handle/11602/3203M.Sc. in e-ScienceDepartment of Mathematical and Computational SciencesMachine learning (ML) has significantly advanced solar cell research, particularly in material optimization and discovery. However, many studies rely on supervised learning models that assume consistent predictive trends across materials, potentially overlooking complex correlations affecting power conversion efficiency (PCE). Unsupervised clustering techniques offer an alternative by uncovering hidden patterns in material properties, yet their application in organic solar cell (OSC) research remains limited. This study addresses this gap by integrating clustering techniques with supervised learning to enhance PCE predictions in OSCs. The research employed K-means, DBSCAN, and hierarchical clustering to categorize OSCs based on molecular descriptors, then incorporated cluster labels as additional features in supervised models including Linear Regression, Random Forest, XGBoost, and Support Vector Regressor. Despite weak inherent cluster structure indicated by clusterability tests, the integration of cluster labels consistently improved predictive performance across all configurations. XGBoost paired with hierarchical clustering achieved the most substantial enhancement, with R² reaching 0.9640 and MAE reducing from 0.2917 to 0.2859. The findings demonstrate that (1) unsupervised learning can identify meaningful structural patterns in OSC datasets, and (2) incorporating cluster labels as engineered features improves PCE prediction accuracy compared to traditional supervised approaches alone. Importantly, even statistically weak clusters provided valuable predictive signals, contributing to enhanced model performance and supporting accelerated discovery of high-efficiency OSC materials1 online resource (viii, 61 leaves): color illustrationsenUniversity of VendaOrganic solar cells (OSCs)UCTDPower conversion efficiency (PCE)Unsupervised LearningClusteringSupervised LearningK-meansDBSCANHierar XGBoostRandom ForestMaterials informaticsEnhancing PCE Prediction for Organic Solar Cells through the Integration of Supervised and Unsupervised LearningDissertationMudau MR. Enhancing PCE Prediction for Organic Solar Cells through the Integration of Supervised and Unsupervised Learning. []. , 2026 [cited yyyy month dd]. Available from:Mudau, M. R. (2026). <i>Enhancing PCE Prediction for Organic Solar Cells through the Integration of Supervised and Unsupervised Learning</i>. (). . Retrieved fromMudau, Mulweli Raymond. <i>"Enhancing PCE Prediction for Organic Solar Cells through the Integration of Supervised and Unsupervised Learning."</i> ., , 2026.TY - Dissertation AU - Mudau, Mulweli Raymond AB - Machine learning (ML) has significantly advanced solar cell research, particularly in material optimization and discovery. However, many studies rely on supervised learning models that assume consistent predictive trends across materials, potentially overlooking complex correlations affecting power conversion efficiency (PCE). Unsupervised clustering techniques offer an alternative by uncovering hidden patterns in material properties, yet their application in organic solar cell (OSC) research remains limited. This study addresses this gap by integrating clustering techniques with supervised learning to enhance PCE predictions in OSCs. The research employed K-means, DBSCAN, and hierarchical clustering to categorize OSCs based on molecular descriptors, then incorporated cluster labels as additional features in supervised models including Linear Regression, Random Forest, XGBoost, and Support Vector Regressor. Despite weak inherent cluster structure indicated by clusterability tests, the integration of cluster labels consistently improved predictive performance across all configurations. XGBoost paired with hierarchical clustering achieved the most substantial enhancement, with R² reaching 0.9640 and MAE reducing from 0.2917 to 0.2859. The findings demonstrate that (1) unsupervised learning can identify meaningful structural patterns in OSC datasets, and (2) incorporating cluster labels as engineered features improves PCE prediction accuracy compared to traditional supervised approaches alone. Importantly, even statistically weak clusters provided valuable predictive signals, contributing to enhanced model performance and supporting accelerated discovery of high-efficiency OSC materials DA - 2026-05-19 DB - ResearchSpace DP - Univen KW - Organic solar cells (OSCs) KW - Power conversion efficiency (PCE) KW - Unsupervised Learning KW - Clustering KW - Supervised Learning KW - K-means KW - DBSCAN KW - Hierar XGBoost KW - Random Forest KW - Materials informatics LK - https://univendspace.univen.ac.za PY - 2026 T1 - Enhancing PCE Prediction for Organic Solar Cells through the Integration of Supervised and Unsupervised Learning TI - Enhancing PCE Prediction for Organic Solar Cells through the Integration of Supervised and Unsupervised Learning UR - ER -