Department of Mathematical and Computational Sciences
Permanent URI for this community
Browse
Browsing Department of Mathematical and Computational Sciences by Author "Garira, Winston"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Embargo Assessing models for de-identification of Electronic Discharge Summary Using Machine Learning tools(2024-09-06) Mudau, Tshilisanani; Garira, Winston; Netshikweta, RendaniBackground: De-identification is a technique that eliminates identifying information from Clinical Records in order to protect individual privacy. This procedure decreases the chance of personal information being collected, processed, distributed, and published from being used to identify the person. When Machine Learning techniques were included in the de-identification process, it substantially improved over the previous method. Research Problem: The Electronic Discharge Summary(EDS) has evolved into a significantly improved technique of providing discharge summaries though this information contains Protected Health Information (PHI), which poses a risk to patients’ privacy. This makes the process of de-identification to be mandatory. There have lately been several Machine Learning approaches to de-identify data. This study focuses on applying Machine Learning techniques to figure out which model can best de-identify a data set. Methods: The open source data set from Harvard Medical School was used. This data set contains 899 Electronic Health Records (EHR), 669 for training and 220 for test purpose. The Conditional Random Fields (CRF), Long Short Term Memory (LSTM) and Random Forest models were used, and the performance of each model was assessed. Findings: In order to assess each model’s performance, evaluation metrics were used to compare F-measure, Recall and Precision at token level to determine which Machine Learning model performed best. The Long Short Term Memory was found to outperform both Conditional Random Fields and Random Forest with micro average F-measure, Recall and precision of 99%, and macro average F-measure of 77%, Recall of 73% and Precision of 90%.Item Open Access Mathematical modelling of transmission and control of malaria(2012-12-19) Mulaudzi, Matodzi Stanley; Garira, WinstonItem Open Access Modelling volatility, equity risk and extremal dependence of the BRICS Stock Markets(2022-07-15) Mukhodobwane, Rosinah Mphedziseni; Sigauke, Caston; Chagwiza, Wilbert; Garira, WinstonWith the use of empirical data of the BRICS (Brazil, Russia, India, China, and South Africa) stock markets, this thesis focuses on solving three main nancial and investment issues involving returns volatility, risk and extremal dependence via robust statistical modelling. The rst issue involves modelling nancial returns volatility (when the true distribution is unknown) using the univariate GARCH model under the assumptions of seven error distributions. The ndings, using two of the error distributions, show that the Chinese market has the highest volatility persistence, followed by the South African, Russian, Indian and Brazilian markets in that order. For risk modelling and analysis, the ndings show that the Russian market has the highest risk level, followed by the South African, Chinese, Brazilian and Indian markets, respectively. For the extremal dependence modelling, using the bivariate point process and conditional multivariate extreme value (CMEV) models, the ndings show varied levels of low extremal dependence structure whose outcomes are highly bene cial to investors, portfolio managers and other market participants who are interested in maximising their investment returns and nancial gains. However, it is observed that the point process was able to model many more extreme observations or exceedances that contribute to the likelihood estimation and it gives more information than the threshold excess method of the CMEV model.