Department of Mathematical and Computational Sciences
Permanent URI for this community
Browse
Browsing Department of Mathematical and Computational Sciences by Author "Garira, Winston"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Embargo Assessing models for de-identification of Electronic Discharge Summary Using Machine Learning tools(2024-09-06) Mudau, Tshilisanani; Garira, Winston; Netshikweta, RendaniBackground: De-identification is a technique that eliminates identifying information from Clinical Records in order to protect individual privacy. This procedure decreases the chance of personal information being collected, processed, distributed, and published from being used to identify the person. When Machine Learning techniques were included in the de-identification process, it substantially improved over the previous method. Research Problem: The Electronic Discharge Summary(EDS) has evolved into a significantly improved technique of providing discharge summaries though this information contains Protected Health Information (PHI), which poses a risk to patients’ privacy. This makes the process of de-identification to be mandatory. There have lately been several Machine Learning approaches to de-identify data. This study focuses on applying Machine Learning techniques to figure out which model can best de-identify a data set. Methods: The open source data set from Harvard Medical School was used. This data set contains 899 Electronic Health Records (EHR), 669 for training and 220 for test purpose. The Conditional Random Fields (CRF), Long Short Term Memory (LSTM) and Random Forest models were used, and the performance of each model was assessed. Findings: In order to assess each model’s performance, evaluation metrics were used to compare F-measure, Recall and Precision at token level to determine which Machine Learning model performed best. The Long Short Term Memory was found to outperform both Conditional Random Fields and Random Forest with micro average F-measure, Recall and precision of 99%, and macro average F-measure of 77%, Recall of 73% and Precision of 90%.Item Open Access Mathematical modelling of transmission and control of malaria(2012-12-19) Mulaudzi, Matodzi Stanley; Garira, WinstonMalaria starts with plasmodium sporozoites infection of the host liver, where development into blood stage parasites occurs. A number of deterministic models are developed in this thesis. The release of modified mosquitoes aims to displace gradually the wild (natural) mosquito from the habitat. We discuss the suitability of this technique when applied to pre-domestically adapted plasmodium falciparum mosquitoes which are transmissor of malaria disease. The dynamics of interaction of sporozoites, liver cells, merozoites and red blood cells which cause the symptoms and pathology of the disease is comprehensively studied. We then show how variability of host-parasite immunity is incorporated in the model which are constructed to include liver and blood compartments by subdividing the host population into various mutually exclusive compartments. The increase in eggs, larval and pupal stages of mosquitoes increase the vector mosquito population and transmission of the disease, hence the suggestion that immature and adult mosquitoes be controlled extensively. The models which are in the form of nonlinear ordinary differential equations are rigorously analysed using ex tensively analytic and numerical techniques to determine important epidemiological thresholds, stability of the steady states and the persistence of infection in the respective populations. Conclusions are made based on the results obtained from the analysis of the models of malaria that have been developedItem Open Access Modelling volatility, equity risk and extremal dependence of the BRICS Stock Markets(2022-07-15) Mukhodobwane, Rosinah Mphedziseni; Sigauke, Caston; Chagwiza, Wilbert; Garira, WinstonWith the use of empirical data of the BRICS (Brazil, Russia, India, China, and South Africa) stock markets, this thesis focuses on solving three main nancial and investment issues involving returns volatility, risk and extremal dependence via robust statistical modelling. The rst issue involves modelling nancial returns volatility (when the true distribution is unknown) using the univariate GARCH model under the assumptions of seven error distributions. The ndings, using two of the error distributions, show that the Chinese market has the highest volatility persistence, followed by the South African, Russian, Indian and Brazilian markets in that order. For risk modelling and analysis, the ndings show that the Russian market has the highest risk level, followed by the South African, Chinese, Brazilian and Indian markets, respectively. For the extremal dependence modelling, using the bivariate point process and conditional multivariate extreme value (CMEV) models, the ndings show varied levels of low extremal dependence structure whose outcomes are highly bene cial to investors, portfolio managers and other market participants who are interested in maximising their investment returns and nancial gains. However, it is observed that the point process was able to model many more extreme observations or exceedances that contribute to the likelihood estimation and it gives more information than the threshold excess method of the CMEV model.