scholarly journals The Prediction of Essential Medicines Demand: A Machine Learning Approach Using Consumption Data in Rwanda

Processes ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 26
Author(s):  
Francois Mbonyinshuti ◽  
Joseph Nkurunziza ◽  
Japhet Niyobuhungiro ◽  
Egide Kayitare

Today’s global business trends are causing a significant and complex data revolution in the healthcare industry, culminating in the use of artificial intelligence and predictive modeling to improve health outcomes and performance. The dataset, which was referred to is based on consumption data from 2015 to 2019, included approximately 500 goods. Based on a series of data pre-processing activities, the top ten (10) essential medicines most used were chosen, namely cotrimoxazole 480 mg, amoxicillin 250 mg, paracetamol 500 mg, oral rehydration salts (O.R.S) sachet 20.5 g, chlorpheniramine 4 mg, nevirapine 200 mg, aminophylline 100 mg, artemether 20 mg + lumefantrine (AL) 120 mg, Cromoglycate ophthalmic. Our study concentrated on the application of machine learning (ML) to forecast future trends in the demand for essential drugs in Rwanda. The following models were created and applied: linear regression, artificial neural network, and random forest. The random forest was able to predict 10 selected medicines with an accuracy of 88 percent with the train set and 76 percent with the test set, and it can thus be used to forecast future demand based on past consumption data by inputting a month, year, district, and medicine name. According to our findings, the random Forest model performed well as a forecasting model for the demand for essential medicines. Finally, data-driven predictive modeling with machine learning (ML) could become the cornerstone of health supply chain planning and operational management.

2017 ◽  
Author(s):  
Aymen A. Elfiky ◽  
Maximilian J. Pany ◽  
Ravi B. Parikh ◽  
Ziad Obermeyer

ABSTRACTBackgroundCancer patients who die soon after starting chemotherapy incur costs of treatment without benefits. Accurately predicting mortality risk from chemotherapy is important, but few patient data-driven tools exist. We sought to create and validate a machine learning model predicting mortality for patients starting new chemotherapy.MethodsWe obtained electronic health records for patients treated at a large cancer center (26,946 patients; 51,774 new regimens) over 2004-14, linked to Social Security data for date of death. The model was derived using 2004-11 data, and performance measured on non-overlapping 2012-14 data.Findings30-day mortality from chemotherapy start was 2.1%. Common cancers included breast (21.1%), colorectal (19.3%), and lung (18.0%). Model predictions were accurate for all patients (AUC 0.94). Predictions for patients starting palliative chemotherapy (46.6% of regimens), for whom prognosis is particularly important, remained highly accurate (AUC 0.92). To illustrate model discrimination, we ranked patients initiating palliative chemotherapy by model-predicted mortality risk, and calculated observed mortality by risk decile. 30-day mortality in the highest-risk decile was 22.6%; in the lowest-risk decile, no patients died. Predictions remained accurate across all primary cancers, stages, and chemotherapies—even for clinical trial regimens that first appeared in years after the model was trained (AUC 0.94). The model also performed well for prediction of 180-day mortality (AUC 0.87; mortality 74.8% in the highest risk decile vs. 0.2% in the lowest). Predictions were more accurate than data from randomized trials of individual chemotherapies, or SEER estimates.InterpretationA machine learning algorithm accurately predicted short-term mortality in patients starting chemotherapy using EHR data. Further research is necessary to determine generalizability and the feasibility of applying this algorithm in clinical settings.


2021 ◽  
Author(s):  
Merlin James Rukshan Dennis

Distributed Denial of Service (DDoS) attack is a serious threat on today’s Internet. As the traffic across the Internet increases day by day, it is a challenge to distinguish between legitimate and malicious traffic. This thesis proposes two different approaches to build an efficient DDoS attack detection system in the Software Defined Networking environment. SDN is the latest networking approach which implements centralized controller, which is programmable. The central control and the programming capability of the controller are used in this thesis to implement the detection and mitigation mechanisms. In this thesis, two designed approaches, statistical approach and machine-learning approach, are proposed for the DDoS detection. The statistical approach implements entropy computation and flow statistics analysis. It uses the mean and standard deviation of destination entropy, new flow arrival rate, packets per flow and flow duration to compute various thresholds. These thresholds are then used to distinguish normal and attack traffic. The machine learning approach uses Random Forest classifier to detect the DDoS attack. We fine-tune the Random Forest algorithm to make it more accurate in DDoS detection. In particular, we introduce the weighted voting instead of the standard majority voting to improve the accuracy. Our result shows that the proposed machine-learning approach outperforms the statistical approach. Furthermore, it also outperforms other machine-learning approach found in the literature.


2021 ◽  
pp. 685-694
Author(s):  
José A. Gallardo ◽  
Miguel García-Torres ◽  
Francisco Gómez-Vela ◽  
Félix Morales ◽  
Federico Divina ◽  
...  

DYNA ◽  
2020 ◽  
Vol 87 (212) ◽  
pp. 63-72
Author(s):  
Jorge Iván Pérez Rave ◽  
Favián González Echavarría ◽  
Juan Carlos Correa Morales

The objective of this work is to develop a machine learning model for online pricing of apartments in a Colombian context. This article addresses three aspects: i) it compares the predictive capacity of linear regression, regression trees, random forest and bagging; ii) it studies the effect of a group of text attributes on the predictive capability of the models; and iii) it identifies the more stable-important attributes and interprets them from an inferential perspective to better understand the object of study. The sample consists of 15,177 observations of real estate. The methods of assembly (random forest and bagging) show predictive superiority with respect to others. The attributes derived from the text had a significant relationship with the property price (on a log scale). However, their contribution to the predictive capacity was almost nil, since four different attributes achieved highly accurate predictions and remained stable when the sample change.


Molecules ◽  
2019 ◽  
Vol 24 (21) ◽  
pp. 3837 ◽  
Author(s):  
Seong-Eun Park ◽  
Seung-Ho Seo ◽  
Eun-Ju Kim ◽  
Dae-Hun Park ◽  
Kyung-Mok Park ◽  
...  

The purpose of this study was to analyze metabolic differences of ginseng berries according to cultivation age and ripening stage using gas chromatography-mass spectrometry (GC-MS)-based metabolomics method. Ginseng berries were harvested every week during five different ripening stages of three-year-old and four-year-old ginseng. Using identified metabolites, a random forest machine learning approach was applied to obtain predictive models for the classification of cultivation age or ripening stage. Principal component analysis (PCA) score plot showed a clear separation by ripening stage, indicating that continuous metabolic changes occurred until the fifth ripening stage. Three-year-old ginseng berries had higher levels of valine, glutamic acid, and tryptophan, but lower levels of lactic acid and galactose than four-year-old ginseng berries at fully ripened stage. Metabolic pathways affected by different cultivation age were involved in amino acid metabolism pathways. A random forest machine learning approach extracted some important metabolites for predicting cultivation age or ripening stage with low error rate. This study demonstrates that different cultivation ages or ripening stages of ginseng berry can be successfully discriminated using a GC-MS-based metabolomic approach together with random forest analysis.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Peter Appiahene ◽  
Yaw Marfo Missah ◽  
Ussiph Najim

The financial crisis that hit Ghana from 2015 to 2018 has raised various issues with respect to the efficiency of banks and the safety of depositors’ in the banking industry. As part of measures to improve the banking sector and also restore customers’ confidence, efficiency and performance analysis in the banking industry has become a hot issue. This is because stakeholders have to detect the underlying causes of inefficiencies within the banking industry. Nonparametric methods such as Data Envelopment Analysis (DEA) have been suggested in the literature as a good measure of banks’ efficiency and performance. Machine learning algorithms have also been viewed as a good tool to estimate various nonparametric and nonlinear problems. This paper presents a combined DEA with three machine learning approaches in evaluating bank efficiency and performance using 444 Ghanaian bank branches, Decision Making Units (DMUs). The results were compared with the corresponding efficiency ratings obtained from the DEA. Finally, the prediction accuracies of the three machine learning algorithm models were compared. The results suggested that the decision tree (DT) and its C5.0 algorithm provided the best predictive model. It had 100% accuracy in predicting the 134 holdout sample dataset (30% banks) and a P value of 0.00. The DT was followed closely by random forest algorithm with a predictive accuracy of 98.5% and a P value of 0.00 and finally the neural network (86.6% accuracy) with a P value 0.66. The study concluded that banks in Ghana can use the result of this study to predict their respective efficiencies. All experiments were performed within a simulation environment and conducted in R studio using R codes.


Author(s):  
Amy Marie Campbell ◽  
Marie-Fanny Racault ◽  
Stephen Goult ◽  
Angus Laurenson

Oceanic and coastal ecosystems have undergone complex environmental changes in recent years, amid a context of climate change. These changes are also reflected in the dynamics of water-borne diseases as some of the causative agents of these illnesses are ubiquitous in the aquatic environment and their survival rates are impacted by changes in climatic conditions. Previous studies have established strong relationships between essential climate variables and the coastal distribution and seasonal dynamics of the bacteria Vibrio cholerae, pathogenic types of which are responsible for human cholera disease. In this study we provide a novel exploration of the potential of a machine learning approach to forecast environmental cholera risk in coastal India, home to more than 200 million inhabitants, utilising atmospheric, terrestrial and oceanic satellite-derived essential climate variables. A Random Forest classifier model is developed, trained and tested on a cholera outbreak dataset over the period 2010–2018 for districts along coastal India. The random forest classifier model has an Accuracy of 0.99, an F1 Score of 0.942 and a Sensitivity score of 0.895, meaning that 89.5% of outbreaks are correctly identified. Spatio-temporal patterns emerged in terms of the model’s performance based on seasons and coastal locations. Further analysis of the specific contribution of each Essential Climate Variable to the model outputs shows that chlorophyll-a concentration, sea surface salinity and land surface temperature are the strongest predictors of the cholera outbreaks in the dataset used. The study reveals promising potential of the use of random forest classifiers and remotely-sensed essential climate variables for the development of environmental cholera-risk applications. Further exploration of the present random forest model and associated essential climate variables is encouraged on cholera surveillance datasets in other coastal areas affected by the disease to determine the model’s transferability potential and applicative value for cholera forecasting systems.


Sign in / Sign up

Export Citation Format

Share Document