scholarly journals Diagnosis of Head and Neck Cancer in Developing Countries Using a Stacked Ensemble Model

2020 ◽  
Vol 5 (9) ◽  
pp. 1097-1101
Author(s):  
Folake Akinbohun ◽  
Ambrose Akinbohun ◽  
Adekunle Daniel ◽  
Oghenerukevwe Elohor Ojajuni

Head and neck cancers (HNC) are indicated when cells grow abnormally.  The incidence of HNC is on the increase owing to several factors. There is often late presentation that can result in loss of lives (mortality) especially in Africa due to paucity of specialists. These challenges prompted the development of a stacked ensemble model for diagnosis of HNC to facilitate prompt referral.  The data were collected which consists of 1473 instances with 18 features.   Information Gain was used for selecting important features and three supervised learning algorithms were deployed for the base learners: Decision Tree (C4.5), K-Nearest Neighbors and Naïve Bayes. The predictions of the base learners were combined and passed to meta learners: Logistic Model Tree (LMT). The result showed that Information Gain method with stacked LMTwas 95.11%. It was deduced that both Information Gain with stacked MLR produced higher accuracy that the base learners’ results. Hence, this stacked model can be used for diagnosis of HNC in healthcare systems.

2020 ◽  
Vol 12 (17) ◽  
pp. 2742
Author(s):  
Ehsan Kamali Maskooni ◽  
Seyed Amir Naghibi ◽  
Hossein Hashemi ◽  
Ronny Berndtsson

Groundwater (GW) is being uncontrollably exploited in various parts of the world resulting from huge needs for water supply as an outcome of population growth and industrialization. Bearing in mind the importance of GW potential assessment in reaching sustainability, this study seeks to use remote sensing (RS)-derived driving factors as an input of the advanced machine learning algorithms (MLAs), comprising deep boosting and logistic model trees to evaluate their efficiency. To do so, their results are compared with three benchmark MLAs such as boosted regression trees, k-nearest neighbors, and random forest. For this purpose, we firstly assembled different topographical, hydrological, RS-based, and lithological driving factors such as altitude, slope degree, aspect, slope length, plan curvature, profile curvature, relative slope position, distance from rivers, river density, topographic wetness index, land use/land cover (LULC), normalized difference vegetation index (NDVI), distance from lineament, lineament density, and lithology. The GW spring indicator was divided into two classes for training (434 springs) and validation (186 springs) with a proportion of 70:30. The training dataset of the springs accompanied by the driving factors were incorporated into the MLAs and the outputs were validated by different indices such as accuracy, kappa, receiver operating characteristics (ROC) curve, specificity, and sensitivity. Based upon the area under the ROC curve, the logistic model tree (87.813%) generated similar performance to deep boosting (87.807%), followed by boosted regression trees (87.397%), random forest (86.466%), and k-nearest neighbors (76.708%) MLAs. The findings confirm the great performance of the logistic model tree and deep boosting algorithms in modelling GW potential. Thus, their application can be suggested for other areas to obtain an insight about GW-related barriers toward sustainability. Further, the outcome based on the logistic model tree algorithm depicts the high impact of the RS-based factor, such as NDVI with 100 relative influence, as well as high influence of the distance from river, altitude, and RSP variables with 46.07, 43.47, and 37.20 relative influence, respectively, on GW potential.


2020 ◽  
Vol 5 (4) ◽  
pp. 489-493
Author(s):  
Olatunbosun Olabode ◽  
Adebayo O. Adetunmbi ◽  
Folake Akinbohun ◽  
Ambrose Akinbohun

Head and neck cancers (HNC) are indicated when cells grow abnormally.  The disturbing rate of morbidity and mortality of patients with HNC due to late presentation is on the increase especially in Africa (developing countries). There is need to diagnose head and neck cancer early if patients present so that prompt referral could be facilitated.  The collected data consists of 1473 instances with 18 features. The dataset was divided into training and test data.  Two supervised learning algorithms were deployed for the study namely: Decision Tree (C4.5) and k-Nearest Neighbors (KNN). It showed that Decision Tree outperformed with accuracy of 91.40% while KNN had accuracy of 88.24%. Hence, machine learning algorithm like Decision Tree can be used for diagnosis of HNC in healthcare organisations.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Matthijs Blankers ◽  
Louk F. M. van der Post ◽  
Jack J. M. Dekker

Abstract Background Accurate prediction models for whether patients on the verge of a psychiatric criseis need hospitalization are lacking and machine learning methods may help improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate the accuracy of ten machine learning algorithms, including the generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact. We also evaluate an ensemble model to optimize the accuracy and we explore individual predictors of hospitalization. Methods Data from 2084 patients included in the longitudinal Amsterdam Study of Acute Psychiatry with at least one reported psychiatric crisis care contact were included. Target variable for the prediction models was whether the patient was hospitalized in the 12 months following inclusion. The predictive power of 39 variables related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts was evaluated. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared and we also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis and the five best performing algorithms were combined in an ensemble model using stacking. Results All models performed above chance level. We found Gradient Boosting to be the best performing algorithm (AUC = 0.774) and K-Nearest Neighbors to be the least performing (AUC = 0.702). The performance of GLM/logistic regression (AUC = 0.76) was slightly above average among the tested algorithms. In a Net Reclassification Improvement analysis Gradient Boosting outperformed GLM/logistic regression by 2.9% and K-Nearest Neighbors by 11.3%. GLM/logistic regression outperformed K-Nearest Neighbors by 8.7%. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was in most cases modest. The results show that a predictive accuracy similar to the best performing model can be achieved when combining multiple algorithms in an ensemble model.


Author(s):  
Digvijay Kumar ◽  
Bavithra

Heart-related diseases or Cardiovascular Diseases (CVDs) are the most common and main reasons for a huge number of deaths in the world, not only in India but in the whole world. So, there is a need for a reliable, accurate, and feasible system to diagnose such diseases in time for proper treatment. This research paper represents the various models based on such algorithms and techniques to analyze their performance. Such as Logistic Regression, Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Naive Bayes, Random Forest, and ensemble models which are Supervised Learning algorithms. Using various important features that are necessary for the prediction of CVDs (like a person is having CVDs or not), which we will further discuss in this paper.


Mathematics ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 779
Author(s):  
Ruriko Yoshida

A tropical ball is a ball defined by the tropical metric over the tropical projective torus. In this paper we show several properties of tropical balls over the tropical projective torus and also over the space of phylogenetic trees with a given set of leaf labels. Then we discuss its application to the K nearest neighbors (KNN) algorithm, a supervised learning method used to classify a high-dimensional vector into given categories by looking at a ball centered at the vector, which contains K vectors in the space.


Sign in / Sign up

Export Citation Format

Share Document