Detecting coffee leaf rust with UAV-based vegetation indices and decision tree machine learning models

Introduction: Heart disease is often associated with conditions such as clogged arteries due to the sediment accumulation which causes chest pain and heart attack. Many people die due to the heart disease annually. Most countries have a shortage of cardiovascular specialists and thus, a significant percentage of misdiagnosis occurs. Hence, predicting this disease is a serious issue. Using machine learning models performed on multidimensional dataset, this article aims to find the most efficient and accurate machine learning models for disease prediction.Material and Methods: Several algorithms were utilized to predict heart disease among which Decision Tree, Random Forest and KNN supervised machine learning are highly mentioned. The algorithms are applied to the dataset taken from the UCI repository including 294 samples. The dataset includes heart disease features. To enhance the algorithm performance, these features are analyzed, the feature importance scores and cross validation are considered.Results: The algorithm performance is compared with each other, so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F1 score were evaluated for each model. As a result of evaluation, Accuracy, AUC ROC are 83% and 99% respectively for Decision Tree algorithm. Logistic Regression algorithm with accuracy and AUC ROC are 88% and 91% respectively has better performance than other algorithms. Therefore, these techniques can be useful for physicians to predict heart disease patients and prescribe them correctly.Conclusion: Machine learning technique can be used in medicine for analyzing the related data collections to a disease and its prediction. The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the prediction of heart disease is compared to determine the most appropriate classification. As a result of evaluation, better performance was observed in both Decision Tree and Logistic Regression models.

Download Full-text

Hybrid decision tree-based machine learning models for short-term water quality prediction

Chemosphere ◽

10.1016/j.chemosphere.2020.126169 ◽

2020 ◽

Vol 249 ◽

pp. 126169 ◽

Cited By ~ 13

Author(s):

Hongfang Lu ◽

Xin Ma

Keyword(s):

Machine Learning ◽

Water Quality ◽

Decision Tree ◽

Quality Prediction ◽

Learning Models ◽

Short Term ◽

Water Quality Prediction ◽

Machine Learning Models

Download Full-text

EVALUATING INTONATIONAL FEATURES FOR EMOTION RECOGNITION FROM SPEECH

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213007003679 ◽

2007 ◽

Vol 16 (06) ◽

pp. 1001-1014 ◽

Cited By ~ 1

Author(s):

PANAGIOTIS ZERVAS ◽

IOSIF MPORAS ◽

NIKOS FAKOTAKIS ◽

GEORGE KOKKINAKIS

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Emotion Recognition ◽

Bayesian Learning ◽

Experimental Results ◽

Speech Signals ◽

Learning Approaches ◽

Learning Models ◽

C4.5 Decision Tree ◽

Machine Learning Models

This paper presents and discusses the problem of emotion recognition from speech signals with the utilization of features bearing intonational information. In particular parameters extracted from Fujisaki's model of intonation are presented and evaluated. Machine learning models were build with the utilization of C4.5 decision tree inducer, instance based learner and Bayesian learning. The datasets utilized for the purpose of training machine learning models were extracted from two emotional databases of acted speech. Experimental results showed the effectiveness of Fujisaki's model attributes since they enhanced the recognition process for most of the emotion categories and learning approaches helping to the segregation of emotion categories.

Download Full-text

Explainable Artificial Intelligence (XAI) to Enhance Trust Management in Intrusion Detection Systems Using Decision Tree Model

Complexity ◽

10.1155/2021/6634811 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Basim Mahbooba ◽

Mohan Timilsina ◽

Radhya Sahal ◽

Martin Serrano

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Intrusion Detection ◽

Decision Tree ◽

Trust Management ◽

Decision Tree Model ◽

Learning Models ◽

Tree Model ◽

Explainable Artificial Intelligence ◽

Machine Learning Models

Despite the growing popularity of machine learning models in the cyber-security applications (e.g., an intrusion detection system (IDS)), most of these models are perceived as a black-box. The eXplainable Artificial Intelligence (XAI) has become increasingly important to interpret the machine learning models to enhance trust management by allowing human experts to understand the underlying data evidence and causal reasoning. According to IDS, the critical role of trust management is to understand the impact of the malicious data to detect any intrusion in the system. The previous studies focused more on the accuracy of the various classification algorithms for trust in IDS. They do not often provide insights into their behavior and reasoning provided by the sophisticated algorithm. Therefore, in this paper, we have addressed XAI concept to enhance trust management by exploring the decision tree model in the area of IDS. We use simple decision tree algorithms that can be easily read and even resemble a human approach to decision-making by splitting the choice into many small subchoices for IDS. We experimented with this approach by extracting rules in a widely used KDD benchmark dataset. We also compared the accuracy of the decision tree approach with the other state-of-the-art algorithms.

Download Full-text

Drug Classification using Black-box models and Interpretability

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38203 ◽

2021 ◽

Vol 9 (9) ◽

pp. 1518-1529

Author(s):

Pooja Thakkar

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Learning Models ◽

Drug Classification ◽

Box Models ◽

Machine Learning Model ◽

Black Box Models ◽

Insight Into ◽

Machine Learning Models

Abstract: The focus of this study is on drug categorization utilising Machine Learning models, as well as interpretability utilizing LIME and SHAP to get a thorough understanding of the ML models. To do this, the researchers used machine learning models such as random forest, decision tree, and logistic regression to classify drugs. Then, using LIME and SHAP, they determined if these models were interpretable, which allowed them to better understand their results. It may be stated at the conclusion of this paper that LIME and SHAP can be utilised to get insight into a Machine Learning model and determine which attribute is accountable for the divergence in the outcomes. According to the LIME and SHAP results, it is also discovered that Random Forest and Decision Tree ML models are the best models to employ for drug classification, with Na to K and BP being the most significant characteristics for drug classification. Keywords: Machine Learning, Back-box models, LIME, SHAP, Decision Tree

Download Full-text

Using Big Data-machine learning models for diabetes prediction and flight delays analytics

Journal Of Big Data ◽

10.1186/s40537-020-00355-0 ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Thérence Nibareke ◽

Jalal Laassiri

Keyword(s):

Machine Learning ◽

Big Data ◽

Linear Regression ◽

Decision Tree ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Smart Devices ◽

Learning Models ◽

Flight Delays ◽

Machine Learning Models

Abstract Introduction Nowadays large data volumes are daily generated at a high rate. Data from health system, social network, financial, government, marketing, bank transactions as well as the censors and smart devices are increasing. The tools and models have to be optimized. In this paper we applied and compared Machine Learning algorithms (Linear Regression, Naïve bayes, Decision Tree) to predict diabetes. Further more, we performed analytics on flight delays. The main contribution of this paper is to give an overview of Big Data tools and machine learning models. We highlight some metrics that allow us to choose a more accurate model. We predict diabetes disease using three machine learning models and then compared their performance. Further more we analyzed flight delay and produced a dashboard which can help managers of flight companies to have a 360° view of their flights and take strategic decisions. Case description We applied three Machine Learning algorithms for predicting diabetes and we compared the performance to see what model give the best results. We performed analytics on flights datasets to help decision making and predict flight delays. Discussion and evaluation The experiment shows that the Linear Regression, Naive Bayesian and Decision Tree give the same accuracy (0.766) but Decision Tree outperforms the two other models with the greatest score (1) and the smallest error (0). For the flight delays analytics, the model could show for example the airport that recorded the most flight delays. Conclusions Several tools and machine learning models to deal with big data analytics have been discussed in this paper. We concluded that for the same datasets, we have to carefully choose the model to use in prediction. In our future works, we will test different models in other fields (climate, banking, insurance.).

Download Full-text

Credit Risk Model Based on Central Bank Credit Registry Data

Journal of Risk and Financial Management ◽

10.3390/jrfm14030138 ◽

2021 ◽

Vol 14 (3) ◽

pp. 138

Author(s):

Fisnik Doko ◽

Slobodan Kalajdziski ◽

Igor Mishkovski

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Central Bank ◽

Credit Risk ◽

Commercial Banks ◽

Registry Data ◽

Learning Models ◽

Model Based ◽

Machine Learning Models

Data science and machine-learning techniques help banks to optimize enterprise operations, enhance risk analyses and gain competitive advantage. There is a vast amount of research in credit risk, but to our knowledge, none of them uses credit registry as a data source to model the probability of default for individual clients. The goal of this paper is to evaluate different machine-learning models to create accurate model for credit risk assessment using the data from the real credit registry dataset of the Central Bank of Republic of North Macedonia. We strongly believe that the model developed in this research will be an additional source of valuable information to commercial banks, by leveraging historical data for all the population of the country in all the commercial banks. Thus, in this research, we compare five machine-learning models to classify credit risk data, i.e., logistic regression, decision tree, random forest, support vector machines (SVM) and neural network. We evaluate the five models using different machine-learning metrics, and we propose a model based on credit registry data from the central bank with detailed methodology that can predict the credit risk based on credit history of the population in the country. Our results show that the best accuracy is achieved by using decision tree performing on imbalanced data with and without scaling, followed by random forest and linear regression.

Download Full-text

Improving Potato Yield Prediction by Combining Cultivar Information and UAV Remote Sensing Data Using Machine Learning

Remote Sensing ◽

10.3390/rs13163322 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3322

Author(s):

Dan Li ◽

Yuxin Miao ◽

Sanjay K. Gupta ◽

Carl J. Rosen ◽

Fei Yuan ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Vegetation Indices ◽

Remote Sensing Data ◽

Growing Season ◽

Potato Yield ◽

Yield Variability ◽

Yield Prediction ◽

Learning Models ◽

Machine Learning Models

Accurate high-resolution yield maps are essential for identifying spatial yield variability patterns, determining key factors influencing yield variability, and providing site-specific management insights in precision agriculture. Cultivar differences can significantly influence potato (Solanum tuberosum L.) tuber yield prediction using remote sensing technologies. The objective of this study was to improve potato yield prediction using unmanned aerial vehicle (UAV) remote sensing by incorporating cultivar information with machine learning methods. Small plot experiments involving different cultivars and nitrogen (N) rates were conducted in 2018 and 2019. UAV-based multi-spectral images were collected throughout the growing season. Machine learning models, i.e., random forest regression (RFR) and support vector regression (SVR), were used to combine different vegetation indices with cultivar information. It was found that UAV-based spectral data from the early growing season at the tuber initiation stage (late June) were more correlated with potato marketable yield than the spectral data from the later growing season at the tuber maturation stage. However, the best performing vegetation indices and the best timing for potato yield prediction varied with cultivars. The performance of the RFR and SVR models using only remote sensing data was unsatisfactory (R2 = 0.48–0.51 for validation) but was significantly improved when cultivar information was incorporated (R2 = 0.75–0.79 for validation). It is concluded that combining high spatial-resolution UAV images and cultivar information using machine learning algorithms can significantly improve potato yield prediction than methods without using cultivar information. More studies are needed to improve potato yield prediction using more detailed cultivar information, soil and landscape variables, and management information, as well as more advanced machine learning models.

Download Full-text

Detection of Breast Cancer Using Machine Learning Algorithms

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit217141 ◽

2021 ◽

pp. 223-227

Author(s):

Vijaylaxmi Kochari

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Linear Regression ◽

Decision Tree ◽

Machine Learning Algorithms ◽

Training Dataset ◽

Proper Treatment ◽

Learning Models ◽

Initial Stage ◽

Machine Learning Models

Breast cancer represents one of the dangerous diseases that causes a high number of deaths every year. The dataset containing the features present in the CSV format is used to identify whether the digitalized image is benign or malignant. The machine learning models such as Linear Regression, Decision Tree, Radom Forest are trained with the training dataset and used to classify. The accuracy of these classifiers is compared to get the best model. This will help the doctors to give proper treatment at the initial stage and save their lives.

Download Full-text