Crop Disease Recognition using Machine Learning Algorithms

Classification is a method of observing the features of a new object and assigning it to a known class. Machine learning classification problem consists of known classes and a vivid training set of pre-categorized examples. The work diagnoses groundnut diseases using outstanding machine learning algorithms namely simple logistic, decision tree, random forest and multilayer perceptron for accurate identification of groundnut diseases. Experiments are conducted with the help of 10-fold cross validation strategy. The results advocate that above mentioned classification algorithms diagnose the groundnut diseases with excellent accuracy level. Simple logistic and multilayer perceptron show outstanding performance than other algorithms and result in 96.37% and 95.80% disease classification accuracy. Random forest and decision tree algorithms provide fair accuracies in less time. These machine learning algorithms can be used in diagnosing other crop diseases also.

Download Full-text

PSIX-15 Assessment of machine learning algorithms for prediction of Aleutian disease in American mink

Journal of Animal Science ◽

10.1093/jas/skab235.484 ◽

2021 ◽

Vol 99 (Supplement_3) ◽

pp. 264-265

Author(s):

Duy Ngoc Do ◽

Guoyu Hu ◽

Younes Miar

Keyword(s):

Machine Learning ◽

Random Forest ◽

Linear Models ◽

American Mink ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Data ◽

Enzyme Linked Immunosorbent Assay ◽

Linear Discriminant ◽

Machine Learning Classification

Abstract American mink (Neovison vison) is the major source of fur for the fur industries worldwide and Aleutian disease (AD) is causing severe financial losses to the mink industry. Different methods have been used to diagnose the AD in mink, but the combination of several methods can be the most appropriate approach for the selection of AD resilient mink. Iodine agglutination test (IAT) and counterimmunoelectrophoresis (CIEP) methods are commonly employed in test-and-remove strategy; meanwhile, enzyme-linked immunosorbent assay (ELISA) and packed-cell volume (PCV) methods are complementary. However, using multiple methods are expensive; and therefore, hindering the corrected use of AD tests in selection. This research presented the assessments of the AD classification based on machine learning algorithms. The Aleutian disease was tested on 1,830 individuals using these tests in an AD positive mink farm (Canadian Centre for Fur Animal Research, NS, Canada). The accuracy of classification for CIEP was evaluated based on the sex information, and IAT, ELISA and PCV test results implemented in seven machine learning classification algorithms (Random Forest, Artificial Neural Networks, C50Tree, Naive Bayes, Generalized Linear Models, Boost, and Linear Discriminant Analysis) using the Caret package in R. The accuracy of prediction varied among the methods. Overall, the Random Forest was the best-performing algorithm for the current dataset with an accuracy of 0.89 in the training data and 0.94 in the testing data. Our work demonstrated the utility and relative ease of using machine learning algorithms to assess the CIEP information, and consequently reducing the cost of AD tests. However, further works require the inclusion of production and reproduction information in the models and extension of phenotypic collection to increase the accuracy of current methods.

Download Full-text

Development and Evaluation of the Combined Machine Learning Models for the Prediction of Dam Inflow

Water ◽

10.3390/w12102927 ◽

2020 ◽

Vol 12 (10) ◽

pp. 2927

Author(s):

Jiyeong Hong ◽

Seoro Lee ◽

Joo Hyun Bae ◽

Jimin Lee ◽

Woon Ji Park ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Multilayer Perceptron ◽

Short Term Memory ◽

Learning Algorithms ◽

Absolute Error ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Dam Inflow

Predicting dam inflow is necessary for effective water management. This study created machine learning algorithms to predict the amount of inflow into the Soyang River Dam in South Korea, using weather and dam inflow data for 40 years. A total of six algorithms were used, as follows: decision tree (DT), multilayer perceptron (MLP), random forest (RF), gradient boosting (GB), recurrent neural network–long short-term memory (RNN–LSTM), and convolutional neural network–LSTM (CNN–LSTM). Among these models, the multilayer perceptron model showed the best results in predicting dam inflow, with the Nash–Sutcliffe efficiency (NSE) value of 0.812, root mean squared errors (RMSE) of 77.218 m3/s, mean absolute error (MAE) of 29.034 m3/s, correlation coefficient (R) of 0.924, and determination coefficient (R2) of 0.817. However, when the amount of dam inflow is below 100 m3/s, the ensemble models (random forest and gradient boosting models) performed better than MLP for the prediction of dam inflow. Therefore, two combined machine learning (CombML) models (RF_MLP and GB_MLP) were developed for the prediction of the dam inflow using the ensemble methods (RF and GB) at precipitation below 16 mm, and the MLP at precipitation above 16 mm. The precipitation of 16 mm is the average daily precipitation at the inflow of 100 m3/s or more. The results show the accuracy verification results of NSE 0.857, RMSE 68.417 m3/s, MAE 18.063 m3/s, R 0.927, and R2 0.859 in RF_MLP, and NSE 0.829, RMSE 73.918 m3/s, MAE 18.093 m3/s, R 0.912, and R2 0.831 in GB_MLP, which infers that the combination of the models predicts the dam inflow the most accurately. CombML algorithms showed that it is possible to predict inflow through inflow learning, considering flow characteristics such as flow regimes, by combining several machine learning algorithms.

Download Full-text

Prediction of COVID-19 Risk in Public Areas Using IoT and Machine Learning

Electronics ◽

10.3390/electronics10141677 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1677

Author(s):

Ersin Elbasi ◽

Ahmet E. Topcu ◽

Shinu Mathew

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Naive Bayes ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Bayes Classifier ◽

Social Distancing ◽

Public Areas ◽

Iot Devices

COVID-19 is a community-acquired infection with symptoms that resemble those of influenza and bacterial pneumonia. Creating an infection control policy involving isolation, disinfection of surfaces, and identification of contagions is crucial in eradicating such pandemics. Incorporating social distancing could also help stop the spread of community-acquired infections like COVID-19. Social distancing entails maintaining certain distances between people and reducing the frequency of contact between people. Meanwhile, a significant increase in the development of different Internet of Things (IoT) devices has been seen together with cyber-physical systems that connect with physical environments. Machine learning is strengthening current technologies by adding new approaches to quickly and correctly solve problems utilizing this surge of available IoT devices. We propose a new approach using machine learning algorithms for monitoring the risk of COVID-19 in public areas. Extracted features from IoT sensors are used as input for several machine learning algorithms such as decision tree, neural network, naïve Bayes classifier, support vector machine, and random forest to predict the risks of the COVID-19 pandemic and calculate the risk probability of public places. This research aims to find vulnerable populations and reduce the impact of the disease on certain groups using machine learning models. We build a model to calculate and predict the risk factors of populated areas. This model generates automated alerts for security authorities in the case of any abnormal detection. Experimental results show that we have high accuracy with random forest of 97.32%, with decision tree of 94.50%, and with the naïve Bayes classifier of 99.37%. These algorithms indicate great potential for crowd risk prediction in public areas.

Download Full-text

Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data

10.20944/preprints202002.0108.v1 ◽

2020 ◽

Author(s):

Jiarui Yin ◽

Inikuro Afa Michael ◽

Iduabo John Afa

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Data Science ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Crime Data ◽

Detection Analysis ◽

Supervised Learning Algorithms ◽

Supervised Methods

Machine learning plays a key role in present day crime detection, analysis and prediction. The goal of this work is to propose methods for predicting crimes classified into different categories of severity. We implemented visualization and analysis of crime data statistics in recent years in the city of Boston. We then carried out a comparative study between two supervised learning algorithms, which are decision tree and random forest based on the accuracy and processing time of the models to make predictions using geographical and temporal information provided by splitting the data into training and test sets. The result shows that random forest as expected gives a better result by 1.54% more accuracy in comparison to decision tree, although this comes at a cost of at least 4.37 times the time consumed in processing. The study opens doors to application of similar supervised methods in crime data analytics and other fields of data science

Download Full-text

Data Driven Approach for Eye Disease Classification with Machine Learning

Applied Sciences ◽

10.3390/app9142789 ◽

2019 ◽

Vol 9 (14) ◽

pp. 2789 ◽

Cited By ~ 3

Author(s):

Sadaf Malik ◽

Nadia Kanwal ◽

Mamoona Naveed Asghar ◽

Mohammad Ali A. Sadiq ◽

Irfan Karamat ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Naive Bayes ◽

Learning Algorithms ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Multiple Features ◽

Standard Format ◽

Free Data

Medical health systems have been concentrating on artificial intelligence techniques for speedy diagnosis. However, the recording of health data in a standard form still requires attention so that machine learning can be more accurate and reliable by considering multiple features. The aim of this study is to develop a general framework for recording diagnostic data in an international standard format to facilitate prediction of disease diagnosis based on symptoms using machine learning algorithms. Efforts were made to ensure error-free data entry by developing a user-friendly interface. Furthermore, multiple machine learning algorithms including Decision Tree, Random Forest, Naive Bayes and Neural Network algorithms were used to analyze patient data based on multiple features, including age, illness history and clinical observations. This data was formatted according to structured hierarchies designed by medical experts, whereas diagnosis was made as per the ICD-10 coding developed by the American Academy of Ophthalmology. Furthermore, the system is designed to evolve through self-learning by adding new classifications for both diagnosis and symptoms. The classification results from tree-based methods demonstrated that the proposed framework performs satisfactorily, given a sufficient amount of data. Owing to a structured data arrangement, the random forest and decision tree algorithms’ prediction rate is more than 90% as compared to more complex methods such as neural networks and the naïve Bayes algorithm.

Download Full-text

Statistical Analysis for Selective Identifications of VOCs by Using Surface Functionalized MoS2 Based Sensor Array

Chemistry Proceedings ◽

10.3390/csac2021-10451 ◽

2021 ◽

Vol 5 (1) ◽

pp. 35

Author(s):

Uttam Narendra Thakur ◽

Radha Bhardwaj ◽

Arnab Hazra

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Decision Tree ◽

Sensor Array ◽

Multinomial Logistic Regression ◽

Learning Algorithms ◽

Disease Diagnosis ◽

Machine Learning Algorithms ◽

Human Breath

Disease diagnosis through breath analysis has attracted significant attention in recent years due to its noninvasive nature, rapid testing ability, and applicability for patients of all ages. More than 1000 volatile organic components (VOCs) exist in human breath, but only selected VOCs are associated with specific diseases. Selective identification of those disease marker VOCs using an array of multiple sensors are highly desirable in the current scenario. The use of efficient sensors and the use of suitable classification algorithms is essential for the selective and reliable detection of those disease markers in complex breath. In the current study, we fabricated a noble metal (Au, Pd and Pt) nanoparticle-functionalized MoS2 (Chalcogenides, Sigma Aldrich, St. Louis, MO, USA)-based sensor array for the selective identification of different VOCs. Four sensors, i.e., pure MoS2, Au/MoS2, Pd/MoS2, and Pt/MoS2 were tested under exposure to different VOCs, such as acetone, benzene, ethanol, xylene, 2-propenol, methanol and toluene, at 50 °C. Initially, principal component analysis (PCA) and linear discriminant analysis (LDA) were used to discriminate those seven VOCs. As compared to the PCA, LDA was able to discriminate well between the seven VOCs. Four different machine learning algorithms such as k-nearest neighbors (kNN), decision tree, random forest, and multinomial logistic regression were used to further identify those VOCs. The classification accuracy of those seven VOCs using KNN, decision tree, random forest, and multinomial logistic regression was 97.14%, 92.43%, 84.1%, and 98.97%, respectively. These results authenticated that multinomial logistic regression performed best between the four machine learning algorithms to discriminate and differentiate the multiple VOCs that generally exist in human breath.

Download Full-text

Wind Power Prediction Based on Three Machine-Learning Algorithms: Decision Tree, K-Nearest Neighbors and Random Forest

Proceedings of the Fifteenth International Conference on Management Science and Engineering Management - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-3-030-79203-9_38 ◽

2021 ◽

pp. 490-499

Author(s):

Tingting Liu ◽

Lurong Fan

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Wind Power ◽

Learning Algorithms ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Power Prediction ◽

K Nearest Neighbors ◽

Wind Power Prediction

Download Full-text

Detecting malicious software using machine learning

Issues of radio electronics ◽

10.21778/2218-5453-2019-11-42-45 ◽

2019 ◽

pp. 42-45

Author(s):

A. V. Chevychelov ◽

A. V. Burmistrov ◽

K. Yu. Voyshhev

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Learning Algorithms ◽

Malware Detection ◽

Malicious Code ◽

Machine Learning Algorithms ◽

Fisher Criterion ◽

Malicious Software ◽

Informative Parameters

Today, most malware detection tools (Trojans): trojans, spyware, adware, worms, viruses, and ransomware are based on a signature approach that is ineffective for detecting polymorphs and malware whose signatures have not been recorded in antivirus database. This article explores methods for detecting opcodes in malware using machine learning algorithms. The study is carried on a Microsoft dataset containing 21653 examples of malicious code. The 20 most informative parameters based on the Fisher criterion are distinguished, methods for selecting parameters and various classifiers (logistic decision tree, random forest, naive Bayesian classifier, random tree) are compared, as a result of which an accuracy close to 100% is achieved.

Download Full-text

Application of Data Mining Algorithms for Dementia in People with HIV/AIDS

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/4602465 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Luana Ibiapina Cordeiro Calíope Pinheiro ◽

Maria Lúcia Duarte Pereira ◽

Marcial Porto Fernandez ◽

Francisco Mardônio Vieira Filho ◽

Wilson Jorge Correia Pinto de Abreu ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Data Mining ◽

Logistic Regression ◽

Random Forest ◽

Decision Tree ◽

Learning Algorithms ◽

Principal Component ◽

Machine Learning Algorithms ◽

Hiv Aids

Dementia interferes with the individual’s motor, behavioural, and intellectual functions, causing him to be unable to perform instrumental activities of daily living. This study is aimed at identifying the best performing algorithm and the most relevant characteristics to categorise individuals with HIV/AIDS at high risk of dementia from the application of data mining. Principal component analysis (PCA) algorithm was used and tested comparatively between the following machine learning algorithms: logistic regression, decision tree, neural network, KNN, and random forest. The database used for this study was built from the data collection of 270 individuals infected with HIV/AIDS and followed up at the outpatient clinic of a reference hospital for infectious and parasitic diseases in the State of Ceará, Brazil, from January to April 2019. Also, the performance of the algorithms was analysed for the 104 characteristics available in the database; then, with the reduction of dimensionality, there was an improvement in the quality of the machine learning algorithms and identified that during the tests, even losing about 30% of the variation. Besides, when considering only 23 characteristics, the precision of the algorithms was 86% in random forest, 56% logistic regression, 68% decision tree, 60% KNN, and 59% neural network. The random forest algorithm proved to be more effective than the others, obtaining 84% precision and 86% accuracy.

Download Full-text

APPLICATION OF MACHINE LEARNING ALGORITHMS FOR CLASSIFICATION OF WEED VARIETIES

Bulletin Series of Physics & Mathematical Sciences ◽

10.51889/2021-3.1728-7901.10 ◽

2021 ◽

Vol 75 (3) ◽

pp. 83-93

Author(s):

Zh. A. Buribayev ◽

◽

Zh. E. Amirgaliyeva ◽

A.S. Ataniyazova ◽

Z. M. Melis ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Agricultural Land ◽

Learning Algorithms ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Weed Detection ◽

K Nearest Neighbors ◽

Data Set

The article considers the relevance of the introduction of intelligent weed detection systems, in order to save herbicides and pesticides, as well as to obtain environmentally friendly products. A brief review of the researchers' scientific works is carried out, which describes the methods of identification, classification and discrimination of weeds developed by them based on machine learning algorithms, convolutional neural networks and deep learning algorithms. This research paper presents a program for detecting pests of agricultural land using the algorithms K-Nearest Neighbors, Random Forest and Decision Tree. The data set is collected from 4 types of weeds, such as amaranthus, ambrosia, bindweed and bromus. According to the results of the assessment, the accuracy of weed detection by the classifiers K-Nearest Neighbors, Random Forest and Decision Tree was 83.3%, 87.5%, and 80%. Quantitative results obtained on real data demonstrate that the proposed approach can provide good results in classifying low-resolution images of weeds.

Download Full-text