scholarly journals Predictive model construction for prediction of soil fertility using decision tree machine learning algorithm

2021 ◽  
Vol 8 (1) ◽  
pp. 30-35
Author(s):  
Jayalakshmi R ◽  
Savitha Devi M

Agriculture sector is recognized as the backbone of the Indian economy that plays a crucial role in the growth of the nation’s economy. It imparts on weather and other environmental aspects. Some of the factors on which agriculture is reliant are Soil, climate, flooding, fertilizers, temperature, precipitation, crops, insecticides, and herb. The soil fertility is dependent on these factors and hence difficult to predict. However, the Agriculture sector in India is facing the severe problem of increasing crop productivity. Farmers lack the essential knowledge of nutrient content of the soil, selection of crop best suited for the soil and they also lack efficient methods for predicting crop well in advance so that appropriate methods have been used to improve crop productivity. This paper presents different Supervised Machine Learning Algorithms such as Decision tree, K-Nearest Neighbor (KNN), Support Vector Machine (SVM) to predict the fertility of soil based on macro-nutrients and micro-nutrients status found in the dataset. Supervised Machine Learning algorithms are applied on the training dataset and are tested with the test dataset, and the implementation of these algorithms is done using R Tool. The performance analysis of these algorithms is done using different evaluation metrics like mean absolute error, cross-validation, and accuracy. Result analysis shows that the Decision tree is produced the best accuracy of 99% with a very less mean square error (MSE) rate.

2019 ◽  
Vol 1 (1) ◽  
pp. 384-399 ◽  
Author(s):  
Thais de Toledo ◽  
Nunzio Torrisi

The Distributed Network Protocol (DNP3) is predominately used by the electric utility industry and, consequently, in smart grids. The Peekaboo attack was created to compromise DNP3 traffic, in which a man-in-the-middle on a communication link can capture and drop selected encrypted DNP3 messages by using support vector machine learning algorithms. The communication networks of smart grids are a important part of their infrastructure, so it is of critical importance to keep this communication secure and reliable. The main contribution of this paper is to compare the use of machine learning techniques to classify messages of the same protocol exchanged in encrypted tunnels. The study considers four simulated cases of encrypted DNP3 traffic scenarios and four different supervised machine learning algorithms: Decision tree, nearest-neighbor, support vector machine, and naive Bayes. The results obtained show that it is possible to extend a Peekaboo attack over multiple substations, using a decision tree learning algorithm, and to gather significant information from a system that communicates using encrypted DNP3 traffic.


The first step in diagnosis of a breast cancer is the identification of the disease. Early detection of the breast cancer is significant to reduce the mortality rate due to breast cancer. Machine learning algorithms can be used in identification of the breast cancer. The supervised machine learning algorithms such as Support Vector Machine (SVM) and the Decision Tree are widely used in classification problems, such as the identification of breast cancer. In this study, a machine learning model is proposed by employing learning algorithms namely, the support vector machine and decision tree. The kaggle data repository consisting of 569 observations of malignant and benign observations is used to develop the proposed model. Finally, the model is evaluated using accuracy, confusion matrix precision and recall as metrics for evaluation of performance on the test set. The analysis result showed that, the support vector machine (SVM) has better accuracy and less number of misclassification rate and better precision than the decision tree algorithm. The average accuracy of the support vector machine (SVM) is 91.92 % and that of the decision tree classification model is 87.12 %.


2021 ◽  
Vol 11 (15) ◽  
pp. 6728
Author(s):  
Muhammad Asfand Hafeez ◽  
Muhammad Rashid ◽  
Hassan Tariq ◽  
Zain Ul Abideen ◽  
Saud S. Alotaibi ◽  
...  

Classification and regression are the major applications of machine learning algorithms which are widely used to solve problems in numerous domains of engineering and computer science. Different classifiers based on the optimization of the decision tree have been proposed, however, it is still evolving over time. This paper presents a novel and robust classifier based on a decision tree and tabu search algorithms, respectively. In the aim of improving performance, our proposed algorithm constructs multiple decision trees while employing a tabu search algorithm to consistently monitor the leaf and decision nodes in the corresponding decision trees. Additionally, the used tabu search algorithm is responsible to balance the entropy of the corresponding decision trees. For training the model, we used the clinical data of COVID-19 patients to predict whether a patient is suffering. The experimental results were obtained using our proposed classifier based on the built-in sci-kit learn library in Python. The extensive analysis for the performance comparison was presented using Big O and statistical analysis for conventional supervised machine learning algorithms. Moreover, the performance comparison to optimized state-of-the-art classifiers is also presented. The achieved accuracy of 98%, the required execution time of 55.6 ms and the area under receiver operating characteristic (AUROC) for proposed method of 0.95 reveals that the proposed classifier algorithm is convenient for large datasets.


Author(s):  
Christian Knaak ◽  
Moritz Kröger ◽  
Frederic Schulze ◽  
Peter Abels ◽  
Arnold Gillner

An effective process monitoring strategy is a requirement for meeting the challenges posed by increasingly complex products and manufacturing processes. To address these needs, this study investigates a comprehensive scheme based on classical machine learning methods, deep learning algorithms, and feature extraction and selection techniques. In a first step, a novel deep learning architecture based on convolutional neural networks (CNN) and gated recurrent units (GRU) is introduced to predict the local weld quality based on mid-wave infrared (MWIR) and near-infrared (NIR) image data. The developed technology is used to discover critical welding defects including lack of fusion (false friends), sagging and lack of penetration, and geometric deviations of the weld seam. Additional work is conducted to investigate the significance of various geometrical, statistical, and spatio-temporal features extracted from the keyhole and weld pool regions. Furthermore, the performance of the proposed deep learning architecture is compared to that of classical supervised machine learning algorithms, such as multi-layer perceptron (MLP), logistic regression (LogReg), support vector machines (SVM), decision trees (DT), random forest (RF) and k-Nearest Neighbors (kNN). Optimal hyperparameters for each algorithm are determined by an extensive grid search. Ultimately, the three best classification models are combined into an ensemble classifier that yields the highest detection rates and achieves the most robust estimation of welding defects among all classifiers studied, which is validated on previously unknown welding trials.


2021 ◽  
Vol 11 (21) ◽  
pp. 10062
Author(s):  
Aimin Li ◽  
Meng Fan ◽  
Guangduo Qin ◽  
Youcheng Xu ◽  
Hailong Wang

Monitoring open water bodies accurately is important for assessing the role of ecosystem services in the context of human survival and climate change. There are many methods available for water body extraction based on remote sensing images, such as the normalized difference water index (NDWI), modified NDWI (MNDWI), and machine learning algorithms. Based on Landsat-8 remote sensing images, this study focuses on the effects of six machine learning algorithms and three threshold methods used to extract water bodies, evaluates the transfer performance of models applied to remote sensing images in different periods, and compares the differences among these models. The results are as follows. (1) Various algorithms require different numbers of samples to reach their optimal consequence. The logistic regression algorithm requires a minimum of 110 samples. As the number of samples increases, the order of the optimal model is support vector machine, neural network, random forest, decision tree, and XGBoost. (2) The accuracy evaluation performance of each machine learning on the test set cannot represent the local area performance. (3) When these models are directly applied to remote sensing images in different periods, the AUC indicators of each machine learning algorithm for three regions all show a significant decline, with a decrease range of 0.33–66.52%, and the differences among the different algorithm performances in the three areas are obvious. Generally, the decision tree algorithm has good transfer performance among the machine learning algorithms with area under curve (AUC) indexes of 0.790, 0.518, and 0.697 in the three areas, respectively, and the average value is 0.668. The Otsu threshold algorithm is the optimal among threshold methods, with AUC indexes of 0.970, 0.617, and 0.908 in the three regions respectively and an average AUC of 0.832.


The advancement in cyber-attack technologies have ushered in various new attacks which are difficult to detect using traditional intrusion detection systems (IDS).Existing IDS are trained to detect known patterns because of which newer attacks bypass the current IDS and go undetected. In this paper, a two level framework is proposed which can be used to detect unknown new attacks using machine learning techniques. In the first level the known types of classes for attacks are determined using supervised machine learning algorithms such as Support Vector Machine (SVM) and Neural networks (NN). The second level uses unsupervised machine learning algorithms such as K-means. The experimentation is carried out with four models with NSL- KDD dataset in Openstack cloud environment. The Model with Support Vector Machine for supervised machine learning, Gradual Feature Reduction (GFR) for feature selection and K-means for unsupervised algorithm provided the optimum efficiency of 94.56 %.


PLoS ONE ◽  
2021 ◽  
Vol 16 (11) ◽  
pp. e0258788
Author(s):  
Sarra Ayouni ◽  
Fahima Hajjej ◽  
Mohamed Maddeh ◽  
Shaha Al-Otaibi

The educational research is increasingly emphasizing the potential of student engagement and its impact on performance, retention and persistence. This construct has emerged as an important paradigm in the higher education field for many decades. However, evaluating and predicting the student’s engagement level in an online environment remains a challenge. The purpose of this study is to suggest an intelligent predictive system that predicts the student’s engagement level and then provides the students with feedback to enhance their motivation and dedication. Three categories of students are defined depending on their engagement level (Not Engaged, Passively Engaged, and Actively Engaged). We applied three different machine-learning algorithms, namely Decision Tree, Support Vector Machine and Artificial Neural Network, to students’ activities recorded in Learning Management System reports. The results demonstrate that machine learning algorithms could predict the student’s engagement level. In addition, according to the performance metrics of the different algorithms, the Artificial Neural Network has a greater accuracy rate (85%) compared to the Support Vector Machine (80%) and Decision Tree (75%) classification techniques. Based on these results, the intelligent predictive system sends feedback to the students and alerts the instructor once a student’s engagement level decreases. The instructor can identify the students’ difficulties during the course and motivate them through e-mail reminders, course messages, or scheduling an online meeting.


Author(s):  
Inssaf El Guabassi ◽  
Zakaria Bousalem ◽  
Rim Marah ◽  
Aimad Qazdar

In recent years, the world's population is increasingly demanding to predict the future with certainty, predicting the right information in any area is becoming a necessity. One of the ways to predict the future with certainty is to determine the possible future. In this sense, machine learning is a way to analyze huge datasets to make strong predictions or decisions. The main objective of this research work is to build a predictive model for evaluating students’ performance. Hence, the contributions are threefold. The first is to apply several supervised machine learning algorithms (i.e. ANCOVA, Logistic Regression, Support Vector Regression, Log-linear Regression, Decision Tree Regression, Random Forest Regression, and Partial Least Squares Regression) on our education dataset. The second purpose is to compare and evaluate algorithms used to create a predictive model based on various evaluation metrics. The last purpose is to determine the most important factors that influence the success or failure of the students. The experimental results showed that the Log-linear Regression provides a better prediction as well as the behavioral factors that influence students’ performance.


2021 ◽  
Vol 2095 (1) ◽  
pp. 012058
Author(s):  
Xiaoyu Xian ◽  
Haichuan Tang ◽  
Yin Tian ◽  
Qi Liu ◽  
Yuming Fan

Abstract This paper addresses electric motor fault diagnosis using supervised machine learning classification. A total of 15 distinct fault types are classified and multilabel strategies are used to classify concurrent faults. we explored, developed, and compared the performance of different types of binary (fault/non-fault), multi-class (fault type) and multi-label (single fault versus combination fault) classifiers. To evaluate the effectiveness of fault identification and classification, we used different supervised machine learning methods, including Random forest classification, support vector machine and neural network classification. Through experiment, we compared these methods over 4 classification regimes and finally summarize the most suitable machine learning algorithms for different aspects of health diagnosis in traction motors area.


Current global huge cyber protection attacks resulting from Infected Encryption ransomware structures over all international locations and businesses with millions of greenbacks lost in paying compulsion abundance. This type of malware encrypts consumer files, extracts consumer files, and charges higher ransoms to be paid for decryption of keys. An attacker could use different types of ransomware approach to steal a victim's files. Some of ransomware attacks like Scareware, Mobile ransomware, WannaCry, CryptoLocker, Zero-Day ransomware attack etc. A zero-day vulnerability is a software program security flaw this is regarded to the software seller however doesn’t have patch in vicinity to restore a flaw. Despite the fact that machine learning algorithms are already used to find encryption Ransomware. This is based on the analysis of a large number of PE file data Samples (benign software and ransomware utility) makes use of supervised machine learning algorithms for ascertain Zero-day attacks. This work was done on a Microsoft Windows operating system (the most attacked os through encryption ransomware) and estimated it. We have used four Supervised learning Algorithms, Random Forest Classifier , K-Nearest Neighbor, Support Vector Machine and Logistic Regression. Tests using machine learning algorithms evaluate almost null false positives with a 99.5% accuracy with a random forest algorithm.


Sign in / Sign up

Export Citation Format

Share Document