153 Creation of a feed composition database: Machine learning techniques for automated classification of corn grain products, preliminary results

Abstract Feed composition tables are a commonly used to develop research projects and to develop animal diets. Currently, the National Animal Nutrition Program aims to create a living database containing feed composition information using large datasets provided by commercial laboratories. Using large datasets should ensure representative nutritional values for feeds included in the database; however, managing large datasets requires computer codes to manage and classify feeds correctly. Thus, the objective of this project was to develop 2 models based on supervised machine learning techniques for automated classification of corn grain product samples. The database used in the study contained 88,057 samples of corn grain products resulting from the screening procedure previously described by Tran et al (2016). Two types of supervised machine learning models were developed: decision tree and random forest. Parameters included for feed classification were: dry matter, crude protein, neutral detergent fiber, ash, fat, and starch. Models were trained and validated using 70 and 30% of the dataset, respectively. The decision tree and random forest correctly classified 98.3 and 98.8% of validation dataset, respectively. For each corn grain product the performance of the decision tree and random forest were: corn germ = 91 and 91%; corn germ meal = 97 and 95%; corn gluten feed, dry = 99 and 100%; corn gluten feed, wet = 100 and 100%; corn gluten meal = 99 and 100%; corn grain, dry = 99 and 99%; corn grain, high moisture = 100 and 100%; corn grain, steam-flaked = 34 and 53%; corn hominy feed = 83 and 88%; and corn screenings = 44 and 60%, respectively. In conclusion, the random forest was superior to the decision tree approach for classifying corn grain products. Further development is required to improve the performance of models for classifying corn grain steam-flaked and corn screenings

Download Full-text

Classification of Agriculture Farm Machinery Using Machine Learning and Internet of Things

Symmetry ◽

10.3390/sym13030403 ◽

2021 ◽

Vol 13 (3) ◽

pp. 403

Author(s):

Muhammad Waleed ◽

Tai-Won Um ◽

Tariq Kamal ◽

Syed Muhammad Usman

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Farm Machinery ◽

Learning Techniques

In this paper, we apply the multi-class supervised machine learning techniques for classifying the agriculture farm machinery. The classification of farm machinery is important when performing the automatic authentication of field activity in a remote setup. In the absence of a sound machine recognition system, there is every possibility of a fraudulent activity taking place. To address this need, we classify the machinery using five machine learning techniques—K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and Gradient Boosting (GB). For training of the model, we use the vibration and tilt of machinery. The vibration and tilt of machinery are recorded using the accelerometer and gyroscope sensors, respectively. The machinery included the leveler, rotavator and cultivator. The preliminary analysis on the collected data revealed that the farm machinery (when in operation) showed big variations in vibration and tilt, but observed similar means. Additionally, the accuracies of vibration-based and tilt-based classifications of farm machinery show good accuracy when used alone (with vibration showing slightly better numbers than the tilt). However, the accuracies improve further when both (the tilt and vibration) are used together. Furthermore, all five machine learning algorithms used for classification have an accuracy of more than 82%, but random forest was the best performing. The gradient boosting and random forest show slight over-fitting (about 9%), but both algorithms produce high testing accuracy. In terms of execution time, the decision tree takes the least time to train, while the gradient boosting takes the most time.

Download Full-text

Prediction of Autism Spectrum Disorder Using Supervised Machine Learning Algorithms

Asian Journal of Computer Science and Technology ◽

10.51983/ajcst-2019.8.3.2734 ◽

2019 ◽

Vol 8 (3) ◽

pp. 15-18

Author(s):

T. Lakshmi Praveena ◽

N. V. Muthu Lakshmi

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Autism Spectrum Disorder ◽

Random Forest ◽

Decision Tree ◽

Autism Spectrum ◽

Early Years ◽

Spectrum Disorder ◽

Supervised Machine Learning ◽

Treatment Mechanisms

Autism appears to be a neuro developmental disorder that is visible in the early years. It is a wide-spectrum disorder that indicates that the severity and symptoms can vary from person to person. The Centre for Disease Control found that one in 68 was diagnosed with autism spectrum disorder with increasing numbers in every year. Detection of autism in adults is a cumbersome procedure because in adults, many symptoms can blend with some other mental health, motor impairment disorders so misinterpretation of actual diseases can in turn lead to a terrible life without proper diagnosis and effective treatment mechanisms. Machine learning is a powerful computer tool that supports different application domains Learning complex relationships or patterns from large datasets to draw accurate conclusions. Disease assessment can be done with predictive health data analysis and more appropriate treatment mechanisms that are now a hot area of research. Supervised learning is an important step of Machine learning which uses a rule-based approach by examining empirical data sets to build accurate predictive models. In this paper, decision tree, random forest, SVM, neural networks algorithms are applied on autism spectrum data which have been collected from UCI repository. The results of decision tree, random forest, SVM, neural networks algorithms on autism dataset are presented in this paper in an efficient manner. Analysis performed over these accurate results which will be useful to make right decisions in predicting autism spectrum disorder (ASD) at early stages. Thus, early autism intervention using machine learning techniques opens up a new way for autistic individuals to develop the potential to lead a better life by improving their behavioural and emotional skills.

Download Full-text

A Supervised Machine Learning Approach to Detect the On/Off State in Parkinson’s Disease Using Wearable Based Gait Signals

Diagnostics ◽

10.3390/diagnostics10060421 ◽

2020 ◽

Vol 10 (6) ◽

pp. 421

Author(s):

Satyabrata Aich ◽

Jinyoung Youn ◽

Sabyasachi Chakraborty ◽

Pyari Mohan Pradhan ◽

Jin-han Park ◽

...

Keyword(s):

Machine Learning ◽

Parkinson’S Disease ◽

Parkinson's Disease ◽

Random Forest ◽

Wearable Devices ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Healthcare Applications ◽

Reported Data

Fluctuations in motor symptoms are mostly observed in Parkinson’s disease (PD) patients. This characteristic is inevitable, and can affect the quality of life of the patients. However, it is difficult to collect precise data on the fluctuation characteristics using self-reported data from PD patients. Therefore, it is necessary to develop a suitable technology that can detect the medication state, also termed the “On”/“Off” state, automatically using wearable devices; at the same time, this could be used in the home environment. Recently, wearable devices, in combination with powerful machine learning techniques, have shown the potential to be effectively used in critical healthcare applications. In this study, an algorithm is proposed that can detect the medication state automatically using wearable gait signals. A combination of features that include statistical features and spatiotemporal gait features are used as inputs to four different classifiers such as random forest, support vector machine, K nearest neighbour, and Naïve Bayes. In total, 20 PD subjects with definite motor fluctuations have been evaluated by comparing the performance of the proposed algorithm in association with the four aforementioned classifiers. It was found that random forest outperformed the other classifiers with an accuracy of 96.72%, a recall of 97.35%, and a precision of 96.92%.

Download Full-text

Comparative Analysis of Machine Learning Techniques to Identify Churn for Telecom Data

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.34.19210 ◽

2018 ◽

Vol 7 (3.34) ◽

pp. 291

Author(s):

M Malleswari ◽

R.J Manira ◽

Praveen Kumar ◽

Murugan .

Keyword(s):

Machine Learning ◽

Big Data ◽

Random Forest ◽

Decision Tree ◽

Apache Spark ◽

Machine Learning Techniques ◽

Churn Prediction ◽

Learning Techniques ◽

Boosted Tree ◽

Customer Attrition

Big data analytics has been the focus for large scale data processing. Machine learning and Big data has great future in prediction. Churn prediction is one of the sub domain of big data. Preventing customer attrition especially in telecom is the advantage of churn prediction. Churn prediction is a day-to-day affair involving millions. So a solution to prevent customer attrition can save a lot. This paper propose to do comparison of three machine learning techniques Decision tree algorithm, Random Forest algorithm and Gradient Boosted tree algorithm using Apache Spark. Apache Spark is a data processing engine used in big data which provides in-memory processing so that the processing speed is higher. The analysis is made by extracting the features of the data set and training the model. Scala is a programming language that combines both object oriented and functional programming and so a powerful programming language. The analysis is implemented using Apache Spark and modelling is done using scala ML. The accuracy of Decision tree model came out as 86%, Random Forest model is 87% and Gradient Boosted tree is 85%.

Download Full-text

An Innovative Method for Predicting and Classifying Inadequate Accuracy in Heart Disease by Using Decision Tree with K-Nearest Neighbors Algorithm

Alinteri Journal of Agricultural Sciences ◽

10.47059/alinteri/v36i1/ajas21086 ◽

2021 ◽

Vol 36 (1) ◽

pp. 609-615

Author(s):

Mandhapati Rajesh ◽

Dr.K. Malathi

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Decision Tree ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Accuracy Rate ◽

K Nearest Neighbors ◽

Machine Learning Methods ◽

Learning Techniques

Aim: Predicting the Heartdiseases using medical parameters of cardiac patients to get a good accuracy rate using machine learning methods like innovative Decision Tree (DT) algorithm. Materials and Methods: Supervised Machine learning Techniques with innovative Decision Tree (N = 20) and K Nearest Neighbour (KNN) (N = 20) are performed with five different datasets at each time to record five samples. Results: The Decision Tree is used to predict heart disease with the help of various medical conditions, the accuracy is achieved for DT is 98% and KNN is 72.2%. The two algorithms Decision Tree and KNN are statistically insignificant (=.737) with the independent sample T-Test value (p<0.005) with a confidence level of 95%. Conclusion: Prediction and classification of heart disease significantly seem to be better in DT than KNN.

Download Full-text

Aprendizado de Máquina Aplicado à Predição de Doenças Cardiometabólicas com Utilização de Indicadores Metabólicos e Comportamentais de Risco à Saúde

10.14210/cotb.v12.p301-308 ◽

2021 ◽

Author(s):

Alan Lopes de Sousa Freitas ◽

Ana Silvia Degasperi Ieker ◽

Josiane Melchiori Pinheiro ◽

Wilson Rinaldi ◽

Heloise Manica Paris Teixeira

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Logistic Regression ◽

Decision Tree ◽

Causes Of Death ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Cardiometabolic Diseases ◽

Learning Techniques ◽

Good Classification

Cardiometabolic diseases, developed throughout the worker’s life,such as hypertension, diabetes, dyslipidemia and obesity are amongthe main causes of death and are associated with modifiable andcontrollable risk factors. The general objective of this study wasto apply supervised Machine Learning techniques and to comparetheir performance to predict the risk of developing cardiometabolicdisease from servers working at the School Hospital of south inBrazil. We sought to map the characteristics of individuals who aremore likely to develop cardiometabolic diseases. The machine learningmodels evaluated were Naive Bayes, Decision Tree, RandomForest, KNN, Logistic Regression and SVM. The results obtained inthe experiments showed that some supervised machine learningmodels produce a good classification, depending on the attributesand hyperparameters used.

Download Full-text

Machine Learning (Neuronal Net, Random Forest, and C5.0 single decision tree) based on pXRF data as a tool to date sediment layers of the Nile Delta

10.5194/egusphere-egu21-15296 ◽

2021 ◽

Author(s):

Martin Seeliger ◽

Marina Altmeyer ◽

Andreas Ginau ◽

Robert Schiestl ◽

Jürgen Wunderlich

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Nile Delta ◽

Sediment Cores ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Surrounding Areas ◽

Sediment Layers

This paper presents the application of machine-learning techniques on pXRF data to establish a chronology for sediment cores around Tell Buto (Tell el-Fara&#180;in) in the northwestern Nile Delta. As modern laboratories for dating techniques like OSL or 14C are rare in Egypt and sample export is restricted, we are facing a lack of opportunities to create a robust chronology, which is indispensable in modern Geoarchaeology.Therefore, we present a new approach to transfer archaeological age information gained at the excavation at Buto to corings of the wider Buto area. Sediments of archaeological outcrops and pits with known age are measured using pXRF to create a geochemical &#8220;fingerprint&#8221; for several historic eras. Afterwards, these &#8220;fingerprints&#8221; are transferred to corings of the surrounding areas using machine-learning algorithms.This paper presents 1) the application of three different machine-learning approaches (Neuronal Net, Random Forest, and C5.0 decision tree) to check if archaeological age information can be transferred to sediments far off the settlement mounds using pXRF data, 2) the comparison of all approaches and the evaluation if the easily anticipated decision tree and Random Forest show similar results as the &#8220;black-box system&#8221; Neuronal Net, and finally, 3) a case study that provides the results of Altmeyer et al. (in review) for Kom el-Gir, a further settlement mound little north of Buto, with a chronostratigraphic framework based on this approach.Reference:Altmeyer, M., Seeliger, M., Ginau, A., Schiestl, R. & J. Wunderlich (in review):&#160; Reconstruction of former channel systems in the northwestern Nile Delta (Egypt) based on corings and electrical resistivity tomography (ERT). (Submitted to E & G Quaternary Science Journal).

Download Full-text

Network Intrusion Detection System Using Random Forest and Decision Tree Machine Learning Techniques

First International Conference on Sustainable Technologies for Computational Intelligence - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-15-0029-9_50 ◽

2019 ◽

pp. 637-643

Author(s):

T. Tulasi Bhavani ◽

M. Kameswara Rao ◽

A. Manohar Reddy

Keyword(s):

Machine Learning ◽

Random Forest ◽

Intrusion Detection ◽

Decision Tree ◽

Intrusion Detection System ◽

Detection System ◽

Machine Learning Techniques ◽

Network Intrusion Detection ◽

Network Intrusion ◽

Learning Techniques

Download Full-text

A Comparative Study of Random Forest and Genetic Engineering Programming for the Prediction of Compressive Strength of High Strength Concrete (HSC)

Applied Sciences ◽

10.3390/app10207330 ◽

2020 ◽

Vol 10 (20) ◽

pp. 7330 ◽

Cited By ~ 1

Author(s):

Furqan Farooq ◽

Muhammad Nasir Amin ◽

Kaffayatullah Khan ◽

Muhammad Rehan Sadiq ◽

Muhammad Faisal Faisal Javed ◽

...

Keyword(s):

Machine Learning ◽

Compressive Strength ◽

Random Forest ◽

Decision Tree ◽

High Strength Concrete ◽

High Strength ◽

Supervised Machine Learning ◽

Fine Aggregate ◽

Strength Concrete ◽

Statistical Measures

Supervised machine learning and its algorithm is an emerging trend for the prediction of mechanical properties of concrete. This study uses an ensemble random forest (RF) and gene expression programming (GEP) algorithm for the compressive strength prediction of high strength concrete. The parameters include cement content, coarse aggregate to fine aggregate ratio, water, and superplasticizer. Moreover, statistical analyses like MAE, RSE, and RRMSE are used to evaluate the performance of models. The RF ensemble model outbursts in performance as it uses a weak base learner decision tree and gives an adamant determination of coefficient R2 = 0.96 with fewer errors. The GEP algorithm depicts a good response in between actual values and prediction values with an empirical relation. An external statistical check is also applied on RF and GEP models to validate the variables with data points. Artificial neural networks (ANNs) and decision tree (DT) are also used on a given data sample and comparison is made with the aforementioned models. Permutation features using python are done on the variables to give an influential parameter. The machine learning algorithm reveals a strong correlation between targets and predicts with less statistical measures showing the accuracy of the entire model.

Download Full-text

Big data and democratic speech: Predicting deliberative quality using machine learning techniques

Methodological Innovations ◽

10.1177/20597991211010416 ◽

2021 ◽

Vol 14 (2) ◽

pp. 205979912110104

Author(s):

Eleonore Fournier-Tombs ◽

Michael K. MacKenzie

Keyword(s):

Machine Learning ◽

Speech Acts ◽

Large Scale ◽

Large Datasets ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Computational Techniques ◽

Learning Techniques ◽

Campaign Speeches ◽

Northern Territories

This article explores techniques for using supervised machine learning to study discourse quality in large datasets. We explain and illustrate the computational techniques that we have developed to facilitate a large-scale study of deliberative quality in Canada’s three northern territories: Yukon, Northwest Territories, and Nunavut. This larger study involves conducting comparative analyses of hundreds of thousands of parliamentary speech acts since the creation of Nunavut 20 years ago. Without computational techniques, we would be unable to conduct such an ambitious and comprehensive analysis of deliberative quality. The purpose of this article is to demonstrate the machine learning techniques that we have developed with the hope that they might be used and improved by other communications scholars who are interested in conducting textual analyses using large datasets. Other possible applications of these techniques might include analyses of campaign speeches, party platforms, legislation, judicial rulings, online comments, newspaper articles, and television or radio commentaries.

Download Full-text