A Review of Characterization Approaches for Smallholder Farmers: Towards Predictive Farm Typologies

Characterization of smallholder farmers has been conducted in various researches by using machine learning algorithms, participatory and expert-based methods. All approaches used end up with the development of some subgroups known as farm typologies. The main purpose of this paper is to highlight the main approaches used to characterize smallholder farmers, presenting the pros and cons of the approaches. By understanding the nature and key advantages of the reviewed approaches, the paper recommends a hybrid approach towards having predictive farm typologies. Search of relevant research articles published between 2007 and 2018 was done on ScienceDirect and Google Scholar. By using a generated search query, 20 research articles related to characterization of smallholder farmers were retained. Cluster-based algorithms appeared to be the mostly used in characterizing smallholder farmers. However, being highly unpredictable and inconsistent, use of clustering methods calls in for a discussion on how well the developed farm typologies can be used to predict future trends of the farmers. A thorough discussion is presented and recommends use of supervised models to validate unsupervised models. In order to achieve predictive farm typologies, three stages in characterization are recommended as tested in smallholder dairy farmers datasets: (a) develop farm types from a comparative analysis of more than two unsupervised learning algorithms by using training models, (b) assess the training models’ robustness in predicting farm types for a testing dataset, and (c) assess the predictive power of the developed farm types from each algorithm by predicting the trend of several response variables.

Download Full-text

Towards Near-Real-Time Intrusion Detection for IoT Devices using Supervised Learning and Apache Spark

Electronics ◽

10.3390/electronics9030444 ◽

2020 ◽

Vol 9 (3) ◽

pp. 444 ◽

Cited By ~ 1

Author(s):

Valerio Morfino ◽

Salvatore Rampone

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithms ◽

Hybrid Approach ◽

Cyber Attacks ◽

Machine Learning Algorithms ◽

Apache Spark ◽

Identification Accuracy ◽

Supervised Machine Learning ◽

Iot Devices

In the fields of Internet of Things (IoT) infrastructures, attack and anomaly detection are rising concerns. With the increased use of IoT infrastructure in every domain, threats and attacks in these infrastructures are also growing proportionally. In this paper the performances of several machine learning algorithms in identifying cyber-attacks (namely SYN-DOS attacks) to IoT systems are compared both in terms of application performances, and in training/application times. We use supervised machine learning algorithms included in the MLlib library of Apache Spark, a fast and general engine for big data processing. We show the implementation details and the performance of those algorithms on public datasets using a training set of up to 2 million instances. We adopt a Cloud environment, emphasizing the importance of the scalability and of the elasticity of use. Results show that all the Spark algorithms used result in a very good identification accuracy (>99%). Overall, one of them, Random Forest, achieves an accuracy of 1. We also report a very short training time (23.22 sec for Decision Tree with 2 million rows). The experiments also show a very low application time (0.13 sec for over than 600,000 instances for Random Forest) using Apache Spark in the Cloud. Furthermore, the explicit model generated by Random Forest is very easy-to-implement using high- or low-level programming languages. In light of the results obtained, both in terms of computation times and identification performance, a hybrid approach for the detection of SYN-DOS cyber-attacks on IoT devices is proposed: the application of an explicit Random Forest model, implemented directly on the IoT device, along with a second level analysis (training) performed in the Cloud.

Download Full-text

Machine Learning Algorithms for Characterization of EMG Signals

International Journal of Information and Electronics Engineering ◽

10.7763/ijiee.2014.v4.433 ◽

2014 ◽

Vol 4 (3) ◽

Cited By ~ 10

Author(s):

Bekir Karlık

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Dominant Feature Selection and Machine Learning-Based Hybrid Approach to Analyze Android Ransomware

Security and Communication Networks ◽

10.1155/2021/7035233 ◽

2021 ◽

Vol 2021 ◽

pp. 1-22

Author(s):

Tanya Gera ◽

Jaiteg Singh ◽

Abolfazl Mehbodniya ◽

Julian L. Webber ◽

Mohammad Shabaz ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Learning Algorithms ◽

Hybrid Approach ◽

Machine Learning Algorithms ◽

Dynamic Monitoring ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Detection Techniques ◽

Dominant Feature

Ransomware is a special malware designed to extort money in return for unlocking the device and personal data files. Smartphone users store their personal as well as official data on these devices. Ransomware attackers found it bewitching for their financial benefits. The financial losses due to ransomware attacks are increasing rapidly. Recent studies witness that out of 87% reported cyber-attacks, 41% are due to ransomware attacks. The inability of application-signature-based solutions to detect unknown malware has inspired many researchers to build automated classification models using machine learning algorithms. Advanced malware is capable of delaying malicious actions on sensing the emulated environment and hence posing a challenge to dynamic monitoring of applications also. Existing hybrid approaches utilize a variety of features combination for detection and analysis. The rapidly changing nature and distribution strategies are possible reasons behind the deteriorated performance of primitive ransomware detection techniques. The limitations of existing studies include ambiguity in selecting the features set. Increasing the feature set may lead to freedom of adept attackers against learning algorithms. In this work, we intend to propose a hybrid approach to identify and mitigate Android ransomware. This study employs a novel dominant feature selection algorithm to extract the dominant feature set. The experimental results show that our proposed model can differentiate between clean and ransomware with improved precision. Our proposed hybrid solution confirms an accuracy of 99.85% with zero false positives while considering 60 prominent features. Further, it also justifies the feature selection algorithm used. The comparison of the proposed method with the existing frameworks indicates its better performance.

Download Full-text

Neuronal Communication Genetic Algorithm-Based Inductive Learning

Research Anthology on Multi-Industry Uses of Genetic Programming and Algorithms ◽

10.4018/978-1-7998-8048-6.ch013 ◽

2021 ◽

pp. 244-259

Author(s):

Abdiya Alaoui ◽

Zakaria Elberrichi

Keyword(s):

Machine Learning ◽

Genetic Algorithm ◽

Predictive Accuracy ◽

Inductive Learning ◽

Learning Algorithms ◽

Hybrid Approach ◽

Machine Learning Algorithms ◽

Rule Based ◽

Neuronal Communication ◽

Robust Model

The development of powerful learning strategies in the medical domain constitutes a real challenge. Machine learning algorithms are used to extract high-level knowledge from medical datasets. Rule-based machine learning algorithms are easily interpreted by humans. To build a robust rule-based algorithm, a new hybrid metaheuristic was proposed for the classification of medical datasets. The hybrid approach uses neural communication and genetic algorithm-based inductive learning to build a robust model for disease prediction. The resulting classification models are characterized by good predictive accuracy and relatively small size. The results on 16 well-known medical datasets from the UCI machine learning repository shows the efficiency of the proposed approach compared to other states-of-the-art approaches.

Download Full-text

Characterization of machine learning algorithms for slippage estimation in planetary exploration rovers

Journal of Terramechanics ◽

10.1016/j.jterra.2018.12.001 ◽

2019 ◽

Vol 82 ◽

pp. 23-34 ◽

Cited By ~ 2

Author(s):

Ramon Gonzalez ◽

Samuel Chandler ◽

Dimi Apostolopoulos

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Planetary Exploration ◽

Machine Learning Algorithms

Download Full-text

HEART DISEASE PREDICTION SYSTEM USING MACHINE LEARNING ALGORITHM

Iraqi Journal of Information & Communications Technology ◽

10.31987/ijict.1.1.153 ◽

2021 ◽

Vol 1 (1) ◽

pp. 146-176

Author(s):

Israa Nadher ◽

Mohammad Ayache ◽

Hussein Kanaan

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Digital Data ◽

Disease Prediction ◽

Prediction System ◽

Testing Dataset ◽

Starting Point

Abstract—Information decision support systems are becomingmore in use as we are living in the era of digital data andrise of artificial intelligence. Heart disease as one of the mostknown and dangerous is getting very important attention, thisattention is translated into digital and prediction system thatdetects the presence of disease according to the available dataand information. Such systems faced a lot of problems since thefirst rise, but now with the deveolopment of machine learnigfield we are using them in developing new models to detect thepresence of this disease, in addition to algorithms data is veryimportant which also form the heart of the predicton systems,as we know prediction algorithms take decisions and thesedecisions must be based on facts, and these facts are extractedfrom data, as a result data is the starting point of every system.In this paper we propose a Heart Disease Prediction Systemusing Machine Learning Algorithms, in terms of data we usedCleveland dataset, this dataset is normalized then divided intothree scnearios in terms of traning and testing respectively,80%-20%, 50%-50%, 30%-70%. In each case of dataset ifit is normalized or not we will have these three scenarios.We used three machine learning algorithms for every scenarioof the mentioned before which are SVM, SMO and MLP, inthese algorithms we’ve used two different kernels to test theresults upon that. These two types of simulation are added tothe collection of scenarios mentioned above to become as thefollowing we have at the main level two types normalized andunnormalized dataset, then for each one we have three typesaccording to the amount of training and testing dataset, thenfor each of these scenarios we have two scenarios according tothe type of kernel to become 30 scenarios in total, our proposedsystem have shown a dominance in terms of accuracy over theother previous works.

Download Full-text

Prediction and Characterization of Heating Load Energy Performance of Residential Building Machine Learning Algorithms

10.1007/978-3-030-92038-8_5 ◽

2021 ◽

pp. 46-56

Author(s):

Aissa Boudjella ◽

Manal Y. Boudjella ◽

Mohamed E. Bellebna ◽

Nasreddine Aoumeur ◽

Samir Belhouari

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Residential Building ◽

Energy Performance ◽

Machine Learning Algorithms ◽

Heating Load

Download Full-text

Predicting 180-day mortality for women with ovarian cancer using machine learning and patient-reported outcome data.

Journal of Clinical Oncology ◽

10.1200/jco.2021.39.15_suppl.e13555 ◽

2021 ◽

Vol 39 (15_suppl) ◽

pp. e13555-e13555

Author(s):

Chris Sidey-Gibbons ◽

Charlotte C. Sun ◽

Cai Xu ◽

Amy Schneider ◽

Sheng-Chieh Lu ◽

...

Keyword(s):

Machine Learning ◽

Ovarian Cancer ◽

End Of Life ◽

Learning Algorithms ◽

The United States ◽

Patient Reported Outcome ◽

Machine Learning Algorithms ◽

Electronic Health Record Data ◽

Testing Dataset ◽

Patient Reported

e13555 Background: Contra to national guidelines, women with ovarian cancer often receive aggressive treatment until the end-of-life. We trained machine learning algorithms to predict mortality within 180 days for women with ovarian cancer. Methods: Data were collected data from a single academic cancer institution in the United States. Women with recurrent ovarian cancer completed biopsychosocial patient-reported outcome measures (PROMs) every 90 days. We randomly partitioned our dataset into training and testing samples with a 2:1 ratio. We used synthetic minority oversampling to reduce class imbalance in the training dataset. We fitted training data to six machine learning algorithms and combined their classifications on the testing dataset into a voting ensemble. We assessed the accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC) for each algorithm. Results: We recruited 245 patients who completed 1319 assessments. The final voting ensemble performed well across all performance metrics (Accuracy = .79, Sensitivity = .71, Specificity = .80, AUROC = .76). The algorithm correctly identified 25 of the 35 women in the testing dataset who died within 180 days of assessment Conclusions: Machine learning algorithms trained using PROM data offer state-of-the-art performance in predicting whether a woman with ovarian cancer will reach the end-of-life within 180 days. We highlight the importance of PROM data in ML models of mortality. Our model exhibits substantial improvements in prediction sensitivity compared to other similar models trained using electronic health record data alone. This model could inform clinical decision making and improve the uptake of appropriate end-of-life care. Further research is warranted to expand on these findings in a larger, more diverse sample.

Download Full-text