Research on Random Forest Algorithm Based on Big Data in Parallel Load Forecasting

The paper propose a parallel load forecasting method based on random forest algorithm, through the analysis of historical load, temperature, wind speed and other data, the algorithm can shorten the load forecasting time and improve the processing capability of large data. This paper also designs and implements parallel load forecasting prototype system based on power user side large data of a Hadoop, including data cluster management, data management, prediction classification algorithm library and other functions. The experimental results show that the accuracy of parallel stochastic forest algorithm is obviously higher than decision tree, and the prediction accuracy on the different data sets is generally higher than decision tree, and it can better analyze and process large data.

Download Full-text

A Comparative Analysis of Machine/Deep Learning Models for Parking Space Availability Prediction

Sensors ◽

10.3390/s20010322 ◽

2020 ◽

Vol 20 (1) ◽

pp. 322 ◽

Cited By ~ 9

Author(s):

Faraz Malik Awan ◽

Yasir Saleem ◽

Roberto Minerva ◽

Noel Crespi

Keyword(s):

Deep Learning ◽

Comparative Analysis ◽

Random Forest ◽

Decision Tree ◽

Multilayer Perceptron ◽

Large Data ◽

Data Sets ◽

Application Domain ◽

Parking Space ◽

Data Set

Machine/Deep Learning (ML/DL) techniques have been applied to large data sets in order to extract relevant information and for making predictions. The performance and the outcomes of different ML/DL algorithms may vary depending upon the data sets being used, as well as on the suitability of algorithms to the data and the application domain under consideration. Hence, determining which ML/DL algorithm is most suitable for a specific application domain and its related data sets would be a key advantage. To respond to this need, a comparative analysis of well-known ML/DL techniques, including Multilayer Perceptron, K-Nearest Neighbors, Decision Tree, Random Forest, and Voting Classifier (or the Ensemble Learning Approach) for the prediction of parking space availability has been conducted. This comparison utilized Santander’s parking data set, initiated while working on the H2020 WISE-IoT project. The data set was used in order to evaluate the considered algorithms and to determine the one offering the best prediction. The results of this analysis show that, regardless of the data set size, the less complex algorithms like Decision Tree, Random Forest, and KNN outperform complex algorithms such as Multilayer Perceptron, in terms of higher prediction accuracy, while providing comparable information for the prediction of parking space availability. In addition, in this paper, we are providing Top-K parking space recommendations on the basis of distance between current position of vehicles and free parking spots.

Download Full-text

Accident Prediction Accuracy Assessment for Highway-Rail Grade Crossings Using Random Forest Algorithm Compared with Decision Tree

Reliability Engineering & System Safety ◽

10.1016/j.ress.2020.106931 ◽

2020 ◽

Vol 200 ◽

pp. 106931 ◽

Cited By ~ 10

Author(s):

Xiaoyi Zhou ◽

Pan Lu ◽

Zijian Zheng ◽

Denver Tolliver ◽

Amin Keramati

Keyword(s):

Random Forest ◽

Decision Tree ◽

Prediction Accuracy ◽

Accuracy Assessment ◽

Random Forest Algorithm ◽

Accident Prediction ◽

Grade Crossings

Download Full-text

Prediction of Breast Cancer using Decision tree and Random Forest Algorithm

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i2.226229 ◽

2018 ◽

Vol 6 (2) ◽

pp. 226-229

Author(s):

N.Sridevi . ◽

◽

S.Anitha . ◽

Keyword(s):

Breast Cancer ◽

Random Forest ◽

Decision Tree ◽

Random Forest Algorithm

Download Full-text

Development of a Safety Management System Tracking the Weight of Heavy Objects Carried by Construction Workers Using FSR Sensors

Applied Sciences ◽

10.3390/app11041378 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1378

Author(s):

Seung Hyun Lee ◽

Jaeho Son

Keyword(s):

Random Forest ◽

Decision Tree ◽

Safety Management ◽

Tracking System ◽

Construction Workers ◽

Gradient Boosting ◽

Prototype System ◽

Construction Site ◽

Average Accuracy ◽

Smart Safety

It has been pointed out that the act of carrying a heavy object that exceeds a certain weight by a worker at a construction site is a major factor that puts physical burden on the worker’s musculoskeletal system. However, due to the nature of the construction site, where there are a large number of workers simultaneously working in an irregular space, it is difficult to figure out the weight of the object carried by the worker in real time or keep track of the worker who carries the excess weight. This paper proposes a prototype system to track the weight of heavy objects carried by construction workers by developing smart safety shoes with FSR (Force Sensitive Resistor) sensors. The system consists of smart safety shoes with sensors attached, a mobile device for collecting initial sensing data, and a web-based server computer for storing, preprocessing and analyzing such data. The effectiveness and accuracy of the weight tracking system was verified through the experiments where a weight was lifted by each experimenter from +0 kg to +20 kg in 5 kg increments. The results of the experiment were analyzed by a newly developed machine learning based model, which adopts effective classification algorithms such as decision tree, random forest, gradient boosting algorithm (GBM), and light GBM. The average accuracy classifying the weight by each classification algorithm showed similar, but high accuracy in the following order: random forest (90.9%), light GBM (90.5%), decision tree (90.3%), and GBM (89%). Overall, the proposed weight tracking system has a significant 90.2% average accuracy in classifying how much weight each experimenter carries.

Download Full-text

A Short-Term Load Forecasting Model Based on Improved Random Forest Algorithm

2020 7th International Forum on Electrical Engineering and Automation (IFEEA) ◽

10.1109/ifeea51475.2020.00195 ◽

2020 ◽

Author(s):

Huang Yiling ◽

Huang Shaofeng

Keyword(s):

Random Forest ◽

Load Forecasting ◽

Forecasting Model ◽

Random Forest Algorithm ◽

Short Term ◽

Model Based ◽

Short Term Load Forecasting

Download Full-text

Data Mining: A Bagged Decision Tree Classifier Algorithm For Ids Intrusion Detection System Based Attacks Classification

Design Engineering ◽

10.17762/de.v2021i04.1800 ◽

2021 ◽

pp. 1826-1839

Author(s):

Sandeep Adhikari, Dr. Sunita Chaudhary

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Decision Tree ◽

Intrusion Detection System ◽

Detection System ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Decision Tree Classifier ◽

Tree Classifier

The exponential growth in the use of computers over networks, as well as the proliferation of applications that operate on different platforms, has drawn attention to network security. This paradigm takes advantage of security flaws in all operating systems that are both technically difficult and costly to fix. As a result, intrusion is used as a key to worldwide a computer resource's credibility, availability, and confidentiality. The Intrusion Detection System (IDS) is critical in detecting network anomalies and attacks. In this paper, the data mining principle is combined with IDS to efficiently and quickly identify important, secret data of interest to the user. The proposed algorithm addresses four issues: data classification, high levels of human interaction, lack of labeled data, and the effectiveness of distributed denial of service attacks. We're also working on a decision tree classifier that has a variety of parameters. The previous algorithm classified IDS up to 90% of the time and was not appropriate for large data sets. Our proposed algorithm was designed to accurately classify large data sets. Aside from that, we quantify a few more decision tree classifier parameters.

Download Full-text

A Detailed Study on Classification Algorithms in Big Data

Big Data Analytics for Sustainable Computing - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-9750-6.ch002 ◽

2020 ◽

pp. 30-46

Author(s):

Saranya N. ◽

Saravana Selvam

Keyword(s):

Big Data ◽

Random Forest ◽

Linear Regression ◽

Comprehensive Evaluation ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Classification Methods ◽

Computing Science ◽

Data Collections

After an era of managing data collection difficulties, these days the issue has turned into the problem of how to process these vast amounts of information. Scientists, as well as researchers, think that today, probably the most essential topic in computing science is Big Data. Big Data is used to clarify the huge volume of data that could exist in any structure. This makes it difficult for standard controlling approaches for mining the best possible data through such large data sets. Classification in Big Data is a procedure of summing up data sets dependent on various examples. There are distinctive classification frameworks which help us to classify data collections. A few methods that discussed in the chapter are Multi-Layer Perception Linear Regression, C4.5, CART, J48, SVM, ID3, Random Forest, and KNN. The target of this chapter is to provide a comprehensive evaluation of classification methods that are in effect commonly utilized.

Download Full-text

Sentiment Analysis of Social Media Users Using Naïve Bayes, Decision Tree, Random Forest Algorithm: A Case Study of Draft Law on the Elimination of Sexual Violence (RUU PKS)

2019 International Conference on Sustainable Engineering and Creative Computing (ICSECC) ◽

10.1109/icsecc.2019.8907228 ◽

2019 ◽

Author(s):

Khalisa Virra ◽

Rachmadita Andreswari ◽

Muhammad Azani Hasibuan

Keyword(s):

Social Media ◽

Random Forest ◽

Decision Tree ◽

Sexual Violence ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Random Forest Algorithm

Download Full-text

Sentiment Analysis on Youtube Social Media Using Decision Tree and Random Forest Algorithm: A Case Study

2020 International Conference on Data Science and Its Applications (ICoDSA) ◽

10.1109/icodsa50139.2020.9213078 ◽

2020 ◽

Author(s):

Mohammad Aufar ◽

Rachmadita Andreswari ◽

Dita Pramesti

Keyword(s):

Social Media ◽

Random Forest ◽

Decision Tree ◽

Sentiment Analysis ◽

Random Forest Algorithm

Download Full-text

Usage of Data Mining Techniques in Predicting the Heart Diseases Decision Tree & Random Forest Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.h7168.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 963-967

Keyword(s):

Data Mining ◽

Heart Disease ◽

Random Forest ◽

Early Diagnosis ◽

Decision Tree ◽

Heart Diseases ◽

Classification Algorithms ◽

Random Forest Algorithm ◽

Medical Field ◽

Data Mining Techniques

Nowadays, heart disease is the main cause of several deaths among all other diseases. Due to the lack of resources in the medical field, the prediction of heart diseases becomes a major problem. For early diagnosis and treatment, some classification algorithms such as Decision Tree and Random Forest Algorithm are used. The data mining techniques compare the accuracy of the algorithm and predict heart diseases. The main aim of this paper is to predict heart disease based on the dataset values. In this paper we are comparing the accuracy of above two algorithms. To implement these methods the following steps are used. In first phase, a dataset of 13 attributes is collected and it was applied on classification techniques using the Decision tree and Random Forest Algorithms. Finally, the accuracy is collected for both the algorithms. In this paper we observed that random forest is generating better results than decision tree in prediction of heart diseases.

Download Full-text