Prediction of Breast Cancer using Decision tree and Random Forest Algorithm

2018 ◽  
Vol 6 (2) ◽  
pp. 226-229
Author(s):  
N.Sridevi . ◽  
◽  
S.Anitha . ◽  
2019 ◽  
Vol 8 (4) ◽  
pp. 4879-4881

One of the most dreadful disease is breast cancer and it has a potential cause for death in women. Every year, death rate increases drastically due to breast cancer. An effective way to classify data is through classification or data mining. This becomes very handy, especially in the medical field where diagnosis and analysis are done through these techniques. Wisconsin Breast cancer dataset is used to perform a comparison between SVM, Logistic Regression, Naïve Bayes and Random Forest. Evaluating the correctness in classifying data based on accuracy and time consumption is used to determine the efficiency of the algorithms, which is the main objective. Based on the result of performed experiments, the Random Forest algorithm shows the highest accuracy (99.76%) with the least error rate. ANACONDA Data Science Platform is used to execute all the experiments in a simulated environment.


Nowadays, heart disease is the main cause of several deaths among all other diseases. Due to the lack of resources in the medical field, the prediction of heart diseases becomes a major problem. For early diagnosis and treatment, some classification algorithms such as Decision Tree and Random Forest Algorithm are used. The data mining techniques compare the accuracy of the algorithm and predict heart diseases. The main aim of this paper is to predict heart disease based on the dataset values. In this paper we are comparing the accuracy of above two algorithms. To implement these methods the following steps are used. In first phase, a dataset of 13 attributes is collected and it was applied on classification techniques using the Decision tree and Random Forest Algorithms. Finally, the accuracy is collected for both the algorithms. In this paper we observed that random forest is generating better results than decision tree in prediction of heart diseases.


Author(s):  
H. Sahu ◽  
D. Haldar ◽  
A. Danodia ◽  
S. Kumar

<p><strong>Abstract.</strong> A study was conducted in Saharanpur District of Uttar Pradesh to asses the potential of Sentinel-1A SAR Data in orchard crop classification. The objective of the study was to evaluate three different classifiers that are maximum likelihood classifier, decision tree algorithm and random forest algorithm in Sentinel-1A SAR Data. An attempt is made to study Sentinel-1A SAR Data to classify orchard crop using this approach. Here the rule-based classifiers such as decision tree algorithm and random forest algorithm are compared with conventional maximum likelihood classifier. Statistical analysis of the classification show that the distribution of the crop, forest orchard, settlement and waterbody was 17.47<span class="thinspace"></span>%, 0.47<span class="thinspace"></span>%, 28.3<span class="thinspace"></span>%, 28.3<span class="thinspace"></span>% and 25.5<span class="thinspace"></span>% respectively in all the classification algorithm but root mean square error for maximum likelihood classifier (1.278) is more than decision tree algorithm (1.196) and random forest algorithm (1.193). Out of three, a percentage correct prediction is highest in case of decision tree algorithm (73.4) than random forest algorithm (72.5) and least for maximum likelihood classifier (66.8) in December 2017. The accuracy for orchard class is 0.81 for maximum likelihood classifier, 0.80 for decision tree algorithm and 0.78 for random forest algorithm. Thus Sentinel-1A SAR Data was effectively utilized for the classification of orchard crops.</p>


2018 ◽  
Vol 228 ◽  
pp. 01020 ◽  
Author(s):  
Qingqing Liu

The paper propose a parallel load forecasting method based on random forest algorithm, through the analysis of historical load, temperature, wind speed and other data, the algorithm can shorten the load forecasting time and improve the processing capability of large data. This paper also designs and implements parallel load forecasting prototype system based on power user side large data of a Hadoop, including data cluster management, data management, prediction classification algorithm library and other functions. The experimental results show that the accuracy of parallel stochastic forest algorithm is obviously higher than decision tree, and the prediction accuracy on the different data sets is generally higher than decision tree, and it can better analyze and process large data.


Sign in / Sign up

Export Citation Format

Share Document