scholarly journals SENTIMENT ANALYSIS ON TWITTER OF PSBB EFFECT USING MACHINE LEARNING

2020 ◽  
Vol 17 (2) ◽  
pp. 143-150
Author(s):  
Irwansyah Saputra ◽  
Jose Andrean Halomoan ◽  
Adam Bagusmugi Raharjo ◽  
Cyra Rezky Ananda Syavira

A collection of tweets from Twitter users about PSBB can be used as sentiment analysis. The data obtained is processed using data mining techniques (data mining), in which there is a process of mining the text, tokenize, transformation, classification, stem, etc. Then calculated into three different algorithms to be compared, the algorithm used is the Decision Tree, K-NN, and Naïve Bayes Classifier to find the best accuracy. Rapidminer application is also used to facilitate writers in processing data. The highest results from this study were the Decision Tree algorithm with an accuracy of 83.3%, precision 79%, and recall 87.17%.

2012 ◽  
Vol 532-533 ◽  
pp. 1685-1690 ◽  
Author(s):  
Zhi Kang Luo ◽  
Huai Ying Sun ◽  
De Wang

This paper presents an improved SPRINT algorithm. The original SPRINT algorithm is a scalable and parallelizable decision tree algorithm, which is a popular algorithm in data mining and machine learning communities. To improve the algorithm's efficiency, we propose an improved algorithm. Firstly, we select the splitting attributes and obtain the best splitting attribute from them by computing the information gain ratio of each attribute. After that, we calculate the best splitting point of the best splitting attribute. Since it avoids a lot of calculations of other attributes, the improved algorithm can effectively reduce the computation.


2011 ◽  
Vol 267 ◽  
pp. 732-737 ◽  
Author(s):  
Ming Du ◽  
Shu Mei Wang ◽  
Gu Gong

Decision tree is an important learning method in machine learning and data mining ,this paper discusses the method of choosing the best attribute based on information entropy .It analyzes the process and the characters of classification and the discovery knowledge based on decision tree about the application of decision tree on data mining .Through an instance ,the paper shows the procedure of selecting the decision attribute in detail ,finally it pointes out the developing trends of decision tree.


Decision tree algorithms, being accurate and comprehensible classifiers, have been one of the most widely used classifiers in data mining and machine learning. However, like many other classification algorithms, decision tree algorithms focus on extracting patterns with high generality and in the process, these ignore some rare but useful and interesting patterns that may exist in small disjuncts of data. Such extraordinary patterns with low support and high confidence capture very specific but exceptional behavior present in data. This paper proposes a novel Enhanced Decision Tree Algorithm for Discovering Intra and Inter-class Exceptions (EDTADE). Intra-class exceptions cover objects of unique interest within a class whereas inter-class exceptions capture rare conditions due to which we are forced shift the class of few unusual objects. For instance, whales and bats are examples of intra-class exceptions since these have unique characteristics within the class of mammals. Further, most of the birds are flying creatures, but the rare birds, like penguin and ostrich fall in the category of no flying birds. Here, penguin and ostrich are inter-class exceptions. In fact, without knowing about such exceptional patterns, our knowledge about a domain is incomplete. We have enhanced the decision tree algorithm by defining a framework for capturing intra and inter-class exceptions at leaf nodes of a decision tree. The proposed algorithm (EDTADE) is applied to many datasets from UCI Machine Learning Repository. The results show that the EDTADE has been successful in discovering many intra and inter-class exceptions. The decision tree augmented with intra and inter-class exceptions are more accurate, comprehensible as well as interesting since these provide additional knowledge in the form of exceptional patterns that deviate from the general rules discovered for classification


2014 ◽  
Vol 538 ◽  
pp. 460-464
Author(s):  
Xue Li

Based on inter-correlation and permeability among disciplines, the author makes an attempt to apply the information science to cognitive linguistics to provide a new perspective for the study of foreign languages. The correlation between self-efficacy and such four factors as anxiety, learning strategies, motivation and learners’ past achievement is analyzed by means of data mining and the extent to which the above factors affect self-efficacy in language learning is explored in this paper. The paper employs the decision tree algorithm in SPSS Clementine. C5.0 decision tree algorithm is adopted to analyze data in the study. The results are elicited from the researches carried out in this paper. The increased anxiety is bound to weaken learners’ motivation over time. It is obvious that learners have low self-efficacy. It is very important to employ strategies in foreign language learning. Ignorance of using learning strategies may result in unplanned learning with unsatisfactory achievements in spite of more efforts involved. Self-efficacy in foreign language learning may be weakened accordingly. Learners’ past achievement is a reference dimension in measuring self-efficacy with weaker influence.


2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Win-Tsung Lo ◽  
Yue-Shan Chang ◽  
Ruey-Kai Sheu ◽  
Chun-Chieh Chiu ◽  
Shyan-Ming Yuan

Decision tree is one of the famous classification methods in data mining. Many researches have been proposed, which were focusing on improving the performance of decision tree. However, those algorithms are developed and run on traditional distributed systems. Obviously the latency could not be improved while processing huge data generated by ubiquitous sensing node in the era without new technology help. In order to improve data processing latency in huge data mining, in this paper, we design and implement a new parallelized decision tree algorithm on a CUDA (compute unified device architecture), which is a GPGPU solution provided by NVIDIA. In the proposed system, CPU is responsible for flow control while the GPU is responsible for computation. We have conducted many experiments to evaluate system performance of CUDT and made a comparison with traditional CPU version. The results show that CUDT is 5∼55 times faster than Weka-j48 and is 18 times speedup than SPRINT for large data set.


Sign in / Sign up

Export Citation Format

Share Document