2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Win-Tsung Lo ◽  
Yue-Shan Chang ◽  
Ruey-Kai Sheu ◽  
Chun-Chieh Chiu ◽  
Shyan-Ming Yuan

Decision tree is one of the famous classification methods in data mining. Many researches have been proposed, which were focusing on improving the performance of decision tree. However, those algorithms are developed and run on traditional distributed systems. Obviously the latency could not be improved while processing huge data generated by ubiquitous sensing node in the era without new technology help. In order to improve data processing latency in huge data mining, in this paper, we design and implement a new parallelized decision tree algorithm on a CUDA (compute unified device architecture), which is a GPGPU solution provided by NVIDIA. In the proposed system, CPU is responsible for flow control while the GPU is responsible for computation. We have conducted many experiments to evaluate system performance of CUDT and made a comparison with traditional CPU version. The results show that CUDT is 5∼55 times faster than Weka-j48 and is 18 times speedup than SPRINT for large data set.


Complexity ◽  
2017 ◽  
Vol 2017 ◽  
pp. 1-8 ◽  
Author(s):  
Jincai Yang ◽  
Huichao Gu ◽  
Xingpeng Jiang ◽  
Qingyang Huang ◽  
Xiaohua Hu ◽  
...  

In the past 20 years, much progress has been made on the genetic analysis of osteoporosis. A number of genes and SNPs associated with osteoporosis have been found through GWAS method. In this paper, we intend to identify the suspected risky SNPs of osteoporosis with computational methods based on the known osteoporosis GWAS-associated SNPs. The process includes two steps. Firstly, we decided whether the genes associated with the suspected risky SNPs are associated with osteoporosis by using random walk algorithm on the PPI network of osteoporosis GWAS-associated genes and the genes associated with the suspected risky SNPs. In order to solve the overfitting problem in ID3 decision tree algorithm, we then classified the SNPs with positive results based on their features of position and function through a simplified classification decision tree which was constructed by ID3 decision tree algorithm with PEP (Pessimistic-Error Pruning). We verified the accuracy of the identification framework with the data set of GWAS-associated SNPs, and the result shows that this method is feasible. It provides a more convenient way to identify the suspected risky SNPs associated with osteoporosis.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Zhu Gu ◽  
Chaohu He

After the reform and the opening, the economy of our country has developed rapidly, and the living conditions of the people have become better and better. As a result, they have a lot of time to pay attention to their health, which has promoted the rapid development of the sports and fitness industry in my country. In response to the increasing development of the sports and fitness sector of my country, the current state of the administration of members of the sports fitness industry does not keep pace with the development of the sports and fitness industry of my country. Based on this, this article uses a fuzzy decision tree algorithm to establish a decision tree based on the characteristics of customer data and loses existing customers. Analyzing the situation is of strategic significance for improving the competitiveness of the club. This article selects the 7 most commonly used data sets from the UCI data set as the initial experimental data for model training in three different formats and then uses the data of a specific club member to conduct experiments, using these data files as training samples to construct a vague analysis of the decision tree to overturn the customer to analyze the main factors of customer change. Experiments show that the fuzzy decision tree ID3 algorithm based on mobile computing has the highest accuracy in the Iris data set, reaching 97.8%, and the accuracy rate in the Wine data set is the smallest, only 65.2%. The mobile computing-based fuzzy decision tree ID3 algorithm proposed in this paper obtained the highest correct rate (86.32%). This shows that, compared to traditional analysis methods, the blurred decision tree obtained for churn client analysis has the advantages of high classification accuracy and is understandable so that ideal classification accuracy can be achieved when the tree is small.


2012 ◽  
Vol 268-270 ◽  
pp. 1730-1734
Author(s):  
Cheng Hua Wang ◽  
Lin Zhou ◽  
Feng Jiang ◽  
Hong Bo Zhao

Decision tree algorithms have been widely used in intrusion detection. In this paper, within the framework of granular computing (GrC), we propose a new decision tree algorithm called DTGAE and apply it to intrusion detection. First, by virtue of the GrC model using information tables, we propose a new information entropy model, which contains two basic notions: approximation entropy of granule (AEG) and GrC-based approximation entropy (GAE), where the latter is defined based on the former. In algorithm DTGAE, GAE is adopted as the heuristic information for the selection of splitting attributes. When calculating AEG and GAE, we not only utilize the concept of conditional entropy in Shannon's information theory, but also use the concept of approximation accuracy in rough sets. Second, we present a method of decision tree pre-pruning based on Düntsch's knowledge entropy. Finally, the KDDCUP99 data set is used to verify the effectiveness of our algorithm in intrusion detection.


Sign in / Sign up

Export Citation Format

Share Document