Big Data Classification Using Belief Decision Trees: Application to Intrusion Detection

Big Data classification has recently received a great deal of attention due to the main properties of Big Data, which are volume, variety, and velocity. The furthest-pair-based binary search tree (FPBST) shows a great potential for Big Data classification. This work attempts to improve the performance the FPBST in terms of computation time, space consumed and accuracy. The major enhancement of the FPBST includes converting the resultant BST to a decision tree, in order to remove the need for the slow K-nearest neighbors (KNN), and to obtain a smaller tree, which is useful for memory usage, speeding both training and testing phases and increasing the classification accuracy. The proposed decision trees are based on calculating the probabilities of each class at each node using various methods; these probabilities are then used by the testing phase to classify an unseen example. The experimental results on some (small, intermediate and big) machine learning datasets show the efficiency of the proposed methods, in terms of space, speed and accuracy compared to the FPBST, which shows a great potential for further enhancements of the proposed methods to be used in practice.

Download Full-text

Big data Classification based on Distributed Fuzzy Decision Trees

SSRN Electronic Journal ◽

10.2139/ssrn.3576492 ◽

2019 ◽

Author(s):

Sunu Fathima T H ◽

Binsu C Kovoor ◽

Jaseena K U

Keyword(s):

Big Data ◽

Decision Trees ◽

Data Classification ◽

Fuzzy Decision ◽

Big Data Classification ◽

Fuzzy Decision Trees

Download Full-text

On the Usage of the Probability Integral Transform to Reduce the Complexity of Multi-Way Fuzzy Decision Trees in Big Data Classification Problems

2018 IEEE International Congress on Big Data (BigData Congress) ◽

10.1109/bigdatacongress.2018.00011 ◽

2018 ◽

Cited By ~ 2

Author(s):

Mikel Elkano ◽

Mikel Uriz ◽

Humberto Bustince ◽

Mikel Galar

Keyword(s):

Big Data ◽

Decision Trees ◽

Integral Transform ◽

Data Classification ◽

Fuzzy Decision ◽

Classification Problems ◽

Probability Integral Transform ◽

Probability Integral ◽

Big Data Classification ◽

Fuzzy Decision Trees

Download Full-text

CNB-MRF: Adapting Correlative Naive Bayes Classifier and MapReduce Framework for Big Data Classification

International Review on Computers and Software (IRECOS) ◽

10.15866/irecos.v11i11.10116 ◽

2016 ◽

Vol 11 (11) ◽

pp. 1007 ◽

Cited By ~ 3

Author(s):

Chitrakant Banchhor ◽

N. Srinivasu

Keyword(s):

Big Data ◽

Naive Bayes ◽

Data Classification ◽

Naïve Bayes ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Mapreduce Framework ◽

Big Data Classification

Download Full-text

Folk Song Computer Big Data Classification and Analysis Research Based on National Style Characteristics

Journal of Physics Conference Series ◽

10.1088/1742-6596/1744/3/032117 ◽

2021 ◽

Vol 1744 (3) ◽

pp. 032117

Author(s):

Jin Yang

Keyword(s):

Big Data ◽

Data Classification ◽

Folk Song ◽

Big Data Classification ◽

National Style

Download Full-text

Application research of Sports Place System based on big data classification technology

2020 International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE) ◽

10.1109/icbase51474.2020.00020 ◽

2020 ◽

Author(s):

Liu Hu ◽

Qingxuan Zeng

Keyword(s):

Big Data ◽

Data Classification ◽

Application Research ◽

Big Data Classification

Download Full-text

Research on complex attribute big data classification based on iterative fuzzy clustering algorithm

Web Intelligence ◽

10.3233/web-210463 ◽

2021 ◽

pp. 1-12

Author(s):

Li Qian

Keyword(s):

Big Data ◽

Fuzzy Clustering ◽

Classification Accuracy ◽

Clustering Algorithm ◽

Principal Component ◽

Data Classification ◽

Fisher Discriminant Analysis ◽

Fuzzy Clustering Algorithm ◽

Local Fisher Discriminant Analysis ◽

Big Data Classification

In order to overcome the low classification accuracy of traditional methods, this paper proposes a new classification method of complex attribute big data based on iterative fuzzy clustering algorithm. Firstly, principal component analysis and kernel local Fisher discriminant analysis were used to reduce dimensionality of complex attribute big data. Then, the Bloom Filter data structure is introduced to eliminate the redundancy of the complex attribute big data after dimensionality reduction. Secondly, the redundant complex attribute big data is classified in parallel by iterative fuzzy clustering algorithm, so as to complete the complex attribute big data classification. Finally, the simulation results show that the accuracy, the normalized mutual information index and the Richter’s index of the proposed method are close to 1, the classification accuracy is high, and the RDV value is low, which indicates that the proposed method has high classification effectiveness and fast convergence speed.

Download Full-text