Research on online evaluation method of MOOC teaching quality based on decision tree-based big data classification

Big Data classification has recently received a great deal of attention due to the main properties of Big Data, which are volume, variety, and velocity. The furthest-pair-based binary search tree (FPBST) shows a great potential for Big Data classification. This work attempts to improve the performance the FPBST in terms of computation time, space consumed and accuracy. The major enhancement of the FPBST includes converting the resultant BST to a decision tree, in order to remove the need for the slow K-nearest neighbors (KNN), and to obtain a smaller tree, which is useful for memory usage, speeding both training and testing phases and increasing the classification accuracy. The proposed decision trees are based on calculating the probabilities of each class at each node using various methods; these probabilities are then used by the testing phase to classify an unseen example. The experimental results on some (small, intermediate and big) machine learning datasets show the efficiency of the proposed methods, in terms of space, speed and accuracy compared to the FPBST, which shows great potential for further enhancements of the proposed methods to be used in practice.

Download Full-text

A Novel Hybrid Technique for Big Data Classification Using Decision Tree Learning

Communications in Computer and Information Science - Computational Intelligence, Communications, and Business Analytics ◽

10.1007/978-981-10-6427-2_10 ◽

2017 ◽

pp. 118-128 ◽

Cited By ~ 1

Author(s):

Khyati Ahlawat ◽

Amit Prakash Singh

Keyword(s):

Big Data ◽

Decision Tree ◽

Data Classification ◽

Hybrid Technique ◽

Decision Tree Learning ◽

Big Data Classification

Download Full-text

LDT-MRF: Log decision tree and map reduce framework to clinical big data classification

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i1.5.9129 ◽

2017 ◽

Vol 7 (1.5) ◽

pp. 97

Author(s):

T. Surekha ◽

R. Siva Rama Prasad

Keyword(s):

Big Data ◽

Decision Tree ◽

Performance Metrics ◽

Data Classification ◽

Map Reduce ◽

Breast Cancer Dataset ◽

The Novel ◽

Novel Method ◽

Big Data Classification ◽

Sensitivity Specificity

The growth of the data is enormous in the current scenario of the developing information technology and performing the data classification is complex both in time and information extraction. Moreover, there are uncertainties in performing the big data classification that are associated with the unbalanced datasets. In order to overcome the issues, a novel method of big data classification is introduced in this paper. The novel method, Log Decision Tree and Map Reduce Framework (LDT-MRF) uses the Log Decision Tree (LDT) and the Map Reduce Framework (MRF) for performing the parallel data classification. The novel parameter termed as Log-entropy is used to select the best feature attribute for data classification. The data classification is performed using the LDT that enables the efficient data classification. Experimentation is carried out using three datasets, namely the Cleveland dataset, Switzerland dataset, and the Breast Cancer dataset. The comparative analysis is carried out using the performance metrics, such as sensitivity, specificity, and accuracy to prove the effectiveness of the proposed method. The sensitivity, specificity, and accuracy of the proposed method is 84.7596%, 74.633%, and 80.9088% respectively, which is greater when compared with the existing methods of big data classification.

Download Full-text

Optimized Decision tree rules using divergence based grey wolf optimization for big data classification in health care

Evolutionary Intelligence ◽

10.1007/s12065-019-00267-w ◽

2019 ◽

Cited By ~ 2

Author(s):

Pravin S. Game ◽

Vinod Vaze ◽

M. Emmanuel

Keyword(s):

Health Care ◽

Big Data ◽

Decision Tree ◽

Data Classification ◽

Grey Wolf ◽

Grey Wolf Optimization ◽

Big Data Classification

Download Full-text

CNB-MRF: Adapting Correlative Naive Bayes Classifier and MapReduce Framework for Big Data Classification

International Review on Computers and Software (IRECOS) ◽

10.15866/irecos.v11i11.10116 ◽

2016 ◽

Vol 11 (11) ◽

pp. 1007 ◽

Cited By ~ 3

Author(s):

Chitrakant Banchhor ◽

N. Srinivasu

Keyword(s):

Big Data ◽

Naive Bayes ◽

Data Classification ◽

Naïve Bayes ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Mapreduce Framework ◽

Big Data Classification

Download Full-text

Folk Song Computer Big Data Classification and Analysis Research Based on National Style Characteristics

Journal of Physics Conference Series ◽

10.1088/1742-6596/1744/3/032117 ◽

2021 ◽

Vol 1744 (3) ◽

pp. 032117

Author(s):

Jin Yang

Keyword(s):

Big Data ◽

Data Classification ◽

Folk Song ◽

Big Data Classification ◽

National Style

Download Full-text

Application research of Sports Place System based on big data classification technology

2020 International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE) ◽

10.1109/icbase51474.2020.00020 ◽

2020 ◽

Author(s):

Liu Hu ◽

Qingxuan Zeng

Keyword(s):

Big Data ◽

Data Classification ◽

Application Research ◽

Big Data Classification

Download Full-text

Research on complex attribute big data classification based on iterative fuzzy clustering algorithm

Web Intelligence ◽

10.3233/web-210463 ◽

2021 ◽

pp. 1-12

Author(s):

Li Qian

Keyword(s):

Big Data ◽

Fuzzy Clustering ◽

Classification Accuracy ◽

Clustering Algorithm ◽

Principal Component ◽

Data Classification ◽

Fisher Discriminant Analysis ◽

Fuzzy Clustering Algorithm ◽

Local Fisher Discriminant Analysis ◽

Big Data Classification

In order to overcome the low classification accuracy of traditional methods, this paper proposes a new classification method of complex attribute big data based on iterative fuzzy clustering algorithm. Firstly, principal component analysis and kernel local Fisher discriminant analysis were used to reduce dimensionality of complex attribute big data. Then, the Bloom Filter data structure is introduced to eliminate the redundancy of the complex attribute big data after dimensionality reduction. Secondly, the redundant complex attribute big data is classified in parallel by iterative fuzzy clustering algorithm, so as to complete the complex attribute big data classification. Finally, the simulation results show that the accuracy, the normalized mutual information index and the Richter’s index of the proposed method are close to 1, the classification accuracy is high, and the RDV value is low, which indicates that the proposed method has high classification effectiveness and fast convergence speed.

Download Full-text