scholarly journals Research on the Construction of Big Data Classification System Based on Distributed Data Flow

2021 ◽  
Vol 2136 (1) ◽  
pp. 012057
Author(s):  
Han Zhou

Abstract In the context of the comprehensive popularization of network technical services and database construction system, more and more data are used by enterprises or individuals. It is difficult for the existing technology to meet the technical analysis requirements of the development of the era of big data. Therefore, in the development of practice, we should continue to explore new technologies and methods to reasonably use big data. Therefore, on the basis of understanding the current big data technology and its system operation status, this paper designs relevant algorithms according to the big data classification model, and verifies the effectiveness of the analysis model algorithm based on practice.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Surendran Rajendran ◽  
Osamah Ibrahim Khalaf ◽  
Youseef Alotaibi ◽  
Saleh Alghamdi

AbstractIn recent times, big data classification has become a hot research topic in various domains, such as healthcare, e-commerce, finance, etc. The inclusion of the feature selection process helps to improve the big data classification process and can be done by the use of metaheuristic optimization algorithms. This study focuses on the design of a big data classification model using chaotic pigeon inspired optimization (CPIO)-based feature selection with an optimal deep belief network (DBN) model. The proposed model is executed in the Hadoop MapReduce environment to manage big data. Initially, the CPIO algorithm is applied to select a useful subset of features. In addition, the Harris hawks optimization (HHO)-based DBN model is derived as a classifier to allocate appropriate class labels. The design of the HHO algorithm to tune the hyperparameters of the DBN model assists in boosting the classification performance. To examine the superiority of the presented technique, a series of simulations were performed, and the results were inspected under various dimensions. The resultant values highlighted the supremacy of the presented technique over the recent techniques.


The typical Internet of Things (IoT) device gathers a huge amount of data specifically termed as big data framework, which transfers the collected data from the sensing layer to the information processing layer. Various big data classification methods are adopted in the industrial applications, and smart cities, but accurately classifying the data in the IoT network poses a challenging task in the research community. Therefore, an effective big data classification model using spark-based architecture is proposed in this research. The big data classification is performed at the master node using the proposed Fractional Artificial Bee Colony- Chaotic Fruitfly Rider Optimization Algorithm (FABC-CFFRideNN). The concept of fictional computing is adopted by the rider optimization algorithm (ROA) to update the position of rider groups based on success rate and the foraging behavior of fruit flies along with the rider parameters is used to enhance to performance of data classification using the proposed CFFRideNN classifier. Moreover, the proposed Fractional Artificial Bee Colony- Chaotic Fruitfly Rider Optimization Algorithm attained better performance using the metrics, namely accuracy, specificity, and sensitivity with the values of 95.382%, 95.81%, and 98.824% for training percentage without node velocity.


2019 ◽  
Vol 8 (4) ◽  
pp. 5370-5375

With the growing culture of Internet applications and their usage lead to challenging task for storing a massive volume of high-velocity data from different fields. This result an evolution of big data with integrated, i.e. Volume, Velocity, and Variety (3V's). The voluminous data extraction is a very complex task which is not possible form classical data mining techniques. Therefore, a big data mining technique is introducing by modifying traditional data mining scheme using a novel of Neuro-Fuzzy Logic based approach, i.e. named as NFDDC. The proposed distributed data classification model performs into three stages first- reduce the data set dimension, second- data clustering, and third-data classification using the neuro-fuzzy method. The performance of the NFDDC system is analysed using two different datasets, i.e. medical data and e-commerce datasets. Additionally, comparative analysis is performed by measuring the accuracy of existing CCSA algorithm with proposed NFDDC algorithm and will get 90% accuracy in data classification


2021 ◽  
pp. 1-12
Author(s):  
Li Qian

In order to overcome the low classification accuracy of traditional methods, this paper proposes a new classification method of complex attribute big data based on iterative fuzzy clustering algorithm. Firstly, principal component analysis and kernel local Fisher discriminant analysis were used to reduce dimensionality of complex attribute big data. Then, the Bloom Filter data structure is introduced to eliminate the redundancy of the complex attribute big data after dimensionality reduction. Secondly, the redundant complex attribute big data is classified in parallel by iterative fuzzy clustering algorithm, so as to complete the complex attribute big data classification. Finally, the simulation results show that the accuracy, the normalized mutual information index and the Richter’s index of the proposed method are close to 1, the classification accuracy is high, and the RDV value is low, which indicates that the proposed method has high classification effectiveness and fast convergence speed.


Sign in / Sign up

Export Citation Format

Share Document