Fast clustering algorithm of commodity association big data sparse network

The integration of Artificial Intelligence technology and school education had become a future trend, and became an important driving force for the development of education. With the advent of the era of big data, although the relationship between students’ learning status data was closer to nonlinear relationship, combined with the application analysis of artificial intelligence technology, it could be found that students’ living habits were closely related to their academic performance. In this paper, through the investigation and analysis of the living habits and learning conditions of more than 2000 students in the past 10 grades in Information College of Institute of Disaster Prevention, we used the hierarchical clustering algorithm to classify the nearly 180000 records collected, and used the big data visualization technology of Echarts + iView + GIS and the JavaScript development method to dynamically display the students’ life track and learning information based on the map, then apply Three Dimensional ArcGIS for JS API technology showed the network infrastructure of the campus. Finally, a training model was established based on the historical learning achievements, life trajectory, graduates’ salary, school infrastructure and other information combined with the artificial intelligence Back Propagation neural network algorithm. Through the analysis of the training resulted, it was found that the students’ academic performance was related to the reasonable laboratory study time, dormitory stay time, physical exercise time and social entertainment time. Finally, the system could intelligently predict students’ academic performance and give reasonable suggestions according to the established prediction model. The realization of this project could provide technical support for university educators.

Download Full-text

Improved K-Means Clustering Algorithm for Big Data Mining under Hadoop Parallel Framework

Journal of Grid Computing ◽

10.1007/s10723-019-09503-0 ◽

2019 ◽

Vol 18 (2) ◽

pp. 239-250 ◽

Cited By ~ 3

Author(s):

Weijia Lu

Keyword(s):

Data Mining ◽

Big Data ◽

Clustering Algorithm ◽

Big Data Mining

Download Full-text

Research on complex attribute big data classification based on iterative fuzzy clustering algorithm

Web Intelligence ◽

10.3233/web-210463 ◽

2021 ◽

pp. 1-12

Author(s):

Li Qian

Keyword(s):

Big Data ◽

Fuzzy Clustering ◽

Classification Accuracy ◽

Clustering Algorithm ◽

Principal Component ◽

Data Classification ◽

Fisher Discriminant Analysis ◽

Fuzzy Clustering Algorithm ◽

Local Fisher Discriminant Analysis ◽

Big Data Classification

In order to overcome the low classification accuracy of traditional methods, this paper proposes a new classification method of complex attribute big data based on iterative fuzzy clustering algorithm. Firstly, principal component analysis and kernel local Fisher discriminant analysis were used to reduce dimensionality of complex attribute big data. Then, the Bloom Filter data structure is introduced to eliminate the redundancy of the complex attribute big data after dimensionality reduction. Secondly, the redundant complex attribute big data is classified in parallel by iterative fuzzy clustering algorithm, so as to complete the complex attribute big data classification. Finally, the simulation results show that the accuracy, the normalized mutual information index and the Richter’s index of the proposed method are close to 1, the classification accuracy is high, and the RDV value is low, which indicates that the proposed method has high classification effectiveness and fast convergence speed.

Download Full-text

Research on Particle Swarm Optimization Clustering Algorithm for Big Data Based on Cloud Storage Environment

10.1145/3482632.3484002 ◽

2021 ◽

Author(s):

Dan Liu

Keyword(s):

Big Data ◽

Particle Swarm Optimization ◽

Cloud Storage ◽

Clustering Algorithm ◽

Particle Swarm ◽

Swarm Optimization ◽

Cloud Storage Environment

Download Full-text

Efficient and Privacy-Preserving Multi-User Outsourced K-Means Clustering

Computer and Information Science ◽

10.5539/cis.v14n2p26 ◽

2021 ◽

Vol 14 (2) ◽

pp. 26

Author(s):

Na Li ◽

Lianguan Huang ◽

Yanling Li ◽

Meng Sun

Keyword(s):

Data Mining ◽

Big Data ◽

Clustering Algorithm ◽

Privacy Preserving ◽

Locality Sensitive Hashing ◽

Sensitive Information ◽

The Public ◽

Big Data Mining ◽

Euclidean Distances ◽

Computational Resources

In recent years, with the development of the Internet, the data on the network presents an outbreak trend. Big data mining aims at obtaining useful information through data processing, such as clustering, clarifying and so on. Clustering is an important branch of big data mining and it is popular because of its simplicity. A new trend for clients who lack of storage and computational resources is to outsource the data and clustering task to the public cloud platforms. However, as datasets used for clustering may contain some sensitive information (e.g., identity information, health information), simply outsourcing them to the cloud platforms can't protect the privacy. So clients tend to encrypt their databases before uploading to the cloud for clustering. In this paper, we focus on privacy protection and efficiency promotion with respect to k-means clustering, and we propose a new privacy-preserving multi-user outsourced k-means clustering algorithm which is based on locality sensitive hashing (LSH). In this algorithm, we use a Paillier cryptosystem encrypting databases, and combine LSH to prune off some unnecessary computations during the clustering. That is, we don't need to compute the Euclidean distances between each data record and each clustering center. Finally, the theoretical and experimental results show that our algorithm is more efficient than most existing privacy-preserving k-means clustering.

Download Full-text