Big Data Clustering Techniques: Recent Advances and Survey

Machine Learning and Data Mining for Emerging Trend in Cyber Dynamics ◽

10.1007/978-3-030-66288-2_3 ◽

2021 ◽

pp. 57-79

Author(s):

Hassan Ibrahim Hayatu ◽

Abdullahi Mohammed ◽

Ahmad Barroon Isma’eel

Keyword(s):

Big Data ◽

Data Clustering ◽

Clustering Techniques ◽

Recent Advances

Download Full-text

National internal security policies across Europe – a comparative analysis applying big data clustering techniques

Political Research Exchange ◽

10.1080/2474736x.2020.1787796 ◽

2020 ◽

Vol 2 (1) ◽

pp. 1787796

Author(s):

Andreas Kattler ◽

Felix Ettensperger

Keyword(s):

Big Data ◽

Comparative Analysis ◽

Data Clustering ◽

Security Policies ◽

Internal Security ◽

Clustering Techniques

Download Full-text

Peer Review #2 of "Big data clustering techniques based on Spark: a literature review (v0.1)"

10.7287/peerj-cs.321v0.1/reviews/2 ◽

2020 ◽

Keyword(s):

Big Data ◽

Literature Review ◽

Peer Review ◽

Data Clustering ◽

Clustering Techniques

Download Full-text

Peer Review #1 of "Big data clustering techniques based on Spark: a literature review (v0.1)"

10.7287/peerj-cs.321v0.1/reviews/1 ◽

2020 ◽

Keyword(s):

Big Data ◽

Literature Review ◽

Peer Review ◽

Data Clustering ◽

Clustering Techniques

Download Full-text

Content-aware data distribution over cluster nodes

Intelligent Data Analysis ◽

10.3233/ida-205360 ◽

2021 ◽

Vol 25 (4) ◽

pp. 907-927

Author(s):

Adam Krechowicz

Keyword(s):

Big Data ◽

Data Processing ◽

Data Clustering ◽

Data Distribution ◽

Distributed Environment ◽

Data Set ◽

Clustering Techniques ◽

Big Data Applications ◽

Paper Author ◽

Content Aware

Proper data items distribution may seriously improve the performance of data processing in distributed environment. However, typical datastorage systems as well as distributed computational frameworks do not pay special attention to that aspect. In this paper author introduces two custom data items addressing methods for distributed datastorage on the example of Scalable Distributed Two-Layer Datastore. The basic idea of those methods is to preserve that data items stored on the same cluster node are similar to each other following concepts of data clustering. Still, most of the data clustering mechanisms have serious problem with data scalability which is a severe limitation in Big Data applications. The proposed methods allow to efficiently distribute data set over a set of buckets. As it was shown by the experimental results, all proposed methods generate good results efficiently in comparison to traditional clustering techniques like k-means, agglomerative and birch clustering. Distributed environment experiments shown that proper data distribution can seriously improve the effectiveness of Big Data processing.

Download Full-text

Peer Review #3 of "Big data clustering techniques based on Spark: a literature review (v0.2)"

10.7287/peerj-cs.321v0.2/reviews/3 ◽

2020 ◽

Keyword(s):

Big Data ◽

Literature Review ◽

Peer Review ◽

Data Clustering ◽

Clustering Techniques

Download Full-text

Peer Review #3 of "Big data clustering techniques based on Spark: a literature review (v0.1)"

10.7287/peerj-cs.321v0.1/reviews/3 ◽

2020 ◽

Keyword(s):

Big Data ◽

Literature Review ◽

Peer Review ◽

Data Clustering ◽

Clustering Techniques

Download Full-text

Peer Review #1 of "Big data clustering techniques based on Spark: a literature review (v0.2)"

10.7287/peerj-cs.321v0.2/reviews/1 ◽

2020 ◽

Keyword(s):

Big Data ◽

Literature Review ◽

Peer Review ◽

Data Clustering ◽

Clustering Techniques

Download Full-text

The Rising Role of Big Data Analytics and IoT in Disaster Management: Recent Advances, Taxonomy and Prospects

IEEE Access ◽

10.1109/access.2019.2913340 ◽

2019 ◽

Vol 7 ◽

pp. 54595-54614 ◽

Cited By ~ 11

Author(s):

Syed Attique Shah ◽

Dursun Zafer Seker ◽

Sufian Hameed ◽

Dirk Draheim

Keyword(s):

Big Data ◽

Disaster Management ◽

Data Analytics ◽

Big Data Analytics ◽

Recent Advances

Download Full-text

Ensembled Adaptive Fuzzy K-Means With Stochastic Extreme Gradient Boost Big Data Clustering on Geo-Social Networks

2021 International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE) ◽

10.1109/icacite51222.2021.9404574 ◽

2021 ◽

Author(s):

M. Anoop ◽

P. Sripriya

Keyword(s):

Social Networks ◽

Big Data ◽

Data Clustering ◽

Adaptive Fuzzy

Download Full-text

A survey of clustering techniques for big data analysis

2014 5th International Conference - Confluence The Next Generation Information Technology Summit (Confluence) ◽

10.1109/confluence.2014.6949256 ◽

2014 ◽

Cited By ~ 19

Author(s):

Saurabh Arora ◽

Inderveer Chana

Keyword(s):

Big Data ◽

Data Analysis ◽

Big Data Analysis ◽

Clustering Techniques

Download Full-text