Política Customizada de Balanceamento de Réplicas para o HDFS Balancer do Apache Hadoop

Mapping Intimacies ◽

10.5753/wtf.2019.7717 ◽

2019 ◽

Author(s):

Rhauani W. Fazul ◽

Patricia Pitthan Barcelos

Keyword(s):

File System ◽

Data Replication ◽

Distributed File System ◽

Apache Hadoop ◽

Hadoop Distributed File System ◽

Fundamental Mechanism ◽

The Way

Data replication is a fundamental mechanism of the Hadoop Distributed File System (HDFS). However, the way data is spread across the cluster directly affects the replication balancing. The HDFS Balancer is a Hadoop integrated tool which can balance the storage load on each machine by moving data between nodes, although its operation does not address the specific needs of applications while performing block rearrangement. This paper proposes a customized balancing policy for HDFS Balancer based on a system of priorities, which can be adapted and configured according to usage demands. The priorities define whether HDFS parameters, or whether cluster topology should be considered during the operation, thus making the balancing more flexible.

Download Full-text

Efficient Data Replication Scheme based on Hadoop Distributed File System

International Journal of Software Engineering and Its Applications ◽

10.14257/ijseia.2015.9.12.16 ◽

2015 ◽

Vol 9 (12) ◽

pp. 177-186 ◽

Cited By ~ 1

Author(s):

Jungha Lee ◽

Jaehwa Chung ◽

Daewon Lee

Keyword(s):

File System ◽

Data Replication ◽

Distributed File System ◽

Hadoop Distributed File System ◽

Replication Scheme ◽

Efficient Data

Download Full-text

EDAS: Efficient Data Access Scheme of Data Replication for Hadoop Distributed File System (HDFS)

March 29-30, 2015 Singapore ◽

10.17758/ur.u0315253 ◽

2015 ◽

Keyword(s):

File System ◽

Data Replication ◽

Data Access ◽

Distributed File System ◽

Access Scheme ◽

Hadoop Distributed File System ◽

Efficient Data

Download Full-text

Identification of Threats and Vulnerabilities in Public Cloud-Based Apache Hadoop Distributed File System

2019 15th International Computer Engineering Conference (ICENCO) ◽

10.1109/icenco48310.2019.9027300 ◽

2019 ◽

Author(s):

Omar Hussein

Keyword(s):

File System ◽

Distributed File System ◽

Public Cloud ◽

Apache Hadoop ◽

Hadoop Distributed File System

Download Full-text

Cloud Forensics : Isolating Cloud Instance

International Research Journal of Electronics and Computer Engineering ◽

10.24178/irjece.2017.3.2.10 ◽

2017 ◽

Vol 3 (2) ◽

pp. 10

Author(s):

Mariam J. AlKandari ◽

Huda F. Al Rasheedi ◽

Ayed A. Salman

Keyword(s):

Cloud Computing ◽

Digital Forensics ◽

File System ◽

File Systems ◽

Distributed File System ◽

Cloud Environment ◽

Apache Hadoop ◽

Cloud Forensics ◽

Hadoop Distributed File System ◽

Do So

Abstract—Cloud computing has been the trending model for storing, accessing and modifying the data over the Internet in the recent years. Rising use of the cloud has generated a new concept related to the cloud which is cloud forensics. Cloud forensics can be defined as investigating for evidence over the cloud, so it can be viewed as a combination of both cloud computing and digital forensics. Many issues of applying forensics in the cloud have been addressed. Isolating the location of the incident has become an essential part of forensic process. This is done to ensure that evidence will not be modified or changed. Isolating an instant in the cloud computing has become even more challenging, due to the nature of the cloud environment. In the cloud, the same storage or virtual machine have been used by many users. Hence, the evidence is most likely will be overwritten and lost. The proposed solution in this paper is to isolate a cloud instance. This can be achieved by marking the instant that reside in the servers as "Under Investigation". To do so, cloud file system must be studied. One of the well-known file systems used in the cloud is Apache Hadoop Distributed File System (HDFS). Thus, in this paper the methodology used for isolating a cloud instance would be based on the HDFS architecture. Keywords: cloud computing; digital forensics; cloud forensics

Download Full-text

Improving downloading performance in hadoop distributed file system

Journal of Computer Applications ◽

10.3724/sp.j.1087.2010.02060 ◽

2010 ◽

Vol 30 (8) ◽

pp. 2060-2065 ◽

Cited By ~ 4

Author(s):

Ning CAO ◽

Zhong-hai WU ◽

Hong-zhi LIU ◽

Qi-xun ZHANG

Keyword(s):

File System ◽

Distributed File System ◽

Hadoop Distributed File System

Download Full-text

A Technique For Big Statistics Security Based on Hadoop Distributed File System

SSRN Electronic Journal ◽

10.2139/ssrn.3508526 ◽

2019 ◽

Author(s):

Sindhu D M ◽

DR.Ravikumar G.K ◽

Manu Y.M

Keyword(s):

File System ◽

Distributed File System ◽

Hadoop Distributed File System

Download Full-text

Data protection on hadoop distributed file system by using encryption algorithms: a systematic literature review

Journal of Physics Conference Series ◽

10.1088/1742-6596/1444/1/012012 ◽

2020 ◽

Vol 1444 ◽

pp. 012012

Author(s):

Meisuchi Naisuty ◽

Achmad Nizar Hidayanto ◽

Nabila Clydea Harahap ◽

Ahmad Rosyiq ◽

Agus Suhanto ◽

...

Keyword(s):

Literature Review ◽

Systematic Literature Review ◽

Data Protection ◽

File System ◽

Distributed File System ◽

Hadoop Distributed File System ◽

Encryption Algorithms

Download Full-text

A Study on Security Approaches for Big Data Hadoop Distributed File System

Journal of Engineering and Applied Sciences ◽

10.36478/jeasci.2019.8266.8272 ◽

2019 ◽

Vol 14 (22) ◽

pp. 8266-8272

Author(s):

Leelavathi . ◽

M. Elshayeb

Keyword(s):

Big Data ◽

File System ◽

Distributed File System ◽

Hadoop Distributed File System

Download Full-text

The Evolution of the Hadoop Distributed File System

2018 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA) ◽

10.1109/waina.2018.00065 ◽

2018 ◽

Cited By ~ 1

Author(s):

Stathis Maneas ◽

Bianca Schroeder

Keyword(s):

File System ◽

Distributed File System ◽

Hadoop Distributed File System

Download Full-text

Applying the K-Means Algorithm in Big Raw Data Sets with Hadoop and MapReduce

Business Intelligence ◽

10.4018/978-1-4666-9562-7.ch062 ◽

2016 ◽

pp. 1220-1243

Author(s):

Ilias K. Savvas ◽

Georgia N. Sofianidou ◽

M-Tahar Kechadi

Keyword(s):

Big Data ◽

Clustering Algorithm ◽

File System ◽

Large Data ◽

Large Data Sets ◽

Distributed File System ◽

Data Sets ◽

Raw Data ◽

Hadoop Distributed File System ◽

Access To Data

Big data refers to data sets whose size is beyond the capabilities of most current hardware and software technologies. The Apache Hadoop software library is a framework for distributed processing of large data sets, while HDFS is a distributed file system that provides high-throughput access to data-driven applications, and MapReduce is software framework for distributed computing of large data sets. Huge collections of raw data require fast and accurate mining processes in order to extract useful knowledge. One of the most popular techniques of data mining is the K-means clustering algorithm. In this study, the authors develop a distributed version of the K-means algorithm using the MapReduce framework on the Hadoop Distributed File System. The theoretical and experimental results of the technique prove its efficiency; thus, HDFS and MapReduce can apply to big data with very promising results.

Download Full-text