Using HDFS to Load, Search, and Retrieve Data from Local Data Nodes
Abstract: By utilizing the Hadoop environment, data may be loaded and searched from local data nodes. Because the dataset's capacity may be vast, loading and finding data using a query is often more difficult. We suggest a method for dealing with data in local nodes that does not overlap with data acquired by script. The query's major purpose is to store information in a distributed environment and look for it quickly. In this section, we define the script to eliminate duplicate data redundancy when searching and loading data in a dynamic manner. In addition, the Hadoop file system is available in a distributed environment. Keywords: HDFS; Hadoop distributed file system; replica; local; distributed; capacity; SQL; redundancy