Efficient data transfer protocols for big data

Network bandwidth is a scarce resource in big data environments, so data locality is a fundamental problem for data-parallel frameworks such as Hadoop and Spark. This problem is exacerbated in multicore server-based clusters, where multiple tasks running on the same server compete for the server’s network bandwidth. Existing approaches solve this problem by scheduling computational tasks near the input data and considering the server’s free time, data placements, and data transfer costs. However, such approaches usually set identical values for data transfer costs, even though a multicore server’s data transfer cost increases with the number of data-remote tasks. Eventually, this hampers data-processing time, by minimizing it ineffectively. As a solution, we propose DynDL (Dynamic Data Locality), a novel data-locality-aware task-scheduling model that handles dynamic data transfer costs for multicore servers. DynDL offers greater flexibility than existing approaches by using a set of non-decreasing functions to evaluate dynamic data transfer costs. We also propose online and offline algorithms (based on DynDL) that minimize data-processing time and adaptively adjust data locality. Although DynDL is NP-complete (nondeterministic polynomial-complete), we prove that the offline algorithm runs in quadratic time and generates optimal results for DynDL’s specific uses. Using a series of simulations and real-world executions, we show that our algorithms are 30% better than algorithms that do not consider dynamic data transfer costs in terms of data-processing time. Moreover, they can adaptively adjust data localities based on the server’s free time, data placement, and network bandwidth, and schedule tens of thousands of tasks within subseconds or seconds.

Download Full-text

A Swarm Inspired Method for Efficient Data Transfer

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e95.d.2852 ◽

2012 ◽

Vol E95.D (12) ◽

pp. 2852-2859

Author(s):

Yutaka KAWAI ◽

Adil HASAN ◽

Go IWAI ◽

Takashi SASAKI ◽

Yoshiyuki WATASE

Keyword(s):

Data Transfer ◽

Efficient Data

Download Full-text

Infrastructure and Energy Conservation in Big Data Computing: A Survey

Journal of Telecommunications and Information Technology ◽

10.26636/jtit.2019.132419 ◽

2019 ◽

Vol 2 ◽

pp. 73-82

Author(s):

Ewa Niewiadomska-Szynkiewicz ◽

Michał P. Karpowicz

Keyword(s):

Big Data ◽

High Performance ◽

Physical Sciences ◽

Communication Management ◽

Data Intensive ◽

Efficient Data ◽

Software Platforms ◽

Allocation Algorithms ◽

And Storage ◽

Big Data Computing

Progress in life, physical sciences and technology depends on eﬃcient data-mining and modern computing technologies. The rapid growth of data-intensive domains requires a continuous development of new solutions for network infrastructure, servers and storage in order to address Big Datarelated problems. Development of software frameworks, include smart calculation, communication management, data decomposition and allocation algorithms is clearly one of the major technological challenges we are faced with. Reduction in energy consumption is another challenge arising in connection with the development of eﬃcient HPC infrastructures. This paper addresses the vital problem of energy-eﬃcient high performance distributed and parallel computing. An overview of recent technologies for Big Data processing is presented. The attention is focused on the most popular middleware and software platforms. Various energy-saving approaches are presented and discussed as well.

Download Full-text

Efficient data transfer operations for a SIMD processor array system

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2012.6288206 ◽

2012 ◽

Cited By ~ 1

Author(s):

Hanno Lieske ◽

Shorin Kyo ◽

Shohei Nomoto ◽

Sunao Torii ◽

Yuki Kobayashi ◽

...

Keyword(s):

Data Transfer ◽

Processor Array ◽

Efficient Data

Download Full-text

Reliable and Energy-Efficient Data Transfer Routing in Wireless Body Area Networks

Mobile Radio Communications and 5G Networks - Lecture Notes in Networks and Systems ◽

10.1007/978-981-15-7130-5_62 ◽

2020 ◽

pp. 757-769

Author(s):

Nikhil Marriwala

Keyword(s):

Energy Efficient ◽

Data Transfer ◽

Body Area Networks ◽

Wireless Body Area Networks ◽

Body Area ◽

Efficient Data

Download Full-text

Big Data-Based Spectrum Sensing for Cognitive Radio Networks Using Artificial Intelligence

Big Data Analytics for Sustainable Computing - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-9750-6.ch009 ◽

2020 ◽

pp. 146-159 ◽

Cited By ~ 3

Author(s):

Suriya Murugan ◽

Sumithra M. G.

Keyword(s):

Machine Learning ◽

Big Data ◽

Cognitive Radio ◽

Spectrum Sensing ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Spectrum Utilization ◽

Candidate Solution ◽

Learning Techniques ◽

Efficient Data

Cognitive radio has emerged as a promising candidate solution to improve spectrum utilization in next generation wireless networks. Spectrum sensing is one of the main challenges encountered by cognitive radio and the application of big data is a powerful way to solve various problems. However, for the increasingly tense spectrum resources, the prediction of cognitive radio based on big data is an inevitable trend. The signal data from various sources is analyzed using the big data cognitive radio framework and efficient data analytics can be performed using different types of machine learning techniques. This chapter analyses the process of spectrum sensing in cognitive radio, the challenges to process spectrum data and need for dynamic machine learning algorithms in decision making process.

Download Full-text

Optimization Algorithms for Data Transfer in the Grid Environment

Grid and Cloud Computing ◽

10.4018/978-1-4666-0879-5.ch210 ◽

2012 ◽

pp. 502-516

Author(s):

Muzhou Xiong ◽

Hai Jin

Keyword(s):

Data Transfer ◽

Optimization Algorithms ◽

Experimental Results ◽

Grid Environment ◽

Transfer Data ◽

Multiple Data ◽

Transfer Channel ◽

Efficient Data ◽

Global Connection

In this chapter, two algorithms have been presented for supporting efficient data transfer in the Grid environment. From a node’s perspective, a multiple data transfer channel can be formed by selecting some other nodes as relays in data transfer. One algorithm requires the sender to be aware of the global connection information while another does not. Experimental results indicate that both algorithms can transfer data efficiently under various circumstances.

Download Full-text

Efficient data transfer protocols for big data

Service Scheduling and Resource Allocation for Big Data Transfer in Elastic Optical Network

Bandwidth scheduling for big data transfer with two variable node-disjoint paths

Client-Based Intelligence for Resource Efficient Vehicular Big Data Transfer in Future 6G Networks

DynDL: Scheduling Data-Locality-Aware Tasks with Dynamic Data Transfer Cost for Multicore-Server-Based Big Data Clusters

A Swarm Inspired Method for Efficient Data Transfer

Infrastructure and Energy Conservation in Big Data Computing: A Survey

Efficient data transfer operations for a SIMD processor array system

Reliable and Energy-Efficient Data Transfer Routing in Wireless Body Area Networks

Big Data-Based Spectrum Sensing for Cognitive Radio Networks Using Artificial Intelligence

Optimization Algorithms for Data Transfer in the Grid Environment

Export Citation Format