Job scheduling for data-parallel frameworks with hybrid electrical/optical datacenter networks

Parallel computing technology has been widely used to process massive remote sensing data high efficiently. In order to simplify the development of remote sensing data parallel processing system and consider about the characteristics of remote sensing data pre-processing, this paper designs a cluster-based universal parallel processing framework. The framework encapsulates parallel job scheduling and management, adapts the strategy of components development, provides the simple interface for the users to develop new functionalities by adding new data-processing components into the framework. Basing on Message Passing Interface (MPI), the framework is implemented. Experiments, such as adding remote sensing data extracting, radiometric correction and geometric correction into the framework, show that the framework performed well in computing efficiency and speedup rate.

Download Full-text

Multiple job scheduling in a connection-limited data parallel system

IEEE Transactions on Parallel and Distributed Systems ◽

10.1109/tpds.2006.26 ◽

2006 ◽

Vol 17 (2) ◽

pp. 125-134 ◽

Cited By ~ 3

Author(s):

A. Amoroso ◽

K. Marzullo

Keyword(s):

Job Scheduling ◽

Parallel System ◽

Limited Data ◽

Data Parallel

Download Full-text

Co-scheduler: Accelerating Data-Parallel Jobs in Datacenter Networks with Optical Circuit Switching

2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS) ◽

10.1109/icdcs.2019.00027 ◽

2019 ◽

Cited By ~ 2

Author(s):

Zhuozhao Li ◽

Haiying Shen

Keyword(s):

Circuit Switching ◽

Data Parallel ◽

Datacenter Networks ◽

Parallel Jobs ◽

Optical Circuit ◽

Optical Circuit Switching

Download Full-text

Data-parallel line relaxation method for the Navier-Stokes equations

AIAA Journal ◽

10.2514/3.14012 ◽

1998 ◽

Vol 36 ◽

pp. 1603-1609 ◽

Cited By ~ 3

Author(s):

Michael J. Wright ◽

Graham V. Candler ◽

Deepak Bose

Keyword(s):

Stokes Equations ◽

Relaxation Method ◽

Parallel Line ◽

Navier Stokes ◽

Navier Stokes Equations ◽

Data Parallel

Download Full-text

A SURVEY ON ENERGY AWARE JOB SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT

i-manager’s Journal on Cloud Computing ◽

10.26634/jcc.3.1.8077 ◽

2016 ◽

Vol 3 (1) ◽

pp. 30 ◽

Cited By ~ 1

Author(s):

NASEERA SHAIK ◽

JYOTHEESWAI P ◽

◽

Keyword(s):

Job Scheduling ◽

Scheduling Algorithms ◽

Cloud Environment ◽

Energy Aware

Download Full-text

Space sharing job scheduling policies for parallel computers

10.31274/rtd-180813-10085 ◽

1995 ◽

Author(s):

Ismail Mohamed Ismail

Keyword(s):

Job Scheduling ◽

Parallel Computers ◽

Scheduling Policies

Download Full-text

Dependable grid job scheduling mechanism

Journal of Computer Applications ◽

10.3724/sp.j.1087.2010.02066 ◽

2010 ◽

Vol 30 (8) ◽

pp. 2066-2069

Author(s):

Yong-cai TAO ◽

Lei SHI

Keyword(s):

Job Scheduling ◽

Grid Job Scheduling

Download Full-text

Multi-level Parallelization of Genotype Imputation on Supercomputers

Current Bioinformatics ◽

10.2174/1574893615999200420071307 ◽

2020 ◽

Vol 15 ◽

Author(s):

Weiwen Zhang ◽

Long Wang ◽

Theint Theint Aye ◽

Juniarto Samsudin ◽

Yongqing Zhu

Keyword(s):

Association Study ◽

Message Passing ◽

High Performance ◽

Message Passing Interface ◽

Genome Wide Association Study ◽

Job Scheduling ◽

Genotype Imputation ◽

Job Level ◽

Multi Level ◽

High Performance Requirement

Background: Genotype imputation as a service is developed to enable researchers to estimate genotypes on haplotyped data without performing whole genome sequencing. However, genotype imputation is computation intensive and thus it remains a challenge to satisfy the high performance requirement of genome wide association study (GWAS). Objective: In this paper, we propose a high performance computing solution for genotype imputation on supercomputers to enhance its execution performance. Method: We design and implement a multi-level parallelization that includes job level, process level and thread level parallelization, enabled by job scheduling management, message passing interface (MPI) and OpenMP, respectively. It involves job distribution, chunk partition and execution, parallelized iteration for imputation and data concatenation. Due to the design of multi-level parallelization, we can exploit the multi-machine/multi-core architecture to improve the performance of genotype imputation. Results: Experiment results show that our proposed method can outperform the Hadoop-based implementation of genotype imputation. Moreover, we conduct the experiments on supercomputers to evaluate the performance of the proposed method. The evaluation shows that it can significantly shorten the execution time, thus improving the performance for genotype imputation. Conclusion: The proposed multi-level parallelization, when deployed as an imputation as a service, will facilitate bioinformatics researchers in Singapore to conduct genotype imputation and enhance the association study.

Download Full-text

Job scheduling for data-parallel frameworks with hybrid electrical/optical datacenter networks

Symbiosis: Network-aware task scheduling in data-parallel frameworks

Scatter-Gather-Merge: An efficient star-join query processing algorithm for data-parallel frameworks

A Universal Parallel Framework For Remote Sensing Process

Multiple job scheduling in a connection-limited data parallel system

Co-scheduler: Accelerating Data-Parallel Jobs in Datacenter Networks with Optical Circuit Switching

Data-parallel line relaxation method for the Navier-Stokes equations

A SURVEY ON ENERGY AWARE JOB SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT

Space sharing job scheduling policies for parallel computers

Dependable grid job scheduling mechanism

Multi-level Parallelization of Genotype Imputation on Supercomputers

Export Citation Format