PHash: A memory-efficient, high-performance key-value store for large-scale data-intensive applications

Journal of Systems and Software ◽

10.1016/j.jss.2016.09.047 ◽

2017 ◽

Vol 123 ◽

pp. 33-44 ◽

Author(s):

Hyotaek Shim

Keyword(s):

High Performance ◽

Large Scale ◽

Data Intensive ◽

Large Scale Data ◽

Data Intensive Applications ◽

Memory Efficient

Download Full-text

Performance mining of large-scale data-intensive applications

Proceedings 16th International Parallel and Distributed Processing Symposium ◽

10.1109/ipdps.2002.1016582 ◽

2002 ◽

Author(s):

C. Carothers ◽

B.K. Szymanski ◽

M. Zaki

Keyword(s):

Large Scale ◽

Data Intensive ◽

Large Scale Data ◽

Data Intensive Applications ◽

Download Full-text

Enabling Large-Scale Biomedical Analysis in the Cloud

BioMed Research International ◽

10.1155/2013/185679 ◽

2013 ◽

Vol 2013 ◽

pp. 1-6 ◽

Author(s):

Ying-Chih Lin ◽

Chin-Sheng Yu ◽

Yen-Jen Lin

Keyword(s):

High Performance ◽

Large Scale ◽

Computing System ◽

Biomedical Data ◽

Data Intensive Computing ◽

Biomedical Analysis ◽

Data Intensive ◽

Large Scale Data ◽

Performance Computing ◽

Recent progress in high-throughput instrumentations has led to an astonishing growth in both volume and complexity of biomedical data collected from various sources. The planet-size data brings serious challenges to the storage and computing technologies. Cloud computing is an alternative to crack the nut because it gives concurrent consideration to enable storage and high-performance computing on large-scale data. This work briefly introduces the data intensive computing system and summarizes existing cloud-based resources in bioinformatics. These developments and applications would facilitate biomedical research to make the vast amount of diversification data meaningful and usable.

Download Full-text

Modified Delay Scheduling: A Heuristic Approach for Hadoop Scheduling to Improve Fairness and Response Time

Parallel Processing Letters ◽

10.1142/s0129626415500097 ◽

2015 ◽

Vol 25 (04) ◽

pp. 1550009 ◽

Author(s):

N. P. Gopalan ◽

S. Suresh

Keyword(s):

Response Time ◽

Large Scale ◽

Programming Model ◽

Data Locality ◽

Data Intensive ◽

Large Scale Data ◽

Fair Scheduler ◽

Hadoop Clusters ◽

Data Intensive Applications ◽

Hadoop is a widely used open source implementation of MapReduce which is a popular programming model for parallel processing large scale data intensive applications in a cloud environment. Sharing Hadoop clusters has a tradeoff between fairness and data locality. When launching a local task is not possible, Hadoop Fair Scheduler (HFS) with delay scheduling postpones the node allocation for a while to a job which is to be scheduled next as per fairness to achieve high locality. This waiting becomes waste when the desired locality could not be achieved within a reasonable period. In this paper, a modified delay scheduling in HFS is proposed and implemented in Hadoop. It avoids the aforementioned waiting of delay scheduler if achieving locality is not possible. Instead of blindly waiting for a local node, the proposed algorithm first estimates the time to wait for a local node for the job and avoids waiting wherever achieving locality is not possible within the predefined delay threshold while accomplishing same locality. The performance of the proposed algorithm is evaluated by extensive experiments and it has been observed that the algorithm works significantly better in terms of response time and fairness achieving up to 20% speedup and up to 38% fairness in certain cases.

Download Full-text

Efficient Performance Prediction for Large-Scale, Data-Intensive Applications

The International Journal of High Performance Computing Applications ◽

10.1177/109434200001400305 ◽

2000 ◽

Vol 14 (3) ◽

pp. 216-227 ◽

Author(s):

Tahsin Kurc ◽

Mustafa Uysal ◽

Hyeonsang Eom ◽

Jeff Hollingsworth ◽

Joel Saltz ◽

...

Keyword(s):

Performance Prediction ◽

Large Scale ◽

Data Intensive ◽

Large Scale Data ◽

Efficient Performance ◽

Data Intensive Applications ◽

Download Full-text

Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist

Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining ◽

10.1145/3219819.3219927 ◽

2018 ◽

Author(s):

Alex Gittens ◽

Kai Rothauge ◽

Shusen Wang ◽

Michael W. Mahoney ◽

Lisa Gerhardt ◽

...

Keyword(s):

Data Analysis ◽

High Performance Computing ◽

High Performance ◽

Large Scale ◽

Large Scale Data ◽

Performance Computing ◽

Download Full-text

Energy-Conservation in Large-Scale Data-Intensive Hadoop Compute Clusters

Green IT: Technologies and Applications ◽

10.1007/978-3-642-22179-8_13 ◽

2011 ◽

pp. 245-265

Author(s):

Rini T. Kaushik ◽

Klara Nahrstedt

Keyword(s):

Energy Conservation ◽

Large Scale ◽

Data Intensive ◽

Large Scale Data ◽

Download Full-text

Enabling low latency at large-scale data center and high-performance computing interconnect networks using fine-grained all-optical switching technology

2017 International Conference on Optical Network Design and Modeling (ONDM) ◽

10.23919/ondm.2017.7958532 ◽

2017 ◽

Author(s):

Nan Hua ◽

Zhizhen Zhong ◽

Xiaoping Zheng

Keyword(s):

Data Center ◽

High Performance ◽

Optical Switching ◽

Large Scale ◽

Fine Grained ◽

Large Scale Data ◽

All Optical ◽

Performance Computing ◽

All Optical Switching ◽

Download Full-text

Integrating Web service and grid enabling technologies to provide desktop access to high-performance cluster-based components for large-scale data services

36th Annual Simulation Symposium, 2003. ◽

10.1109/simsym.2003.1192810 ◽

2003 ◽

Author(s):

V.P. Holmes ◽

W.R. Johnson ◽

D.J. Miller

Keyword(s):

Web Service ◽

High Performance ◽

Large Scale ◽

Data Services ◽

Large Scale Data ◽

Enabling Technologies ◽

Download Full-text

Query Prediction in Large Scale Data Intensive Event Stream Analysis Systems

2008 Seventh International Conference on Grid and Cooperative Computing ◽

10.1109/gcc.2008.115 ◽

2008 ◽

Author(s):

Song Huaiming ◽

Wang Yang ◽

An Mingyuan ◽

Wang Weiping ◽

Sun Ninghui

Keyword(s):

Large Scale ◽

Event Stream ◽

Data Intensive ◽

Large Scale Data ◽

Download Full-text

Large-Scale Data-Intensive Computing

Large-Scale Computing ◽

10.1002/9781118130506.ch7 ◽

2012 ◽

pp. 131-140

Author(s):

Mark Parsons

Keyword(s):

Large Scale ◽

Data Intensive Computing ◽

Data Intensive ◽

Large Scale Data ◽

Download Full-text