Building High-Resolution Sky Images Using the Cell/B.E.

Ana Lucia Varbanescu; Alexander S. van Amesfoort; Tim Cornwell; Ger van Diepen; Rob van Nieuwpoort; Bruce G. Elmegreen; Henk Sips

doi:10.1155/2009/408370

Building High-Resolution Sky Images Using the Cell/B.E.

Scientific Programming ◽

10.1155/2009/408370 ◽

2009 ◽

Vol 17 (1-2) ◽

pp. 113-134 ◽

Cited By ~ 4

Author(s):

Ana Lucia Varbanescu ◽

Alexander S. van Amesfoort ◽

Tim Cornwell ◽

Ger van Diepen ◽

Rob van Nieuwpoort ◽

...

Keyword(s):

High Performance ◽

Complete Solution ◽

Performance Potential ◽

Data Intensive ◽

Irregular Data ◽

Original Application ◽

The One ◽

High Level ◽

Data Intensive Applications ◽

Application Specific

The performance potential of the Cell/B.E., as well as its availability, have attracted a lot of attention from various high-performance computing (HPC) fields. While computation intensive kernels proved to be exceptionally well suited for running on the Cell, irregular data-intensive applications are usually considered as poor matches. In this paper, we present our complete solution for enabling such a data-intensive application to run efficiently on the Cell/B.E. processor. Specifically, we target radioastronomy data gridding and degridding, two resembling imaging filters based on convolutional resampling. Our solution is based on building a high-level application model, used to evaluate parallelization alternatives. Next, we choose the one with the best performance potential, and we gradually exploit this potential by applying platform-specific and application-specific optimizations. After several iterations, our target application shows a speed-up factor between 10 and 20 on a dual-Cell blade when compared with the original application running on a commodity machine. Given these results, and based on our empirical observations, we are able to pinpoint a set of ten guidelines for parallelizing similar applications on the Cell/B.E. Finally, we conclude the Cell/B.E. can provide high performance for data-intensive applications at the price of increased programming efforts and with a significant aid from aggressive application-specific optimizations.

Custom templates based heterogeneous resource allocation for data-intensive applications

10.32469/10355/86482 ◽

2020 ◽

Author(s):

◽

Ronny Bazan Antequera

Keyword(s):

High Performance ◽

Real Data ◽

University Of Missouri ◽

Application Performance ◽

Data Intensive ◽

Edge Based ◽

The Right ◽

Heterogeneous Cloud ◽

Data Intensive Applications ◽

Cloud Resources

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI-COLUMBIA AT REQUEST OF AUTHOR.] The increase of data-intensive applications in science and engineering fields (i.e., bioinformatics, cybermanufacturing) demand the use of high-performance computing resources. However, data-intensive applications' local resources usually present limited capacity and availability due to sizable upfront costs. Moreover, using remote public resources presents constraints at the private edge network domain. Specifically, mis-configured network policies cause bottlenecks due to the other application cross-traffic attempting to use shared networking resources. Additionally, selecting the right remote resources can be cumbersome especially for those users who are interested in the application execution considering nonfunctional requirements such as performance, security and cost. The data-intensive applications have recurrent deployments and similar infrastructure requirements that can be addressed by creating templates. In this thesis, we handle applications requirements through intelligent resource 'abstractions' coupled with 'reusable' approaches that save time and effort in deploying new cloud architectures. Specifically, we design a novel custom template middleware that can retrieve blue prints of resource configuration, technical/policy information, and benchmarks of workflow performance to facilitate repeatable/reusable resource composition. The middleware considers hybrid-recommendation methodology (Online and offline recommendation) to leverage a catalog to rapidly check custom template solution correctness before/during resource consumption. Further, it prescribes application adaptations by fostering effective social interactions during the application's scaling stages. Based on the above approach, we organize the thesis contributions under two main thrusts: (i) Custom Templates for Cloud Networking for Data-intensive Applications: This involves scheduling transit selection, engineering at the campus-edge based upon real-time policy control. Our solution ensures prioritized application performance delivery for multi-tenant traffic profiles from a diverse set of actual data intensive applications in bioinformatics. (ii) Custom Templates for Cloud Computing for Data-intensive Applications: This involves recommending cloud resources for data-intensive applications based on a custom template catalog. We develop a novel expert system approach that is implemented as a middleware to abstracts data-intensive application requirements for custom templates composition. We uniquely consider heterogeneous cloud resources selection for the deployment of cloud architectures for real data-intensive applications in cybermanufacturing.

Empirical Performance Analysis of HPC Benchmarks Across Variations in Cloud Computing

International Journal of Cloud Applications and Computing ◽

10.4018/ijcac.2013010102 ◽

2013 ◽

Vol 3 (1) ◽

pp. 13-26 ◽

Cited By ~ 4

Author(s):

Sanjay P. Ahuja ◽

Sindhu Mani

Keyword(s):

Data Storage ◽

High Performance ◽

Large Data ◽

Extensive Study ◽

Memory Bandwidth ◽

Platform As A Service ◽

Data Intensive ◽

Computational Performance ◽

Empirical Performance ◽

Data Intensive Applications

High Performance Computing (HPC) applications are scientific applications that require significant CPU capabilities. They are also data-intensive applications requiring large data storage. While many researchers have examined the performance of Amazon’s EC2 platform across some HPC benchmarks, an extensive study and their comparison between Amazon’s EC2 and Microsoft’s Windows Azure is largely missing with metrics such as memory bandwidth, I/O performance, and communication and computational performance. The purpose of this paper is to implement existing benchmarks to evaluate and analyze these metrics for EC2 and Windows Azure that span both Infrastructure-as-a-Service and Platform-as-a-Service types. This was accomplished by running MPI versions of STREAM, Interleaved or Random (IOR) and NAS Parallel (NPB) benchmarks on small and medium instance types. In addition a new EC2 medium instance type (m1.medium) was also included in the analysis. These benchmarks measure the memory bandwidth, I/O performance, communication and computational performance.

Using Grids for Distributed Knowledge Discovery

Mathematical Methods for Knowledge Discovery and Data Mining ◽

10.4018/978-1-59904-528-3.ch017 ◽

2011 ◽

pp. 284-298 ◽

Cited By ~ 3

Author(s):

Antonio Congiusta ◽

Domenico Talia ◽

Paolo Trunfio

Keyword(s):

Data Mining ◽

Knowledge Discovery ◽

High Performance ◽

Data Transfer ◽

Grid Services ◽

Distributed Knowledge ◽

Data Intensive ◽

Knowledge Grid ◽

Complex Knowledge ◽

High Level

Knowledge discovery is a compute and data intensive process that allows for finding patterns, trends, and models in large datasets. The Grid can be effectively exploited for deploying knowledge discovery applications because of the high-performance it can offer and its distributed infrastructure. For effective use of Grids in knowledge discovery, the development of middleware is critical to support data management, data transfer, data mining and knowledge representation. To such purpose, we designed the Knowledge Grid, a high-level environment providing for Grid-based knowledge discovery tools and services. Such services allow users to create and manage complex knowledge discovery applications, composed as workflows that integrate data sources and data mining tools provided as distributed Grid services. This chapter describes the Knowledge Grid architecture and describes how its components can be used to design and implement distributed knowledge discovery applications. Then, the chapter describes how the Knowledge Grid services can be made accessible using the Open Grid Services Architecture (OGSA) model.

Evaluating the usefulness of content addressable storage for high-performance data intensive applications

Proceedings of the 17th international symposium on High performance distributed computing - HPDC '08 ◽

10.1145/1383422.1383428 ◽

2008 ◽

Cited By ~ 13

Author(s):

Partho Nath ◽

Bhuvan Urgaonkar ◽

Anand Sivasubramaniam

Keyword(s):

High Performance ◽

Performance Data ◽

Data Intensive ◽

Data Intensive Applications

Adaptive Page Migration for Irregular Data-intensive Applications under GPU Memory Oversubscription

2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS) ◽

10.1109/ipdps47924.2020.00054 ◽

2020 ◽

Cited By ~ 1

Author(s):

Debashis Ganguly ◽

Ziyu Zhang ◽

Jun Yang ◽

Rami Melhem

Keyword(s):

Data Intensive ◽

Page Migration ◽

Irregular Data ◽

Data Intensive Applications

Creating a portable, high-level graph analytics paradigm for compute and data-intensive applications

International Journal of High Performance Computing and Networking ◽

10.1504/ijhpcn.2017.10007922 ◽

2017 ◽

Vol 1 (1) ◽

pp. 1

Author(s):

Robert Searles ◽

Michela Taufer ◽

Sunita Chandrasekaran ◽

Stephen Herbein ◽

Travis Johnston

Keyword(s):

Graph Analytics ◽

Data Intensive ◽

High Level ◽

Data Intensive Applications

Automated Configuration of NoSQL Performance and Scalability Tactics for Data-Intensive Applications

Informatics ◽

10.3390/informatics7030029 ◽

2020 ◽

Vol 7 (3) ◽

pp. 29

Author(s):

Davy Preuveneers ◽

Wouter Joosen

Keyword(s):

Supervised Machine Learning ◽

Tuning Parameters ◽

Data Intensive ◽

Optimal Service ◽

Challenging Tasks ◽

Growth Changes ◽

High Level ◽

Storage Technologies ◽

Data Intensive Applications ◽

Current Monitoring

This paper presents the architecture, implementation and evaluation of a middleware support layer for NoSQL storage systems. Our middleware automatically selects performance and scalability tactics in terms of application specific workloads. Enterprises are turning to NoSQL storage technologies for their data-intensive computing and analytics applications. Comprehensive benchmarks of different Big Data platforms can help drive decisions which solutions to adopt. However, selecting the best performing technology, configuring the deployment for scalability and tuning parameters at runtime for an optimal service delivery remain challenging tasks, especially when application workloads evolve over time. Our middleware solves this problem at runtime by monitoring the data growth, changes in the read-write-query mix at run-time, as well as other system metrics that are indicative of sub-optimal performance. Our middleware employs supervised machine learning on historic and current monitoring information and corresponding configurations to select the best combinations of high-level tactics and adapt NoSQL systems to evolving workloads. This work has been driven by two real world case studies with different QoS requirements. The evaluation demonstrates that our middleware can adapt to unseen workloads of data-intensive applications, and automate the configuration of different families of NoSQL systems at runtime to optimize the performance and scalability of such applications.

Impact of high performance sockets on data intensive applications

High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on ◽

10.1109/hpdc.2003.1210013 ◽

2004 ◽

Cited By ~ 10

Author(s):

P. Balaji ◽

Jiesheng Wu ◽

T. Kurc ◽

U. Catalyurek ◽

D.K. Panda ◽

...

Keyword(s):

High Performance ◽

Data Intensive ◽

Data Intensive Applications

The Educational Project in the Context of High-Performance Sports

Sociology of Sport Journal ◽

10.1123/ssj.2020-0069 ◽

2020 ◽

pp. 1-8

Author(s):

Fabrice Burlot ◽

Mathilde Desenfant ◽

Helene Joncheray

Keyword(s):

High Performance ◽

The Other ◽

Training Project ◽

Educational Project ◽

Other Hand ◽

Social Balance ◽

Adjustment Variable ◽

The One ◽

High Level

The requirements of performance sport are becoming more and more time-consuming for athletes. Based on the work of Rosa, the article looks into the ability of athletes to reconcile their training project and the increasing requirements of practice at a high level. To address this issue, we interviewed 63 high-level French athletes who train at the French Institute of Sport. The results show that although the training project appears to be time-consuming, it is nonetheless a source of social balance and a reassuring choice for their future professional retraining. In order to preserve this educational project in the time-consuming context of high-performance sports, athletes on the one hand implement strategies of arrangement in order to produce an acceptable timetable, and on the other hand use this temporality as an adjustment variable allowing them to better manage temporal emergencies. By giving athletes a voice, this work deconstructs the idea of the incompatibility of educational and sports projects and offers recommendations to sports institutions.

PHash: A memory-efficient, high-performance key-value store for large-scale data-intensive applications

Journal of Systems and Software ◽

10.1016/j.jss.2016.09.047 ◽

2017 ◽

Vol 123 ◽

pp. 33-44 ◽

Cited By ~ 2

Author(s):

Hyotaek Shim

Keyword(s):

High Performance ◽

Large Scale ◽

Data Intensive ◽

Large Scale Data ◽

Data Intensive Applications ◽

Scale Data ◽

Memory Efficient