scholarly journals Distributed Computing Software and Data Access Patterns in OSG Midscale Collaborations

2020 ◽  
Vol 245 ◽  
pp. 03005
Author(s):  
Pascal Paschos ◽  
Benedikt Riedel ◽  
Mats Rynge ◽  
Lincoln Bryant ◽  
Judith Stephen ◽  
...  

In this paper we showcase the support in Open Science Grid (OSG) of Midscale collaborations, the region of computing and storage scale where multi-institutional researchers collaborate to execute their science workflows on the grid without having dedicated technical support teams of their own. Collaboration Services enables such collaborations to take advantage of the distributed resources of the Open Science Grid by facilitating access to submission hosts, the deployment of their applications and supporting their data management requirements. Distributed computing software adopted from large scale collaborations, such as CVMFS, Rucio, xCache lower the barrier of intermediate scale research to integrate with existing infrastructure.

2007 ◽  
Vol 15 (4) ◽  
pp. 249-268 ◽  
Author(s):  
Gurmeet Singh ◽  
Karan Vahi ◽  
Arun Ramakrishnan ◽  
Gaurang Mehta ◽  
Ewa Deelman ◽  
...  

In this paper we examine the issue of optimizing disk usage and scheduling large-scale scientific workflows onto distributed resources where the workflows are data-intensive, requiring large amounts of data storage, and the resources have limited storage resources. Our approach is two-fold: we minimize the amount of space a workflow requires during execution by removing data files at runtime when they are no longer needed and we demonstrate that workflows may have to be restructured to reduce the overall data footprint of the workflow. We show the results of our data management and workflow restructuring solutions using a Laser Interferometer Gravitational-Wave Observatory (LIGO) application and an astronomy application, Montage, running on a large-scale production grid-the Open Science Grid. We show that although reducing the data footprint of Montage by 48% can be achieved with dynamic data cleanup techniques, LIGO Scientific Collaboration workflows require additional restructuring to achieve a 56% reduction in data space usage. We also examine the cost of the workflow restructuring in terms of the application's runtime.


2012 ◽  
Vol 20 (2) ◽  
pp. 89-114 ◽  
Author(s):  
H. Carter Edwards ◽  
Daniel Sunderland ◽  
Vicki Porter ◽  
Chris Amsler ◽  
Sam Mish

Large, complex scientific and engineering application code have a significant investment in computational kernels to implement their mathematical models. Porting these computational kernels to the collection of modern manycore accelerator devices is a major challenge in that these devices have diverse programming models, application programming interfaces (APIs), and performance requirements. The Kokkos Array programming model provides library-based approach to implement computational kernels that are performance-portable to CPU-multicore and GPGPU accelerator devices. This programming model is based upon three fundamental concepts: (1) manycore compute devices each with its own memory space, (2) data parallel kernels and (3) multidimensional arrays. Kernel execution performance is, especially for NVIDIA® devices, extremely dependent on data access patterns. Optimal data access pattern can be different for different manycore devices – potentially leading to different implementations of computational kernels specialized for different devices. The Kokkos Array programming model supports performance-portable kernels by (1) separating data access patterns from computational kernels through a multidimensional array API and (2) introduce device-specific data access mappings when a kernel is compiled. An implementation of Kokkos Array is available through Trilinos [Trilinos website, http://trilinos.sandia.gov/, August 2011].


2012 ◽  
pp. 862-880
Author(s):  
Russ Miller ◽  
Charles Weeks

Grids represent an emerging technology that allows geographically- and organizationally-distributed resources (e.g., computer systems, data repositories, sensors, imaging systems, and so forth) to be linked in a fashion that is transparent to the user. The New York State Grid (NYS Grid) is an integrated computational and data grid that provides access to a wide variety of resources to users from around the world. NYS Grid can be accessed via a Web portal, where the users have access to their data sets and applications, but do not need to be made aware of the details of the data storage or computational devices that are specifically employed in solving their problems. Grid-enabled versions of the SnB and BnP programs, which implement the Shake-and-Bake method of molecular structure (SnB) and substructure (BnP) determination, respectively, have been deployed on NYS Grid. Further, through the Grid Portal, SnB has been run simultaneously on all computational resources on NYS Grid as well as on more than 1100 of the over 3000 processors available through the Open Science Grid.


2020 ◽  
Vol 13 (12) ◽  
pp. 1656-1671 ◽  
Author(s):  
Jizhe Xia ◽  
Sicheng Huang ◽  
Shaobiao Zhang ◽  
Xiaoming Li ◽  
Jianrong Lyu ◽  
...  

2013 ◽  
Vol 10 (4) ◽  
pp. 1-19
Author(s):  
Andrei Hagiescu ◽  
Bing Liu ◽  
R. Ramanathan ◽  
Sucheendra K. Palaniappan ◽  
Zheng Cui ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document