scholarly journals Scientific Data Lake for High Luminosity LHC project and other data-intensive particle and astro-particle physics experiments

2020 ◽  
Vol 1690 ◽  
pp. 012166
Author(s):  
A Alekseev ◽  
A Kiryanov ◽  
A Klimentov ◽  
T Korchuganova ◽  
V Mitsyn ◽  
...  
2021 ◽  
Vol 251 ◽  
pp. 02031
Author(s):  
Aleksandr Alekseev ◽  
Xavier Espinal ◽  
Stephane Jezequel ◽  
Andrey Kiryanov ◽  
Alexei Klimentov ◽  
...  

The High Luminosity phase of the LHC, which aims for a tenfold increase in the luminosity of proton-proton collisions is expected to start operation in eight years. An unprecedented scientific data volume at the multiexabyte scale will be delivered to particle physics experiments at CERN. This amount of data has to be stored and the corresponding technology must ensure fast and reliable data delivery for processing by the scientific community all over the world. The present LHC computing model will not be able to provide the required infrastructure growth even taking into account the expected hardware evolution. To address this challenge the Data Lake R&D project has been launched by the DOMA community in the fall of 2019. State-of-the-art data handling technologies are under active development, and their current status for the Russian Scientific Data Lake prototype is presented here.


2016 ◽  
Author(s):  
◽  
Matthew Dickinson

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] In recent years, most scientific research in both academia and industry has become increasingly data-driven. According to market estimates, spending related to supporting scientific data-intensive research is expected to increase to $5.8 billion by 2018. Particularly for data-intensive scientific fields such as bioscience, or particle physics within academic environments, data storage/processing facilities, expert collaborators and specialized computing resources do not always reside within campus boundaries. With the growing trend of large collaborative partnerships involving researchers, expensive scientific instruments and high performance computing centers, experiments and simulations produce peta-bytes of data viz., Big Data, that is likely to be shared and analyzed by scientists in multi-disciplinary areas. Federated multi-cloud resource allocation for data-intensive application workflows is generally performed based on performance or quality of service (i.e., QSpecs) considerations. At the same time, end-to-end security requirements of these workflows across multiple domains are considered as an afterthought due to lack of standardized formalization methods. Consequently, diverse/heterogenous domain resource and security policies cause inter-conflicts between application's security and performance requirements that lead to sub-optimal resource allocations, especially when multiple such applications contend for limited resources. In this thesis, a joint performance and security-driven federated resource allocation scheme for data-intensive scientific applications is presented. In order to aid joint resource brokering among multi-cloud domains with diverse/heterogenous security postures, the definition and characterization of a data-intensive application's security specifications (i.e., SSpecs) is required. Next, an alignment technique inspired by Portunes Algebra to homogenize the various domain resource policies (i.e., RSpecs) along an application's workflow lifecycle stages is presented. Using such formalization and alignment, a near optimal cost-aware joint QSpecs-SSpecs-driven, RSpecs-compliant resource allocation algorithm for multi-cloud computing resource domain/location selection as well as network path selection, is proposed. We implement our security formalization, alignment, and allocation scheme as a framework, viz., "OnTimeURB" and validate it in a multi-cloud environment with exemplar data-intensive application workflows involving distributed computing and remote instrumentation use cases with different performance and security requirements.


2020 ◽  
Vol 35 (33) ◽  
pp. 2030022
Author(s):  
Aleksandr Alekseev ◽  
Simone Campana ◽  
Xavier Espinal ◽  
Stephane Jezequel ◽  
Andrey Kirianov ◽  
...  

The experiments at CERN’s Large Hadron Collider use the Worldwide LHC Computing Grid, the WLCG, for its distributed computing infrastructure. Through the distributed workload and data management systems, they provide seamless access to hundreds of grid, HPC and cloud based computing and storage resources that are distributed worldwide to thousands of physicists. LHC experiments annually process more than an exabyte of data using an average of 500,000 distributed CPU cores, to enable hundreds of new scientific results from the collider. However, the resources available to the experiments have been insufficient to meet data processing, simulation and analysis needs over the past five years as the volume of data from the LHC has grown. The problem will be even more severe for the next LHC phases. High Luminosity LHC will be a multiexabyte challenge where the envisaged Storage and Compute needs are a factor 10 to 100 above the expected technology evolution. The particle physics community needs to evolve current computing and data organization models in order to introduce changes in the way it uses and manages the infrastructure, focused on optimizations to bring performance and efficiency not forgetting simplification of operations. In this paper we highlight a recent R&D project related to scientific data lake and federated data storage.


1977 ◽  
Vol 140 (3) ◽  
pp. 549-552 ◽  
Author(s):  
E.D. Platner ◽  
A. Etkin ◽  
K.J. Foley ◽  
J.H. Goldman ◽  
W.A. Love ◽  
...  

2004 ◽  
Vol 13 (10) ◽  
pp. 2355-2359 ◽  
Author(s):  
JONATHAN L. FENG ◽  
ARVIND RAJARAMAN ◽  
FUMIHIRO TAKAYAMA

The gravitational interactions of elementary particles are suppressed by the Planck scale M*~1018 GeV and are typically expected to be far too weak to be probed by experiments. We show that, contrary to conventional wisdom, such interactions may be studied by particle physics experiments in the next few years. As an example, we consider conventional supergravity with a stable gravitino as the lightest supersymmetric particle. The next-lightest supersymmetric particle (NLSP) decays to the gravitino through gravitational interactions after about a year. This lifetime can be measured by stopping NLSPs at colliders and observing their decays. Such studies will yield a measurement of Newton's gravitational constant on unprecedentedly small scales, shed light on dark matter, and provide a window on the early universe.


Sign in / Sign up

Export Citation Format

Share Document