scholarly journals Integration of Rucio in Belle II

2021 ◽  
Vol 251 ◽  
pp. 02057
Author(s):  
Cédric Serfon ◽  
Ruslan Mashinistov ◽  
John Steven De Stefano ◽  
Michel Hernández Villanueva ◽  
Hironori Ito ◽  
...  

The Belle II experiment, which started taking physics data in April 2019, will multiply the volume of data currently stored on its nearly 30 storage elements worldwide by one order of magnitude to reach about 340 PB of data (raw and Monte Carlo simulation data) by the end of operations. To tackle this massive increase and to manage the data even after the end of the data taking, it was decided to move the Distributed Data Management software from a homegrown piece of software to a widely used Data Management solution in HEP and beyond : Rucio. This contribution describes the work done to integrate Rucio with Belle II distributed computing infrastructure as well as the migration strategy that was successfully performed to ensure a smooth transition.

2020 ◽  
Vol 245 ◽  
pp. 04007 ◽  
Author(s):  
Siarhei Padolski ◽  
Hironori Ito ◽  
Paul Laycock ◽  
Ruslan Mashinistov ◽  
Hideki Miyake ◽  
...  

The Belle II experiment started taking physics data in April 2018 with an estimated total volume of all files including raw events, Monte-Carlo and skim statistics of 340 petabytes expected by the end of operations in the late-2020s. Originally designed as a fully integrated component of the BelleDIRAC production system, the Belle II distributed data management (DDM) software needs to manage data across about 29 storage elements worldwide for a collaboration of nearly 1000 physicists. By late 2018, this software required significant performance improvements to meet the requirements of physics data taking and was seriously lacking in automation. Rucio, the DDM solution created by ATLAS, was an obvious alternative but required tight integration with BelleDIRAC and a seamless yet non-trivial migration. This contribution describes the work done on both DDM options, the current status of the software running successfully in production and the problems associated with trying to balance long-term operations cost against short term risk.


2019 ◽  
Vol 214 ◽  
pp. 04031
Author(s):  
Malachi Schram

The Belle II experiment at the SuperKEKB collider in Tsukuba, Japan, has started taking physics data in early 2018 and plans to accumulate 50 ab-1, which is approximately 50 times more data than the Belle experiment. The collaboration expects it will require managing and processing approximately 200 PB of data. Computing at this scale requires efficient and coordinated use of the geographically distributed compute resources in North America, Asia and Europe and will take advantage of high-speed global networks. We present the general Belle II the distributed data management system and computing results from the first phase of data taking.


2021 ◽  
Vol 251 ◽  
pp. 02026
Author(s):  
Cédric Serfon ◽  
John Steven De Stefano ◽  
Michel Hernández Villanueva ◽  
Hironori Ito ◽  
Yuji Kato ◽  
...  

DIRAC and Rucio are two standard pieces of software widely used in the HEP domain. DIRAC provides Workload and Data Management functionalities, among other things, while Rucio is a dedicated, advanced Distributed Data Management system. Many communities that already use DIRAC have expressed their interest in using DIRAC for Workload Management in combination with Rucio for Data Management. In this paper, we describe the integration of the Rucio File Catalog into DIRAC that was initially developed for the Belle II collaboration.


2014 ◽  
Vol 513 (3) ◽  
pp. 032095 ◽  
Author(s):  
Wataru Takase ◽  
Yoshimi Matsumoto ◽  
Adil Hasan ◽  
Francesca Di Lodovico ◽  
Yoshiyuki Watase ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document