scholarly journals DAQExpert the service to increase CMS data-taking efficiency

2020 ◽  
Vol 245 ◽  
pp. 01028
Author(s):  
Gilbert Badaro ◽  
Ulf Behrens ◽  
James Branson ◽  
Philipp Brummer ◽  
Sergio Cittolin ◽  
...  

The Data Acquisition (DAQ) system of the Compact Muon Solenoid (CMS) experiment at the LHC is a complex system responsible for the data readout, event building and recording of accepted events. Its proper functioning plays a critical role in the data-taking efficiency of the CMS experiment. In order to ensure high availability and recover promptly in the event of hardware or software failure of the subsystems, an expert system, the DAQ Expert, has been developed. It aims at improving the data taking efficiency, reducing the human error in the operations and minimising the on-call expert demand. Introduced in the beginning of 2017, it assists the shift crew and the system experts in recovering from operational faults, streamlining the post mortem analysis and, at the end of Run 2, triggering fully automatic recovery without human intervention. DAQ Expert analyses the real-time monitoring data originating from the DAQ components and the high-level trigger updated every few seconds. It pinpoints data flow problems, and recovers them automatically or after given operator approval. We analyse the CMS downtime in the 2018 run focusing on what was improved with the introduction of automated recovery; present challenges and design of encoding the expert knowledge into automated recovery jobs. Furthermore, we demonstrate the web-based, ReactJS interfaces that ensure an effective cooperation between the human operators in the control room and the automated recovery system. We report on the operational experience with automated recovery.

2019 ◽  
Vol 214 ◽  
pp. 01015 ◽  
Author(s):  
Jean-Marc Andre ◽  
Ulf Behrens ◽  
James Branson ◽  
Philipp Brummer ◽  
Sergio Cittolin ◽  
...  

The data acquisition (DAQ) system of the Compact Muon Solenoid (CMS) at CERN reads out the detector at the level-1 trigger accept rate of 100 kHz, assembles events with a bandwidth of 200 GB/s, provides these events to the high level-trigger running on a farm of about 30k cores and records the accepted events. Comprising custom-built and cutting edge commercial hardware and several 1000 instances of software applications, the DAQ system is complex in itself and failures cannot be completely excluded. Moreover, problems in the readout of the detectors,in the first level trigger system or in the high level trigger may provoke anomalous behaviour of the DAQ systemwhich sometimes cannot easily be differentiated from a problem in the DAQ system itself. In order to achieve high data taking efficiency with operators from the entire collaboration and without relying too heavily on the on-call experts, an expert system, the DAQ-Expert, has been developed that can pinpoint the source of most failures and give advice to the shift crew on how to recover in the quickest way. The DAQ-Expert constantly analyzes monitoring data from the DAQ system and the high level trigger by making use of logic modules written in Java that encapsulate the expert knowledge about potential operational problems. The results of the reasoning are presented to the operator in a web-based dashboard, may trigger sound alerts in the control room and are archived for post-mortem analysis - presented in a web-based timeline browser. We present the design of the DAQ-Expert and report on the operational experience since 2017, when it was first put into production.


2020 ◽  
Vol 245 ◽  
pp. 01031
Author(s):  
Thiago Rafael Fernandez Perez Tomei

The CMS experiment has been designed with a two-level trigger system: the Level-1 Trigger, implemented on custom-designed electronics, and the High Level Trigger, a streamlined version of the CMS offline reconstruction software running on a computer farm. During its second phase the LHC will reach a luminosity of 7.5 1034 cm−2 s−1 with a pileup of 200 collisions, producing integrated luminosity greater than 3000 fb−1 over the full experimental run. To fully exploit the higher luminosity, the CMS experiment will introduce a more advanced Level-1 Trigger and increase the full readout rate from 100 kHz to 750 kHz. CMS is designing an efficient data-processing hardware trigger that will include tracking information and high-granularity calorimeter information. The current Level-1 conceptual design is expected to take full advantage of advances in FPGA and link technologies over the coming years, providing a high-performance, low-latency system for large throughput and sophisticated data correlation across diverse sources. The higher luminosity, event complexity and input rate present an unprecedented challenge to the High Level Trigger that aims to achieve a similar efficiency and rejection factor as today despite the higher pileup and more pure preselection. In this presentation we will discuss the ongoing studies and prospects for the online reconstruction and selection algorithms for the high-luminosity era.


2014 ◽  
Vol 31 ◽  
pp. 1460297 ◽  
Author(s):  
Valentina Gori

The CMS experiment has been designed with a 2-level trigger system: the Level 1 Trigger, implemented on custom-designed electronics, and the High Level Trigger (HLT), a streamlined version of the CMS offline reconstruction software running on a computer farm. A software trigger system requires a tradeoff between the complexity of the algorithms running on the available computing power, the sustainable output rate, and the selection efficiency. Here we will present the performance of the main triggers used during the 2012 data taking, ranging from simpler single-object selections to more complex algorithms combining different objects, and applying analysis-level reconstruction and selection. We will discuss the optimisation of the triggers and the specific techniques to cope with the increasing LHC pile-up, reducing its impact on the physics performance.


2019 ◽  
Vol 214 ◽  
pp. 01006
Author(s):  
Jean-Marc André ◽  
Ulf Behrens ◽  
James Branson ◽  
Philipp Brummer ◽  
Sergio Cittolin ◽  
...  

The data acquisition system (DAQ) of the CMS experiment at the CERN Large Hadron Collider (LHC) assembles events of 2MB at a rate of 100 kHz. The event builder collects event fragments from about 750 sources and assembles them into complete events which are then handed to the High-Level Trigger (HLT) processes running on O(1000) computers. The aging eventbuilding hardware will be replaced during the long shutdown 2 of the LHC taking place in 2019/20. The future data networks will be based on 100 Gb/s interconnects using Ethernet and Infiniband technologies. More powerful computers may allow to combine the currently separate functionality of the readout and builder units into a single I/O processor handling simultaneously 100 Gb/s of input and output traffic. It might be beneficial to preprocess data originating from specific detector parts or regions before handling it to generic HLT processors. Therefore, we will investigate how specialized coprocessors, e.g. GPUs, could be integrated into the event builder. We will present the envisioned changes to the event-builder compared to today’s system. Initial measurements of the performance of the data networks under the event-building traffic pattern will be shown. Implications of a folded network architecture for the event building and corresponding changes to the software implementation will be discussed.


2019 ◽  
Vol 214 ◽  
pp. 01003
Author(s):  
Sioni Summers ◽  
Andrew Rose

Track reconstruction at the CMS experiment uses the Combinatorial Kalman Filter. The algorithm computation time scales exponentially with pileup, which will pose a problem for the High Level Trigger at the High Luminosity LHC. FPGAs, which are already used extensively in hardware triggers, are becoming more widely used for compute acceleration. With a combination of high performance, energy efficiency, and predictable and low latency, FPGA accelerators are an interesting technology for high energy physics. Here, progress towards porting of the CMS track reconstruction to Maxeler Technologies’ Dataflow Engines is shown, programmed with their high level language MaxJ. The performance is compared to CPUs, and further steps to optimise for the architecture are presented.


2019 ◽  
Vol 214 ◽  
pp. 01047
Author(s):  
Andrew Wightman ◽  
Geoffrey Smith ◽  
Kelci Mohrman ◽  
Charles Mueller

One of the major challenges for the Compact Muon Solenoid (CMS)experiment, is the task of reducing event rate from roughly 40 MHz down to a more manageable 1 kHz while keeping as many interesting physics events as possible. This is accomplished through the use of a Level-1 (L1) hardware based trigger as well as a software based High-Level Trigger (HLT). Monitoring and understanding the output rates of the L1 and HLT triggers is of key importance for determining the overall performance of the trigger system and is intimately tied to what type of data is being recorded for physics analyses. We present here a collection of tools used by CMS to monitor the L1 and HLT trigger rates. One of these tools is a script (run in the CMS control room) that gives valuable real-time feedback of trigger rates to the shift crew. Another useful tool is a plotting library, that is used for observing how trigger rates vary over a range of beam and detector conditions, in particular how the rates of individual triggers scale with event pile-up.


2020 ◽  
Vol 245 ◽  
pp. 01025
Author(s):  
Robert Bainbridge

The CMS experiment has recorded a high-purity sample of 10 billion unbiased b hadron decays. The CMS trigger and data acquisition systems were configured to deliver a custom data stream at an average throughput of 2 GB s−1, which was “parked” prior to reconstruction. The data stream was defined by level-1 and high level trigger algorithms that operated at peak trigger rates in excess of 50 and 5 kHz, respectively. New algorithms have been developed to reconstruct and identify electrons with high efficiency at transverse momenta as low as 0.5 GeV. The trigger strategy and electron reconstruction performance were validated with pilot processing campaigns. The accumulation and reconstruction of this data set, now complete, were delivered without significant impact on the core physics programme of CMS. This unprecedented sample provides a unique opportunity for physics analyses in the flavour sector and beyond.


Sign in / Sign up

Export Citation Format

Share Document