DAQExpert the service to increase CMS data-taking efficiency

The Data Acquisition (DAQ) system of the Compact Muon Solenoid (CMS) experiment at the LHC is a complex system responsible for the data readout, event building and recording of accepted events. Its proper functioning plays a critical role in the data-taking efficiency of the CMS experiment. In order to ensure high availability and recover promptly in the event of hardware or software failure of the subsystems, an expert system, the DAQ Expert, has been developed. It aims at improving the data taking efficiency, reducing the human error in the operations and minimising the on-call expert demand. Introduced in the beginning of 2017, it assists the shift crew and the system experts in recovering from operational faults, streamlining the post mortem analysis and, at the end of Run 2, triggering fully automatic recovery without human intervention. DAQ Expert analyses the real-time monitoring data originating from the DAQ components and the high-level trigger updated every few seconds. It pinpoints data flow problems, and recovers them automatically or after given operator approval. We analyse the CMS downtime in the 2018 run focusing on what was improved with the introduction of automated recovery; present challenges and design of encoding the expert knowledge into automated recovery jobs. Furthermore, we demonstrate the web-based, ReactJS interfaces that ensure an effective cooperation between the human operators in the control room and the automated recovery system. We report on the operational experience with automated recovery.

Download Full-text

Operational experience with the new CMS DAQ-Expert

EPJ Web of Conferences ◽

10.1051/epjconf/201921401015 ◽

2019 ◽

Vol 214 ◽

pp. 01015 ◽

Cited By ~ 1

Author(s):

Jean-Marc Andre ◽

Ulf Behrens ◽

James Branson ◽

Philipp Brummer ◽

Sergio Cittolin ◽

...

Keyword(s):

Expert Knowledge ◽

Operational Experience ◽

Trigger System ◽

Web Based ◽

High Data ◽

Software Applications ◽

Level Trigger ◽

Level 1 ◽

Logic Modules ◽

High Level

The data acquisition (DAQ) system of the Compact Muon Solenoid (CMS) at CERN reads out the detector at the level-1 trigger accept rate of 100 kHz, assembles events with a bandwidth of 200 GB/s, provides these events to the high level-trigger running on a farm of about 30k cores and records the accepted events. Comprising custom-built and cutting edge commercial hardware and several 1000 instances of software applications, the DAQ system is complex in itself and failures cannot be completely excluded. Moreover, problems in the readout of the detectors,in the first level trigger system or in the high level trigger may provoke anomalous behaviour of the DAQ systemwhich sometimes cannot easily be differentiated from a problem in the DAQ system itself. In order to achieve high data taking efficiency with operators from the entire collaboration and without relying too heavily on the on-call experts, an expert system, the DAQ-Expert, has been developed that can pinpoint the source of most failures and give advice to the shift crew on how to recover in the quickest way. The DAQ-Expert constantly analyzes monitoring data from the DAQ system and the high level trigger by making use of logic modules written in Java that encapsulate the expert knowledge about potential operational problems. The results of the reasoning are presented to the operator in a web-based dashboard, may trigger sound alerts in the control room and are archived for post-mortem analysis - presented in a web-based timeline browser. We present the design of the DAQ-Expert and report on the operational experience since 2017, when it was first put into production.

Download Full-text

The CMS Trigger Upgrade for the HL-LHC

EPJ Web of Conferences ◽

10.1051/epjconf/202024501031 ◽

2020 ◽

Vol 245 ◽

pp. 01031

Author(s):

Thiago Rafael Fernandez Perez Tomei

Keyword(s):

High Performance ◽

Second Phase ◽

Level Trigger ◽

Cms Experiment ◽

Rejection Factor ◽

Reconstruction Software ◽

Efficient Data ◽

Level 1 ◽

High Level ◽

Selection Algorithms

The CMS experiment has been designed with a two-level trigger system: the Level-1 Trigger, implemented on custom-designed electronics, and the High Level Trigger, a streamlined version of the CMS offline reconstruction software running on a computer farm. During its second phase the LHC will reach a luminosity of 7.5 1034 cm−2 s−1 with a pileup of 200 collisions, producing integrated luminosity greater than 3000 fb−1 over the full experimental run. To fully exploit the higher luminosity, the CMS experiment will introduce a more advanced Level-1 Trigger and increase the full readout rate from 100 kHz to 750 kHz. CMS is designing an efficient data-processing hardware trigger that will include tracking information and high-granularity calorimeter information. The current Level-1 conceptual design is expected to take full advantage of advances in FPGA and link technologies over the coming years, providing a high-performance, low-latency system for large throughput and sophisticated data correlation across diverse sources. The higher luminosity, event complexity and input rate present an unprecedented challenge to the High Level Trigger that aims to achieve a similar efficiency and rejection factor as today despite the higher pileup and more pure preselection. In this presentation we will discuss the ongoing studies and prospects for the online reconstruction and selection algorithms for the high-luminosity era.

Download Full-text

Operational experience with the ALICE High Level Trigger

Journal of Physics Conference Series ◽

10.1088/1742-6596/396/1/012048 ◽

2012 ◽

Vol 396 (1) ◽

pp. 012048 ◽

Cited By ~ 3

Author(s):

Artur Szostak

Keyword(s):

Operational Experience ◽

Level Trigger ◽

High Level

Download Full-text

The CMS high level trigger

International Journal of Modern Physics Conference Series ◽

10.1142/s201019451460297x ◽

2014 ◽

Vol 31 ◽

pp. 1460297 ◽

Cited By ~ 1

Author(s):

Valentina Gori

Keyword(s):

Trigger System ◽

Output Rate ◽

Computing Power ◽

Level Trigger ◽

Cms Experiment ◽

Pile Up ◽

Reconstruction Software ◽

Physics Performance ◽

Level 1 ◽

High Level

The CMS experiment has been designed with a 2-level trigger system: the Level 1 Trigger, implemented on custom-designed electronics, and the High Level Trigger (HLT), a streamlined version of the CMS offline reconstruction software running on a computer farm. A software trigger system requires a tradeoff between the complexity of the algorithms running on the available computing power, the sustainable output rate, and the selection efficiency. Here we will present the performance of the main triggers used during the 2012 data taking, ranging from simpler single-object selections to more complex algorithms combining different objects, and applying analysis-level reconstruction and selection. We will discuss the optimisation of the triggers and the specific techniques to cope with the increasing LHC pile-up, reducing its impact on the physics performance.

Download Full-text

The CMS Event-Builder System for LHC Run 3 (2021-23)

EPJ Web of Conferences ◽

10.1051/epjconf/201921401006 ◽

2019 ◽

Vol 214 ◽

pp. 01006

Author(s):

Jean-Marc André ◽

Ulf Behrens ◽

James Branson ◽

Philipp Brummer ◽

Sergio Cittolin ◽

...

Keyword(s):

Network Architecture ◽

Hadron Collider ◽

Software Implementation ◽

Data Networks ◽

Cern Large Hadron Collider ◽

Traffic Pattern ◽

Level Trigger ◽

Cms Experiment ◽

Future Data ◽

High Level

The data acquisition system (DAQ) of the CMS experiment at the CERN Large Hadron Collider (LHC) assembles events of 2MB at a rate of 100 kHz. The event builder collects event fragments from about 750 sources and assembles them into complete events which are then handed to the High-Level Trigger (HLT) processes running on O(1000) computers. The aging eventbuilding hardware will be replaced during the long shutdown 2 of the LHC taking place in 2019/20. The future data networks will be based on 100 Gb/s interconnects using Ethernet and Infiniband technologies. More powerful computers may allow to combine the currently separate functionality of the readout and builder units into a single I/O processor handling simultaneously 100 Gb/s of input and output traffic. It might be beneficial to preprocess data originating from specific detector parts or regions before handling it to generic HLT processors. Therefore, we will investigate how specialized coprocessors, e.g. GPUs, could be integrated into the event builder. We will present the envisioned changes to the event-builder compared to today’s system. Initial measurements of the performance of the data networks under the event-building traffic pattern will be shown. Implications of a folded network architecture for the event building and corresponding changes to the software implementation will be discussed.

Download Full-text

Kalman Filter track reconstruction on FPGAs for acceleration of the High Level Trigger of the CMS experiment at the HL-LHC

EPJ Web of Conferences ◽

10.1051/epjconf/201921401003 ◽

2019 ◽

Vol 214 ◽

pp. 01003

Author(s):

Sioni Summers ◽

Andrew Rose

Keyword(s):

Kalman Filter ◽

High Performance ◽

High Energy Physics ◽

Computation Time ◽

High Energy ◽

Track Reconstruction ◽

Level Trigger ◽

Cms Experiment ◽

High Level ◽

Energy Physics

Track reconstruction at the CMS experiment uses the Combinatorial Kalman Filter. The algorithm computation time scales exponentially with pileup, which will pose a problem for the High Level Trigger at the High Luminosity LHC. FPGAs, which are already used extensively in hardware triggers, are becoming more widely used for compute acceleration. With a combination of high performance, energy efficiency, and predictable and low latency, FPGA accelerators are an interesting technology for high energy physics. Here, progress towards porting of the CMS track reconstruction to Maxeler Technologies’ Dataflow Engines is shown, programmed with their high level language MaxJ. The performance is compared to CPUs, and further steps to optimise for the architecture are presented.

Download Full-text

Trigger Rate Monitoring Tools at CMS

EPJ Web of Conferences ◽

10.1051/epjconf/201921401047 ◽

2019 ◽

Vol 214 ◽

pp. 01047

Author(s):

Andrew Wightman ◽

Geoffrey Smith ◽

Kelci Mohrman ◽

Charles Mueller

Keyword(s):

Event Rate ◽

Trigger System ◽

Level Trigger ◽

Cms Experiment ◽

Trigger Rate ◽

Monitoring Tools ◽

Pile Up ◽

Overall Performance ◽

Level 1 ◽

High Level

One of the major challenges for the Compact Muon Solenoid (CMS)experiment, is the task of reducing event rate from roughly 40 MHz down to a more manageable 1 kHz while keeping as many interesting physics events as possible. This is accomplished through the use of a Level-1 (L1) hardware based trigger as well as a software based High-Level Trigger (HLT). Monitoring and understanding the output rates of the L1 and HLT triggers is of key importance for determining the overall performance of the trigger system and is intimately tied to what type of data is being recorded for physics analyses. We present here a collection of tools used by CMS to monitor the L1 and HLT trigger rates. One of these tools is a script (run in the CMS control room) that gives valuable real-time feedback of trigger rates to the shift crew. Another useful tool is a plotting library, that is used for observing how trigger rates vary over a range of beam and detector conditions, in particular how the rates of individual triggers scale with event pile-up.

Download Full-text

Recording and reconstructing 10 billion unbiased b hadron decays in CMS

EPJ Web of Conferences ◽

10.1051/epjconf/202024501025 ◽

2020 ◽

Vol 245 ◽

pp. 01025

Author(s):

Robert Bainbridge

Keyword(s):

Data Stream ◽

High Efficiency ◽

Data Set ◽

Level Trigger ◽

Cms Experiment ◽

Trigger Strategy ◽

Reconstruction Performance ◽

Level 1 ◽

High Level ◽

New Algorithms

The CMS experiment has recorded a high-purity sample of 10 billion unbiased b hadron decays. The CMS trigger and data acquisition systems were configured to deliver a custom data stream at an average throughput of 2 GB s−1, which was “parked” prior to reconstruction. The data stream was defined by level-1 and high level trigger algorithms that operated at peak trigger rates in excess of 50 and 5 kHz, respectively. New algorithms have been developed to reconstruct and identify electrons with high efficiency at transverse momenta as low as 0.5 GeV. The trigger strategy and electron reconstruction performance were validated with pilot processing campaigns. The accumulation and reconstruction of this data set, now complete, were delivered without significant impact on the core physics programme of CMS. This unprecedented sample provides a unique opportunity for physics analyses in the flavour sector and beyond.

Download Full-text