Enhancing searches for resonances with machine learning and moment decomposition

Ouail Kitouni; Benjamin Nachman; Constantin Weisser; Mike Williams

doi:10.1007/jhep04(2021)070

Enhancing searches for resonances with machine learning and moment decomposition

Journal of High Energy Physics ◽

10.1007/jhep04(2021)070 ◽

2021 ◽

Vol 2021 (4) ◽

Author(s):

Ouail Kitouni ◽

Benjamin Nachman ◽

Constantin Weisser ◽

Mike Williams

Keyword(s):

Machine Learning ◽

Loss Function ◽

New Physics ◽

Background Estimation ◽

Localized Structures ◽

False Signal ◽

Resonant Feature

Abstract A key challenge in searches for resonant new physics is that classifiers trained to enhance potential signals must not induce localized structures. Such structures could result in a false signal when the background is estimated from data using sideband methods. A variety of techniques have been developed to construct classifiers which are independent from the resonant feature (often a mass). Such strategies are sufficient to avoid localized structures, but are not necessary. We develop a new set of tools using a novel moment loss function (Moment Decomposition or MoDe) which relax the assumption of independence without creating structures in the background. By allowing classifiers to be more flexible, we enhance the sensitivity to new physics without compromising the fidelity of the background estimation.

Download Full-text

Risky Driver Recognition with Class Imbalance Data and Automated Machine Learning Framework

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18147534 ◽

2021 ◽

Vol 18 (14) ◽

pp. 7534

Author(s):

Ke Wang ◽

Qingwen Xue ◽

Jian John Lu

Keyword(s):

Machine Learning ◽

High Risk ◽

Loss Function ◽

Class Imbalance ◽

Support Vector ◽

Trajectory Data ◽

Recognition Model ◽

Learning Framework ◽

Sampling Cost ◽

Automated Machine Learning

Identifying high-risk drivers before an accident happens is necessary for traffic accident control and prevention. Due to the class-imbalance nature of driving data, high-risk samples as the minority class are usually ill-treated by standard classification algorithms. Instead of applying preset sampling or cost-sensitive learning, this paper proposes a novel automated machine learning framework that simultaneously and automatically searches for the optimal sampling, cost-sensitive loss function, and probability calibration to handle class-imbalance problem in recognition of risky drivers. The hyperparameters that control sampling ratio and class weight, along with other hyperparameters, are optimized by Bayesian optimization. To demonstrate the performance of the proposed automated learning framework, we establish a risky driver recognition model as a case study, using video-extracted vehicle trajectory data of 2427 private cars on a German highway. Based on rear-end collision risk evaluation, only 4.29% of all drivers are labeled as risky drivers. The inputs of the recognition model are the discrete Fourier transform coefficients of target vehicle’s longitudinal speed, lateral speed, and the gap between the target vehicle and its preceding vehicle. Among 12 sampling methods, 2 cost-sensitive loss functions, and 2 probability calibration methods, the result of automated machine learning is consistent with manual searching but much more computation-efficient. We find that the combination of Support Vector Machine-based Synthetic Minority Oversampling TEchnique (SVMSMOTE) sampling, cost-sensitive cross-entropy loss function, and isotonic regression can significantly improve the recognition ability and reduce the error of predicted probability.

Download Full-text

Anomaly Detection for Resonant New Physics with Machine Learning

Physical Review Letters ◽

10.1103/physrevlett.121.241803 ◽

2018 ◽

Vol 121 (24) ◽

Cited By ~ 37

Author(s):

Jack Collins ◽

Kiel Howe ◽

Benjamin Nachman

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

New Physics

Download Full-text

Utilizing Unsupervised Machine Learning in BSM Physics Searches At The LHC

EPJ Web of Conferences ◽

10.1051/epjconf/202024506021 ◽

2020 ◽

Vol 245 ◽

pp. 06021

Author(s):

Adam Leinweber ◽

Martin White

Keyword(s):

Machine Learning ◽

Large Hadron Collider ◽

Hadron Collider ◽

New Physics ◽

Machine Learning Technique ◽

Unsupervised Machine Learning ◽

Learning Technique ◽

Bsm Physics ◽

Supersymmetric Particles ◽

One Machine

Recent searches for supersymmetric particles at the Large Hadron Collider have been unsuccessful in detecting any BSM physics. This is partially because the exact masses of supersymmetric particles are not known, and as such, searching for them is very difficult. The method broadly used in searching for new physics requires one to optimise on the signal being searched for, potentially suppressing sensitivity to new physics which may actually be present that does not resemble the chosen signal. The problem with this approach is that, in order to detect something with this method, one must already know what to look for. I will showcase one machine-learning technique that can be used to define a “signal-agnostic” search. This is a search that does not make any assumptions about the signal being searched for, allowing it to detect a signal in a more general way. This method is applied to simulated BSM physics data and the results are explored.

Download Full-text

A Cost-Sensitive Loss Function for Machine Learning

Database Systems for Advanced Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-91455-8_22 ◽

2018 ◽

pp. 255-268 ◽

Cited By ~ 1

Author(s):

Shihong Chen ◽

Xiaoqing Liu ◽

Baiqi Li

Keyword(s):

Machine Learning ◽

Loss Function

Download Full-text

Using machine-learning methods to analyze economic loss function of quality management processes

Journal of Physics Conference Series ◽

10.1088/1742-6596/1015/3/032031 ◽

2018 ◽

Vol 1015 ◽

pp. 032031

Author(s):

V A Dzedik ◽

P A Lontsikh

Keyword(s):

Machine Learning ◽

Quality Management ◽

Loss Function ◽

Economic Loss ◽

Learning Methods ◽

Machine Learning Methods ◽

Management Processes

Download Full-text

Kernel-Based Machine Learning for Background Estimation of NaI Low-Count Gamma-Ray Spectra

IEEE Transactions on Nuclear Science ◽

10.1109/tns.2013.2260868 ◽

2013 ◽

Vol 60 (3) ◽

pp. 2209-2221 ◽

Cited By ~ 9

Author(s):

M. Alamaniotis ◽

J. Mattingly ◽

L. H. Tsoukalas

Keyword(s):

Machine Learning ◽

Gamma Ray ◽

Background Estimation

Download Full-text

Building an NCAA men’s basketball predictive model and quantifying its success

Journal of Quantitative Analysis in Sports ◽

10.1515/jqas-2014-0058 ◽

2015 ◽

Vol 11 (1) ◽

Cited By ~ 4

Author(s):

Michael J. Lopez ◽

Gregory J. Matthews

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Predictive Model ◽

Loss Function ◽

Las Vegas ◽

Statistical Tools ◽

Men's Basketball ◽

Predictive Algorithms ◽

Complex Models ◽

Logistic Regressions

AbstractComputing and machine learning advancements have led to the creation of many cutting-edge predictive algorithms, some of which have been demonstrated to provide more accurate forecasts than traditional statistical tools. In this manuscript, we provide evidence that the combination of modest statistical methods with informative data can meet or exceed the accuracy of more complex models when it comes to predicting the NCAA men’s basketball tournament. First, we describe a prediction model that merges the point spreads set by Las Vegas sportsbooks with possession based team efficiency metrics by using logistic regressions. The set of probabilities generated from this model most accurately predicted the 2014 tournament, relative to approximately 400 competing submissions, as judged by the log loss function. Next, we attempt to quantify the degree to which luck played a role in the success of this model by simulating tournament outcomes under different sets of true underlying game probabilities. We estimate that under the most optimistic of game probability scenarios, our entry had roughly a 12% chance of outscoring all competing submissions and just less than a 50% chance of finishing with one of the ten best scores.

Download Full-text

Detecting an axion-like particle with machine learning at the LHC

Journal of High Energy Physics ◽

10.1007/jhep11(2021)138 ◽

2021 ◽

Vol 2021 (11) ◽

Author(s):

Jie Ren ◽

Daohan Wang ◽

Lei Wu ◽

Jin Min Yang ◽

Mengchao Zhang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Mass Range ◽

New Physics ◽

Machine Learning Techniques ◽

Global Symmetry ◽

Sequential Decay ◽

Production Processes ◽

Learning Techniques ◽

Photon Coupling

Abstract Axion-like particles (ALPs) appear in various new physics models with spon- taneous global symmetry breaking. When the ALP mass is in the range of MeV to GeV, the cosmology and astrophysics bounds are so far quite weak. In this work, we investi- gate such light ALPs through the ALP-strahlung production processes pp → W±a, Za with the sequential decay a → γγ at the 14 TeV LHC with an integrated luminosity of 3000 fb−1 (HL-LHC). Building on the concept of jet image which uses calorimeter towers as the pixels of the image and measures a jet as an image, we investigate the potential of machine learning techniques based on convolutional neural network (CNN) to identify the highly boosted ALPs which decay to a pair of highly collimated photons. With the CNN tagging algorithm, we demonstrate that our approach can extend current LHC sensitivity and probe the ALP mass range from 0.3 GeV to 5 GeV. The obtained bounds are stronger than the existing limits on the ALP-photon coupling.

Download Full-text

Quantifying the Impacts of Pre- and Post-Conception TSH Levels on Birth Outcomes: An Examination of Different Machine Learning Models

Frontiers in Endocrinology ◽

10.3389/fendo.2021.755364 ◽

2021 ◽

Vol 12 ◽

Author(s):

Yuantong Sun ◽

Weiwei Zheng ◽

Ling Zhang ◽

Huijuan Zhao ◽

Xun Li ◽

...

Keyword(s):

Machine Learning ◽

Birth Outcomes ◽

Loss Function ◽

Apgar Score ◽

Thyroid Stimulating Hormone ◽

Adverse Birth Outcomes ◽

Learning Models ◽

Neural Network Models ◽

Tsh Levels ◽

Machine Learning Models

BackgroundWhile previous studies identified risk factors for diverse pregnancy outcomes, traditional statistical methods had limited ability to quantify their impacts on birth outcomes precisely. We aimed to use a novel approach that applied different machine learning models to not only predict birth outcomes but systematically quantify the impacts of pre- and post-conception serum thyroid-stimulating hormone (TSH) levels and other predictive characteristics on birth outcomes.MethodsWe used data from women who gave birth in Shanghai First Maternal and Infant Hospital from 2014 to 2015. We included 14,110 women with the measurement of preconception TSH in the first analysis and 3,428 out of 14,110 women with both pre- and post-conception TSH measurement in the second analysis. Synthetic Minority Over-sampling Technique (SMOTE) was applied to adjust the imbalance of outcomes. We randomly split (7:3) the data into a training set and a test set in both analyses. We compared Area Under Curve (AUC) for dichotomous outcomes and macro F1 score for categorical outcomes among four machine learning models, including logistic model, random forest model, XGBoost model, and multilayer neural network models to assess model performance. The model with the highest AUC or macro F1 score was used to quantify the importance of predictive features for adverse birth outcomes with the loss function algorithm.ResultsThe XGBoost model provided prominent advantages in terms of improved performance and prediction of polytomous variables. Predictive models with abnormal preconception TSH or not-well-controlled TSH, a novel indicator with pre- and post-conception TSH levels combined, provided the similar robust prediction for birth outcomes. The highest AUC of 98.7% happened in XGBoost model for predicting low Apgar score with not-well-controlled TSH adjusted. By loss function algorithm, we found that not-well-controlled TSH ranked 4th, 6th, and 7th among 14 features, respectively, in predicting birthweight, induction, and preterm birth, and 3rd among 19 features in predicting low Apgar score.ConclusionsOur four machine learning models offered valid predictions of birth outcomes in women during pre- and post-conception. The predictive features panel suggested the combined TSH indicator (not-well-controlled TSH) could be a potentially competitive biomarker to predict adverse birth outcomes.

Download Full-text

Search for R-parity-violating supersymmetry in a final state containing leptons and many jets with the ATLAS experiment using $$\sqrt{s} = 13\hbox { TeV}$$ proton–proton collision data

The European Physical Journal C ◽

10.1140/epjc/s10052-021-09761-x ◽

2021 ◽

Vol 81 (11) ◽

Author(s):

G. Aad ◽

B. Abbott ◽

D. C. Abbott ◽

A. Abed Abud ◽

K. Abeling ◽

...

Keyword(s):

Machine Learning ◽

Hadron Collider ◽

Data Driven ◽

Machine Learning Techniques ◽

Proton Collision ◽

Top Squark ◽

Proton Proton ◽

Atlas Experiment ◽

Background Estimation ◽

Proton Proton Collision

AbstractA search for R-parity-violating supersymmetry in final states characterized by high jet multiplicity, at least one isolated light lepton and either zero or at least three b-tagged jets is presented. The search uses $${139}\,{\text {fb}^{-1}}$$ 139 fb - 1 of $$\sqrt{s} = {13}\hbox { TeV}$$ s = 13 TeV proton–proton collision data collected by the ATLAS experiment during Run 2 of the Large Hadron Collider. The results are interpreted in the context of R-parity-violating supersymmetry models that feature gluino production, top-squark production, or electroweakino production. The dominant sources of background are estimated using a data-driven model, based on observables at medium jet multiplicity, to predict the b-tagged jet multiplicity distribution at the higher jet multiplicities used in the search. Machine-learning techniques are used to reach sensitivity to electroweakino production, extending the data-driven background estimation to the shape of the machine-learning discriminant. No significant excess over the Standard Model expectation is observed and exclusion limits at the 95% confidence level are extracted, reaching as high as 2.4 TeV in gluino mass, 1.35 TeV in top-squark mass, and 320 (365) GeV in higgsino (wino) mass.

Download Full-text