scholarly journals Heat Prediction of High Energy Physical Data Based on LSTM Recurrent Neural Network

2020 ◽  
Vol 245 ◽  
pp. 04002
Author(s):  
Zhenjing Cheng ◽  
Lu Wang ◽  
Yaodong Cheng ◽  
Gang Chen

High-energy physics computing is a typical data-intensive calculation. Each year, petabytes of data needs to be analyzed, and data access performance is increasingly demanding. The tiered storage system scheme for building a unified namespace has been widely adopted. Generally, data is stored on storage devices with different performances and different prices according to different access frequency. When the heat of the data changes, the data is then migrated to the appropriate storage tier. At present, heuristic algorithms based on artificial experience are widely used in data heat prediction. Due to the differences in computing models of different users, the accuracy of prediction is low. A method for predicting future access popularity based on file access characteristics with the help of LSTM deep learning algorithm is proposed as the basis for data migration in hierarchical storage. This paper uses the real data of high-energy physics experiment LHAASO as an example for comparative testing. The results show that under the same test conditions, the model has higher prediction accuracy and stronger applicability than existing prediction models.

2020 ◽  
Vol 245 ◽  
pp. 06042
Author(s):  
Oliver Gutsche ◽  
Igor Mandrichenko

A columnar data representation is known to be an efficient way for data storage, specifically in cases when the analysis is often done based only on a small fragment of the available data structures. A data representation like Apache Parquet is a step forward from a columnar representation, which splits data horizontally to allow for easy parallelization of data analysis. Based on the general idea of columnar data storage, working on the [LDRD Project], we have developed a striped data representation, which, we believe, is better suited to the needs of High Energy Physics data analysis. A traditional columnar approach allows for efficient data analysis of complex structures. While keeping all the benefits of columnar data representations, the striped mechanism goes further by enabling easy parallelization of computations without requiring special hardware. We will present an implementation and some performance characteristics of such a data representation mechanism using a distributed no-SQL database or a local file system, unified under the same API and data representation model. The representation is efficient and at the same time simple so that it allows for a common data model and APIs for wide range of underlying storage mechanisms such as distributed no-SQL databases and local file systems. Striped storage adopts Numpy arrays as its basic data representation format, which makes it easy and efficient to use in Python applications. The Striped Data Server is a web service, which allows to hide the server implementation details from the end user, easily exposes data to WAN users, and allows to utilize well known and developed data caching solutions to further increase data access efficiency. We are considering the Striped Data Server as the core of an enterprise scale data analysis platform for High Energy Physics and similar areas of data processing. We have been testing this architecture with a 2TB dataset from a CMS dark matter search and plan to expand it to multiple 100 TB or even PB scale. We will present the striped format, Striped Data Server architecture and performance test results.


2017 ◽  
Vol 898 ◽  
pp. 062003
Author(s):  
Qiulan Huang ◽  
Ran Du ◽  
YaoDong Cheng ◽  
Jingyan Shi ◽  
Gang Chen ◽  
...  

Author(s):  
José Manuel Manuel Clavijo Columbié ◽  
Paul Glaysher ◽  
Jenia Jitsev ◽  
Judith Maria Katzy

Abstract We apply adversarial domain adaptation to reduce sample bias in a classification machine learning algorithm. We add a gradient reversal layer to a neural network to simultaneously classify signal versus background events, while minimising the difference of the classifier response to a background sample using an alternative MC model. We show this on the example of simulated events at the LHC with $t\bar{t}H$ signal versus $t\bar{t}b\bar{b}$ background classification.


1995 ◽  
Vol 06 (04) ◽  
pp. 579-584 ◽  
Author(s):  
CLARK S. LINDSEY ◽  
THOMAS LINDBLAD ◽  
GIVI SEKHNIAIDZE ◽  
G. SZÉKELY ◽  
M. MINERSKJÖLD

The new IBM Zero Instruction Set Computer (ZISC) provides a radial basis function neural network. The first generation chip (ZISC036) allows for 64 8-bit inputs, 36 RBF neurons in the middle layer, and up to 16383 possible output categories. Forward processing takes 4μs with a 20MHz clock. Cascading multiple chips increases the number of available RBF’s with no increase in processing time. The chip also executes a learning algorithm. We report on tests of the ZISC with a high energy physics related task.


Author(s):  
Preeti Kumari ◽  
◽  
Kavita Lalwani ◽  
Ranjit Dalal ◽  
Ashutosh Bhardwaj ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document