Machine Learning to Design an Auto-tuning System for the Best Compressed Format Detection for Parallel Sparse Computations

Author(s):  
Olfa Hamdi-Larbi ◽  
Ichrak Mehrez ◽  
Thomas Dufaud

Many applications in scientific computing process very large sparse matrices on parallel architectures. The presented work in this paper is a part of a project where our general aim is to develop an auto-tuner system for the selection of the best matrix compression format in the context of high-performance computing. The target smart system can automatically select the best compression format for a given sparse matrix, a numerical method processing this matrix, a parallel programming model and a target architecture. Hence, this paper describes the design and implementation of the proposed concept. We consider a case study consisting of a numerical method reduced to the sparse matrix vector product (SpMV), some compression formats, the data parallel as a programming model and, a distributed multi-core platform as a target architecture. This study allows extracting a set of important novel metrics and parameters which are relative to the considered programming model. Our metrics are used as input to a machine-learning algorithm to predict the best matrix compression format. An experimental study targeting a distributed multi-core platform and processing random and real-world matrices shows that our system can improve in average up to 7% the accuracy of the machine learning.

Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 656
Author(s):  
Xavier Larriva-Novo ◽  
Víctor A. Villagrá ◽  
Mario Vega-Barbas ◽  
Diego Rivera ◽  
Mario Sanz Rodrigo

Security in IoT networks is currently mandatory, due to the high amount of data that has to be handled. These systems are vulnerable to several cybersecurity attacks, which are increasing in number and sophistication. Due to this reason, new intrusion detection techniques have to be developed, being as accurate as possible for these scenarios. Intrusion detection systems based on machine learning algorithms have already shown a high performance in terms of accuracy. This research proposes the study and evaluation of several preprocessing techniques based on traffic categorization for a machine learning neural network algorithm. This research uses for its evaluation two benchmark datasets, namely UGR16 and the UNSW-NB15, and one of the most used datasets, KDD99. The preprocessing techniques were evaluated in accordance with scalar and normalization functions. All of these preprocessing models were applied through different sets of characteristics based on a categorization composed by four groups of features: basic connection features, content characteristics, statistical characteristics and finally, a group which is composed by traffic-based features and connection direction-based traffic characteristics. The objective of this research is to evaluate this categorization by using various data preprocessing techniques to obtain the most accurate model. Our proposal shows that, by applying the categorization of network traffic and several preprocessing techniques, the accuracy can be enhanced by up to 45%. The preprocessing of a specific group of characteristics allows for greater accuracy, allowing the machine learning algorithm to correctly classify these parameters related to possible attacks.


Author(s):  
Simon McIntosh–Smith ◽  
Rob Hunt ◽  
James Price ◽  
Alex Warwick Vesztrocy

High-performance computing systems continue to increase in size in the quest for ever higher performance. The resulting increased electronic component count, coupled with the decrease in feature sizes of the silicon manufacturing processes used to build these components, may result in future exascale systems being more susceptible to soft errors caused by cosmic radiation than in current high-performance computing systems. Through the use of techniques such as hardware-based error-correcting codes and checkpoint-restart, many of these faults can be mitigated at the cost of increased hardware overhead, run-time, and energy consumption that can be as much as 10–20%. Some predictions expect these overheads to continue to grow over time. For extreme scale systems, these overheads will represent megawatts of power consumption and millions of dollars of additional hardware costs, which could potentially be avoided with more sophisticated fault-tolerance techniques. In this paper we present new software-based fault tolerance techniques that can be applied to one of the most important classes of software in high-performance computing: iterative sparse matrix solvers. Our new techniques enables us to exploit knowledge of the structure of sparse matrices in such a way as to improve the performance, energy efficiency, and fault tolerance of the overall solution.


2021 ◽  
Author(s):  
Inger Persson ◽  
Andreas Östling ◽  
Martin Arlbrandt ◽  
Joakim Söderberg ◽  
David Becedas

BACKGROUND Despite decades of research, sepsis remains a leading cause of mortality and morbidity in ICUs worldwide. The key to effective management and patient outcome is early detection, where no prospectively validated machine learning prediction algorithm is available for clinical use in Europe today. OBJECTIVE To develop a high-performance machine learning sepsis prediction algorithm based on routinely collected ICU data, designed to be implemented in Europe. METHODS The machine learning algorithm is developed using Convolutional Neural Network, based on the Massachusetts Institute of Technology Lab for Computational Physiology MIMIC-III Clinical Database, focusing on ICU patients aged 18 years or older. Twenty variables are used for prediction, on an hourly basis. Onset of sepsis is defined in accordance with the international Sepsis-3 criteria. RESULTS The developed algorithm NAVOY Sepsis uses 4 hours of input and can with high accuracy predict patients with high risk of developing sepsis in the coming hours. The prediction performance is superior to that of existing sepsis early warning scoring systems, and competes well with previously published prediction algorithms designed to predict sepsis onset in accordance with the Sepsis-3 criteria, as measured by the area under the receiver operating characteristics curve (AUROC) and the area under the precision-recall curve (AUPRC). NAVOY Sepsis yields AUROC = 0.90 and AUPRC = 0.62 for predictions up to 3 hours before sepsis onset. The predictive performance is externally validated on hold-out test data, where NAVOY Sepsis is confirmed to predict sepsis with high accuracy. CONCLUSIONS An algorithm with excellent predictive properties has been developed, based on variables routinely collected at ICUs. This algorithm is to be further validated in an ongoing prospective randomized clinical trial and will be CE marked as Software as a Medical Device, designed for commercial use in European ICUs.


2021 ◽  
Author(s):  
Ekaterina Gurina ◽  
Ksenia Antipova ◽  
Nikita Klyuchnikov ◽  
Dmitry Koroteev

Abstract Drilling accidents prediction is the important task in well construction. Drilling support software allows observing the drilling parameters for multiple wells at the same time and artificial intelligence helps detecting the drilling accident predecessor ahead the emergency situation. We present machine learning (ML) algorithm for prediction of such accidents as stuck, mud loss, fluid show, washout, break of drill string and shale collar. The model for forecasting the drilling accidents is based on the "Bag-of-features" approach, which implies the use of distributions of the directly recorded data as the main features. Bag-of-features implies the labeling of small parts of data by the particular symbol, named codeword. Building histograms of symbols for the data segment, one could use the histogram as an input for the machine learning algorithm. Fragments of real-time mud log data were used to create the model. We define more than 1000 drilling accident predecessors for more than 60 real accidents and about 2500 normal drilling cases as a training set for ML model. The developed model analyzes real-time mud log data and calculates the probability of accident. The result is presented as a probability curve for each type of accident; if the critical probability value is exceeded, the user is notified of the risk of an accident. The Bag-of-features model shows high performance by validation both on historical data and in real time. The prediction quality does not vary field to field and could be used in different fields without additional training of the ML model. The software utilizing the ML model has microservice architecture and is integrated with the WITSML data server. It is capable of real-time accidents forecasting without human intervention. As a result, the system notifies the user in all cases when the situation in the well becomes similar to the pre-accident one, and the engineer has enough time to take the necessary actions to prevent an accident.


Author(s):  
Lena Oden ◽  
Holger Fröning

Due to their massive parallelism and high performance per Watt, GPUs have gained high popularity in high-performance computing and are a strong candidate for future exascale systems. But communication and data transfer in GPU-accelerated systems remain a challenging problem. Since the GPU normally is not able to control a network device, a hybrid-programming model is preferred whereby the GPU is used for calculation and the CPU handles the communication. As a result, communication between distributed GPUs suffers from unnecessary overhead, introduced by switching control flow from GPUs to CPUs and vice versa. Furthermore, often a designated CPU thread is required to control GPU-related communication. In this work, we modify user space libraries and device drivers of GPUs and the InfiniBand network device in a way to enable the GPU to control an InfiniBand network device to independently source and sink communication requests without any involvement of the CPU. Our results show that complex networking protocols such as InfiniBand Verbs are better handled by CPUs, since overhead of work request generation cannot be parallelized and is not suitable for the highly parallel programming model of GPUs. The massive number of instructions and accesses to host memory that is required to source and sink a communication request on the GPU slows down the performance. Only through a massive reduction in the complexity of the InfiniBand protocol can some performance improvements be achieved.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
T. Bagni ◽  
G. Bovone ◽  
A. Rack ◽  
D. Mauro ◽  
C. Barth ◽  
...  

AbstractThe electro-mechanical and electro-thermal properties of high-performance Restacked-Rod-Process (RRP) Nb3Sn wires are key factors in the realization of compact magnets above 15 T for the future particle physics experiments. Combining X-ray micro-tomography with unsupervised machine learning algorithm, we provide a new tool capable to study the internal features of RRP wires and unlock different approaches to enhance their performances. Such tool is ideal to characterize the distribution and morphology of the voids that are generated during the heat treatment necessary to form the Nb3Sn superconducting phase. Two different types of voids can be detected in this type of wires: one inside the copper matrix and the other inside the Nb3Sn sub-elements. The former type can be related to Sn leaking from sub-elements to the copper matrix which leads to poor electro-thermal stability of the whole wire. The second type is detrimental for the electro-mechanical performance of the wires as superconducting wires experience large electromagnetic stresses in high field and high current conditions. We analyze these aspects thoroughly and discuss the potential of the X-ray tomography analysis tool to help modeling and predicting electro-mechanical and electro-thermal behavior of RRP wires and optimize their design.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Maryam Abedi ◽  
Hamid Reza Marateb ◽  
Mohammad Reza Mohebian ◽  
Seyed Hamid Aghaee-Bakhtiari ◽  
Seyed Mahdi Nassiri ◽  
...  

AbstractDiabetic nephropathy (DN), the leading cause of end-stage renal disease, has become a massive global health burden. Despite considerable efforts, the underlying mechanisms have not yet been comprehensively understood. In this study, a systematic approach was utilized to identify the microRNA signature in DN and to introduce novel drug targets (DTs) in DN. Using microarray profiling followed by qPCR confirmation, 13 and 6 differentially expressed (DE) microRNAs were identified in the kidney cortex and medulla, respectively. The microRNA-target interaction networks for each anatomical compartment were constructed and central nodes were identified. Moreover, enrichment analysis was performed to identify key signaling pathways. To develop a strategy for DT prediction, the human proteome was annotated with 65 biochemical characteristics and 23 network topology parameters. Furthermore, all proteins targeted by at least one FDA-approved drug were identified. Next, mGMDH-AFS, a high-performance machine learning algorithm capable of tolerating massive imbalanced size of the classes, was developed to classify DT and non-DT proteins. The sensitivity, specificity, accuracy, and precision of the proposed method were 90%, 86%, 88%, and 89%, respectively. Moreover, it significantly outperformed the state-of-the-art (P-value ≤ 0.05) and showed very good diagnostic accuracy and high agreement between predicted and observed class labels. The cortex and medulla networks were then analyzed with this validated machine to identify potential DTs. Among the high-rank DT candidates are Egfr, Prkce, clic5, Kit, and Agtr1a which is a current well-known target in DN. In conclusion, a combination of experimental and computational approaches was exploited to provide a holistic insight into the disorder for introducing novel therapeutic targets.


Sign in / Sign up

Export Citation Format

Share Document