scholarly journals Predicting file lifetimes for data placement in multi-tiered storage systems for HPC

2021 ◽  
Vol 55 (1) ◽  
pp. 99-107
Author(s):  
Luis Thomas ◽  
Sebastien Gougeaud ◽  
Stéphane Rubini ◽  
Philippe Deniel ◽  
Jalil Boukhobza

The emergence of Exascale machines in HPC will have the foreseen consequence of putting more pressure on the storage systems in place, not only in terms of capacity but also bandwidth and latency. With limited budget we cannot imagine using only storage class memory, which leads to the use of a heterogeneous tiered storage hierarchy. In order to make the most efficient use of the high performance tier in this storage hierarchy, we need to be able to place user data on the right tier and at the right time. In this paper, we assume a 2-tier storage hierarchy with a high performance tier and a high capacity archival tier. Files are placed on the high performance tier at creation time and moved to capacity tier once their lifetime expires (that is once they are no more accessed). The main contribution of this paper lies in the design of a file lifetime prediction model solely based on its path based on the use of Convolutional Neural Network. Results show that our solution strikes a good trade-off between accuracy and under-estimation. Compared to previous work, our model made it possible to reach an accuracy close to previous work (around 98.60% compared to 98.84%) while reducing the underestimations by almost 10x to reach 2.21% (compared to 21.86%). The reduction in underestimations is crucial as it avoids misplacing files in the capacity tier while they are still in use.

2021 ◽  
Vol 17 (4) ◽  
pp. 1-21
Author(s):  
Devarshi Ghoshal ◽  
Lavanya Ramakrishnan

Scientific workflows in High Performance Computing ( HPC ) environments are processing large amounts of data. The storage hierarchy on HPC systems is getting deeper, driven by new technologies (NVRAMs, SSDs, etc.) There is a need for new programming abstractions that allow users to seamlessly manage data at the workflow level on multi-tiered storage systems, and provide optimal workflow performance and use of storage resources. In previous work, we introduced a software architecture Managing Data on Tiered Storage for Scientific Workflows (MaDaTS ) that used a Virtual Data Space ( VDS ) abstraction to hide the complexities of the underlying storage system while allowing users to control data management strategies. In this article, we detail the data-centric programming abstractions that allow users to manage a workflow around its data on the storage layer. The programming abstractions simplify data management for scientific workflows on multi-tiered storage systems, without affecting workflow performance or storage capacity. We measure the overheads and effectiveness introduced by the programming abstractions of MaDaTS. Our results show that these abstractions can optimally use the storage capacity in lesser capacity storage tiers, and simplify data management without adding any performance overheads.


2008 ◽  
Vol 59 (7) ◽  
Author(s):  
Corina Samoila ◽  
Alfa Xenia Lupea ◽  
Andrei Anghel ◽  
Marilena Motoc ◽  
Gabriela Otiman ◽  
...  

Denaturing High Performance Liquid Chromatography (DHPLC) is a relatively new method used for screening DNA sequences, characterized by high capacity to detect mutations/polymorphisms. This study is focused on the Transgenomic WAVETM DNA Fragment Analysis (based on DHPLC separation method) of a 485 bp fragment from human EC-SOD gene promoter in order to detect single nucleotide polymorphism (SNPs) associated with atherosclerosis and risk factors of cardiovascular disease. The fragment of interest was amplified by PCR reaction and analyzed by DHPLC in 100 healthy subjects and 70 patients characterized by atheroma. No different melting profiles were detected for the analyzed DNA samples. A combination of computational methods was used to predict putative transcription factors in the fragment of interest. Several putative transcription factors binding sites from the Ets-1 oncogene family: ETS member Elk-1, polyomavirus enhancer activator-3 (PEA3), protein C-Ets-1 (Ets-1), GABP: GA binding protein (GABP), Spi-1 and Spi-B/PU.1 related transcription factors, from the Krueppel-like family: Gut-enriched Krueppel-like factor (GKLF), Erythroid Krueppel-like factor (EKLF), Basic Krueppel-like factor (BKLF), GC box and myeloid zinc finger protein MZF-1 were identified in the evolutionary conserved regions. The bioinformatics results need to be investigated further in others studies by experimental approaches.


Author(s):  
Joshua May

This chapter considers remaining empirical challenges to the idea that we’re commonly motivated to do what’s right for the right reasons. Two key factors threaten to defeat claims to virtuous motivation, self-interest (egoism) and arbitrary situational factors (situationism). Both threats aim to identify defective influences on moral behavior that reveal us to be commonly motivated by the wrong reasons. However, there are limits to such wide-ranging skeptical arguments. Ultimately, like debunking arguments, defeater challenges succumb to a Defeater’s Dilemma: one can identify influences on many of our morally relevant behaviors that are either substantial or arbitrary, but not both. The science suggests a familiar trade-off in which substantial influences on many morally relevant actions are rarely defective. Arriving at this conclusion requires carefully scrutinizing a range of studies, including those on framing effects, dishonesty, implicit bias, mood effects, and moral hypocrisy (vs. integrity).


Author(s):  
Mark Endrei ◽  
Chao Jin ◽  
Minh Ngoc Dinh ◽  
David Abramson ◽  
Heidi Poxon ◽  
...  

Rising power costs and constraints are driving a growing focus on the energy efficiency of high performance computing systems. The unique characteristics of a particular system and workload and their effect on performance and energy efficiency are typically difficult for application users to assess and to control. Settings for optimum performance and energy efficiency can also diverge, so we need to identify trade-off options that guide a suitable balance between energy use and performance. We present statistical and machine learning models that only require a small number of runs to make accurate Pareto-optimal trade-off predictions using parameters that users can control. We study model training and validation using several parallel kernels and more complex workloads, including Algebraic Multigrid (AMG), Large-scale Atomic Molecular Massively Parallel Simulator, and Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. We demonstrate that we can train the models using as few as 12 runs, with prediction error of less than 10%. Our AMG results identify trade-off options that provide up to 45% improvement in energy efficiency for around 10% performance loss. We reduce the sample measurement time required for AMG by 90%, from 13 h to 74 min.


Polymers ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 631
Author(s):  
Aleksander Cholewinski ◽  
Pengxiang Si ◽  
Marianna Uceda ◽  
Michael Pope ◽  
Boxin Zhao

Binders play an important role in electrode processing for energy storage systems. While conventional binders often require hazardous and costly organic solvents, there has been increasing development toward greener and less expensive binders, with a focus on those that can be processed in aqueous conditions. Due to their functional groups, many of these aqueous binders offer further beneficial properties, such as higher adhesion to withstand the large volume changes of several high-capacity electrode materials. In this review, we first discuss the roles of binders in the construction of electrodes, particularly for energy storage systems, summarize typical binder characterization techniques, and then highlight the recent advances on aqueous binder systems, aiming to provide a stepping stone for the development of polymer binders with better sustainability and improved functionalities.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Michal Sitina ◽  
Heiko Stark ◽  
Stefan Schuster

AbstractIn humans and higher animals, a trade-off between sufficiently high erythrocyte concentrations to bind oxygen and sufficiently low blood viscosity to allow rapid blood flow has been achieved during evolution. Optimal hematocrit theory has been successful in predicting hematocrit (HCT) values of about 0.3–0.5, in very good agreement with the normal values observed for humans and many animal species. However, according to those calculations, the optimal value should be independent of the mechanical load of the body. This is in contradiction to the exertional increase in HCT observed in some animals called natural blood dopers and to the illegal practice of blood boosting in high-performance sports. Here, we present a novel calculation to predict the optimal HCT value under the constraint of constant cardiac power and compare it to the optimal value obtained for constant driving pressure. We show that the optimal HCT under constant power ranges from 0.5 to 0.7, in agreement with observed values in natural blood dopers at exertion. We use this result to explain the tendency to better exertional performance at an increased HCT.


Author(s):  
Kersten Schuster ◽  
Philip Trettner ◽  
Leif Kobbelt

We present a numerical optimization method to find highly efficient (sparse) approximations for convolutional image filters. Using a modified parallel tempering approach, we solve a constrained optimization that maximizes approximation quality while strictly staying within a user-prescribed performance budget. The results are multi-pass filters where each pass computes a weighted sum of bilinearly interpolated sparse image samples, exploiting hardware acceleration on the GPU. We systematically decompose the target filter into a series of sparse convolutions, trying to find good trade-offs between approximation quality and performance. Since our sparse filters are linear and translation-invariant, they do not exhibit the aliasing and temporal coherence issues that often appear in filters working on image pyramids. We show several applications, ranging from simple Gaussian or box blurs to the emulation of sophisticated Bokeh effects with user-provided masks. Our filters achieve high performance as well as high quality, often providing significant speed-up at acceptable quality even for separable filters. The optimized filters can be baked into shaders and used as a drop-in replacement for filtering tasks in image processing or rendering pipelines.


Sign in / Sign up

Export Citation Format

Share Document