scholarly journals Scientific Computing, High-Performance Computing and Data Science in Higher Education

2019 ◽  
Vol 10 (1) ◽  
pp. 24-31 ◽  
Author(s):  
Marcelo Ponce ◽  
Erik Spence ◽  
Ramses van Zon ◽  
Daniel Gruner
2019 ◽  
Vol 214 ◽  
pp. 03031
Author(s):  
Dirk Hufnagel ◽  
Burt Holzman ◽  
David Mason ◽  
Parag Mhashilkar ◽  
Steven Timm ◽  
...  

The higher energy and luminosity from the LHC in Run 2 have put increased pressure on CMS computing resources. Extrapolating to even higher luminosities (and thus higher event complexities and trigger rates) beyond Run 3, it becomes clear that simply scaling up the the current model of CMS computing alone will become economically unfeasible. High Performance Computing (HPC) facilities, widely used in scientific computing outside of HEP, have the potential to help fill the gap. Here we describe the U.S.CMS efforts to integrate US HPC resources into CMS Computing via the HEPCloud project at Fermilab. We present advancements in our ability to use NERSC resources at scale and efforts to integrate other HPC sites as well. We present experience in the elastic use of HPC resources, quickly scaling up use when so required by CMS workflows. We also present performance studies of the CMS multi-threaded framework on both Haswell and KNL HPC resources.


Author(s):  
A Grannan ◽  
K Sood ◽  
B Norris ◽  
A Dubey

Scientific discovery increasingly relies on computation through simulations, analytics, and machine and deep learning. Of these, simulations on high-performance computing (HPC) platforms have been the cornerstone of scientific computing for more than two decades. However, the development of simulation software has, in general, occurred through accretion, with a few exceptions. With an increase in scientific understanding, models have become more complex, rendering an accretion mode untenable to the point where software productivity and sustainability have become active concerns in scientific computing. In this survey paper, we examine a modest set of HPC scientific simulation applications that are already using cutting-edge HPC platforms. Several have been in existence for a decade or more. Our objective in this survey is twofold: first, to understand the landscape of scientific computing on HPC platforms in order to distill the currently scattered knowledge about software practices that have helped both developer and software productivity, and second, to understand the kind of tools and methodologies that need attention for continued productivity.


2021 ◽  
Vol 19 (4) ◽  
pp. e49
Author(s):  
Anas Oujja ◽  
Mohamed Riduan Abid ◽  
Jaouad Boumhidi ◽  
Safae Bourhnane ◽  
Asmaa Mourhir ◽  
...  

Nowadays, Genomic data constitutes one of the fastest growing datasets in the world. As of 2025, it is supposed to become the fourth largest source of Big Data, and thus mandating adequate high-performance computing (HPC) platform for processing. With the latest unprecedented and unpredictable mutations in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the research community is in crucial need for ICT tools to process SARS-CoV-2 RNA data, e.g., by classifying it (i.e., clustering) and thus assisting in tracking virus mutations and predict future ones. In this paper, we are presenting an HPC-based SARS-CoV-2 RNAs clustering tool. We are adopting a data science approach, from data collection, through analysis, to visualization. In the analysis step, we present how our clustering approach leverages on HPC and the longest common subsequence (LCS) algorithm. The approach uses the Hadoop MapReduce programming paradigm and adapts the LCS algorithm in order to efficiently compute the length of the LCS for each pair of SARS-CoV-2 RNA sequences. The latter are extracted from the U.S. National Center for Biotechnology Information (NCBI) Virus repository. The computed LCS lengths are used to measure the dissimilarities between RNA sequences in order to work out existing clusters. In addition to that, we present a comparative study of the LCS algorithm performance based on variable workloads and different numbers of Hadoop worker nodes.


Sign in / Sign up

Export Citation Format

Share Document