scholarly journals Convergence of Artificial Intelligence and High Performance Computing on NSF-supported Cyberinfrastructure

Author(s):  
Eliu Huerta ◽  
Asad Khan ◽  
Edward Davis ◽  
Colleen Bushell ◽  
William Gropp ◽  
...  

Abstract Significant investments to upgrade and construct large-scale scientific facilities demand commensurate investments in R\&D to design algorithms and computing approaches to enable scientific and engineering breakthroughs in the big data era. Innovative Artificial Intelligence (AI) applications have powered transformational solutions for big data challenges in industry and technology that now drive a multi-billion dollar industry, and which play an ever increasing role shaping human social patterns. As AI continues to evolve into a computing paradigm endowed with statistical and mathematical rigor, it has become apparent that single-GPU solutions for training, validation, and testing are no longer sufficient for AI applications that aim to provide novel solutions for big-data challenges posed by scientific facilities that produce data at a rate and volume that outstrip the computing capabilities of available cyberinfrastructure platforms. This realization has been driving the confluence of AI and high performance computing (HPC), which is critical to reduce time-to-insight, and to enable a systematic study of domain-inspired AI architectures and optimization schemes to enable data-driven discovery. In this article we present a summary of recent developments in this field, and discuss avenues to accelerate and streamline the use of HPC platforms to design accelerated AI algorithms.

2020 ◽  
Vol 7 (1) ◽  
Author(s):  
E. A. Huerta ◽  
Asad Khan ◽  
Edward Davis ◽  
Colleen Bushell ◽  
William D. Gropp ◽  
...  

Abstract Significant investments to upgrade and construct large-scale scientific facilities demand commensurate investments in R&D to design algorithms and computing approaches to enable scientific and engineering breakthroughs in the big data era. Innovative Artificial Intelligence (AI) applications have powered transformational solutions for big data challenges in industry and technology that now drive a multi-billion dollar industry, and which play an ever increasing role shaping human social patterns. As AI continues to evolve into a computing paradigm endowed with statistical and mathematical rigor, it has become apparent that single-GPU solutions for training, validation, and testing are no longer sufficient for computational grand challenges brought about by scientific facilities that produce data at a rate and volume that outstrip the computing capabilities of available cyberinfrastructure platforms. This realization has been driving the confluence of AI and high performance computing (HPC) to reduce time-to-insight, and to enable a systematic study of domain-inspired AI architectures and optimization schemes to enable data-driven discovery. In this article we present a summary of recent developments in this field, and describe specific advances that authors in this article are spearheading to accelerate and streamline the use of HPC platforms to design and apply accelerated AI algorithms in academia and industry.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Marek Nowicki ◽  
Łukasz Górski ◽  
Piotr Bała

AbstractWith the development of peta- and exascale size computational systems there is growing interest in running Big Data and Artificial Intelligence (AI) applications on them. Big Data and AI applications are implemented in Java, Scala, Python and other languages that are not widely used in High-Performance Computing (HPC) which is still dominated by C and Fortran. Moreover, they are based on dedicated environments such as Hadoop or Spark which are difficult to integrate with the traditional HPC management systems. We have developed the Parallel Computing in Java (PCJ) library, a tool for scalable high-performance computing and Big Data processing in Java. In this paper, we present the basic functionality of the PCJ library with examples of highly scalable applications running on the large resources. The performance results are presented for different classes of applications including traditional computational intensive (HPC) workloads (e.g. stencil), as well as communication-intensive algorithms such as Fast Fourier Transform (FFT). We present implementation details and performance results for Big Data type processing running on petascale size systems. The examples of large scale AI workloads parallelized using PCJ are presented.


2020 ◽  
Vol 245 ◽  
pp. 09011
Author(s):  
Michael Hildreth ◽  
Kenyi Paolo Hurtado Anampa ◽  
Cody Kankel ◽  
Scott Hampton ◽  
Paul Brenner ◽  
...  

The NSF-funded Scalable CyberInfrastructure for Artificial Intelligence and Likelihood Free Inference (SCAILFIN) project aims to develop and deploy artificial intelligence (AI) and likelihood-free inference (LFI) techniques and software using scalable cyberinfrastructure (CI) built on top of existing CI elements. Specifically, the project has extended the CERN-based REANA framework, a cloud-based data analysis platform deployed on top of Kubernetes clusters that was originally designed to enable analysis reusability and reproducibility. REANA is capable of orchestrating extremely complicated multi-step workflows, and uses Kubernetes clusters both for scheduling and distributing container-based workloads across a cluster of available machines, as well as instantiating and monitoring the concrete workloads themselves. This work describes the challenges and development efforts involved in extending REANA and the components that were developed in order to enable large scale deployment on High Performance Computing (HPC) resources. Using the Virtual Clusters for Community Computation (VC3) infrastructure as a starting point, we implemented REANA to work with a number of differing workload managers, including both high performance and high throughput, while simultaneously removing REANA’s dependence on Kubernetes support at the workers level.


1998 ◽  
Vol 30 (10) ◽  
pp. 1839-1856 ◽  
Author(s):  
I Turton ◽  
S Openshaw

In this paper we outline some of the results that were obtained by the application of a Cray T3D parallel supercomputer to human geography problems. We emphasise the fundamental importance of high-performance computing (HPC) as a future relevant paradigm for doing geography. We offer an introduction to recent developments and illustrate how new computational intelligence technologies can start to be used to make use of opportunities created by data riches from geographic information systems, artificial intelligence tools, and HPC in geography.


2020 ◽  
Vol 3 (2) ◽  
pp. 134-164
Author(s):  
Erick Giovani Sperandio Nascimento ◽  
Adhvan Novais Furtado ◽  
Roberto Badaró ◽  
Luciana Knop

The pandemic of the new coronavirus affected people’s lives by an unprecedented scale. Due to the need for isolation and the treatments, drugs, and vaccines, the pandemic amplified the digital health technologies, such as Artificial Intelligence (AI), Big Data Analytics (BDA), Blockchain, Telecommunication Technology (TT) as well as High-Performance Computing (HPC) and other technologies, to historic levels. These technologies are being used to mitigate, facilitate pandemic strategies, and find treatments and vaccines. This paper aims to reach articles about new technologies applied to COVID-19 published in the main database (PubMed/Medline, Elsevier Science Direct, Scopus, Isi Web of Science, Embase, Excerpta Medica, UptoDate, Lilacs, Novel Coronavirus Resource Directory from Elsevier), in the high-impact international scientific Journals (Scimago Journal and Country Rank - SJR - and Journal Citation Reports - JCR), such as The Lancet, Science, Nature, The New England Journal of Medicine, Physiological Reviews, Journal of the American Medical Association, Plos One, Journal of Clinical Investigation, and in the data from Center for Disease Control (CDC), National Institutes of Health (NIH), National Institute of Allergy and Infectious Diseases (NIAID) and World Health Organization (WHO). We prior selected meta-analysis, systematic reviews, article reviews, and original articles in this order. We reviewed 252 articles and used 140 from March to June 2020, using the terms coronavirus, SARS-CoV-2, novel coronavirus, Wuhan coronavirus, severe acute respiratory syndrome, 2019-nCoV, 2019 novel coronavirus, n-CoV-2, covid, n-SARS-2, COVID-19, corona virus, coronaviruses, New Technologies, Artificial Intelligence, Telemedicine, Telecommunication Technologies, AI, Big Data, BDA, TT, High-Performance Computing, Deep Learning, Neural Network, Blockchain, with the tools MeSH (Medical Subject Headings), AND, OR, and the characters [,“,; /., to ensure the best review topics. We concluded that this pandemic lastly consolidates the new technologies era and will change the whole way of the social life of human beings. Also, a big jump in medicine will happen on procedures, protocols, drug designs, attendances, encompassing all health areas, as well as in social and business behaviors.


Sign in / Sign up

Export Citation Format

Share Document