A survey of software implementations used by application codes in the Exascale Computing Project

The International Journal of High Performance Computing Applications ◽

10.1177/10943420211028940 ◽

2021 ◽

pp. 109434202110289

Author(s):

Thomas M Evans ◽

Andrew Siegel ◽

Erik W Draeger ◽

Jack Deslippe ◽

Marianne M Francois ◽

...

Keyword(s):

Programming Languages ◽

Department Of Energy ◽

Programming Models ◽

Processing Unit ◽

Scientific Software ◽

Exascale Computing ◽

Graphical Processing ◽

High Level ◽

Hardware Platforms ◽

Application Codes

The US Department of Energy Office of Science and the National Nuclear Security Administration initiated the Exascale Computing Project (ECP) in 2016 to prepare mission-relevant applications and scientific software for the delivery of the exascale computers starting in 2023. The ECP currently supports 24 efforts directed at specific applications and six supporting co-design projects. These 24 application projects contain 62 application codes that are implemented in three high-level languages—C, C++, and Fortran—and use 22 combinations of graphical processing unit programming models. The most common implementation language is C++, which is used in 53 different application codes. The most common programming models across ECP applications are CUDA and Kokkos, which are employed in 15 and 14 applications, respectively. This article provides a survey of the programming languages and models used in the ECP applications codebase that will be used to achieve performance on the future exascale hardware platforms.

Download Full-text

Reasearch Directions in High-Level Parallel Programming Languages

10.1007/3-540-55160-3 ◽

1992 ◽

Cited By ~ 5

Keyword(s):

Programming Languages ◽

Parallel Programming ◽

Parallel Programming Languages ◽

High Level

Download Full-text

Mixed-mode database miner classifier: Parallel computation of graphical processing unit mining

International Journal of Electrical Engineering Education ◽

10.1177/0020720920988494 ◽

2021 ◽

pp. 002072092098849

Author(s):

Soumya Ranjan Nayak ◽

S Sivakumar ◽

Akash Kumar Bhoi ◽

Gyoo-Soo Chae ◽

Pradeep Kumar Mallick

Keyword(s):

Credit Card ◽

Mixed Mode ◽

Processing Time ◽

Gpu Computing ◽

Graphical Processing Unit ◽

Computational Time ◽

Processing Unit ◽

Large Set ◽

Minimal Processing ◽

Graphical Processing

Graphical processing unit (GPU) has gained more popularity among researchers in the field of decision making and knowledge discovery systems. However, most of the earlier studies have GPU memory utilization, computational time, and accuracy limitations. The main contribution of this paper is to present a novel algorithm called the Mixed Mode Database Miner (MMDBM) classifier by implementing multithreading concepts on a large number of attributes. The proposed method use the quick sort algorithm in GPU parallel computing to overcome the state of the art limitations. This method applies the dynamic rule generation approach for constructing the decision tree based on the predicted rules. Moreover, the implementation results are compared with both SLIQ and MMDBM using Java and GPU with the computed acceleration ratio time using the BP dataset. The primary objective of this work is to improve the performance with less processing time. The results are also analyzed using various threads in GPU mining using eight different datasets of UCI Machine learning repository. The proposed MMDBM algorithm have been validated on these chosen eight different dataset with accuracy of 91.3% in diabetes, 89.1% in breast cancer, 96.6% in iris, 89.9% in labor, 95.4% in vote, 89.5% in credit card, 78.7% in supermarket and 78.7% in BP, and simultaneously, it also takes less computational time for given datasets. The outcome of this work will be beneficial for the research community to develop more effective multi thread based GPU solution in GPU mining to handle large set of data in minimal processing time. Therefore, this can be considered a more reliable and precise method for GPU computing.

Download Full-text

A graphical processing unit‐based parallel hybrid genetic algorithm for resource‐constrained multi‐project scheduling problem

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6266 ◽

2021 ◽

Author(s):

Furkan Uysal ◽

Rifat Sonmez ◽

Selcuk Kursat Isleyen

Keyword(s):

Genetic Algorithm ◽

Project Scheduling ◽

Hybrid Genetic Algorithm ◽

Graphical Processing Unit ◽

Processing Unit ◽

Scheduling Problem ◽

Resource Constrained ◽

Parallel Hybrid ◽

Project Scheduling Problem ◽

Graphical Processing

Download Full-text

CSim 2

ACM Transactions on Programming Languages and Systems ◽

10.1145/3436808 ◽

2021 ◽

Vol 43 (1) ◽

pp. 1-46

Author(s):

David Sanan ◽

Yongwang Zhao ◽

Shang-Wei Lin ◽

Liu Yang

Keyword(s):

Programming Languages ◽

Concurrent Systems ◽

Theorem Prover ◽

Compositional Techniques ◽

Machine Code ◽

Top Down ◽

Hol Theorem Prover ◽

High Level ◽

High Degree

To make feasible and scalable the verification of large and complex concurrent systems, it is necessary the use of compositional techniques even at the highest abstraction layers. When focusing on the lowest software abstraction layers, such as the implementation or the machine code, the high level of detail of those layers makes the direct verification of properties very difficult and expensive. It is therefore essential to use techniques allowing to simplify the verification on these layers. One technique to tackle this challenge is top-down verification where by means of simulation properties verified on top layers (representing abstract specifications of a system) are propagated down to the lowest layers (that are an implementation of the top layers). There is no need to say that simulation of concurrent systems implies a greater level of complexity, and having compositional techniques to check simulation between layers is also desirable when seeking for both feasibility and scalability of the refinement verification. In this article, we present CSim 2 a (compositional) rely-guarantee-based framework for the top-down verification of complex concurrent systems in the Isabelle/HOL theorem prover. CSim 2 uses CSimpl, a language with a high degree of expressiveness designed for the specification of concurrent programs. Thanks to its expressibility, CSimpl is able to model many of the features found in real world programming languages like exceptions, assertions, and procedures. CSim 2 provides a framework for the verification of rely-guarantee properties to compositionally reason on CSimpl specifications. Focusing on top-down verification, CSim 2 provides a simulation-based framework for the preservation of CSimpl rely-guarantee properties from specifications to implementations. By using the simulation framework, properties proven on the top layers (abstract specifications) are compositionally propagated down to the lowest layers (source or machine code) in each concurrent component of the system. Finally, we show the usability of CSim 2 by running a case study over two CSimpl specifications of an Arinc-653 communication service. In this case study, we prove a complex property on a specification, and we use CSim 2 to preserve the property on lower abstraction layers.

Download Full-text

A Review of Algorithms and Hardware Implementations for Spiking Neural Networks

Journal of Low Power Electronics and Applications ◽

10.3390/jlpea11020023 ◽

2021 ◽

Vol 11 (2) ◽

pp. 23

Author(s):

Duy-Anh Nguyen ◽

Xuan-Tu Tran ◽

Francesca Iacopi

Keyword(s):

Neural Networks ◽

Computational Cost ◽

Superior Performance ◽

Training Algorithms ◽

Current State ◽

Hardware Implementations ◽

Neuromorphic Hardware ◽

High Level ◽

Event Based ◽

Hardware Platforms

Deep Learning (DL) has contributed to the success of many applications in recent years. The applications range from simple ones such as recognizing tiny images or simple speech patterns to ones with a high level of complexity such as playing the game of Go. However, this superior performance comes at a high computational cost, which made porting DL applications to conventional hardware platforms a challenging task. Many approaches have been investigated, and Spiking Neural Network (SNN) is one of the promising candidates. SNN is the third generation of Artificial Neural Networks (ANNs), where each neuron in the network uses discrete spikes to communicate in an event-based manner. SNNs have the potential advantage of achieving better energy efficiency than their ANN counterparts. While generally there will be a loss of accuracy on SNN models, new algorithms have helped to close the accuracy gap. For hardware implementations, SNNs have attracted much attention in the neuromorphic hardware research community. In this work, we review the basic background of SNNs, the current state and challenges of the training algorithms for SNNs and the current implementations of SNNs on various hardware platforms.

Download Full-text

Application of C Sharp and MATLAB Mixed Programming Based on .Net Assembly in Blind Source Separation

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.599-601.1407 ◽

2014 ◽

Vol 599-601 ◽

pp. 1407-1410

Author(s):

Xu Liang ◽

Ke Ming Wang ◽

Gui Yu Xin

Keyword(s):

Numerical Calculation ◽

Programming Languages ◽

Blind Source Separation ◽

Source Separation ◽

Processing System ◽

Signal Processing System ◽

Blind Signal Processing ◽

Blind Signal ◽

Mixed Programming ◽

High Level

Comparing with other High-level programming languages, C Sharp (C#) is more efficient in software development. While MATLAB language provides a series of powerful functions of numerical calculation that facilitate adoption of algorithms, which are widely applied in blind source separation (BSS). Combining the advantages of the two languages, this paper presents an implementation of mixed programming and the development of a simplified blind signal processing system. Application results show the system developed by mixed programming is successful.

Download Full-text

NENOK — A SOFTWARE ARCHITECTURE FOR GENERIC INFERENCE

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213010000042 ◽

2010 ◽

Vol 19 (01) ◽

pp. 65-99 ◽

Cited By ~ 6

Author(s):

MARC POULY

Keyword(s):

Programming Languages ◽

Software Library ◽

Inference Process ◽

Software Projects ◽

Open Problems ◽

Constraint Systems ◽

Fast Prototyping ◽

Inference Algorithms ◽

Key Competences ◽

High Level

Computing inference from a given knowledgebase is one of the key competences of computer science. Therefore, numerous formalisms and specialized inference routines have been introduced and implemented for this task. Typical examples are Bayesian networks, constraint systems or different kinds of logic. It is known today that these formalisms can be unified under a common algebraic roof called valuation algebra. Based on this system, generic inference algorithms for the processing of arbitrary valuation algebras can be defined. Researchers benefit from this high level of abstraction to address open problems independently of the underlying formalism. It is therefore all the more astonishing that this theory did not find its way into concrete software projects. Indeed, all modern programming languages for example provide generic sorting procedures, but generic inference algorithms are still mythical creatures. NENOK breaks a new ground and offers an extensive library of generic inference tools based on the valuation algebra framework. All methods are implemented as distributed algorithms that process local and remote knowledgebases in a transparent manner. Besides its main purpose as software library, NENOK also provides a sophisticated graphical user interface to inspect the inference process and the involved graphical structures. This can be used for educational purposes but also as a fast prototyping architecture for inference formalisms.

Download Full-text

Co-design Center for Exascale Machine Learning Technologies (ExaLearn)

The International Journal of High Performance Computing Applications ◽

10.1177/10943420211029302 ◽

2021 ◽

pp. 109434202110293

Author(s):

Francis J Alexander ◽

James Ang ◽

Jenna A Bilbrey ◽

Jan Balewski ◽

Tiernan Casey ◽

...

Keyword(s):

Machine Learning ◽

High Performance ◽

New Technologies ◽

Scientific Discovery ◽

Language Translation ◽

Learning Technologies ◽

Department Of Energy ◽

Experimental Science ◽

Exascale Computing ◽

Self Driving Cars

Rapid growth in data, computational methods, and computing power is driving a remarkable revolution in what variously is termed machine learning (ML), statistical learning, computational learning, and artificial intelligence. In addition to highly visible successes in machine-based natural language translation, playing the game Go, and self-driving cars, these new technologies also have profound implications for computational and experimental science and engineering, as well as for the exascale computing systems that the Department of Energy (DOE) is developing to support those disciplines. Not only do these learning technologies open up exciting opportunities for scientific discovery on exascale systems, they also appear poised to have important implications for the design and use of exascale computers themselves, including high-performance computing (HPC) for ML and ML for HPC. The overarching goal of the ExaLearn co-design project is to provide exascale ML software for use by Exascale Computing Project (ECP) applications, other ECP co-design centers, and DOE experimental facilities and leadership class computing facilities.

Download Full-text

Programming Hypothesis on Life Phenomena and the Key Processes Simulation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.647.258 ◽

2013 ◽

Vol 647 ◽

pp. 258-263 ◽

Cited By ~ 1

Author(s):

Jun Ma ◽

Shu Yan Li ◽

Yi De Ma

Keyword(s):

Cell Metabolism ◽

Transport Processes ◽

Programming Models ◽

Life Process ◽

Human Dna ◽

Evolution Of Life ◽

Control Chemical ◽

C Value ◽

Life Phenomena ◽

High Level

The formula that life process follows is a major scientific mystery during centuries. Some people put programming thoughts into this field like Gates brought the idea that “Human DNA is like a computer program but far, far more advanced than any software we’ve ever created”[1]. Here we proposed a more specific hypothesis on this topic as that DNA is a set of p-code[2] and the enzymes which control chemical reactions and transport processes in cell metabolism are the basic instructions. Based on this hypothesis, some program models were developed successfully in this work to simulate the key processes of life phenomena: gene expression, cell division and differentiation, and life evolution. The results of these simulations show that there is a high level of similarity between life phenomena and computer programs; the process of cell differentiation and evolution of life can be explained in a programming way. These models also suggest that reflection technology[3, 4] is essential to life process. Besides, C-value paradox, N-value paradox[5] and pseudogene as well as some other biological problems could be also explained by these programming models. These conclusions imply that life phenomena are consistent with the concept of “process” in computer fields.

Download Full-text

ACCELERATING 3D NON-RIGID REGISTRATION USING GRAPHICS HARDWARE

International Journal of Image and Graphics ◽

10.1142/s0219467808002988 ◽

2008 ◽

Vol 08 (01) ◽

pp. 81-98 ◽

Cited By ~ 11

Author(s):

NICOLAS COURTY ◽

PIERRE HELLIER

Keyword(s):

Image Analysis ◽

Classical Method ◽

Graphics Hardware ◽

Processing Unit ◽

Recursive Filtering ◽

Rigid Registration ◽

Guided Surgery ◽

Graphical Processing Unit Implementation ◽

Demons Algorithm ◽

Graphical Processing

There is an increasing need for real-time implementation of 3D image analysis processes, especially in the context of image-guided surgery. Among the various image analysis tasks, non-rigid image registration is particularly needed and is also computationally prohibitive. This paper presents a GPU (Graphical Processing Unit) implementation of the popular Demons algorithm using a Gaussian recursive filtering. Acceleration of the classical method is mainly achieved by a new filtering scheme on GPU which could be reused in or extended to other applications and denotes a significant contribution to the GPU-based image processing domain. This implementation was able to perform a non-rigid registration of 3D MR volumes in less than one minute, which corresponds to an acceleration factor of 10 compared to the corresponding CPU implementation. This demonstrated the usefulness of such method in an intra-operative context.

Download Full-text