The Impact of Voltage-Frequency Scaling for the Matrix-Vector Product on the IBM POWER8

A PARALLEL FAST MULTIPOLE METHOD FOR THE HELMHOLTZ EQUATION

Parallel Processing Letters ◽

10.1142/s0129626495000242 ◽

1995 ◽

Vol 05 (02) ◽

pp. 263-274 ◽

Cited By ~ 3

Author(s):

MARK A. STALZER

Keyword(s):

Helmholtz Equation ◽

Parallel Algorithm ◽

Fast Multipole Method ◽

Iterative Solvers ◽

Dense Matrix ◽

Vector Product ◽

Fast Multipole ◽

Multipole Method ◽

The Matrix ◽

Matrix Vector

Presented is a parallel algorithm based on the fast multipole method (FMM) for the Helmholtz equation. This variant of the FMM is useful for computing radar cross sections and antenna radiation patterns. The FMM decomposes the impedance matrix into sparse components, reducing the operation count of the matrix-vector multiplication in iterative solvers to O(N3/2) (where N is the number of unknowns). The parallel algorithm divides the problem into groups and assigns the computation involved with each group to a processor node. Careful consideration is given to the communications costs. A time complexity analysis of the algorithm is presented and compared with empirical results from a Paragon XP/S running the lightweight Sandia/University of New Mexico operating system (SUNMOS). For a 90,000 unknown problem running on 60 nodes, the sparse representation fits in memory and the algorithm computes the matrix-vector product in 1.26 seconds. It sustains an aggregate rate of 1.4 Gflop/s. The corresponding dense matrix would occupy over 100 Gbytes and, assuming that I/O is free, would require on the order of 50 seconds to form the matrix-vector product.

Download Full-text

Efficient calculation of a normal matrix–vector product for anisotropic full-matrix least-squares refinement of macromolecular structures

Journal of Applied Crystallography ◽

10.1107/s0021889809040989 ◽

2009 ◽

Vol 42 (6) ◽

pp. 1020-1029 ◽

Cited By ~ 2

Author(s):

Boris V. Strokopytov

Keyword(s):

Protein Structures ◽

Normal Matrix ◽

Explicit Calculation ◽

Normal Equation ◽

Conjugate Directions ◽

Matrix Equations ◽

Vector Product ◽

Efficient Calculation ◽

The Matrix ◽

Matrix Vector

A novel algorithm is described for multiplying a normal equation matrix by an arbitrary real vector using the fast Fourier transform technique during anisotropic crystallographic refinement. The matrix–vector algorithm allows one to solve normal matrix equations using the conjugate-gradients or conjugate-directions technique without explicit calculation of a normal matrix. The anisotropic version of the algorithm has been implemented in a new version of the computer programFMLSQ. The updated program has been tested on several protein structures at high resolution. In addition, rapid methods for preconditioner and normal matrix–vector product calculations are described.

Download Full-text

Convolutional neural nets for estimating the run time and energy consumption of the sparse matrix-vector product

The International Journal of High Performance Computing Applications ◽

10.1177/1094342020953196 ◽

2020 ◽

pp. 109434202095319

Author(s):

Maria Barreda ◽

Manuel F Dolz ◽

M Asunción Castaño

Keyword(s):

Energy Consumption ◽

Computer Architecture ◽

Ad Hoc ◽

Sparse Matrix ◽

Neural Nets ◽

Vector Product ◽

Consumption Ratio ◽

The Matrix ◽

Memory Accesses ◽

Matrix Vector

Modeling the performance and energy consumption of the sparse matrix-vector product (SpMV) is essential to perform off-line analysis and, for example, choose a target computer architecture that delivers the best performance-energy consumption ratio. However, this task is especially complex given the memory-bounded nature and irregular memory accesses of the SpMV, mainly dictated by the input sparse matrix. In this paper, we propose a Machine Learning (ML)-driven approach that leverages Convolutional Neural Networks (CNNs) to provide accurate estimations of the performance and energy consumption of the SpMV kernel. The proposed CNN-based models use a blockwise approach to make the CNN architecture independent of the matrix size. These models are trained to estimate execution time as well as total, package, and DRAM energy consumption at different processor frequencies. The experimental results reveal that the overall relative error ranges between 0.5% and 14%, while at matrix level is not superior to 10%. To demonstrate the applicability and accuracy of the SpMV CNN-based models, this study is complemented with an ad-hoc time-energy model for the PageRank algorithm, a popular algorithm for web information retrieval used by search engines, which internally realizes the SpMV kernel.

Download Full-text

SPARSE COMPUTATION WITH PEI

International Journal of Foundations of Computer Science ◽

10.1142/s0129054199000307 ◽

1999 ◽

Vol 10 (04) ◽

pp. 425-442 ◽

Cited By ~ 1

Author(s):

FRÉDÉRIQUE VOISIN ◽

GUY-RENÉ PERRIN

Keyword(s):

Sparse Matrices ◽

Parallel Programs ◽

Data Parallelism ◽

Vector Product ◽

The Matrix ◽

Dense Matrices ◽

Matrix Vector

PEI formalism has been designed to reason and develop parallel programs in the context of data parallelism. In this paper, we focus on the use of PEI to transform a program involving dense matrices into a new program involving sparse matrices, using the example of the matrix-vector product.

Download Full-text

Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructions

PeerJ Computer Science ◽

10.7717/peerj-cs.151 ◽

2018 ◽

Vol 4 ◽

pp. e151 ◽

Cited By ~ 3

Author(s):

Bérenger Bramas ◽

Pavel Kus

Keyword(s):

Open Source ◽

High Performance ◽

Sparse Matrix ◽

Assembly Language ◽

Memory Storage ◽

Vector Product ◽

Zero Padding ◽

The Matrix ◽

Block Based ◽

Matrix Vector

The sparse matrix-vector product (SpMV) is a fundamental operation in many scientific applications from various fields. The High Performance Computing (HPC) community has therefore continuously invested a lot of effort to provide an efficient SpMV kernel on modern CPU architectures. Although it has been shown that block-based kernels help to achieve high performance, they are difficult to use in practice because of the zero padding they require. In the current paper, we propose new kernels using the AVX-512 instruction set, which makes it possible to use a blocking scheme without any zero padding in the matrix memory storage. We describe mask-based sparse matrix formats and their corresponding SpMV kernels highly optimized in assembly language. Considering that the optimal blocking size depends on the matrix, we also provide a method to predict the best kernel to be used utilizing a simple interpolation of results from previous executions. We compare the performance of our approach to that of the Intel MKL CSR kernel and the CSR5 open-source package on a set of standard benchmark matrices. We show that we can achieve significant improvements in many cases, both for sequential and for parallel executions. Finally, we provide the corresponding code in an open source library, called SPC5.

Download Full-text

Efficient Three-Way Split Formulas for Binary Polynomial Multiplication and Toeplitz Matrix Vector Product

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1587/transfun.e101.a.239 ◽

2018 ◽

Vol E101.A (1) ◽

pp. 239-248

Author(s):

Sun-Mi PARK ◽

Ku-Young CHANG ◽

Dowon HONG ◽

Changho SEO

Keyword(s):

Toeplitz Matrix ◽

Vector Product ◽

Polynomial Multiplication ◽

Matrix Vector

Download Full-text

Impact Strength, Flexural Modulus and Wear Rate of PMMA Composites Reinforced by Eggshell Powders

Engineering and Technology Journal ◽

10.30684/etj.v38i7a.384 ◽

2020 ◽

Vol 38 (7A) ◽

pp. 960-966

Author(s):

Aseel M. Abdullah ◽

Hussein Jaber ◽

Hanaa A. Al-Kaisy

Keyword(s):

Impact Strength ◽

Wear Rate ◽

Flexural Modulus ◽

Matrix Material ◽

Weight Fraction ◽

Pure Pmma ◽

Calcination Process ◽

The Matrix ◽

The Impact ◽

Interface Bond

In the present study, the impact strength, flexural modulus, and wear rate of poly methyl methacrylate (PMMA) with eggshell powder (ESP) composites have been investigated. The PMMA used as a matrix material reinforced with ESP at two different states (including untreated eggshell powder (UTESP) and treated eggshell powder (TESP)). Both UTESP and TESP were mixed with PMMA at different weight fractions ranged from (1-5) wt.%. The results revealed that the mechanical properties of the PMMA/ESP composites were enhanced steadily with increasing eggshell contents. The samples with 5 wt.% of UTESP and TESP additions give the maximum values of impact strength, about twice the value of the pure PMMA sample. The calcination process of eggshells powders gives better properties of the PMMA samples compared with the UTESP at the same weight fraction due to improvements in the interface bond between the matrix and particles. The wear characteristics of the PMMA composites decrease by about 57% with increases the weight fraction of TESP up to 5 wt.%. The flexural modulus values are slightly enhanced by increasing of the ESP contents in the PMMA composites.

Download Full-text

The Interactions between Polyphenols and Microorganisms, Especially Gut Microbiota

Antioxidants ◽

10.3390/antiox10020188 ◽

2021 ◽

Vol 10 (2) ◽

pp. 188

Author(s):

Małgorzata Makarewicz ◽

Iwona Drożdż ◽

Tomasz Tarko ◽

Aleksandra Duda-Chodak

Keyword(s):

Intestinal Bacteria ◽

Research Data ◽

Bioactive Metabolites ◽

Human Organism ◽

Comprehensive Knowledge ◽

Bidirectional Relationship ◽

The Matrix ◽

Intestinal Pathogens ◽

The Impact

This review presents the comprehensive knowledge about the bidirectional relationship between polyphenols and the gut microbiome. The first part is related to polyphenols’ impacts on various microorganisms, especially bacteria, and their influence on intestinal pathogens. The research data on the mechanisms of polyphenol action were collected together and organized. The impact of various polyphenols groups on intestinal bacteria both on the whole “microbiota” and on particular species, including probiotics, are presented. Moreover, the impact of polyphenols present in food (bound to the matrix) was compared with the purified polyphenols (such as in dietary supplements) as well as polyphenols in the form of derivatives (such as glycosides) with those in the form of aglycones. The second part of the paper discusses in detail the mechanisms (pathways) and the role of bacterial biotransformation of the most important groups of polyphenols, including the production of bioactive metabolites with a significant impact on the human organism (both positive and negative).

Download Full-text

Compression and load balancing for efficient sparse matrix‐vector product on multicore processors and graphics processing units

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6515 ◽

2021 ◽

Author(s):

José I. Aliaga ◽

Hartwig Anzt ◽

Thomas Grützmacher ◽

Enrique S. Quintana‐Ortí ◽

Andrés E. Tomás

Keyword(s):

Load Balancing ◽

Graphics Processing Units ◽

Sparse Matrix ◽

Multicore Processors ◽

Vector Product ◽

Graphics Processing ◽

Matrix Vector

Download Full-text

Selecting optimal SpMV realizations for GPUs via machine learning

The International Journal of High Performance Computing Applications ◽

10.1177/1094342021990738 ◽

2021 ◽

pp. 109434202199073

Author(s):

Ernesto Dufrechou ◽

Pablo Ezzatti ◽

Enrique S Quintana-Ortí

Keyword(s):

Machine Learning ◽

Sparse Matrix ◽

Machine Learning Techniques ◽

Optimal Method ◽

Learning Techniques ◽

General Rules ◽

Machine Learning Approach ◽

The Matrix ◽

Time And Energy ◽

Matrix Vector

More than 10 years of research related to the development of efficient GPU routines for the sparse matrix-vector product (SpMV) have led to several realizations, each with its own strengths and weaknesses. In this work, we review some of the most relevant efforts on the subject, evaluate a few prominent routines that are publicly available using more than 3000 matrices from different applications, and apply machine learning techniques to anticipate which SpMV realization will perform best for each sparse matrix on a given parallel platform. Our numerical experiments confirm the methods offer such varied behaviors depending on the matrix structure that the identification of general rules to select the optimal method for a given matrix becomes extremely difficult, though some useful strategies (heuristics) can be defined. Using a machine learning approach, we show that it is possible to obtain unexpensive classifiers that predict the best method for a given sparse matrix with over 80% accuracy, demonstrating that this approach can deliver important reductions in both execution time and energy consumption.

Download Full-text