precision matrix Latest Research Papers

GPU Domain Specialization via Composable On-Package Architecture

ACM Transactions on Architecture and Code Optimization ◽

10.1145/3484505 ◽

2022 ◽

Vol 19 (1) ◽

pp. 1-23

Author(s):

Yaosheng Fu ◽

Evgeny Bolotin ◽

Niladrish Chatterjee ◽

David Nellans ◽

Stephen W. Keckler

Keyword(s):

Deep Learning ◽

Memory System ◽

Design Reuse ◽

Application Domain ◽

Precision Matrix ◽

Practical Solution ◽

Optimal Configurations ◽

Gpu Architecture ◽

With Memory ◽

Cache Capacity

As GPUs scale their low-precision matrix math throughput to boost deep learning (DL) performance, they upset the balance between math throughput and memory system capabilities. We demonstrate that a converged GPU design trying to address diverging architectural requirements between FP32 (or larger)-based HPC and FP16 (or smaller)-based DL workloads results in sub-optimal configurations for either of the application domains. We argue that a C omposable O n- PA ckage GPU (COPA-GPU) architecture to provide domain-specialized GPU products is the most practical solution to these diverging requirements. A COPA-GPU leverages multi-chip-module disaggregation to support maximal design reuse, along with memory system specialization per application domain. We show how a COPA-GPU enables DL-specialized products by modular augmentation of the baseline GPU architecture with up to 4× higher off-die bandwidth, 32× larger on-package cache, and 2.3× higher DRAM bandwidth and capacity, while conveniently supporting scaled-down HPC-oriented designs. This work explores the microarchitectural design necessary to enable composable GPUs and evaluates the benefits composability can provide to HPC, DL training, and DL inference. We show that when compared to a converged GPU design, a DL-optimized COPA-GPU featuring a combination of 16× larger cache capacity and 1.6× higher DRAM bandwidth scales per-GPU training and inference performance by 31% and 35%, respectively, and reduces the number of GPU instances by 50% in scale-out training scenarios.

High‐dimensional Precision Matrix Estimation with a Known Graphical Structure

Stat ◽

10.1002/sta4.424 ◽

2021 ◽

Author(s):

Thien‐Minh Le ◽

Ping‐Shou Zhong

Keyword(s):

High Dimensional ◽

Precision Matrix ◽

Matrix Estimation ◽

Dimensional Precision ◽

Graphical Structure

An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units

Computational Statistics ◽

10.1007/s00180-021-01127-x ◽

2021 ◽

Author(s):

Young-Geun Choi ◽

Seunghwan Lee ◽

Donghyeon Yu

Keyword(s):

Graphics Processing Units ◽

Large Scale ◽

Coordinate Descent ◽

Block Coordinate Descent ◽

Precision Matrix ◽

Descent Algorithm ◽

Matrix Estimation ◽

Coordinate Descent Algorithm ◽

Graphics Processing

Robust estimation of sparse precision matrix using adaptive weighted graphical lasso approach

Journal of Nonparametric Statistics ◽

10.1080/10485252.2021.1931688 ◽

2021 ◽

pp. 1-24

Author(s):

Peng Tang ◽

Huijing Jiang ◽

Heeyoung Kim ◽

Xinwei Deng

Keyword(s):

Robust Estimation ◽

Precision Matrix ◽

Graphical Lasso ◽

Sparse Precision Matrix

A Greedy Algorithm for Sparse Precision Matrix Approximation

Journal of Computational Mathematics ◽

10.4208/jcm.2005-m2019-0151 ◽

2021 ◽

Vol 39 (5) ◽

pp. 655-669

Author(s):

DidiLv & XiaoqunZhang

Keyword(s):

Greedy Algorithm ◽

Matrix Approximation ◽

Precision Matrix ◽

Sparse Precision Matrix

Efficient Distributed Estimation of High-dimensional Sparse Precision Matrix for Transelliptical Graphical Models

Acta Mathematica Sinica English Series ◽

10.1007/s10114-021-9553-z ◽

2021 ◽

Vol 37 (5) ◽

pp. 689-706

Author(s):

Guan Peng Wang ◽

Heng Jian Cui

Keyword(s):

Graphical Models ◽

Distributed Estimation ◽

High Dimensional ◽

Precision Matrix ◽

Sparse Precision Matrix

Block-Enhanced Precision Matrix Estimation for Large-Scale Datasets

Journal of Computational Science ◽

10.1016/j.jocs.2021.101389 ◽

2021 ◽

pp. 101389

Author(s):

Aryan Eftekhari ◽

Dimosthenis Pasadakis ◽

Matthias Bollhöfer ◽

Simon Scheidegger ◽

Olaf Schenk

Keyword(s):

Large Scale ◽

Precision Matrix ◽

Matrix Estimation

Learning gene regulatory networks using gaussian process emulator and graphical LASSO

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720021500074 ◽

2021 ◽

pp. 2150007

Author(s):

H. Chatrabgoun ◽

A. R. Soltanian ◽

H. Mahjub ◽

F. Bahreini

Keyword(s):

Gaussian Process ◽

Normal Distribution ◽

Gene Regulatory Networks ◽

Multivariate Normal Distribution ◽

Regulatory Networks ◽

Multivariate Normal ◽

Gene Expressions ◽

Precision Matrix ◽

Gene Regulatory ◽

Gp Model

Large amounts of research efforts have been focused on learning gene regulatory networks (GRNs) based on gene expression data to understand the functional basis of a living organism. Under the assumption that the joint distribution of the gene expressions of interest is a multivariate normal distribution, such networks can be constructed by assessing the nonzero elements of the inverse covariance matrix, the so-called precision matrix or concentration matrix. This may not reflect the true connectivity between genes by considering just pairwise linear correlations. To relax this limitative constraint, we employ Gaussian process (GP) model which is well known as computationally efficient non-parametric Bayesian machine learning technique. GPs are among a class of methods known as kernel machines which can be used to approximate complex problems by tuning their hyperparameters. In fact, GP creates the ability to use the capacity and potential of different kernels in constructing precision matrix and GRNs. In this paper, in the first step, we choose the GP with appropriate kernel to learn the considered GRNs from the observed genetic data, and then we estimate kernel hyperparameters using rule-of-thumb technique. Using these hyperparameters, we can also control the degree of sparseness in the precision matrix. Then we obtain kernel-based precision matrix similar to GLASSO to construct kernel-based GRN. The findings of our research are used to construct GRNs with high performance, for different species of Drosophila fly rather than simply using the assumption of multivariate normal distribution, and the GPs, despite the use of the kernels capacity, have a much better performance than the multivariate Gaussian distribution assumption.

Resolving the ambiguity of random‐effects models with singular precision matrix

Statistica Neerlandica ◽

10.1111/stan.12244 ◽

2021 ◽

Author(s):

Woojoo Lee ◽

Hans‐Peter Piepho ◽

Youngjo Lee

Keyword(s):

Random Effects ◽

Precision Matrix ◽

Random Effects Models

Supporting Mixed-domain Mixed-precision Matrix Multiplication within the BLIS Framework

ACM Transactions on Mathematical Software ◽

10.1145/3402225 ◽

2021 ◽

Vol 47 (2) ◽

pp. 1-26

Author(s):

Field G. Van Zee ◽

Devangi N. Parikh ◽

Robert A. Van De Geijn

Keyword(s):

High Performance ◽

Matrix Multiplication ◽

Software Framework ◽

Matrix Product ◽

Double Precision ◽

Precision Matrix ◽

Implementation Approach ◽

Mixed Precision ◽

The Matrix ◽

Performance Results

We approach the problem of implementing mixed-datatype support within the general matrix multiplication ( gemm ) operation of the BLAS-like Library Instantiation Software framework, whereby each matrix operand A , B , and C may be stored as single- or double-precision real or complex values. Another factor of complexity, whereby the matrix product and accumulation are allowed to take place in a precision different from the storage precisions of either A or B , is also discussed. We first break the problem into orthogonal dimensions, considering the mixing of domains separately from mixing precisions. Support for all combinations of matrix operands stored in either the real or complex domain is mapped out by enumerating the cases and describing an implementation approach for each. Supporting all combinations of storage and computation precisions is handled by typecasting the matrices at key stages of the computation—during packing and/or accumulation, as needed. Several optional optimizations are also documented. Performance results gathered on a 56-core Marvell ThunderX2 and a 52-core Intel Xeon Platinum demonstrate that high performance is mostly preserved, with modest slowdowns incurred from unavoidable typecast instructions. The mixed-datatype implementation confirms that combinatorial intractability is avoided, with the framework relying on only two assembly microkernels to implement 128 datatype combinations.

precision matrix
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

GPU Domain Specialization via Composable On-Package Architecture

High‐dimensional Precision Matrix Estimation with a Known Graphical Structure

An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units

Robust estimation of sparse precision matrix using adaptive weighted graphical lasso approach

A Greedy Algorithm for Sparse Precision Matrix Approximation

Efficient Distributed Estimation of High-dimensional Sparse Precision Matrix for Transelliptical Graphical Models

Block-Enhanced Precision Matrix Estimation for Large-Scale Datasets

Learning gene regulatory networks using gaussian process emulator and graphical LASSO

Resolving the ambiguity of random‐effects models with singular precision matrix

Supporting Mixed-domain Mixed-precision Matrix Multiplication within the BLIS Framework

Export Citation Format

precision matrixRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

GPU Domain Specialization via Composable On-Package Architecture

High‐dimensional Precision Matrix Estimation with a Known Graphical Structure

An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units

Robust estimation of sparse precision matrix using adaptive weighted graphical lasso approach

A Greedy Algorithm for Sparse Precision Matrix Approximation

Efficient Distributed Estimation of High-dimensional Sparse Precision Matrix for Transelliptical Graphical Models

Block-Enhanced Precision Matrix Estimation for Large-Scale Datasets

Learning gene regulatory networks using gaussian process emulator and graphical LASSO

Resolving the ambiguity of random‐effects models with singular precision matrix

Supporting Mixed-domain Mixed-precision Matrix Multiplication within the BLIS Framework

precision matrix
Recently Published Documents