sparse matrices
Recently Published Documents


TOTAL DOCUMENTS

611
(FIVE YEARS 94)

H-INDEX

37
(FIVE YEARS 5)

Author(s):  
Tuan Quoc Nguyen ◽  
Katsumi Inoue ◽  
Chiaki Sakama

AbstractAlgebraic characterization of logic programs has received increasing attention in recent years. Researchers attempt to exploit connections between linear algebraic computation and symbolic computation to perform logical inference in large-scale knowledge bases. In this paper, we analyze the complexity of the linear algebraic methods for logic programs and propose further improvement by using sparse matrices to embed logic programs in vector spaces. We show its great power of computation in reaching the fixed point of the immediate consequence operator. In particular, performance for computing the least models of definite programs is dramatically improved using the sparse matrix representation. We also apply the method to the computation of stable models of normal programs, in which the guesses are associated with initial matrices, and verify its effect when there are small numbers of negation. These results show good enhancement in terms of performance for computing consequences of programs and depict the potential power of tensorized logic programs.


Author(s):  
Olfa Hamdi-Larbi ◽  
Ichrak Mehrez ◽  
Thomas Dufaud

Many applications in scientific computing process very large sparse matrices on parallel architectures. The presented work in this paper is a part of a project where our general aim is to develop an auto-tuner system for the selection of the best matrix compression format in the context of high-performance computing. The target smart system can automatically select the best compression format for a given sparse matrix, a numerical method processing this matrix, a parallel programming model and a target architecture. Hence, this paper describes the design and implementation of the proposed concept. We consider a case study consisting of a numerical method reduced to the sparse matrix vector product (SpMV), some compression formats, the data parallel as a programming model and, a distributed multi-core platform as a target architecture. This study allows extracting a set of important novel metrics and parameters which are relative to the considered programming model. Our metrics are used as input to a machine-learning algorithm to predict the best matrix compression format. An experimental study targeting a distributed multi-core platform and processing random and real-world matrices shows that our system can improve in average up to 7% the accuracy of the machine learning.


2021 ◽  
Author(s):  
Mikhail Karasikov ◽  
Harun Mustafa ◽  
Gunnar Rätsch ◽  
André Kahles

High-throughput sequencing data is rapidly accumulating in public repositories. Making this resource accessible for interactive analysis at scale requires efficient approaches for its storage and indexing. There have recently been remarkable advances in solving the experiment discovery problem and building compressed representations of annotated de Bruijn graphs where k-mer sets can be efficiently indexed and interactively queried. However, approaches for representing and retrieving other quantitative attributes such as gene expression or genome positions in a general manner have yet to be developed. In this work, we propose the concept of Counting de Bruijn graphs generalizing the notion of annotated (or colored) de Bruijn graphs. Counting de Bruijn graphs supplement each node-label relation with one or many attributes (e.g., a k-mer count or its positions in genome). To represent them, we first observe that many schemes for the representation of compressed binary matrices already support the rank operation on the columns or rows, which can be used to define an inherent indexing of any additional quantitative attributes. Based on this property, we generalize these schemes and introduce a new approach for representing non-binary sparse matrices in compressed data structures. Finally, we notice that relation attributes are often easily predictable from a node's local neighborhood in the graph. Notable examples are genome positions shifting by 1 for neighboring nodes in the graph, or expression levels that are often shared across neighbors. We exploit this regularity of graph annotations and apply an invertible delta-like coding to achieve better compression. We show that Counting de Bruijn graphs index k-mer counts from 2,652 human RNA-Seq read sets in representations over 8-fold smaller and yet faster to query compared to state-of-the-art bioinformatics tools. Furthermore, Counting de Bruijn graphs with positional annotations losslessly represent entire reads in indexes on average 27% smaller than the input compressed with gzip -9 for human Illumina RNA-Seq and 57% smaller for PacBio HiFi sequencing of viral samples. A complete joint searchable index of all viral PacBio SMRT reads from NCBI's SRA (152,884 read sets, 875 Gbp) comprises only 178 GB. Finally, on the full RefSeq collection, they generate a lossless and fully queryable index that is 4.4-fold smaller compared to the MegaBLAST index. The techniques proposed in this work naturally complement existing methods and tools employing de Bruijn graphs and significantly broaden their applicability: from indexing k-mer counts and genome positions to implementing novel sequence alignment algorithms on top of highly compressed and fully searchable graph-based sequence indexes.


2021 ◽  
Vol 2099 (1) ◽  
pp. 012005
Author(s):  
V P Il’in ◽  
D I Kozlov ◽  
A V Petukhov

Abstract The objective of this research is to develop and to study iterative methods in the Krylov subspaces for solving systems of linear algebraic equations (SLAEs) with non-symmetric sparse matrices of high orders arising in the approximation of multi-dimensional boundary value problems on the unstructured grids. These methods are also relevant in many applications, including diffusion-convection equations. The considered algorithms are based on constructing ATA — orthogonal direction vectors calculated using short recursions and providing global minimization of a residual at each iteration. Methods based on the Lanczos orthogonalization, AT — preconditioned conjugate residuals algorithm, as well as the left Gauss transform for the original SLAEs are implemented. In addition, the efficiency of these iterative processes is investigated when solving algebraic preconditioned systems using an approximate factorization of the original matrix in the Eisenstat modification. The results of a set of computational experiments for various grids and values of convective coefficients are presented, which demonstrate a sufficiently high efficiency of the approaches under consideration.


2021 ◽  
pp. 102874
Author(s):  
Boro Sofranac ◽  
Ambros Gleixner ◽  
Sebastian Pokutta

Mathematics ◽  
2021 ◽  
Vol 9 (19) ◽  
pp. 2527
Author(s):  
József Abaffy ◽  
Szabina Fodor

Efficient solution of linear systems of equations is one of the central topics of numerical computation. Linear systems with complex coefficients arise from various physics and quantum chemistry problems. In this paper, we propose a novel ABS-based algorithm, which is able to solve complex systems of linear equations. Theoretical analysis is given to highlight the basic features of our new algorithm. Four variants of our algorithm were also implemented and intensively tested on randomly generated full and sparse matrices and real-life problems. The results of numerical experiments reveal that our ABS-based algorithm is able to compute the solution with high accuracy. The performance of our algorithm was compared with a commercially available software, Matlab’s mldivide (\) algorithm. Our algorithm outperformed the Matlab algorithm in most cases in terms of computational accuracy. These results expand the practical usefulness of our algorithm.


2021 ◽  
Author(s):  
Huiya Gu ◽  
Hannah Harris ◽  
Moshe Olshansky ◽  
Kiana Mohajeri ◽  
Yossi Eliaz ◽  
...  

Megabase-scale intervals of active, gene-rich and inactive, gene-poor chromatin are known to segregate, forming the A and B compartments. Fine mapping of the contents of these A and B compartments has been hitherto impossible, owing to the extraordinary sequencing depths required to distinguish between the long-range contact patterns of individual loci, and to the computational complexity of the associated calculations. Here, we generate the largest published in situ Hi-C map to date, spanning 33 billion contacts. We also develop a computational method, dubbed PCA of Sparse, Super Massive Matrices (POSSUMM), that is capable of efficiently calculating eigenvectors for sparse matrices with millions of rows and columns. Applying POSSUMM to our Hi-C dataset makes it possible to assign loci to the A and B compartment at 500 bp resolution. We find that loci frequently alternate between compartments as one moves along the contour of the genome, such that the median compartment interval is only 12.5 kb long. Contrary to the findings in coarse-resolution compartment profiles, we find that individual genes are not uniformly positioned in either the A compartment or the B compartment. Instead, essentially all (95%) active gene promoters localize in the A compartment, but the likelihood of localizing in the A compartment declines along the body of active genes, such that the transcriptional termini of long genes (>60 kb) tend to localize in the B compartment. Similarly, essentially all active enhancers elements (95%) localize in the A compartment, even when the flanking sequences are comprised entirely of inactive chromatin and localize in the B compartment. These results are consistent with a model in which DNA-bound regulatory complexes give rise to phase separation at the scale of individual DNA elements.


10.6036/10004 ◽  
2021 ◽  
Vol 96 (5) ◽  
pp. 512-519
Author(s):  
GORKA URKULLU MARTIN ◽  
IGOR FERNANDEZ DE BUSTOS ◽  
ANDER OLABARRIETA ◽  
RUBEN ANSOLA

The direct integration method by central differences (DIMCD) is an explicit method of order two for integrating the equations governing the dynamic analysis of multibody systems. So far, development has focused only on verifying the quality of the results. In this paper, it is shown that in addition to providing optimal results, it is also competitive from the point of view of computational efficiency, at least for systems with up to six bodies. For this purpose, an appropriate implementation of the method in a compiled language is presented. In turn, it is shown that the methodology is suitable for modeling in sparse matrices, although the proposed implementation is based on dense matrices. The resulting code is applied to different benchmark examples. Results from various commercial software are also included. Keywords: Computational efficiency, multibody dynamics, central differences, null space, dense matrices, quaternions


Sign in / Sign up

Export Citation Format

Share Document