scholarly journals Parallel Transposition of Sparse Data Structures

Author(s):  
Hao Wang ◽  
Weifeng Liu ◽  
Kaixi Hou ◽  
Wu-chun Feng
Keyword(s):  
2019 ◽  
Author(s):  
Thomas D. Sherman ◽  
Tiger Gao ◽  
Elana J. Fertig

AbstractMotivationBayesian factorization methods, including Coordinated Gene Activity in Pattern Sets (CoGAPS), are emerging as powerful analysis tools for single cell data. However, these methods have greater computational costs than their gradient-based counterparts. These costs are often prohibitive for analysis of large single-cell datasets. Many such methods can be run in parallel which enables this limitation to be overcome by running on more powerful hardware. However, the constraints imposed by the prior distributions in CoGAPS limit the applicability of parallelization methods to enhance computational efficiency for single-cell analysis.ResultsWe upgraded CoGAPS in Version 3 to overcome the computational limitations of Bayesian matrix factorization for single cell data analysis. This software includes a new parallelization framework that is designed around the sequential updating steps of the algorithm to enhance computational efficiency. These algorithmic advances were coupled with new software architecture and sparse data structures to reduce the memory overhead for single-cell data. Altogether, these updates to CoGAPS enhance the efficiency of the algorithm so that it can analyze 1000 times more cells, enabling factorization of large single-cell data sets.AvailabilityCoGAPS is available as a Bioconductor package and the source code is provided at github.com/FertigLab/CoGAPS. All efficiency updates to enable single-cell analysis available as of version [email protected]


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Thomas D. Sherman ◽  
Tiger Gao ◽  
Elana J. Fertig

Abstract Background Bayesian factorization methods, including Coordinated Gene Activity in Pattern Sets (CoGAPS), are emerging as powerful analysis tools for single cell data. However, these methods have greater computational costs than their gradient-based counterparts. These costs are often prohibitive for analysis of large single-cell datasets. Many such methods can be run in parallel which enables this limitation to be overcome by running on more powerful hardware. However, the constraints imposed by the prior distributions in CoGAPS limit the applicability of parallelization methods to enhance computational efficiency for single-cell analysis. Results We developed a new software framework for parallel matrix factorization in Version 3 of the CoGAPS R/Bioconductor package to overcome the computational limitations of Bayesian matrix factorization for single cell data analysis. This parallelization framework provides asynchronous updates for sequential updating steps of the algorithm to enhance computational efficiency. These algorithmic advances were coupled with new software architecture and sparse data structures to reduce the memory overhead for single-cell data. Conclusions Altogether our new software enhance the efficiency of the CoGAPS Bayesian matrix factorization algorithm so that it can analyze 1000 times more cells, enabling factorization of large single-cell data sets.


MACRo 2015 ◽  
2015 ◽  
Vol 1 (1) ◽  
pp. 283-292
Author(s):  
Péter Böröcz ◽  
Péter Tar ◽  
István Maros

AbstractSparse linear algebraic data structures are widely used during the solution of large scale linear optimization problems. The efficiency of the solver is significantly influenced by the used data structures. The implementations of such data structures are not trivial. A performance analysis of the available data structures can provide valuable information to improve efficiency. In the talk we present our software that supports this task as well as our new, special vector representation. We also report results covering the solution for numerical issues affecting the performance of sparse linear algebraic operations.


Sign in / Sign up

Export Citation Format

Share Document