memory efficiency
Recently Published Documents


TOTAL DOCUMENTS

158
(FIVE YEARS 61)

H-INDEX

19
(FIVE YEARS 3)

2022 ◽  
Author(s):  
David Pellow ◽  
Abhinav Dutta ◽  
Ron Shamir

As sequencing datasets keep growing larger, time and memory efficiency of read mapping are becoming more critical. Many clever algorithms and data structures were used to develop mapping tools for next generation sequencing, and in the last few years also for third generation long reads. A key idea in mapping algorithms is to sketch sequences with their minimizers. Recently, syncmers were introduced as an alternative sketching method that is more robust to mutations and sequencing errors. Here we introduce parameterized syncmer schemes, and provide a theoretical analysis for multi-parameter schemes. By combining these schemes with downsampling or minimizers we can achieve any desired compression and window guarantee. We introduced syncmer schemes into the popular minimap2 and Winnowmap2 mappers. In tests on simulated and real long read data from a variety of genomes, the syncmer-based algorithms reduced unmapped reads by 20-60% at high compression while using less memory. The advantage of syncmer-based mapping was even more pronounced at lower sequence identity. At sequence identity of 65-75% and medium compression, syncmer mappers had 50-60% fewer unmapped reads, and ∼ 10% fewer of the reads that did map were incorrectly mapped. We conclude that syncmer schemes improve mapping under higher error and mutation rates. This situation happens, for example, when the high error rate of long reads is compounded by a high mutation rate in a cancer tumor, or due to differences between strains of viruses or bacteria.


2021 ◽  
Author(s):  
Miquel Anglada-Girotto ◽  
Samuel Miravet-Verde ◽  
Luis Serrano ◽  
Sarah A. Head

Motivation: Independent Component Analysis (ICA) allows the dissection of omic datasets into modules that help to interpret global molecular signatures. The inherent randomness of this algorithm can be overcome by clustering many iterations of ICA together to obtain robust components. Existing algorithms for robust ICA are dependent on the choice of clustering method and on computing a potentially biased and large Pearson distance matrix. Results: We present robustica, a Python-based package to compute robust independent components with a fully customizable clustering algorithm and distance metric. Here, we exploited its customizability to revisit and optimize robust ICA systematically. From the 6 popular clustering algorithms considered, DBSCAN performed the best at clustering independent components across ICA iterations. After confirming the bias introduced with Pearson distances, we created a subroutine that infers and corrects the components′ signs across ICA iterations to enable using Euclidean distance. Our subroutine effectively corrected the bias while simultaneously increasing the precision, robustness, and memory efficiency of the algorithm. Finally, we show the applicability of robustica by dissecting over 500 tumor samples from low-grade glioma (LGG) patients, where we define a new gene expression module with the key modulators of tumor aggressiveness downregulated upon IDH1 mutation. Availability and implementation: robustica is written in Python under the open-source BSD 3-Clause license. The source code and documentation are freely available at <A HREF="https://github.com/CRG-CNAG/robustica">https://github.com/CRG-CNAG/robustica</A>. Additionally, all scripts to reproduce the work presented are available at <A HREF="https://github.com/MiqG/publication_robustica">https://github.com/MiqG/publication_robustica</A>.


2021 ◽  
Author(s):  
Yixin Guo ◽  
Pengcheng Li ◽  
Yingwei Luo ◽  
Xiaolin Wang ◽  
Zhenlin Wang

Author(s):  
Hanno Becker ◽  
Jose Maria Bermudo Mera ◽  
Angshuman Karmakar ◽  
Joseph Yiu ◽  
Ingrid Verbauwhede

High-degree, low-precision polynomial arithmetic is a fundamental computational primitive underlying structured lattice based cryptography. Its algorithmic properties and suitability for implementation on different compute platforms is an active area of research, and this article contributes to this line of work: Firstly, we present memory-efficiency and performance improvements for the Toom-Cook/Karatsuba polynomial multiplication strategy. Secondly, we provide implementations of those improvements on Arm® Cortex®-M4 CPU, as well as the newer Cortex-M55 processor, the first M-profile core implementing the M-profile Vector Extension (MVE), also known as Arm® Helium™ technology. We also implement the Number Theoretic Transform (NTT) on the Cortex-M55 processor. We show that despite being singleissue, in-order and offering only 8 vector registers compared to 32 on A-profile SIMD architectures like Arm® Neon™ technology and the Scalable Vector Extension (SVE), by careful register management and instruction scheduling, we can obtain a 3× to 5× performance improvement over already highly optimized implementations on Cortex-M4, while maintaining a low area and energy profile necessary for use in embedded market. Finally, as a real-world application we integrate our multiplication techniques to post-quantum key-encapsulation mechanism Saber


2021 ◽  
pp. 108071
Author(s):  
John A.E. Anderson ◽  
John G. Grundy ◽  
Cheryl L. Grady ◽  
Fergus I.M. Craik ◽  
Ellen Bialystok

2021 ◽  
Vol 10 (12) ◽  
pp. e66101220105
Author(s):  
Lívia Maria de Lima Leôncio ◽  
Flávio Henrique de Santana ◽  
Clécia Gabriela Bezerra ◽  
Gilberto Ramos Vieira ◽  
Letycia dos Santos Neves ◽  
...  

Daytime sleepiness could reduce the memorization of children who are in school. Thus, the aim of this study was to study the effect of daytime sleepiness on the visual memory of schoolchildren at different times during the school semester. Individuals of both genders (n = 88) aged 9 to 11 years and regularly enrolled at the Mariana Amália Municipal School were selected. Data collection occurred in two moments: at the beginning and end of the academic semester. A semi-structured questionnaire was used to collect sociodemographic information, the Epworth Sleepiness Scale to assess sleepiness and the Rey-Osterrieth complex figure, object recall, scrambling figures and addition of dictated numbers for memory analysis tests. The data revealed that there is no direct relationship between sleepiness and impaired memory by the tests used in any of the analyzed moments. However, children showed lower visuospatial memory efficiency at the beginning of the school semester, indicating that they may have greater difficulty in memory retention. Lastly, there was an abnormality in the degree of sleepiness at the end of the school semester and the female gender showed efficiency in immediate and late memory.


2021 ◽  
Author(s):  
Xiangyu Ye ◽  
Zhiquan Lai ◽  
Shengwei Li ◽  
Lei Cai ◽  
Ding Sun ◽  
...  
Keyword(s):  

2021 ◽  
Vol 26 ◽  
pp. 1-67
Author(s):  
Patrick Dinklage ◽  
Jonas Ellert ◽  
Johannes Fischer ◽  
Florian Kurpicz ◽  
Marvin Löbel

We present new sequential and parallel algorithms for wavelet tree construction based on a new bottom-up technique. This technique makes use of the structure of the wavelet trees—refining the characters represented in a node of the tree with increasing depth—in an opposite way, by first computing the leaves (most refined), and then propagating this information upwards to the root of the tree. We first describe new sequential algorithms, both in RAM and external memory. Based on these results, we adapt these algorithms to parallel computers, where we address both shared memory and distributed memory settings. In practice, all our algorithms outperform previous ones in both time and memory efficiency, because we can compute all auxiliary information solely based on the information we obtained from computing the leaves. Most of our algorithms are also adapted to the wavelet matrix , a variant that is particularly suited for large alphabets.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Botao Fa ◽  
Ting Wei ◽  
Yuan Zhou ◽  
Luke Johnston ◽  
Xin Yuan ◽  
...  

AbstractSingle cell RNA sequencing (scRNA-seq) is a powerful tool in detailing the cellular landscape within complex tissues. Large-scale single cell transcriptomics provide both opportunities and challenges for identifying rare cells playing crucial roles in development and disease. Here, we develop GapClust, a light-weight algorithm to detect rare cell types from ultra-large scRNA-seq datasets with state-of-the-art speed and memory efficiency. Benchmarking on diverse experimental datasets demonstrates the superior performance of GapClust compared to other recently proposed methods. When applying our algorithm to an intestine and 68 k PBMC datasets, GapClust identifies the tuft cells and a previously unrecognised subtype of monocyte, respectively.


Sign in / Sign up

Export Citation Format

Share Document