FORGe: prioritizing variants for graph genomes

Mapping Intimacies ◽

10.1101/311720 ◽

2018 ◽

Author(s):

Jacob Pritt ◽

Nae-Chyun Chen ◽

Ben Langmead

Keyword(s):

Genetic Variants ◽

Reference Genome ◽

Software Tool ◽

Alignment Score ◽

Alignment Accuracy ◽

Computational Overhead ◽

Trade Offs ◽

The Cost ◽

Positive Effect ◽

Allelic Bias

AbstractThere is growing interest in using genetic variants to augment the reference genome into a “graph genome” to improve read alignment accuracy and reduce allelic bias. While adding a variant has the positive effect of removing an undesirable alignment-score penalty, it also increases both the ambiguity of the reference genome and the cost of storing and querying the genome index. We introduce methods and a software tool called FORGe for modeling these effects and prioritizing variants accordingly. We show that FORGe enables a range of advantageous and measurable trade-offs between accuracy and computational overhead.

A natural encoding of genetic variation in a Burrows-Wheeler Transform to enable mapping and genome inference

10.1101/059170 ◽

2016 ◽

Cited By ~ 6

Author(s):

Sorina Maciuca ◽

Carlos del Ojo Elias ◽

Gil McVean ◽

Zamin Iqbal

Keyword(s):

Genetic Variation ◽

Human Genome ◽

Genetic Variants ◽

Reference Genome ◽

Exact Matching ◽

Performance Impact ◽

Alphabet Size ◽

The Cost ◽

Burrows Wheeler Transform

AbstractWe show how positional markers can be used to encode genetic variation within aBurrows-Wheeler Transform (BWT), and use this to construct a generalisation ofthe traditional “reference genome”, incorporating known variation within aspecies. Our goal is to support the inference of the closest mosaic of previouslyknown sequences to the genome(s) under analysis.Our scheme results in an increased alphabet size, and by using a wavelet tree encoding of the BWT we reduce the performance impact on rank operations. We give a specialised form of the backward search that allows variation-aware exact matching. We implement this, and demonstrate the cost of constructing an index of the whole human genome with 8 million genetic variants is 25GB of RAM. We also show that inferring a closer reference can close large kilobase-scale coverage gaps in P. falciparum.

A framework for reducing the overhead of the quantum oracle for use with Grover’s algorithm with applications to cryptanalysis of SIKE

Journal of Mathematical Cryptology ◽

10.1515/jmc-2020-0080 ◽

2020 ◽

Vol 15 (1) ◽

pp. 143-156

Author(s):

Jean-François Biasse ◽

Benjamin Pring

Keyword(s):

Search Algorithm ◽

Circuit Complexity ◽

Quantum Circuit ◽

Grover’S Algorithm ◽

Search Problems ◽

Trade Offs ◽

Single Target ◽

Grover's Algorithm ◽

Quantum Oracle ◽

The Cost

AbstractIn this paper we provide a framework for applying classical search and preprocessing to quantum oracles for use with Grover’s quantum search algorithm in order to lower the quantum circuit-complexity of Grover’s algorithm for single-target search problems. This has the effect (for certain problems) of reducing a portion of the polynomial overhead contributed by the implementation cost of quantum oracles and can be used to provide either strict improvements or advantageous trade-offs in circuit-complexity. Our results indicate that it is possible for quantum oracles for certain single-target preimage search problems to reduce the quantum circuit-size from $O\left(2^{n/2}\cdot mC\right)$ (where C originates from the cost of implementing the quantum oracle) to $O(2^{n/2} \cdot m\sqrt{C})$ without the use of quantum ram, whilst also slightly reducing the number of required qubits.This framework captures a previous optimisation of Grover’s algorithm using preprocessing [21] applied to cryptanalysis, providing new asymptotic analysis. We additionally provide insights and asymptotic improvements on recent cryptanalysis [16] of SIKE [14] via Grover’s algorithm, demonstrating that the speedup applies to this attack and impacting upon quantum security estimates [16] incorporated into the SIKE specification [14].

Reference flow: reducing reference bias using multiple population genomes

Genome Biology ◽

10.1186/s13059-020-02229-3 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Nae-Chyun Chen ◽

Brad Solomon ◽

Taher Mun ◽

Sheila Iyer ◽

Ben Langmead

Keyword(s):

Genetic Variation ◽

Reference Genome ◽

Alignment Method ◽

Sequencing Data ◽

Computational Overhead ◽

Reference Flow ◽

Multiple Population ◽

Reference Bias ◽

Flow Alignment ◽

Reference Genomes

AbstractMost sequencing data analyses start by aligning sequencing reads to a linear reference genome, but failure to account for genetic variation leads to reference bias and confounding of results downstream. Other approaches replace the linear reference with structures like graphs that can include genetic variation, incurring major computational overhead. We propose the reference flow alignment method that uses multiple population reference genomes to improve alignment accuracy and reduce reference bias. Compared to the graph aligner vg, reference flow achieves a similar level of accuracy and bias avoidance but with 14% of the memory footprint and 5.5 times the speed.

Wideband Rectangular Foldable and Non-foldable Antenna for Internet of Things Applications

International Journal of Antennas and Propagation ◽

10.1155/2019/2125713 ◽

2019 ◽

Vol 2019 ◽

pp. 1-5 ◽

Cited By ~ 1

Author(s):

Steve W. Y. Mung ◽

Cheuk Yin Cheung ◽

Ka Ming Wu ◽

Joseph S. M. Yuen

Keyword(s):

Internet Of Things ◽

Printed Circuit Board ◽

Circuit Board ◽

Design Parameters ◽

Multiple Frequency ◽

Wireless Applications ◽

Printed Circuit ◽

Iot Applications ◽

Trade Offs ◽

The Cost

This article presents a simple wideband rectangular antenna in foldable and non-foldable (printed circuit board (PCB)) structures for Internet of Things (IoT) applications. Both are simple structures with two similar rectangular metal planes which cover multiple frequency bands such as GPS, WCDMA/LTE, and 2.4 GHz industrial, scientific, and medical (ISM) bands. This wideband antenna is suitable to integrate into the short- and long-range wireless applications such as the short-range 2.4 GHz ISM band and standard cellular bands. This lowers the overall size of the product as well as the cost in the applications. In this article, the configuration and operation principle are presented as well as its trade-offs on the design parameters. Simulated and experimental results of foldable and non-foldable (PCB) structures show that the antenna is suited for IoT applications.

A Scalable Redefined Stochastic Blockmodel

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3442589 ◽

2021 ◽

Vol 15 (3) ◽

pp. 1-28

Author(s):

Xueyan Liu ◽

Bo Yang ◽

Hechang Chen ◽

Katarzyna Musial ◽

Hongxu Chen ◽

...

Keyword(s):

Large Scale ◽

Network Science ◽

Learning Algorithm ◽

State Of The Art ◽

Real World Data ◽

Computational Overhead ◽

Stochastic Blockmodel ◽

Np Hard Problem ◽

Large Scale Networks ◽

The Cost

Stochastic blockmodel (SBM) is a widely used statistical network representation model, with good interpretability, expressiveness, generalization, and flexibility, which has become prevalent and important in the field of network science over the last years. However, learning an optimal SBM for a given network is an NP-hard problem. This results in significant limitations when it comes to applications of SBMs in large-scale networks, because of the significant computational overhead of existing SBM models, as well as their learning methods. Reducing the cost of SBM learning and making it scalable for handling large-scale networks, while maintaining the good theoretical properties of SBM, remains an unresolved problem. In this work, we address this challenging task from a novel perspective of model redefinition. We propose a novel redefined SBM with Poisson distribution and its block-wise learning algorithm that can efficiently analyse large-scale networks. Extensive validation conducted on both artificial and real-world data shows that our proposed method significantly outperforms the state-of-the-art methods in terms of a reasonable trade-off between accuracy and scalability. 1

Demographic life history traits of reproductive natterjack toads (Bufo calamita) vary between northern and southern latitudes

Amphibia-Reptilia ◽

10.1163/156853806778189918 ◽

2006 ◽

Vol 27 (3) ◽

pp. 365-375 ◽

Cited By ~ 31

Author(s):

Delfi Sanuy ◽

Christoph Leskovar ◽

Neus Oromi ◽

Ulrich Sinsch

Keyword(s):

Life History ◽

Life History Traits ◽

Large Female ◽

Breeding Period ◽

Female Size ◽

Data Set ◽

Bufo Calamita ◽

Trade Offs ◽

Age Variation ◽

The Cost

AbstractDemographic life history traits were investigated in three Bufo calamita populations in Germany (Rhineland-Palatinate: Urmitz, 50°N; 1998-2000) and Spain (Catalonia: Balaguer, Mas de Melons, 41°N; 2004). We used skeletochronology to estimate the age as number of lines of arrested growth in breeding adults collected during the spring breeding period (all localities) and during the summer breeding period (only Urmitz). A data set including the variables sex, age and size of 185 males and of 87 females was analyzed with respect to seven life history traits (age and size at maturity of the youngest first breeders, age variation in first breeders, longevity, potential reproductive lifespan, median lifespan, age-size relationship). Spring and summer cohorts at the German locality differed with respect to longevity and potential reproductive lifespan by one year in favour of the early breeders. The potential consequences on fitness and stability of cohorts are discussed. Latitudinal variation of life history traits was mainly limited to female natterjacks in which along a south-north gradient longevity and potential reproductive lifespan increased while size decreased. These results and a review of published information on natterjack demography suggest that lifetime number of offspring seem to be optimized by locally different trade-offs: large female size at the cost of longevity in southern populations and increased longevity at the cost of size in northern ones.

Parallel Software to Offset the Cost of Higher Precision

ACM SIGAda Ada Letters ◽

10.1145/3463478.3463483 ◽

2021 ◽

Vol 40 (2) ◽

pp. 59-64

Author(s):

Jan Verschelde

Keyword(s):

Power Series ◽

Parallel Algorithms ◽

Series Expansions ◽

Use Case ◽

Double Precision ◽

Algebraic Space ◽

Space Curves ◽

Computational Overhead ◽

Power Series Expansions ◽

The Cost

Hardware double precision is often insufficient to solve large scientific problems accurately. Computing in higher precision defined by software causes significant computational overhead. The application of parallel algorithms compensates for this overhead. Newton's method to develop power series expansions of algebraic space curves is the use case for this application.

An Experiential Community Orientation to Improve Knowledge and Assess Resident Attitudes Toward Poor Patients

Journal of Graduate Medical Education ◽

10.4300/jgme-d-12-00015.1 ◽

2013 ◽

Vol 5 (1) ◽

pp. 119-124 ◽

Cited By ~ 5

Author(s):

Erik A. Wallace ◽

Julie E. Miller-Cribbs ◽

F. Daniel Duffy

Keyword(s):

Descriptive Analysis ◽

Quality Health Care ◽

Resident Attitudes ◽

Community Orientation ◽

Knowledge And Attitudes ◽

Underserved Patients ◽

Quality Health ◽

Care Training ◽

The Cost ◽

Positive Effect

Abstract Background Future physicians may not be prepared for the challenges of caring for the growing population of poor patients in this country. Given the potential for a socioeconomic “gulf” between physicians and patients and the lack of curricula that address the specific needs of poor patients, resident knowledge about caring for this underserved population is low. Intervention We created a 2-day Resident Academy orientation, before the start of residency training, to improve community knowledge and address resident attitudes toward poor patients through team-based experiential activities. We collected demographic and satisfaction data through anonymous presurvey and postsurvey t tests, and descriptive analysis of the quantitative data were conducted. Qualitative comments from open-ended questions were reviewed, coded, and divided into themes. We also offer information on the cost and replicability of the Academy. Results Residents rated most components of the Academy as “very good” or “excellent.” Satisfaction scores were higher among residents in primary care training programs than among residents in nonprimary care programs for most Academy elements. Qualitative data demonstrated an overall positive effect on resident knowledge and attitudes about community resource availability for underserved patients, and the challenges of poor patients to access high-quality health care. Conclusions The Resident Academy orientation improved knowledge and attitudes of new residents before the start of residency, and residents were satisfied with the experience. The commitment of institutional leaders is essential for success.

Sustainable Distribution Design: Contrasting Disposable, Recyclable, and Reusable Strategies for Packaging Materials Using a Total Cost Analysis With an Illustration of Milk Distribution

Volume 6: 15th Design for Manufacturing and the Lifecycle Conference; 7th Symposium on International Design and Design Education ◽

10.1115/detc2010-28823 ◽

2010 ◽

Author(s):

Sri Satya Kanaka Nagendra Jayanty ◽

William J. Sawaya ◽

Michael D. Johnson

Keyword(s):

Cost Benefit ◽

Packaging Materials ◽

Policy Makers ◽

Specific Product ◽

Cost Structures ◽

Trade Offs ◽

Technological Advances ◽

The Cost ◽

Total Cost Analysis ◽

Plastic Containers

Engineers, policy makers, and managers have shown increasing interest in increasing the sustainability of products over their complete lifecycles and also from the ‘cradle to grave’ or from production to the disposal of each specific product. However, a significant amount of material is disposed of in landfills rather than being reused in some form. A sizeable proportion of the products being dumped in landfills consist of packaging materials for consumable products. Technological advances in plastics, packaging, cleaning, logistics, and new environmental awareness and understanding may have altered the cost structures surrounding the lifecycle use and disposal costs of many materials and products resulting in different cost-benefit trade-offs. An explicit and well-informed economic analysis of reusing certain containers might change current practices and results in significantly less waste disposal in landfills and in less consumption of resources for manufacturing packaging materials. This work presents a method for calculating the costs associated with a complete process of implementing a system to reuse plastic containers for food products. Specifically, the different relative costs of using a container and then either disposing of it in a landfill, recycling the material, or reconditioning the container for reuse and then reusing it are compared explicitly. Specific numbers and values are calculated for the case of plastic milk bottles to demonstrate the complicated interactions and the feasibility of such a strategy.

Alview: Portable Software for Viewing Sequence Reads in BAM Formatted Files

Cancer Informatics ◽

10.4137/cin.s26470 ◽

2015 ◽

Vol 14 ◽

pp. CIN.S26470 ◽

Cited By ~ 2

Author(s):

Richard P. Finney ◽

Qing-Rong Chen ◽

Cu V. Nguyen ◽

Chih Hao Hsu ◽

Chunhua Yan ◽

...

Keyword(s):

Graphical User Interface ◽

Reference Genome ◽

Source Code ◽

Software Tool ◽

Command Line ◽

Sequencing Data ◽

Genome Data ◽

Command Line Tool ◽

Portable Software ◽

Microsoft Windows

The name Alview is a contraction of the term Alignment Viewer. Alview is a compiled to native architecture software tool for visualizing the alignment of sequencing data. Inputs are files of short-read sequences aligned to a reference genome in the SAM/BAM format and files containing reference genome data. Outputs are visualizations of these aligned short reads. Alview is written in portable C with optional graphical user interface (GUI) code written in C, C++, and Objective-C. The application can run in three different ways: as a web server, as a command line tool, or as a native, GUI program. Alview is compatible with Microsoft Windows, Linux, and Apple OS X. It is available as a web demo at https://cgwb.nci.nih.gov/cgi-bin/alview . The source code and Windows/Mac/Linux executables are available via https://github.com/NCIP/alview .