popSTR2 enables clinical and population-scale genotyping of microsatellites

Snædis Kristmundsdottir; Hannes P Eggertsson; Gudny A Arnadottir; Bjarni V Halldorsson

doi:10.1093/bioinformatics/btz913

popSTR2 enables clinical and population-scale genotyping of microsatellites

Bioinformatics ◽

10.1093/bioinformatics/btz913 ◽

2019 ◽

Vol 36 (7) ◽

pp. 2269-2271 ◽

Cited By ~ 4

Author(s):

Snædis Kristmundsdottir ◽

Hannes P Eggertsson ◽

Gudny A Arnadottir ◽

Bjarni V Halldorsson

Keyword(s):

Population Based ◽

Supplementary Information ◽

Supplementary Data ◽

Clinical Sequencing ◽

Manual Inspection ◽

Repeat Expansions ◽

Population Scale

Abstract Summary popSTR2 is an update and augmentation of our previous work ‘popSTR: a population-based microsatellite genotyper’. To make genotyping sensitive to inter-sample differences, we supply a kernel to estimate sample-specific slippage rates. For clinical sequencing purposes, a panel of known pathogenic repeat expansions is provided along with a script that scans and flags for manual inspection markers indicative of a pathogenic expansion. Like its predecessor, popSTR2 allows for joint genotyping of samples at a population scale. We now provide a binning method that makes the microsatellite genotypes more amenable to analysis within standard association pipelines and can increase association power. Availability and implementation https://github.com/DecodeGenetics/popSTR. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Accurate, scalable cohort variant calls using DeepVariant and GLnexus

Bioinformatics ◽

10.1093/bioinformatics/btaa1081 ◽

2021 ◽

Author(s):

Taedong Yun ◽

Helen Li ◽

Pi-Chuan Chang ◽

Michael F Lin ◽

Andrew Carroll ◽

...

Keyword(s):

Best Practices ◽

Quality Metrics ◽

Supplementary Information ◽

Public Research ◽

Supplementary Data ◽

Quality Improvements ◽

1000 Genomes Project ◽

Individual Level ◽

1000 Genomes ◽

Population Scale

Abstract Motivation Population-scale sequenced cohorts are foundational resources for genetic analyses, but processing raw reads into analysis-ready cohort-level variants remains challenging. Results We introduce an open-source cohort-calling method that uses the highly-accurate caller DeepVariant and scalable merging tool GLnexus. Using callset quality metrics based on variant recall and precision in benchmark samples and Mendelian consistency in father-mother-child trios, we optimized the method across a range of cohort sizes, sequencing methods, and sequencing depths. The resulting callsets show consistent quality improvements over those generated using existing best practices with reduced cost. We further evaluate our pipeline in the deeply sequenced 1000 Genomes Project (1KGP) samples and show superior callset quality metrics and imputation reference panel performance compared to an independently-generated GATK Best Practices pipeline. Availability and Implementation We publicly release the 1KGP individual-level variant calls and cohort callset (https://console.cloud.google.com/storage/browser/brain-genomics-public/research/cohort/1KGP) to foster additional development and evaluation of cohort merging methods as well as broad studies of genetic variation. Both DeepVariant (https://github.com/google/deepvariant) and GLnexus (https://github.com/dnanexus-rnd/GLnexus) are open-sourced, and the optimized GLnexus setup discovered in this study is also integrated into GLnexus public releases v1.2.2 and later. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

GLASS: assisted and standardized assessment of gene variations from Sanger sequence trace data

10.1101/088401 ◽

2016 ◽

Author(s):

Karol Pal ◽

Vojtech Bystry ◽

Tomas Reigl ◽

Martin Demko ◽

Adam Krejci ◽

...

Keyword(s):

Sanger Sequencing ◽

Reference Method ◽

Supplementary Information ◽

Sequence Variant ◽

Supplementary Data ◽

Sequencing Data ◽

Manual Inspection ◽

Sanger Sequence ◽

Variant Detection ◽

Sequence Trace

AbstractMotivationSanger sequencing remains the reference method for sequence variant detection, especially in a clinical setting. However, chromatogram interpretation often requires manual inspection and in some cases considerable expertise. Additionally, variant reporting and nomenclature is typically left to the user, which can lead to inconsistencies.ResultsWe introduce GLASS, a tool built to assist with the assessment of gene variations in Sanger sequencing data. Critically, it provides a standardized variant output as recommended by the Human Genome Variation Society.AvailabilityThe program is freely available online at http://bat.infspire.org/genomepd/glass/[email protected], [email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

ccNetViz: a WebGL-based JavaScript library for visualization of large networks

Bioinformatics ◽

10.1093/bioinformatics/btaa559 ◽

2020 ◽

Vol 36 (16) ◽

pp. 4527-4529

Author(s):

Ales Saska ◽

David Tichy ◽

Robert Moore ◽

Achilles Rasquinha ◽

Caner Akdas ◽

...

Keyword(s):

Systems Biology ◽

Complex Networks ◽

Open Source ◽

High Speed ◽

A Priori ◽

Supplementary Information ◽

Network Visualization ◽

Supplementary Data ◽

Web Based ◽

Flow Of Information

Abstract Summary Visualizing a network provides a concise and practical understanding of the information it represents. Open-source web-based libraries help accelerate the creation of biologically based networks and their use. ccNetViz is an open-source, high speed and lightweight JavaScript library for visualization of large and complex networks. It implements customization and analytical features for easy network interpretation. These features include edge and node animations, which illustrate the flow of information through a network as well as node statistics. Properties can be defined a priori or dynamically imported from models and simulations. ccNetViz is thus a network visualization library particularly suited for systems biology. Availability and implementation The ccNetViz library, demos and documentation are freely available at http://helikarlab.github.io/ccNetViz/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Epidemiological modeling in StochSS Live!

Bioinformatics ◽

10.1093/bioinformatics/btab061 ◽

2021 ◽

Author(s):

Richard Jiang ◽

Bruno Jacob ◽

Matthew Geiger ◽

Sean Matthew ◽

Bryan Rumsey ◽

...

Keyword(s):

Stochastic Model ◽

Epidemiological Model ◽

Supplementary Information ◽

Supplementary Data ◽

Web Based ◽

Epidemiological Modeling ◽

Modeling Simulation ◽

Wide Range ◽

Biochemical Systems

Abstract Summary We present StochSS Live!, a web-based service for modeling, simulation and analysis of a wide range of mathematical, biological and biochemical systems. Using an epidemiological model of COVID-19, we demonstrate the power of StochSS Live! to enable researchers to quickly develop a deterministic or a discrete stochastic model, infer its parameters and analyze the results. Availability and implementation StochSS Live! is freely available at https://live.stochss.org/ Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

KEC: unique sequence search by K-mer exclusion

Bioinformatics ◽

10.1093/bioinformatics/btab196 ◽

2021 ◽

Author(s):

Pavel Beran ◽

Dagmar Stehlíková ◽

Stephen P Cohen ◽

Vladislav Čurn

Keyword(s):

Amino Acid ◽

Nucleic Acid ◽

Source Code ◽

Unique Sequence ◽

Supplementary Information ◽

Supplementary Data ◽

Laptop Computers ◽

Sequence Search ◽

Target Sequences ◽

Cross Reference

Abstract Summary Searching for amino acid or nucleic acid sequences unique to one organism may be challenging depending on size of the available datasets. K-mer elimination by cross-reference (KEC) allows users to quickly and easily find unique sequences by providing target and non-target sequences. Due to its speed, it can be used for datasets of genomic size and can be run on desktop or laptop computers with modest specifications. Availability and implementation KEC is freely available for non-commercial purposes. Source code and executable binary files compiled for Linux, Mac and Windows can be downloaded from https://github.com/berybox/KEC. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

CorGAT: a tool for the functional annotation of SARS-CoV-2 genomes

Bioinformatics ◽

10.1093/bioinformatics/btaa1047 ◽

2020 ◽

Author(s):

Matteo Chiara ◽

Federico Zambelli ◽

Marco Antonio Tangaro ◽

Pietro Mandreoli ◽

David S Horner ◽

...

Keyword(s):

Functional Annotation ◽

Ad Hoc ◽

State Of The Art ◽

Supplementary Information ◽

Genomic Sequences ◽

Supplementary Data ◽

Evolutionary Patterns ◽

Genomic Variants ◽

Art Methods ◽

Available Resources

Abstract Summary While over 200 000 genomic sequences are currently available through dedicated repositories, ad hoc methods for the functional annotation of SARS-CoV-2 genomes do not harness all currently available resources for the annotation of functionally relevant genomic sites. Here, we present CorGAT, a novel tool for the functional annotation of SARS-CoV-2 genomic variants. By comparisons with other state of the art methods we demonstrate that, by providing a more comprehensive and rich annotation, our method can facilitate the identification of evolutionary patterns in the genome of SARS-CoV-2. Availabilityand implementation Galaxy http://corgat.cloud.ba.infn.it/galaxy; software: https://github.com/matteo14c/CorGAT/tree/Revision_V1; docker: https://hub.docker.com/r/laniakeacloud/galaxy_corgat. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

UniBioDicts: Unified access to Biological Dictionaries

Bioinformatics ◽

10.1093/bioinformatics/btaa1065 ◽

2020 ◽

Author(s):

John Zobolas ◽

Vasundra Touré ◽

Martin Kuiper ◽

Steven Vercruysse

Keyword(s):

User Interface ◽

Life Science ◽

Biological Data ◽

Supplementary Information ◽

Supplementary Data ◽

Query Interface ◽

Controlled Vocabularies ◽

Search String ◽

Software Packages ◽

The Right

Abstract Summary We present a set of software packages that provide uniform access to diverse biological vocabulary resources that are instrumental for current biocuration efforts and tools. The Unified Biological Dictionaries (UniBioDicts or UBDs) provide a single query-interface for accessing the online API services of leading biological data providers. Given a search string, UBDs return a list of matching term, identifier and metadata units from databases (e.g. UniProt), controlled vocabularies (e.g. PSI-MI) and ontologies (e.g. GO, via BioPortal). This functionality can be connected to input fields (user-interface components) that offer autocomplete lookup for these dictionaries. UBDs create a unified gateway for accessing life science concepts, helping curators find annotation terms across resources (based on descriptive metadata and unambiguous identifiers), and helping data users search and retrieve the right query terms. Availability and implementation The UBDs are available through npm and the code is available in the GitHub organisation UniBioDicts (https://github.com/UniBioDicts) under the Affero GPL license. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

CONCUR: quick and robust calculation of codon usage from ribosome profiling data

Bioinformatics ◽

10.1093/bioinformatics/btaa733 ◽

2020 ◽

Author(s):

Michaela Frye ◽

Susanne Bornelöv

Keyword(s):

Codon Usage ◽

Ribosome Profiling ◽

Supplementary Information ◽

Supplementary Data ◽

Usage Analysis

Abstract Summary CONCUR is a standalone tool for codon usage analysis in ribosome profiling experiments. CONCUR uses the aligned reads in BAM format to estimate codon counts at the ribosome E-, P- and A-sites and at flanking positions. Availability and implementation CONCUR is written in Perl and is freely available at https://github.com/susbo/concur. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

MorphOT: transport-based interpolation between EM maps with UCSF ChimeraX

Bioinformatics ◽

10.1093/bioinformatics/btaa1019 ◽

2020 ◽

Author(s):

Arthur Ecoffet ◽

Frédéric Poitevin ◽

Khanh Dao Duc

Keyword(s):

Optimal Transport ◽

Three Dimensional ◽

Linear Interpolation ◽

The Other ◽

Supplementary Information ◽

Conformational Heterogeneity ◽

Supplementary Data ◽

Image Dataset ◽

Standard Linear ◽

Unique Potential

Abstract Motivation Cryogenic electron microscopy (cryo-EM) offers the unique potential to capture conformational heterogeneity, by solving multiple three-dimensional classes that co-exist within a single cryo-EM image dataset. To investigate the extent and implications of such heterogeneity, we propose to use an optimal-transport-based metric to interpolate barycenters between EM maps and produce morphing trajectories. Results While standard linear interpolation mostly fails to produce realistic transitions, our method yields continuous trajectories that displace densities to morph one map into the other, instead of blending them. Availability and implementation Our method is implemented as a plug-in for ChimeraX called MorphOT, which allows the use of both CPU or GPU resources. The code is publicly available on GitHub (https://github.com/kdd-ubc/MorphOT.git), with documentation containing tutorial and datasets. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

BioCommons: a robust java library for RNA structural bioinformatics

Bioinformatics ◽

10.1093/bioinformatics/btab069 ◽

2021 ◽

Author(s):

Tomasz Zok

Keyword(s):

Source Code ◽

Structural Bioinformatics ◽

Supplementary Information ◽

Supplementary Data ◽

Bioinformatic Tools ◽

Data Formats ◽

Central Repository ◽

Diverse Data ◽

2D And 3D ◽

Java Library

Abstract Motivation Biomolecular structures come in multiple representations and diverse data formats. Their incompatibility with the requirements of data analysis programs significantly hinders the analytics and the creation of new structure-oriented bioinformatic tools. Therefore, the need for robust libraries of data processing functions is still growing. Results BioCommons is an open-source, Java library for structural bioinformatics. It contains many functions working with the 2D and 3D structures of biomolecules, with a particular emphasis on RNA. Availability and implementation The library is available in Maven Central Repository and its source code is hosted on GitHub: https://github.com/tzok/BioCommons Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text