G-Graph: An interactive genomic graph viewer

AbstractMotivationEffective and efficient exploration of numeric data and annotations as a function of genomic position requires specialized software.ResultsWe present G-Graph, an interactive genomic scatter plot viewer. G-Graph stacks or tiles multiple data series in one graph using different colors and markers. It displays gene annotation and other metadata, allows easy changes to the appearance of data series, implements stack-based undo functionality, and saves user-selected application views as image and pdf files. G-Graph delivers smooth and rapid scrolling and zooming even for datasets with millions of points and line segments. The primary target user is a researcher examining many copy number profiles to identify potentially deleterious variants. G-Graph runs under Linux, Mac OSX and Windows.Availabilityhttps://github.com/docpaa/mumdex/ or https://mumdex.com/ggraph/[email protected] (or [email protected])

Download Full-text

Spatial and temporal dynamics of Pacific capelin Mallotus catervarius in the Gulf of Alaska: implications for ecosystem-based fisheries management

Marine Ecology Progress Series ◽

10.3354/meps13211 ◽

2020 ◽

Vol 637 ◽

pp. 117-140 ◽

Cited By ~ 1

Author(s):

DW McGowan ◽

ED Goldstein ◽

ML Arimitsu ◽

AL Deary ◽

O Ormseth ◽

...

Keyword(s):

Temporal Dynamics ◽

Marine Ecosystem ◽

Gulf Of Alaska ◽

Data Series ◽

Limited Information ◽

Important Species ◽

Multiple Data ◽

Spatial And Temporal Dynamics ◽

Spawning Areas ◽

Spatio Temporal

Pacific capelin Mallotus catervarius are planktivorous small pelagic fish that serve an intermediate trophic role in marine food webs. Due to the lack of a directed fishery or monitoring of capelin in the Northeast Pacific, limited information is available on their distribution and abundance, and how spatio-temporal fluctuations in capelin density affect their availability as prey. To provide information on life history, spatial patterns, and population dynamics of capelin in the Gulf of Alaska (GOA), we modeled distributions of spawning habitat and larval dispersal, and synthesized spatially indexed data from multiple independent sources from 1996 to 2016. Potential capelin spawning areas were broadly distributed across the GOA. Models of larval drift show the GOA’s advective circulation patterns disperse capelin larvae over the continental shelf and upper slope, indicating potential connections between spawning areas and observed offshore distributions that are influenced by the location and timing of spawning. Spatial overlap in composite distributions of larval and age-1+ fish was used to identify core areas where capelin consistently occur and concentrate. Capelin primarily occupy shelf waters near the Kodiak Archipelago, and are patchily distributed across the GOA shelf and inshore waters. Interannual variations in abundance along with spatio-temporal differences in density indicate that the availability of capelin to predators and monitoring surveys is highly variable in the GOA. We demonstrate that the limitations of individual data series can be compensated for by integrating multiple data sources to monitor fluctuations in distributions and abundance trends of an ecologically important species across a large marine ecosystem.

Download Full-text

Biomass dynamic model for multiple data series: An improved approach for the management of the red grouper (Epinephelus morio) fishery of the Campeche Bank, Mexico

Regional Studies in Marine Science ◽

10.1016/j.rsma.2021.101962 ◽

2021 ◽

pp. 101962

Author(s):

Olivia Echazabal-Salazar ◽

Enrique Morales-Bojórquez ◽

Francisco Arreguín-Sánchez

Keyword(s):

Dynamic Model ◽

Data Series ◽

Multiple Data ◽

Epinephelus Morio ◽

Red Grouper ◽

Biomass Dynamic

Download Full-text

Building online genomics applications using BioPyramid

10.1101/243378 ◽

2018 ◽

Author(s):

Liam Stephenson ◽

Yoshua Wakeham ◽

Nick Seidenman ◽

Jarny Choi

Keyword(s):

Gene Annotation ◽

Number Of Components ◽

Link Type ◽

Data Portal ◽

Python Package

AbstractBioPyramid is a python package, which can serve as a scaffold for building an online genomics application. BioPyramid contains a number of components designed to reduce the time and effort in building such an application from scratch, including gene annotation, dataset models and visualisation tools. The user can rapidly deploy a data portal with the example dataset included, and start customising components as required. BioPyramid is implemented in python and javascript and freely available at http://github.com/jarny/biopyramid.

Download Full-text

immuneSIM: tunable multi-feature simulation of B- and T-cell receptor repertoires for immunoinformatics benchmarking

10.1101/759795 ◽

2019 ◽

Cited By ~ 3

Author(s):

Cédric R. Weber ◽

Rahmad Akbar ◽

Alexander Yermanos ◽

Milena Pavlović ◽

Igor Snapkov ◽

...

Keyword(s):

T Cell ◽

T Cell Receptor ◽

Network Architecture ◽

Gene Annotation ◽

Sequence Similarity ◽

Cell Receptor ◽

Germline Gene ◽

Immune Receptor ◽

Link Type ◽

Estimation Sequence

AbstractSummaryB- and T-cell receptor repertoires of the adaptive immune system have become a key target for diagnostics and therapeutics research. Consequently, there is a rapidly growing number of bioinformatics tools for immune repertoire analysis. Benchmarking of such tools is crucial for ensuring reproducible and generalizable computational analyses. Currently, however, it remains challenging to create standardized ground truth immune receptor repertoires for immunoinformatics tool benchmarking. Therefore, we developed immuneSIM, an R package that allows the simulation of native-like and aberrant synthetic full length variable region immune receptor sequences. ImmuneSIM enables the tuning of the immune receptor features: (i) species and chain type (BCR, TCR, single, paired), (ii) germline gene usage, (iii) occurrence of insertions and deletions, (iv) clonal abundance, (v) somatic hypermutation, and (vi) sequence motifs. Each simulated sequence is annotated by the complete set of simulation events that contributed to its in silico generation. immuneSIM permits the benchmarking of key computational tools for immune receptor analysis such as germline gene annotation, diversity and overlap estimation, sequence similarity, network architecture, clustering analysis, and machine learning methods for motif detection.AvailabilityThe package is available via https://github.com/GreiffLab/immuneSIM and will also be available at CRAN (submitted). The documentation is hosted at https://[email protected], [email protected]

Download Full-text

Megadepth: efficient coverage quantification for BigWigs and BAMs

10.1101/2020.12.17.423317 ◽

2020 ◽

Author(s):

Christopher Wilks ◽

Omar Ahmed ◽

Daniel N. Baker ◽

David Zhang ◽

Leonardo Collado-Torres ◽

...

Keyword(s):

Gene Annotation ◽

Command Line ◽

Bioconductor Package ◽

Input File ◽

Link Type ◽

Command Line Tool

AbstractMotivationA common way to summarize sequencing datasets is to quantify data lying within genes or other genomic intervals. This can be slow and can require different tools for different input file types.ResultsMegadepth is a fast tool for quantifying alignments and coverage for BigWig and BAM/CRAM input files, using substantially less memory than the next-fastest competitor. Megadepth can summarize coverage within all disjoint intervals of the Gencode V35 gene annotation for more than 19,000 GTExV8 BigWig files in approximately one hour using 32 threads. Megadepth is available both as a command-line tool and as an R/Bioconductor package providing much faster quantification compared to the rtracklayer package.Availabilityhttps://github.com/ChristopherWilks/megadepth, https://bioconductor.org/packages/[email protected]

Download Full-text

Complete genome sequence and probiotic properties of Lactococcus petauri LZys1 isolated from healthy human gut

Journal of Medical Microbiology ◽

10.1099/jmm.0.001397 ◽

2021 ◽

Vol 70 (8) ◽

Author(s):

Ouyang Li ◽

Huijian Zhang ◽

Wenjing Wang ◽

Yuxin Liang ◽

Wenbi Chen ◽

...

Keyword(s):

Lactic Acid ◽

Complete Genome ◽

Type Species ◽

Gene Annotation ◽

Probiotic Properties ◽

Human Gut ◽

Healthy Human ◽

Significant Similarity ◽

Content Type ◽

Link Type

Introduction. Lactococcus petauri LZys1 ( L. petauri LZys1) is a type of lactic acid bacteria (LAB), which was initially isolated from healthy human gut. Hypothesis/Gap Statement. It was previously anticipated that L. petauri LZys1 has potential characteristics of probiotic properties. The genetic structure and the regulation functions of L. petauri LZys1 need to be better revealed. Aim. The aim of this study was to detect the probiotic properties L. petauri LZys1 and to reveal the genome information related to its genetic adaptation and probiotic profiles. Methodology. Multiple in vitro experiments were carried out to evaluate its lactic acid-producing ability, resistance to pathogenic bacterial strains, auto-aggregation and co-aggregation ability, and so on. Additionally, complete genome sequencing, gene annotation, and probiotic associated gene analysis were performed. Results. The complete genome of L. petauri LZys1 comprised of 1 985 765 bp, with a DNA G+C content of 38.07 %, containing 50 tRNA, seven rRNA, and four sRNA. A total of 1931 genes were classified into six functional categories by Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The neighbour-joining phylogeny tree based on the whole genome of L. petauri LZys1 and other probiotics demonstrated that L. petauri LZys1 has a significant similarity to Lactococcus garvieae . The functional genes were detected to expound the molecular mechanism and biochemical processes of its potential probiotic properties, such as atpB gene. Conclusion. All the results described in this study, together with relevant information previously reported, made L. prtauri LZys1 a very interesting potential strain to be considered as a prominent candidate for probiotic use.

Download Full-text

Efficient Exploration of Long Data Series: A Data Event-driven HMI Concept

Communications in Computer and Information Science - HCI International 2020 - Posters ◽

10.1007/978-3-030-50732-9_64 ◽

2020 ◽

pp. 495-503

Author(s):

Bertram Wortelen ◽

Viviane Herdel ◽

Oliver Pfeiffer ◽

Marie-Christin Harre ◽

Marcel Saager ◽

...

Keyword(s):

Data Series ◽

Event Driven ◽

Efficient Exploration

Download Full-text

An automated approach for annual layer counting in ice cores

Climate of the Past ◽

10.5194/cp-8-1881-2012 ◽

2012 ◽

Vol 8 (6) ◽

pp. 1881-1895 ◽

Cited By ~ 29

Author(s):

M. Winstrup ◽

A. M. Svensson ◽

S. O. Rasmussen ◽

O. Winther ◽

E. J. Steig ◽

...

Keyword(s):

Markov Models ◽

Ice Core ◽

Ice Cores ◽

Detection Algorithm ◽

Data Series ◽

Probabilistic Uncertainty ◽

Annual Layer ◽

Statistical Framework ◽

Multiple Data ◽

Paleoclimate Records

Abstract. A novel method for automated annual layer counting in seasonally-resolved paleoclimate records has been developed. It relies on algorithms from the statistical framework of hidden Markov models (HMMs), which originally was developed for use in machine speech recognition. The strength of the layer detection algorithm lies in the way it is able to imitate the manual procedures for annual layer counting, while being based on statistical criteria for annual layer identification. The most likely positions of multiple layer boundaries in a section of ice core data are determined simultaneously, and a probabilistic uncertainty estimate of the resulting layer count is provided, ensuring an objective treatment of ambiguous layers in the data. Furthermore, multiple data series can be incorporated and used simultaneously. In this study, the automated layer counting algorithm has been applied to two ice core records from Greenland: one displaying a distinct annual signal and one which is more challenging. The algorithm shows high skill in reproducing the results from manual layer counts, and the resulting timescale compares well to absolute-dated volcanic marker horizons where these exist.

Download Full-text

Data fusion challenges in precision beekeeping: a review

10.22616/rrd.26.2020.037 ◽

2020 ◽

Keyword(s):

Data Fusion ◽

Sensor Fusion ◽

Input Data ◽

Data Series ◽

Time Data ◽

Related Data ◽

Multiple Parameters ◽

Multiple Data ◽

Fusion Methods ◽

Global Research

The objective of precision beekeeping is to minimize resource consumption and maximize productivity of bees. This is achieved by detecting and predicting beehive states by monitoring apiary and beehive related parameters like temperature, weight, humidity, noise, vibrations, air pollution, wind, precipitation, etc. These parameters are collected as a raw input data by use of multiple different sensory devices, and are often imperfect and require creation of correlation between time data series. Currently, most researches focus on monitoring and processing each parameter separately, whereas combination of multiple parameters produces information that is more sophisticated. Raw input data sets that complement one another could be pre-processed by applying data fusion methods to achieve understanding about global research subject. There are multiple data fusion methods and classification models, distinguished by raw input data type or device usage, whereas sensor related data fusion is called sensor fusion. This paper analyses existing data fusion methods and process in order to identify data fusion challenges and correlate them with precision beekeeping objectives. The research was conducted over a period of 5 months, starting from October, 2019 and was based on analysis and synthesis of scientific literature. The conclusion was made that requirement of data fusion appliance in precision beekeeping is determined by a global research objective, whereas input data introduces main challenges of data and sensor fusion, as its attributes correlate with potential result.

Download Full-text

G-OnRamp: Generating genome browsers to facilitate undergraduate-driven collaborative genome annotation

10.1101/781658 ◽

2019 ◽

Author(s):

Luke Sargent ◽

Yating Liu ◽

Wilson Leung ◽

Nathan T. Mortimer ◽

David Lopatto ◽

...

Keyword(s):

Genome Annotation ◽

Gene Annotation ◽

Sequence Similarity ◽

Gene Prediction ◽

Phenotypic Traits ◽

Wasp Species ◽

Major Barrier ◽

Link Type ◽

A Genome ◽

Genome Browsers

AbstractScientists are sequencing new genomes at an increasing rate with the goal of associating genome contents with phenotypic traits. After a new genome is sequenced and assembled, structural gene annotation is often the first step in analysis. Despite advances in computational gene prediction algorithms, most eukaryotic genomes still benefit from manual gene annotation. Undergraduates can become skilled annotators, and in the process learn both about genes/genomes and about how to utilize large datasets. Data visualizations provided by a genome browser are essential for manual gene annotation, enabling annotators to quickly evaluate multiple lines of evidence (e.g., sequence similarity, RNA-Seq, gene predictions, repeats). However, creating genome browsers requires extensive computational skills; lack of the expertise required remains a major barrier for many biomedical researchers and educators.To address these challenges, the Genomics Education Partnership (GEP; https://gep.wustl.edu/) has partnered with the Galaxy Project (https://galaxyproject.org) to develop G-OnRamp (http://g-onramp.org), a web-based platform for creating UCSC Assembly Hubs and JBrowse genome browsers. G-OnRamp can also convert a JBrowse instance into an Apollo instance for collaborative genome annotations in research and educational settings. G-OnRamp enables researchers to easily visualize their experimental results, educators to create Course-based Undergraduate Research Experiences (CUREs) centered on genome annotation, and students to participate in genomics research.Development of G-OnRamp was guided by extensive user feedback from in-person workshops. Sixty-five researchers and educators from over 40 institutions participated in these workshops, which produced over 20 genome browsers now available for research and education. For example, genome browsers for four parasitoid wasp species were used in a CURE engaging 142 students taught by 13 faculty members — producing a total of 192 gene models. G-OnRamp can be deployed on a personal computer or on cloud computing platforms, and the genome browsers produced can be transferred to the CyVerse Data Store for long-term access.

Download Full-text