scholarly journals Metavisitor, a suite of Galaxy tools for simple and rapid detection and discovery of viruses in deep sequence data

2016 ◽  
Author(s):  
Guillaume Carissimo ◽  
Marius van den Beek ◽  
Juliana Pegoraro ◽  
Kenneth D Vernick ◽  
Christophe Antoniewski

AbstractWe present user-friendly and adaptable software to provide biologists, clinical researchers and possibly diagnostic clinicians with the ability to robustly detect and reconstruct viral genomes from complex deep sequence datasets. A set of modular bioinformatic tools and workflows was implemented as the Metavisitor package in the Galaxy framework. Using the graphical Galaxy workflow editor, users with minimal computational skills can use existing Metavisitor workflows or adapt them to suit specific needs by adding or modifying analysis modules. Metavisitor can be used on our Mississippi server, or can be installed on any Galaxy server instance and a pre-configured Metavisitor server image is provided. Metavisitor works with DNA, RNA or small RNA sequencing data over a range of read lengths and can use a combination of de novo and guided approaches to assemble genomes from sequencing reads. We show that the software has the potential for quick diagnosis as well as discovery of viruses from a vast array of organisms. Importantly, we provide here executable Metavisitor use cases, which increase the accessibility and transparency of the software, ultimately enabling biologists or clinicians to focus on biological or medical questions.

Genes ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 1576
Author(s):  
Jin-Ok Lee ◽  
Minho Lee ◽  
Yeun-Jun Chung

Transfer RNA (tRNA), a key component of the translation machinery, plays critical roles in stress conditions and various diseases. While knowledge regarding the importance of tRNA function is increasing, its biological roles are still not well understood. There is currently no comprehensive database or web server providing the expression landscape of tRNAs across a variety of human tissues and diseases. Here, we constructed a user-friendly and interactive database, DBtRend, which provides a profile of mature tRNA expression across various biological conditions by reanalyzing the small RNA or microRNA sequencing data from the Cancer Genome Atlas (TCGA) and NCBI’s Gene Expression Omnibus (GEO) in humans. Users can explore not only the expression values of mature individual tRNAs in the human genome, but also those of isodecoders and isoacceptors based on our specific pipelines. DBtRend provides the expressed patterns of tRNAs, the differentially expressed tRNAs in different biological conditions, and the information of samples or patients, tissue types, and molecular subtype of cancers. The database is expected to help researchers interested in functional discoveries of tRNAs.


2021 ◽  
Author(s):  
Víctor García-Olivares ◽  
Adrián Muñoz-Barrera ◽  
José Miguel Lorenzo-Salazar ◽  
Carlos Zaragoza-Trello ◽  
Luis A. Rubio-Rodríguez ◽  
...  

AbstractThe mitochondrial genome (mtDNA) is of interest for a range of fields including evolutionary, forensic, and medical genetics. Human mitogenomes can be classified into evolutionary related haplogroups that provide ancestral information and pedigree relationships. Because of this and the advent of high-throughput sequencing (HTS) technology, there is a diversity of bioinformatic tools for haplogroup classification. We present a benchmarking of the 11 most salient tools for human mtDNA classification using empirical whole-genome (WGS) and whole-exome (WES) short-read sequencing data from 36 unrelated donors. Besides, because of its relevance, we also assess the best performing tool in third-generation long noisy read WGS data obtained with nanopore technology for a subset of the donors. We found that, for short-read WGS, most of the tools exhibit high accuracy for haplogroup classification irrespective of the input file used for the analysis. However, for short-read WES, Haplocheck and MixEmt were the most accurate tools. Based on the performance shown for WGS and WES, and the accompanying qualitative assessment, Haplocheck stands out as the most complete tool. For third-generation HTS data, we also showed that Haplocheck was able to accurately retrieve mtDNA haplogroups for all samples assessed, although only after following assembly-based approaches (either based on a referenced-based assembly or a hybrid de novo assembly). Taken together, our results provide guidance for researchers to select the most suitable tool to conduct the mtDNA analyses from HTS data.


2013 ◽  
Vol 35 (4) ◽  
pp. 342-347 ◽  
Author(s):  
Jeongsoo Lee ◽  
Dong-in Kim ◽  
June Hyun Park ◽  
Ik-Young Choi ◽  
Chanseok Shin

2019 ◽  
Author(s):  
Yuzhe Sun ◽  
Hefu Zhen ◽  
Mei Guo ◽  
Jingyu Ye ◽  
Zhili Liu ◽  
...  

AbstractExosomes are cell-derived lipid bilayer particles which are abundant in biological fluids. Exosome is an emerging source of biomarkers to diagnose various human diseases. Sequencing based exosomal studies could provide a comprehensive view of exosomal RNA and protein. To extracted these inclusions, exosomes should be isolated from the plasma first. Several exosome isolation methods were introduced since the discover of exosome. To promote the clinical application of exosomal inclusions, different isolation methods should be compared. We isolated exosomes from human plasma by using user-friendly and commercially available kits, SBI ExoQuick and QIAGEN exoRNeasy. Subsequently, small RNA sequencing was performed with two groups of isolated exosome samples and one group of plasma samples. No fundamental differences of exRNA yield between SC and EQ were found. In RNA profile analysis, the small RNA aligned reads, miRNA pattern, sample clustering varied as a result of methodological differences. Small RNA isolated by ExoQuick presented better data quality and RNA profile than exoRNeasy. This study compared sRNA sequencing data generated from two exosome isolation kits, it provides a reference for future small RNA studies and biomarker prediction in human plasma exosome.


Plants ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 267
Author(s):  
Axel J. Giudicatti ◽  
Ariel H. Tomassi ◽  
Pablo A. Manavella ◽  
Agustin L. Arce

MicroRNAs are small regulatory RNAs involved in several processes in plants ranging from development and stress responses to defense against pathogens. In order to accomplish their molecular functions, miRNAs are methylated and loaded into one ARGONAUTE (AGO) protein, commonly known as AGO1, to stabilize and protect the molecule and to assemble a functional RNA-induced silencing complex (RISC). A specific machinery controls miRNA turnover to ensure the silencing release of targeted-genes in given circumstances. The trimming and tailing of miRNAs are fundamental modifications related to their turnover and, hence, to their action. In order to gain a better understanding of these modifications, we analyzed Arabidopsis thaliana small RNA sequencing data from a diversity of mutants, related to miRNA biogenesis, action, and turnover, and from different cellular fractions and immunoprecipitations. Besides confirming the effects of known players in these pathways, we found increased trimming and tailing in miRNA biogenesis mutants. More importantly, our analysis allowed us to reveal the importance of ARGONAUTE 1 (AGO1) loading, slicing activity, and cellular localization in trimming and tailing of miRNAs.


2020 ◽  
Author(s):  
Katarzyna Siudeja ◽  
Marius van den Beek ◽  
Nick Riddiford ◽  
Benjamin Boumard ◽  
Annabelle Wurmser ◽  
...  

AbstractTransposable elements (TEs) play a significant role in evolution by contributing to genetic variation through germline insertional activity. However, how TEs act in somatic cells and tissues is not well understood. Here, we address the prevalence of transposition in a somatic tissue, exploiting the Drosophila midgut as a model system. Using whole-genome sequencing of in vivo clonally expanded gut tissue, we map hundreds of high-confidence somatic TE integration sites genome-wide. We show that somatic retrotransposon insertions are associated with inactivation of the tumor suppressor Notch, likely contributing to neoplasia formation. Moreover, by applying Oxford Nanopore long-read sequencing technology, as well as by mapping germline TE activity, we provide evidence suggesting tissue-specific differences in retrotransposition. By comparing somatic TE insertional activity with transcriptomic and small RNA sequencing data, we demonstrate that transposon mobility cannot be simply predicted by whole tissue TE expression levels or by small RNA pathway activity. Finally, we reveal that somatic TE insertions in the adult fly intestine are found preferentially in genic regions and open, transcriptionally active chromatin. Together, our findings provide clear evidence of ongoing somatic transposition in Drosophila and delineate previously unknown underlying features of somatic TE mobility in vivo.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12129
Author(s):  
Paul E. Oluniyi ◽  
Fehintola Ajogbasile ◽  
Judith Oguzie ◽  
Jessica Uwanibe ◽  
Adeyemi Kayode ◽  
...  

Next generation sequencing (NGS)-based studies have vastly increased our understanding of viral diversity. Viral sequence data obtained from NGS experiments are a rich source of information, these data can be used to study their epidemiology, evolution, transmission patterns, and can also inform drug and vaccine design. Viral genomes, however, represent a great challenge to bioinformatics due to their high mutation rate and forming quasispecies in the same infected host, bringing about the need to implement advanced bioinformatics tools to assemble consensus genomes well-representative of the viral population circulating in individual patients. Many tools have been developed to preprocess sequencing reads, carry-out de novo or reference-assisted assembly of viral genomes and assess the quality of the genomes obtained. Most of these tools however exist as standalone workflows and usually require huge computational resources. Here we present (Viral Genomes Easily Analyzed), a Snakemake workflow for analyzing RNA viral genomes. VGEA enables users to map sequencing reads to the human genome to remove human contaminants, split bam files into forward and reverse reads, carry out de novo assembly of forward and reverse reads to generate contigs, pre-process reads for quality and contamination, map reads to a reference tailored to the sample using corrected contigs supplemented by the user’s choice of reference sequences and evaluate/compare genome assemblies. We designed a project with the aim of creating a flexible, easy-to-use and all-in-one pipeline from existing/stand-alone bioinformatics tools for viral genome analysis that can be deployed on a personal computer. VGEA was built on the Snakemake workflow management system and utilizes existing tools for each step: fastp (Chen et al., 2018) for read trimming and read-level quality control, BWA (Li & Durbin, 2009) for mapping sequencing reads to the human reference genome, SAMtools (Li et al., 2009) for extracting unmapped reads and also for splitting bam files into fastq files, IVA (Hunt et al., 2015) for de novo assembly to generate contigs, shiver (Wymant et al., 2018) to pre-process reads for quality and contamination, then map to a reference tailored to the sample using corrected contigs supplemented with the user’s choice of existing reference sequences, SeqKit (Shen et al., 2016) for cleaning shiver assembly for QUAST, QUAST (Gurevich et al., 2013) to evaluate/assess the quality of genome assemblies and MultiQC (Ewels et al., 2016) for aggregation of the results from fastp, BWA and QUAST. Our pipeline was successfully tested and validated with SARS-CoV-2 (n = 20), HIV-1 (n = 20) and Lassa Virus (n = 20) datasets all of which have been made publicly available. VGEA is freely available on GitHub at: https://github.com/pauloluniyi/VGEA under the GNU General Public License.


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Ting Hon ◽  
Kristin Mars ◽  
Greg Young ◽  
Yu-Chih Tsai ◽  
Joseph W. Karalius ◽  
...  

AbstractThe PacBio® HiFi sequencing method yields highly accurate long-read sequencing datasets with read lengths averaging 10–25 kb and accuracies greater than 99.5%. These accurate long reads can be used to improve results for complex applications such as single nucleotide and structural variant detection, genome assembly, assembly of difficult polyploid or highly repetitive genomes, and assembly of metagenomes. Currently, there is a need for sample data sets to both evaluate the benefits of these long accurate reads as well as for development of bioinformatic tools including genome assemblers, variant callers, and haplotyping algorithms. We present deep coverage HiFi datasets for five complex samples including the two inbred model genomes Mus musculus and Zea mays, as well as two complex genomes, octoploid Fragaria × ananassa and the diploid anuran Rana muscosa. Additionally, we release sequence data from a mock metagenome community. The datasets reported here can be used without restriction to develop new algorithms and explore complex genome structure and evolution. Data were generated on the PacBio Sequel II System.


2020 ◽  
Vol 522 (3) ◽  
pp. 776-782
Author(s):  
Wei-Hao Lee ◽  
Kai-Pu Chen ◽  
Kai Wang ◽  
Hsuan-Cheng Huang ◽  
Hsueh-Fen Juan

Sign in / Sign up

Export Citation Format

Share Document