Precise annotation of Drosophila mitochondrial genomes leads to insights into AT-rich regions

In the present study, we performed precise annotation of Drosophila melanogaster, D. simulans, D. grimshawi, Bactrocera oleae mitochondrial (mt) genomes by pan RNA-seq analysis. Our new annotations corrected or modified some of the previous annotations and two important findings were reported for the first time, including the discovery of the conserved polyA(+) and polyA(-) motifs in the control regions (CRs) of insect mt genomes and the adding of CCAs to the 3' ends of two antisense tRNAs in D. melanogaster mt genome. Using PacBio cDNA-seq data from D. simulans, we precisely annotated the Transcription Initiation Sites (TISs) of the mt Heavy and Light strands in Drosophila mt genomes and reported that the polyA(+) and polyA(-) motifs in the CRs are associated with TISs. The discovery of the conserved polyA(+) and polyA(-) motifs provides insights into many polyA and polyT sequences in CRs of insect mt genomes, leading to reveal the mt transcription and its regulation in invertebrates. In addition, we provided a high-quality, well-curated and precisely annotated D. simulans mt genome (GenBank: MN611461), which should be included into the NCBI RefSeq database to replace the current reference genome NC_005781.

Download Full-text

The Blood And Clot Thrombectomy Registry And Collaboration (BACTRAC) protocol: novel method for evaluating human stroke

Journal of NeuroInterventional Surgery ◽

10.1136/neurintsurg-2018-014118 ◽

2018 ◽

Vol 11 (3) ◽

pp. 265-270 ◽

Cited By ~ 11

Author(s):

Justin F Fraser ◽

Lisa A Collier ◽

Amy A Gorman ◽

Sarah R Martha ◽

Kathleen E Salmeron ◽

...

Keyword(s):

Ischemic Stroke ◽

Animal Models ◽

Mechanical Thrombectomy ◽

Tissue Banking ◽

Human Condition ◽

Molecular Response ◽

High Quality ◽

Link Type ◽

Arterial Blood ◽

First Time

BackgroundIschemic stroke research faces difficulties in translating pathology between animal models and human patients to develop treatments. Mechanical thrombectomy, for the first time, offers a momentary window into the changes occurring in ischemia. We developed a tissue banking protocol to capture intracranial thrombi and the blood immediately proximal and distal to it.ObjectiveTo develop and share a reproducible protocol to bank these specimens for future analysis.MethodsWe established a protocol approved by the institutional review board for tissue processing during thrombectomy (www.clinicaltrials.govNCT03153683). The protocol was a joint clinical/basic science effort among multiple laboratories and the NeuroInterventional Radiology service line. We constructed a workspace in the angiography suite, and developed a step-by-step process for specimen retrieval and processing.ResultsOur protocol successfully yielded samples for analysis in all but one case. In our preliminary dataset, the process produced adequate amounts of tissue from distal blood, proximal blood, and thrombi for gene expression and proteomics analyses. We describe the tissue banking protocol, and highlight training protocols and mechanics of on-call research staffing. In addition, preliminary integrity analyses demonstrated high-quality yields for RNA and protein.ConclusionsWe have developed a novel tissue banking protocol using mechanical thrombectomy to capture thrombus along with arterial blood proximal and distal to it. The protocol provides high-quality specimens, facilitating analysis of the initial molecular response to ischemic stroke in the human condition for the first time. This approach will permit reverse translation to animal models for treatment development.

Download Full-text

The value of genotype-specific reference for transcriptome analyses

10.1101/2021.09.14.460213 ◽

2021 ◽

Author(s):

Wenbin Guo ◽

Max Coulter ◽

Robbie Waugh ◽

Runxuan Zhang

Keyword(s):

Alternative Splicing ◽

Reference Genome ◽

Transcriptome Assembly ◽

Specific Reference ◽

Rna Seq ◽

High Quality ◽

Common Reference ◽

Transcript Quantification ◽

Gene Level ◽

Reference Transcript

High quality transcriptome assembly using short reads from RNA-seq data still heavily relies upon reference-based approaches, of which the primary step is to align RNA-seq reads to a single reference genome of haploid sequence. However, it is increasingly apparent that while different genotypes within a species share core genes, they also contain variable numbers of specific genes that are only present a subset of individuals. Using a common reference may thus lead to a loss of genotype-specific information in the assembled transcript dataset and the generation of erroneous, incomplete or misleading transcriptomics analysis results. With the recent development of pan-genome information in many species, it is important that we understand the limitations of single genotype references for transcriptomics analysis. In this study, we quantitively evaluated the advantages of using genotype-specific reference genomes for transcriptome assembly and analysis using cultivated barley as a model. We mapped barley cultivar Barke RNA-seq reads to the Barke genome and to the cultivar Morex genome (common barley genome reference) to construct a genotype specific Reference Transcript Dataset (sRTD) and a common Reference Transcript Datasets (cRTD), respectively. We compared the two RTDs according to their transcript diversity, transcript sequence and structure similarity and the accuracy they provided for transcript quantification and differential expression analysis. Our evaluation shows that the sRTD has a significantly higher diversity of transcripts and alternative splicing events. Despite using a high-quality reference genome for assembly of the cRTD, we miss ca. 40% transcripts present in the sRTD and cRTD only has ca. 70% true assemblies. We found that the sRTD is more accurate for transcript quantification as well as differential expression and differential alternative splicing analysis. However, gene level quantification and comparative expression analysis are less affected by the source RTD, which indicates that analysing transcriptomic data at the gene level may be a reasonable compromise when a high-quality genotype-specific reference is not available.

Download Full-text

Evidence of multiple genome duplication events in Mytilus evolution

10.1101/2021.08.17.456601 ◽

2021 ◽

Author(s):

Ana Corrochano-Fraile ◽

Andrew Davie ◽

Stefano Carboni ◽

Michaël Bekaert

Keyword(s):

Whole Genome Duplication ◽

Evolutionary History ◽

Reference Genome ◽

Genome Duplication ◽

Correct Identification ◽

High Quality ◽

Sequencing Technologies ◽

Abiotic Stressors ◽

Duplication Events ◽

First Time

Molluscs remain one significantly under-represented taxa amongst available genomic resources, despite being the second-largest animal phylum and the recent advances in genomes sequencing technologies and genome assembly techniques. With the present work, we want to contribute to the growing efforts by filling this gap, presenting a new high-quality reference genome for Mytilus edulis and investigating the evolutionary history within the Mytilidae family, in relation to other species in the class Bivalvia. Here we present, for the first time, the discovery of multiple whole genome duplication events in the Mytilidae family and, more generally, in the class Bivalvia. In addition, the calculation of evolution rates for three species of the Mytilinae subfamily sheds new light onto the taxa evolution and highlights key orthologs of interest for the study of Mytilus species divergences. The reference genome presented here will enable the correct identification of molecular markers for evolutionary, population genetics, and conservation studies. Mytilidae have the capability to become a model shellfish for climate change adaptation using genome-enabled systems biology and multi-disciplinary studies of interactions between abiotic stressors, pathogen attacks, and aquaculture practises.

Download Full-text

Siberian sturgeon multi-tissue reference transcriptome database

Database ◽

10.1093/database/baaa082 ◽

2020 ◽

Vol 2020 ◽

Author(s):

Christophe Klopp ◽

Cédric Cabau ◽

Gonzalo Greif ◽

André Lasalle ◽

Santiago Di Landro ◽

...

Keyword(s):

Reference Genome ◽

Transcriptome Assembly ◽

Fish Farming ◽

Siberian Sturgeon ◽

Supplementary Information ◽

Rna Seq ◽

High Quality ◽

Reference Transcriptome ◽

Functional Studies ◽

Transcriptome Database

Abstract Motivation: Siberian sturgeon is a long lived and late maturing fish farmed for caviar production in 50 countries. Functional genomics enable to find genes of interest for fish farming. In the absence of a reference genome, a reference transcriptome is very useful for sequencing based functional studies. Results: We present here a high-quality transcriptome assembly database built using RNA-seq reads coming from brain, pituitary, gonadal, liver, stomach, kidney, anterior kidney, heart, embryonic and pre-larval tissues. It will facilitate crucial research on topics such as puberty, reproduction, growth, food intake and immunology. This database represents a major contribution to the publicly available sturgeon transcriptome reference datasets. Availability: The database is publicly available at http://siberiansturgeontissuedb.sigenae.org Supplementary information: Supplementary data are available at Database online.

Download Full-text

Slinker: Visualising novel splicing events in RNA-Seq data

F1000Research ◽

10.12688/f1000research.74836.1 ◽

2021 ◽

Vol 10 ◽

pp. 1255

Author(s):

Breon Schmidt ◽

Marek Cmero ◽

Paul Ekert ◽

Nadia Davidson ◽

Alicia Oshlack

Keyword(s):

Rare Disease ◽

Human Genome ◽

Rna Sequencing ◽

Reference Genome ◽

Data Driven ◽

Rna Seq ◽

Bioinformatics Pipeline ◽

Link Type ◽

Muscular Disorders ◽

Data Driven Approach

Visualisation of the transcriptome relative to a reference genome is fraught with sparsity. This is due to RNA sequencing (RNA-Seq) reads being predominantly mapped to exons that account for just under 3% of the human genome. Recently, we have used exon-only references, superTranscripts, to improve visualisation of aligned RNA-Seq data through the omission of supposedly unexpressed regions such as introns. However, variation within these regions can lead to novel splicing events that may drive a pathogenic phenotype. In these cases, the loss of information in only retaining annotated exons presents significant drawbacks. Here we present Slinker, a bioinformatics pipeline written in Python and Bpipe that uses a data-driven approach to assemble sample-specific superTranscripts. At its core, Slinker uses Stringtie2 to assemble transcripts with any sequence across any gene. This assembly is merged with reference transcripts, converted to a superTranscript, of which rich visualisations are made through Plotly with associated annotation and coverage information. Slinker was validated on five novel splicing events of rare disease samples from a cohort of primary muscular disorders. In addition, Slinker was shown to be effective in visualising deletion events within transcriptomes of tumour samples in the important leukemia gene, IKZF1. Slinker offers a succinct visualisation of RNA-Seq alignments across typically sparse regions and is freely available on Github.

Download Full-text

High-Quality Reference Genome Sequence for the Oomycete Vegetable Pathogen Phytophthora capsici Strain LT1534

Microbiology Resource Announcements ◽

10.1128/mra.00295-21 ◽

2021 ◽

Vol 10 (21) ◽

Author(s):

Jason E. Stajich ◽

Andrea L. Vu ◽

Howard S. Judelson ◽

Gregory M. Vogel ◽

Michael A. Gore ◽

...

Keyword(s):

Genome Sequence ◽

Genome Assembly ◽

Transcriptome Sequencing ◽

Reference Genome ◽

Phytophthora Capsici ◽

Rna Seq ◽

High Quality ◽

Content Type ◽

Wide Range ◽

Illumina Data

The oomycete Phytophthora capsici is a destructive pathogen of a wide range of vegetable hosts, especially peppers and cucurbits. A 94.17-Mb genome assembly was constructed using PacBio and Illumina data and annotated with support from transcriptome sequencing (RNA-Seq) reads.

Download Full-text

BaRTv1.0: an improved barley reference transcript dataset to determine accurate changes in the barley transcriptome using RNA-seq

10.1101/638106 ◽

2019 ◽

Cited By ~ 2

Author(s):

Paulo Rapazote-Flores ◽

Micha Bayer ◽

Linda Milne ◽

Claus-Dieter Mayer ◽

John Fuller ◽

...

Keyword(s):

Gene Expression ◽

Reference Genome ◽

Splice Junction ◽

Rna Seq ◽

Rt Pcr ◽

High Quality ◽

Transcript Quantification ◽

Reference Transcript ◽

Alternatively Spliced ◽

Comprehensive Reference

AbstractBackgroundTime consuming computational assembly and quantification of gene expression and splicing analysis from RNA-seq data vary considerably. Recent fast non-alignment tools such as Kallisto and Salmon overcome these problems, but these tools require a high quality, comprehensive reference transcripts dataset (RTD), which are rarely available in plants.ResultsA high-quality, non-redundant barley gene RTD and database (Barley Reference Transcripts – BaRTv1.0) has been generated. BaRTv1.0, was constructed from a range of tissues, cultivars and abiotic treatments and transcripts assembled and aligned to the barley cv. Morex reference genome (Mascher et al., 2017). Full-length cDNAs from the barley variety Haruna nijo (Matsumoto et al., 2011) determined transcript coverage, and high-resolution RT-PCR validated alternatively spliced (AS) transcripts of 86 genes in five different organs and tissue. These methods were used as benchmarks to select an optimal barley RTD. BaRTv1.0-Quantification of Alternatively Spliced Isoforms (QUASI) was also made to overcome inaccurate quantification due to variation in 5’ and 3’ UTR ends of transcripts. BaRTv1.0-QUASI was used for accurate transcript quantification of RNA-seq data of five barley organs/tissues. This analysis identified 20,972 significant differentially expressed genes, 2,791 differentially alternatively spliced genes and 2,768 transcripts with differential transcript usage.ConclusionA high confidence barley reference transcript dataset consisting of 60,444 genes with 177,240 transcripts has been generated. Compared to current barley transcripts, BaRTv1.0 transcripts are generally longer, have less fragmentation and improved gene models that are well supported by splice junction reads. Precise transcript quantification using BaRTv1.0 allows routine analysis of gene expression and AS.

Download Full-text

A new long-read dog assembly uncovers thousands of exons and functional elements missing in the previous reference

10.1101/2020.07.02.185108 ◽

2020 ◽

Cited By ~ 2

Author(s):

Chao Wang ◽

Ola Wallerman ◽

Maja-Louise Arendt ◽

Elisabeth Sundström ◽

Åsa Karlsson ◽

...

Keyword(s):

Reference Genome ◽

Cancer Genes ◽

Rna Seq ◽

Structural Variants ◽

Functional Elements ◽

High Quality ◽

Protein Coding ◽

Protein Coding Genes ◽

Long Read ◽

Genomic Regions

AbstractHere we present a new high-quality canine reference genome with gap number reduced 41-fold, from 23,836 to 585. Analysis of existing and novel data, RNA-seq, miRNA-seq and ATAC-seq, revealed a large proportion of these harboured previously hidden elements, including genes, promoters and miRNAs. Short-read dark regions were detected, and genomic regions completed, including the DLA, TCR and 366 cancer genes. 10x sequencing of 27 dogs uncovered a total of 22.1 million SNPs, Indels and larger structural variants (SVs). 1.4% overlap with protein coding genes and could provide a source of normal or aberrant phenotypic modifications.

Download Full-text

Disability as a social phenomenon

10.33920/med-03-2007-01 ◽

2020 ◽

pp. 22-38

Author(s):

Natalia Guseva ◽

Vitaliy Berdutin

Keyword(s):

Social Protection ◽

Russian Society ◽

Stage Model ◽

High Quality ◽

Rehabilitation Programs ◽

Current State ◽

The Social ◽

Consistent Solution ◽

First Time

At present, the problem of establishing disability is a point at issue in Russia. Despite the fact that medical criteria for disability are being developed very actively, high-quality methods for assessing social hallmarks are still lacking. Since disability is a phenomenon inherent in any society, each state forms a social and economic policy for people with disabilities in accordance with its level of development, priorities and opportunities. We have proposed a three-stage model, which includes a system for the consistent solution of the main tasks aimed at studying the causes and consequences of the problems encountered today in the social protection of citizens with health problems. The article shows why the existing approaches to the determination of disability and rehabilitation programs do not correspond to the current state of Russian society and why a decrease in the rate of persons recognized as disabled for the first time does not indicate an improvement in the health of the population. The authors proposed a number of measures with a view to correcting the situation according to the results of the study.

Download Full-text

Faculty Opinions recommendation of Full-length transcriptome assembly from RNA-Seq data without a reference genome.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.13296969.14657090 ◽

2011 ◽

Author(s):

Steven Salzberg ◽

Michael Schatz

Keyword(s):

Reference Genome ◽

Transcriptome Assembly ◽

Full Length ◽

Rna Seq

Download Full-text