scholarly journals Precise annotation of Drosophila mitochondrial genomes leads to insights into AT-rich regions

2021 ◽  
Author(s):  
Guangcai Liang ◽  
Jia Chang ◽  
Tung On Yau ◽  
Xin Li ◽  
Bingjun He ◽  
...  

In the present study, we performed precise annotation of Drosophila melanogaster, D. simulans, D. grimshawi, Bactrocera oleae mitochondrial (mt) genomes by pan RNA-seq analysis. Our new annotations corrected or modified some of the previous annotations and two important findings were reported for the first time, including the discovery of the conserved polyA(+) and polyA(-) motifs in the control regions (CRs) of insect mt genomes and the adding of CCAs to the 3' ends of two antisense tRNAs in D. melanogaster mt genome. Using PacBio cDNA-seq data from D. simulans, we precisely annotated the Transcription Initiation Sites (TISs) of the mt Heavy and Light strands in Drosophila mt genomes and reported that the polyA(+) and polyA(-) motifs in the CRs are associated with TISs. The discovery of the conserved polyA(+) and polyA(-) motifs provides insights into many polyA and polyT sequences in CRs of insect mt genomes, leading to reveal the mt transcription and its regulation in invertebrates. In addition, we provided a high-quality, well-curated and precisely annotated D. simulans mt genome (GenBank: MN611461), which should be included into the NCBI RefSeq database to replace the current reference genome NC_005781.

2018 ◽  
Vol 11 (3) ◽  
pp. 265-270 ◽  
Author(s):  
Justin F Fraser ◽  
Lisa A Collier ◽  
Amy A Gorman ◽  
Sarah R Martha ◽  
Kathleen E Salmeron ◽  
...  

BackgroundIschemic stroke research faces difficulties in translating pathology between animal models and human patients to develop treatments. Mechanical thrombectomy, for the first time, offers a momentary window into the changes occurring in ischemia. We developed a tissue banking protocol to capture intracranial thrombi and the blood immediately proximal and distal to it.ObjectiveTo develop and share a reproducible protocol to bank these specimens for future analysis.MethodsWe established a protocol approved by the institutional review board for tissue processing during thrombectomy (www.clinicaltrials.govNCT03153683). The protocol was a joint clinical/basic science effort among multiple laboratories and the NeuroInterventional Radiology service line. We constructed a workspace in the angiography suite, and developed a step-by-step process for specimen retrieval and processing.ResultsOur protocol successfully yielded samples for analysis in all but one case. In our preliminary dataset, the process produced adequate amounts of tissue from distal blood, proximal blood, and thrombi for gene expression and proteomics analyses. We describe the tissue banking protocol, and highlight training protocols and mechanics of on-call research staffing. In addition, preliminary integrity analyses demonstrated high-quality yields for RNA and protein.ConclusionsWe have developed a novel tissue banking protocol using mechanical thrombectomy to capture thrombus along with arterial blood proximal and distal to it. The protocol provides high-quality specimens, facilitating analysis of the initial molecular response to ischemic stroke in the human condition for the first time. This approach will permit reverse translation to animal models for treatment development.


2021 ◽  
Author(s):  
Wenbin Guo ◽  
Max Coulter ◽  
Robbie Waugh ◽  
Runxuan Zhang

High quality transcriptome assembly using short reads from RNA-seq data still heavily relies upon reference-based approaches, of which the primary step is to align RNA-seq reads to a single reference genome of haploid sequence. However, it is increasingly apparent that while different genotypes within a species share core genes, they also contain variable numbers of specific genes that are only present a subset of individuals. Using a common reference may thus lead to a loss of genotype-specific information in the assembled transcript dataset and the generation of erroneous, incomplete or misleading transcriptomics analysis results. With the recent development of pan-genome information in many species, it is important that we understand the limitations of single genotype references for transcriptomics analysis. In this study, we quantitively evaluated the advantages of using genotype-specific reference genomes for transcriptome assembly and analysis using cultivated barley as a model. We mapped barley cultivar Barke RNA-seq reads to the Barke genome and to the cultivar Morex genome (common barley genome reference) to construct a genotype specific Reference Transcript Dataset (sRTD) and a common Reference Transcript Datasets (cRTD), respectively. We compared the two RTDs according to their transcript diversity, transcript sequence and structure similarity and the accuracy they provided for transcript quantification and differential expression analysis. Our evaluation shows that the sRTD has a significantly higher diversity of transcripts and alternative splicing events. Despite using a high-quality reference genome for assembly of the cRTD, we miss ca. 40% transcripts present in the sRTD and cRTD only has ca. 70% true assemblies. We found that the sRTD is more accurate for transcript quantification as well as differential expression and differential alternative splicing analysis. However, gene level quantification and comparative expression analysis are less affected by the source RTD, which indicates that analysing transcriptomic data at the gene level may be a reasonable compromise when a high-quality genotype-specific reference is not available.


2021 ◽  
Author(s):  
Ana Corrochano-Fraile ◽  
Andrew Davie ◽  
Stefano Carboni ◽  
Michaël Bekaert

Molluscs remain one significantly under-represented taxa amongst available genomic resources, despite being the second-largest animal phylum and the recent advances in genomes sequencing technologies and genome assembly techniques. With the present work, we want to contribute to the growing efforts by filling this gap, presenting a new high-quality reference genome for Mytilus edulis and investigating the evolutionary history within the Mytilidae family, in relation to other species in the class Bivalvia. Here we present, for the first time, the discovery of multiple whole genome duplication events in the Mytilidae family and, more generally, in the class Bivalvia. In addition, the calculation of evolution rates for three species of the Mytilinae subfamily sheds new light onto the taxa evolution and highlights key orthologs of interest for the study of Mytilus species divergences. The reference genome presented here will enable the correct identification of molecular markers for evolutionary, population genetics, and conservation studies. Mytilidae have the capability to become a model shellfish for climate change adaptation using genome-enabled systems biology and multi-disciplinary studies of interactions between abiotic stressors, pathogen attacks, and aquaculture practises.


Database ◽  
2020 ◽  
Vol 2020 ◽  
Author(s):  
Christophe Klopp ◽  
Cédric Cabau ◽  
Gonzalo Greif ◽  
André Lasalle ◽  
Santiago Di Landro ◽  
...  

Abstract Motivation: Siberian sturgeon is a long lived and late maturing fish farmed for caviar production in 50 countries. Functional genomics enable to find genes of interest for fish farming. In the absence of a reference genome, a reference transcriptome is very useful for sequencing based functional studies. Results: We present here a high-quality transcriptome assembly database built using RNA-seq reads coming from brain, pituitary, gonadal, liver, stomach, kidney, anterior kidney, heart, embryonic and pre-larval tissues. It will facilitate crucial research on topics such as puberty, reproduction, growth, food intake and immunology. This database represents a major contribution to the publicly available sturgeon transcriptome reference datasets. Availability: The database is publicly available at http://siberiansturgeontissuedb.sigenae.org Supplementary information:  Supplementary data are available at Database online.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 1255
Author(s):  
Breon Schmidt ◽  
Marek Cmero ◽  
Paul Ekert ◽  
Nadia Davidson ◽  
Alicia Oshlack

Visualisation of the transcriptome relative to a reference genome is fraught with sparsity. This is due to RNA sequencing (RNA-Seq) reads being predominantly mapped to exons that account for just under 3% of the human genome. Recently, we have used exon-only references, superTranscripts, to improve visualisation of aligned RNA-Seq data through the omission of supposedly unexpressed regions such as introns. However, variation within these regions can lead to novel splicing events that may drive a pathogenic phenotype. In these cases, the loss of information in only retaining annotated exons presents significant drawbacks. Here we present Slinker, a bioinformatics pipeline written in Python and Bpipe that uses a data-driven approach to assemble sample-specific superTranscripts. At its core, Slinker uses Stringtie2 to assemble transcripts with any sequence across any gene. This assembly is merged with reference transcripts, converted to a superTranscript, of which rich visualisations are made through Plotly with associated annotation and coverage information. Slinker was validated on five novel splicing events of rare disease samples from a cohort of primary muscular disorders. In addition, Slinker was shown to be effective in visualising deletion events within transcriptomes of tumour samples in the important leukemia gene, IKZF1. Slinker offers a succinct visualisation of RNA-Seq alignments across typically sparse regions and is freely available on Github.


2021 ◽  
Vol 10 (21) ◽  
Author(s):  
Jason E. Stajich ◽  
Andrea L. Vu ◽  
Howard S. Judelson ◽  
Gregory M. Vogel ◽  
Michael A. Gore ◽  
...  

The oomycete Phytophthora capsici is a destructive pathogen of a wide range of vegetable hosts, especially peppers and cucurbits. A 94.17-Mb genome assembly was constructed using PacBio and Illumina data and annotated with support from transcriptome sequencing (RNA-Seq) reads.


2019 ◽  
Author(s):  
Paulo Rapazote-Flores ◽  
Micha Bayer ◽  
Linda Milne ◽  
Claus-Dieter Mayer ◽  
John Fuller ◽  
...  

AbstractBackgroundTime consuming computational assembly and quantification of gene expression and splicing analysis from RNA-seq data vary considerably. Recent fast non-alignment tools such as Kallisto and Salmon overcome these problems, but these tools require a high quality, comprehensive reference transcripts dataset (RTD), which are rarely available in plants.ResultsA high-quality, non-redundant barley gene RTD and database (Barley Reference Transcripts – BaRTv1.0) has been generated. BaRTv1.0, was constructed from a range of tissues, cultivars and abiotic treatments and transcripts assembled and aligned to the barley cv. Morex reference genome (Mascher et al., 2017). Full-length cDNAs from the barley variety Haruna nijo (Matsumoto et al., 2011) determined transcript coverage, and high-resolution RT-PCR validated alternatively spliced (AS) transcripts of 86 genes in five different organs and tissue. These methods were used as benchmarks to select an optimal barley RTD. BaRTv1.0-Quantification of Alternatively Spliced Isoforms (QUASI) was also made to overcome inaccurate quantification due to variation in 5’ and 3’ UTR ends of transcripts. BaRTv1.0-QUASI was used for accurate transcript quantification of RNA-seq data of five barley organs/tissues. This analysis identified 20,972 significant differentially expressed genes, 2,791 differentially alternatively spliced genes and 2,768 transcripts with differential transcript usage.ConclusionA high confidence barley reference transcript dataset consisting of 60,444 genes with 177,240 transcripts has been generated. Compared to current barley transcripts, BaRTv1.0 transcripts are generally longer, have less fragmentation and improved gene models that are well supported by splice junction reads. Precise transcript quantification using BaRTv1.0 allows routine analysis of gene expression and AS.


Author(s):  
Chao Wang ◽  
Ola Wallerman ◽  
Maja-Louise Arendt ◽  
Elisabeth Sundström ◽  
Åsa Karlsson ◽  
...  

AbstractHere we present a new high-quality canine reference genome with gap number reduced 41-fold, from 23,836 to 585. Analysis of existing and novel data, RNA-seq, miRNA-seq and ATAC-seq, revealed a large proportion of these harboured previously hidden elements, including genes, promoters and miRNAs. Short-read dark regions were detected, and genomic regions completed, including the DLA, TCR and 366 cancer genes. 10x sequencing of 27 dogs uncovered a total of 22.1 million SNPs, Indels and larger structural variants (SVs). 1.4% overlap with protein coding genes and could provide a source of normal or aberrant phenotypic modifications.


2020 ◽  
pp. 22-38
Author(s):  
Natalia Guseva ◽  
Vitaliy Berdutin

At present, the problem of establishing disability is a point at issue in Russia. Despite the fact that medical criteria for disability are being developed very actively, high-quality methods for assessing social hallmarks are still lacking. Since disability is a phenomenon inherent in any society, each state forms a social and economic policy for people with disabilities in accordance with its level of development, priorities and opportunities. We have proposed a three-stage model, which includes a system for the consistent solution of the main tasks aimed at studying the causes and consequences of the problems encountered today in the social protection of citizens with health problems. The article shows why the existing approaches to the determination of disability and rehabilitation programs do not correspond to the current state of Russian society and why a decrease in the rate of persons recognized as disabled for the first time does not indicate an improvement in the health of the population. The authors proposed a number of measures with a view to correcting the situation according to the results of the study.


Sign in / Sign up

Export Citation Format

Share Document