scholarly journals Strategies for quantitative RNA-seq analyses among closely related species

2018 ◽  
Author(s):  
Swati Parekh ◽  
Beate Vieth ◽  
Christoph Ziegenhain ◽  
Wolfgang Enard ◽  
Ines Hellmann

AbstractWith the growing appreciation for the role of regulatory differences in evolution, researchers need to reliably quantify expression levels within and among species. However, for non-model organisms genome assemblies and annotations are often not available or have inferior quality, biasing the inference of expression changes to an unknown extent. Here, we explore the possibility to map RNA-seq reads from diverged species to one high quality reference genome. As test case, we used a small primate phylogeny ranging from Human to Marmoset spanning 12% nucleotide divergence. To distinguish the effect of sequence divergence and genome quality, we used in silico evolved genomes and existing genomes to simulate RNA-seq reads. These were then mapped to the genome of origin (self-mapping) as well as to one common reference (cross-mapping) to infer the quantification biases. We find that the bias due to cross-mapping is small for the closely related great apes (≤ 4% divergence), and preferable to self-mapping given current genome qualities. For closely related species, cross-mapping provides easy access, high power and a well controlled false discovery rate for both; the analysis of intra-species expression differences as well as the detection of relative differences between species. If divergence increases, so that a substantial fraction of reads exceeds the limits of the mapper used, we find that gene-specific corrections and effect-size cutoffs can limit the bias before self-mapping becomes unavoidable. In summary, for the first time we systematically quantify biases in cross-species RNA-seq studies, providing guidance to best practices for these important evolutionary studies.

2009 ◽  
Vol 37 (4) ◽  
pp. 778-782 ◽  
Author(s):  
Macarena Toll-Riera ◽  
Robert Castelo ◽  
Nicolás Bellora ◽  
M. Mar Albà

Genomes contain a large number of genes that do not have recognizable homologues in other species. These genes, found in only one or a few closely related species, are known as orphan genes. Their limited distribution implies that many of them are probably involved in lineage-specific adaptive processes. One important question that has remained elusive to date is how orphan genes originate. It has been proposed that they might have arisen by gene duplication followed by a period of very rapid sequence divergence, which would have erased any traces of similarity to other evolutionarily related genes. However, this explanation does not seem plausible for genes lacking homologues in very closely related species. In the present article, we review recent efforts to identify the mechanisms of formation of primate orphan genes. These studies reveal an unexpected important role of transposable elements in the formation of novel protein-coding genes in the genomes of primates.


2014 ◽  
Vol 64 (Pt_11) ◽  
pp. 3856-3861 ◽  
Author(s):  
Yong-Cheng Ren ◽  
Yun Wang ◽  
Liang Chen ◽  
Tao Ke ◽  
Feng-Li Hui

Two strains representing Wickerhamiella allomyrinae f.a., sp. nov. were isolated from the gut of Allomyrina dichotoma (Coleoptera: Scarabeidae) collected from the Baotianman National Nature Reserve, Nanyan, Henan Province, China. Sequence analyses of the D1/D2 domains of the LSU rRNA gene revealed that this novel species was located in the Wickerhamiella clade (Saccharomycetes, Saccharomycetales), with three described species of the genus Candida, namely Candida musiphila, Candida spandovensis and Candida sergipensis, as the most closely related species. The novel species differed from these three species by 9.3–9.8 % sequence divergence (35–45 nt substitutions) in the D1/D2 sequences. The species could also be distinguished from the closely related species, C. musiphila, C. spandovensis and C. sergipensis, by growth on vitamin-free medium and at 37 °C. The type strain is Wickerhamiella allomyrinae sp. nov. NYNU 13920T ( = CICC 33031T = CBS 13167T).


2015 ◽  
Vol 9S4 ◽  
pp. BBI.S29334 ◽  
Author(s):  
Jessica P. Hekman ◽  
Jennifer L Johnson ◽  
Anna V. Kukekova

Domesticated species occupy a special place in the human world due to their economic and cultural value. In the era of genomic research, domesticated species provide unique advantages for investigation of diseases and complex phenotypes. RNA sequencing, or RNA-seq, has recently emerged as a new approach for studying transcriptional activity of the whole genome, changing the focus from individual genes to gene networks. RNA-seq analysis in domesticated species may complement genome-wide association studies of complex traits with economic importance or direct relevance to biomedical research. However, RNA-seq studies are more challenging in domesticated species than in model organisms. These challenges are at least in part associated with the lack of quality genome assemblies for some domesticated species and the absence of genome assemblies for others. In this review, we discuss strategies for analyzing RNA-seq data, focusing particularly on questions and examples relevant to domesticated species.


2000 ◽  
Vol 68 (12) ◽  
pp. 7180-7185 ◽  
Author(s):  
O. Colin Stine ◽  
Shanmuga Sozhamannan ◽  
Qing Gou ◽  
Siqen Zheng ◽  
J. Glenn Morris ◽  
...  

ABSTRACT We sequenced a 705-bp fragment of the recA gene from 113 Vibrio cholerae strains and closely related species. One hundred eighty-seven nucleotides were phylogenetically informative, 55 were phylogenetically uninformative, and 463 were invariant. Not unexpectedly, Vibrio parahaemolyticus and Vibrio vulnificus strains formed out-groups; we also identified isolates which resembled V. cholerae biochemically but which did not cluster with V. cholerae. In many instances, V. cholerae serogroup designations did not correlate with phylogeny, as reflected by recA sequence divergence. This observation is consistent with the idea that there is horizontal transfer of O-antigen biosynthesis genes among V. cholerae strains.


2022 ◽  
Author(s):  
Leeban Yusuf ◽  
Venera Tyukmaeva ◽  
Anneli Hoikkala ◽  
Michael G Ritchie

Speciation with gene flow is now widely regarded as common. However, the frequency of introgression between recently diverged species and the evolutionary consequences of gene flow are still poorly understood. The virilis group of Drosophila contains around a dozen species that are geographically widespread and show varying levels of pre-zygotic and post-zygotic isolation. Here, we utilize de novo genome assemblies and whole-genome sequencing data to resolve phylogenetic relationships and describe patterns of introgression and divergence across the group. We suggest that the virilis group consists of three, rather than the traditional two, subgroups. We found evidence of pervasive phylogenetic discordance caused by ancient introgression events between distant lineages within the group, and much more recent gene flow between closely-related species. When assessing patterns of genome-wide divergence in species pairs across the group, we found no consistent genomic evidence of a disproportionate role for the X chromosome. Some genes undergoing rapid sequence divergence across the group were involved in chemical communication and may be related to the evolution of sexual isolation. We suggest that gene flow between closely-related species has potentially had an impact on lineage-specific adaptation and the evolution of reproductive barriers. Our results show how ancient and recent introgression confuse phylogenetic reconstruction, and suggest that shared variation can facilitate adaptation and speciation.


2019 ◽  
Vol 37 (1) ◽  
pp. 260-279 ◽  
Author(s):  
Carina F Mugal ◽  
Verena E Kutschera ◽  
Fidel Botero-Castro ◽  
Jochen B W Wolf ◽  
Ingemar Kaj

Abstract The ratio of nonsynonymous over synonymous sequence divergence, dN/dS, is a widely used estimate of the nonsynonymous over synonymous fixation rate ratio ω, which measures the extent to which natural selection modulates protein sequence evolution. Its computation is based on a phylogenetic approach and computes sequence divergence of protein-coding DNA between species, traditionally using a single representative DNA sequence per species. This approach ignores the presence of polymorphisms and relies on the indirect assumption that new mutations fix instantaneously, an assumption which is generally violated and reasonable only for distantly related species. The violation of the underlying assumption leads to a time-dependence of sequence divergence, and biased estimates of ω in particular for closely related species, where the contribution of ancestral and lineage-specific polymorphisms to sequence divergence is substantial. We here use a time-dependent Poisson random field model to derive an analytical expression of dN/dS as a function of divergence time and sample size. We then extend our framework to the estimation of the proportion of adaptive protein evolution α. This mathematical treatment enables us to show that the joint usage of polymorphism and divergence data can assist the inference of selection for closely related species. Moreover, our analytical results provide the basis for a protocol for the estimation of ω and α for closely related species. We illustrate the performance of this protocol by studying a population data set of four corvid species, which involves the estimation of ω and α at different time-scales and for several choices of sample sizes.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Angel Ruiz-Reche ◽  
Akanksha Srivastava ◽  
Joel A. Indi ◽  
Ivan de la Rubia ◽  
Eduardo Eyras

AbstractWe describe ReorientExpress, a method to perform reference-free orientation of transcriptomic long sequencing reads. ReorientExpress uses deep learning to correctly predict the orientation of the majority of reads, and in particular when trained on a closely related species or in combination with read clustering. ReorientExpress enables long-read transcriptomics in non-model organisms and samples without a genome reference without using additional technologies and is available at https://github.com/comprna/reorientexpress.


2020 ◽  
Vol 65 (2) ◽  
Author(s):  
Arseniy Lobov ◽  
Irina Babkina ◽  
Arina Maltseva ◽  
Natalia Mikhailova ◽  
Andrey Granovitch

The forces driving reproductive isolation emergence during the process of sympatric speciation are still intensely debated. Mechanisms of gametic isolation (which are known to form rapidly in several models) take the central place in these debates. Nevertheless, the approximative capacity of a few investigated models to other taxa could be questioned, generating demand for the adoption of additional model organisms to study sympatric speciation. The group of closely related species of the genus Littorina (subgenus Neritrema) sympatrically inhabiting seashores are promising. In this study, we performed comparative proteomic analysis of penial tissues of four Neritrema species to identify potential effectors contributing to gametic isolation. Among 272 analyzed proteins, 13 mamilliform gland-specific proteins (possibly transferred to the female during copulation) were detected, as well as five proteins specifically expressed in the epithelium of the penial basal part. Eight of these proteins were species-specific and may be involved in the maintenance of reproductive barriers.


Sign in / Sign up

Export Citation Format

Share Document