Homology cluster differential expression analysis for interspecies mRNA-Seq experiments

Author(s):  
Jonathan A. Gelfond ◽  
Joseph G. Ibrahim ◽  
Ming-Hui Chen ◽  
Wei Sun ◽  
Kaitlyn Lewis ◽  
...  

AbstractThere is an increasing demand for exploration of the transcriptomes of multiple species with extraordinary traits such as the naked-mole rat (NMR). The NMR is remarkable because of its longevity and resistance to developing cancer. It is of scientific interest to understand the molecular mechanisms that impart these traits, and RNA-sequencing experiments with comparator species can correlate transcriptome dynamics with these phenotypes. Comparing transcriptome differences requires a homology mapping of each transcript in one species to transcript(s) within the other. Such mappings are necessary, especially if one species does not have well-annotated genome available. Current approaches for this type of analysis typically identify the best match for each transcript, but the best match analysis ignores the inherent risks of mismatch when there are multiple candidate transcripts with similar homology scores. We present a method that treats the set of homologs from a novel species as a cluster corresponding to a single gene in the reference species, and we compare the cluster-based approach to a conventional best-match analysis in both simulated data and a case study with NMR and mouse tissues. We demonstrate that the cluster-based approach has superior power to detect differential expression.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Matthew Chung ◽  
Vincent M. Bruno ◽  
David A. Rasko ◽  
Christina A. Cuomo ◽  
José F. Muñoz ◽  
...  

AbstractAdvances in transcriptome sequencing allow for simultaneous interrogation of differentially expressed genes from multiple species originating from a single RNA sample, termed dual or multi-species transcriptomics. Compared to single-species differential expression analysis, the design of multi-species differential expression experiments must account for the relative abundances of each organism of interest within the sample, often requiring enrichment methods and yielding differences in total read counts across samples. The analysis of multi-species transcriptomics datasets requires modifications to the alignment, quantification, and downstream analysis steps compared to the single-species analysis pipelines. We describe best practices for multi-species transcriptomics and differential gene expression.


2021 ◽  
Author(s):  
Mariana Costa Dias ◽  
Cecílio Caldeira ◽  
Markus Gastauer ◽  
Silvio Ramos ◽  
Guilherme Oliveira

Abstract BackgroundCanga is the Brazilian term for the savanna-like vegetation harboring several endemic species on iron-rich rocky outcrops, usually considered for mining activities. Parkia platycephala Benth. and Stryphnodendron pulcherrimum (Willd.) Hochr. naturally occur in the cangas of Serra dos Carajás (eastern Amazonia, Brazil) and the surrounding forest, indicating high phenotypic plasticity. The morphological and physiological mechanisms of the plants’ establishment in the canga environment are well studied, but the molecular adaptative responses are still unknown. We aimed to identify molecular mechanisms that allow the establishment of these plants in the canga environment.ResultsPlants were grown in canga and forest substrates collected in the Carajás Mineral Province. RNA was extracted from pooled leaf tissue, and RNA-seq paired-end reads were assembled into representative transcriptomes for P. platycephala and S. pulcherrimum containing 31,728 and 31,311 primary transcripts, respectively. We identified both species-specific and core molecular responses in plants grown in the canga substrate using differential expression analyses. In the species-specific analysis, we identified 1,112 and 838 differentially expressed genes for P. platycephala and S. pulcherrimum, respectively. Enrichment analyses showed unique biological processes and metabolic pathways affected for each species. Comparative differential expression analysis was based on shared single-copy orthologs. The overall pattern of ortholog expression was species-specific. Even so, almost 300 altered genes were identified between plants in canga and forest substrates, responding the same way in both species. The genes were functionally associated with the response to light stimulus and the circadian rhythm pathway.ConclusionsPlants possess species-specific adaptative responses to cope with the substrates. Our results also suggest that plants adapted to both canga and forest environments can adjust the circadian rhythm in a substrate-dependent manner. The circadian clock gene modulation might be a central mechanism regulating the plants’ development in the canga substrate in the studied legume species. The mechanism may be shared as a common mechanism to abiotic stress compensation in other native species.


2019 ◽  
Author(s):  
Avi Srivastava ◽  
Laraib Malik ◽  
Hirak Sarkar ◽  
Mohsen Zakeri ◽  
Fatemeh Almodaresi ◽  
...  

AbstractBackgroundThe accuracy of transcript quantification using RNA-seq data depends on many factors, such as the choice of alignment or mapping method and the quantification model being adopted. While the choice of quantification model has been shown to be important, considerably less attention has been given to comparing the effect of various read alignment approaches on quantification accuracy.ResultsWe investigate the influence of mapping and alignment on the accuracy of transcript quantification in both simulated and experimental data, as well as the effect on subsequent differential expression analysis. We observe that, even when the quantification model itself is held fixed, the effect of choosing a different alignment methodology, or aligning reads using different parameters, on quantification estimates can sometimes be large, and can affect downstream differential expression analyses as well. These effects can go unnoticed when assessment is focused too heavily on simulated data, where the alignment task is often simpler than in experimentally-acquired samples. We also introduce a new alignment methodology, called selective alignment, to overcome the shortcomings of lightweight approaches without incurring the computational cost of traditional alignment.ConclusionWe observe that, on experimental datasets, the performance of lightweight mapping and alignment-based approaches varies significantly and highlight some of the underlying factors. We show this variation both in terms of quantification and downstream differential expression analysis. In all comparisons, we also show the improved performance of our proposed selective alignment method and suggest best practices for performing RNA-seq quantification.


2017 ◽  
Author(s):  
Henning Onsbring Gustafson ◽  
Mahwash Jamy ◽  
Thijs J. G. Ettema

SummaryWhile ciliates of the genus Stentor are known for their ability to regenerate when their cells are damaged or even fragmented, the physical and molecular mechanisms underlying this process are poorly understood. To identify genes involved in the regenerative capability of Stentor cells, RNA sequencing of individual Stentor polymorphus cell fragments was performed. After splitting a cell over the anterior-posterior axis, the posterior fragment has to regenerate the oral apparatus, while the anterior part needs to regenerate the hold fast. Altogether, differential expression analysis of both posterior and anterior S. polymorphus cell fragments for four different post-split time points revealed over 10,000 up-regulated genes throughout the regeneration process. Among these, genes involved in cell signaling, microtubule-based movement and cell cycle regulation seemed to be particularly important during cellular regeneration. We identified roughly nine times as many up-regulated genes in regenerating S. polymorphus posterior fragments as compared to anterior fragments, indicating that regeneration of the anterior oral apparatus is a complex process that involves many genes. Our analyses identified several expanded groups of genes such as dual-specific tyrosine-(Y)-phosphorylation regulated kinases and MORN domain containing proteins that seemingly act as key-regulators of cellular regeneration. In agreement with earlier morphological and cell biological studies, our differential expression analyses indicate that cellular regeneration and vegetative division share many similarities.


2021 ◽  
Author(s):  
Joseph W Saelens ◽  
Jens E.V. Petersen ◽  
Elizabeth Freedman ◽  
Robert C Moseley ◽  
Drissa Konate ◽  
...  

Sickle-trait hemoglobin (HbAS) confers near-complete protection from severe, life-threatening falciparum malaria in African children. Despite this clear protection, the molecular mechanisms by which HbAS confers these protective phenotypes remain incompletely understood. As a forward genetic screen for aberrant parasite transcriptional responses associated with parasite neutralization in HbAS red blood cells (RBCs), we performed comparative transcriptomic analyses of Plasmodium falciparum in normal (HbAA) and HbAS erythrocytes during both in vitro cultivation of reference parasite strains and naturally-occurring P. falciparum infections in Malian children with HbAA or HbAS. During in vitro cultivation, parasites matured normally in HbAS RBCs, and the temporal expression was largely unperturbed of the highly ordered transcriptional program that underlies the parasites maturation throughout the intraerythrocytic development cycle (IDC). However, differential expression analysis identified hundreds of transcripts aberrantly expressed in HbAS, largely occurring late in the IDC. Surprisingly, transcripts encoding members of the Maurers clefts were overexpressed in HbAS despite impaired parasite protein export in these RBCs, while parasites in HbAS RBCs underexpressed transcripts associated with the endoplasmic reticulum and those encoding serine repeat antigen proteases that promote parasite egress. Analyses of P. falciparum transcriptomes from 32 children with uncomplicated malaria identified stage-specific differential expression: among infections composed of ring-stage parasites, only cyclophilin 19B was underexpressed in children with HbAS, while trophozoite-stage infections identified a range of differentially-expressed transcripts, including downregulation in HbAS of several transcripts associated with severe malaria in collateral studies. Collectively, our comparative transcriptomic screen in vitro and in vivo indicates that P. falciparum adapts to HbAS by altering its protein chaperone and folding machinery, oxidative stress response, and protein export machinery. Because HbAS consistently protects from severe P. falciparum, modulation of these responses may offer avenues by which to neutralize P. falciparum parasites.


2015 ◽  
Author(s):  
Rahul Reddy

As RNA-Seq and other high-throughput sequencing grow in use and remain critical for gene expression studies, technical variability in counts data impedes studies of differential expression studies, data across samples and experiments, or reproducing results. Studies like Dillies et al. (2013) compare several between-lane normalization methods involving scaling factors, while Hansen et al. (2012) and Risso et al. (2014) propose methods that correct for sample-specific bias or use sets of control genes to isolate and remove technical variability. This paper evaluates four normalization methods in terms of reducing intra-group, technical variability and facilitating differential expression analysis or other research where the biological, inter-group variability is of interest. To this end, the four methods were evaluated in differential expression analysis between data from Pickrell et al. (2010) and Montgomery et al. (2010) and between simulated data modeled on these two datasets. Though the between-lane scaling factor methods perform worse on real data sets, they are much stronger for simulated data. We cannot reject the recommendation of Dillies et al. to use TMM and DESeq normalization, but further study of power to detect effects of different size under each normalization method is merited.


2020 ◽  
Author(s):  
Gabriel E. Hoffman ◽  
Yixuan Ma ◽  
Kelsey S. Montgomery ◽  
Jaroslav Bendl ◽  
Manoj Kumar Jaiswal ◽  
...  

AbstractWhile schizophrenia differs between males and females in age of onset, symptomatology and the course of the disease, the molecular mechanisms underlying these differences remain uncharacterized. In order to address questions about the sex-specific effects of schizophrenia, we performed a large-scale transcriptome analysis of RNA-seq data from 437 controls and 341 cases from two distinct cohorts from the CommonMind Consortium. Analysis across the cohorts identifies a reproducible gene expression signature of schizophrenia that is highly concordant with previous work. Differential expression across sex is reproducible across cohorts and identifies X- and Y-linked genes, as well as those involved in dosage compensation. Intriguingly, the sex expression signature is also enriched for genes involved in neurexin family protein binding and synaptic organization. Differential expression analysis testing a sex-by-diagnosis interaction effect did not identify any genome-wide signature after multiple testing corrections. Gene coexpression network analysis was performed to reduce dimensionality and elucidate interactions among genes. We found enrichment of co-expression modules for sex-by-diagnosis differential expression signatures, which were highly reproducible across the two cohorts and involve a number of diverse pathways, including neural nucleus development, neuron projection morphogenesis, and regulation of neural precursor cell proliferation. Overall, our results indicate that the effect size of sex differences in schizophrenia gene expression signatures is small and underscore the challenge of identifying robust sex-by-diagnosis signatures, which will require future analyses in larger cohorts.


2020 ◽  
Author(s):  
Kameshwar P. Singh ◽  
Krishna P. Maremanda ◽  
Dongmei Li ◽  
Irfan Rahman

Abstract Background Electronic cigarettes (e-cigs) produce aerosolized substances by heating a liquid, which contains large number of chemicals. The aerosol generated by E-cig may produce serious health effects. Cigarette smoke exposure may causes various diseases including COPD, atherosclerosis, and lung cancer. Waterpipe tobacco smoking also causes various acute and chronic health effects including cardiopulmonary diseases. MicroRNAs are present in higher concentration in exosomes that play a major role in various normal physiological functions and diseases. We hypothesized that the non-coding RNAs transcript may serve as susceptibility to disease biomarkers by smoking and vaping. Results Our data show the enrichment of various non-coding RNAs that include microRNAs, tRNAs, piRNAs, snoRNAs, snRNAs, Mt-tRNAs, and other biotypes in exosomes. The detailed differential expression analysis of microRNAs, tRNAs and piRNA showed significant changes between pairwise comparisons of different groups. The common changes in differential expression of 8 microRNAs that are hsa-let-7a-5p, hsa-miR-21-5p, hsa-miR-29b-3p, hsa-let-7f-5p, hsa-miR-143-3p, hsa-miR-30a-5p, hsa-let-7i-5p, and hsa-let-7g-5p were found when compared with all smoking and vaping groups with non-smoking group. The e-cig group has differentially expressed 7 microRNAs (hsa-miR-224-5p, hsa-let-7c-5p, hsa-miR-193b-3p, hsa-miR-30e-5p, hsa-miR-423-3p, hsa-miR-500b-3p, hsa-miR-365a-3p|hsa-miR-365b-3p) that is specific for this group, not expressed in other three groups. Gene set enrichment analysis of microRNA showed significant changes in the top six enriched functions that consisted of biological pathway, biological process, molecular function, cellular component, site of expression and transcription factor in all groups. Further, the pairwise comparison of tRNAs and piRNA in all groups also revealed significant changes in differential expression. Conclusions Plasma exosomes of cigarette smokers, waterpipe smokers, e-cig users and dual smokers have common differential expression of microRNAs (hsa-let-7a-5p, hsa-miR-21-5p, hsa-miR-29b-3p, hsa-let-7f-5p, hsa-miR-143-3p, hsa-miR-30a-5p, hsa-let-7i-5p, and hsa-let-7g-5p), may be biomarker for tobacco exposure. Additionally, the e-cig users have also differential expressed microRNAs (hsa-miR-224-5p, hsa-let-7c-5p, hsa-miR-193b-3p, hsa-miR-30e-5p, hsa-miR-423-3p, hsa-miR-500b-3p, and hsa-miR-365a-3p|hsa-miR-365b-3p) that is specific for this group. This study will help to better understand molecular mechanisms of plasma exosome non-coding RNAs and in developing biomarkers that may be useful in diagnosis and therapy of pulmonary injury and disease by smoking and vaping.


2020 ◽  
Vol 15 ◽  
Author(s):  
Xiaowei Jiang ◽  
Pu Ying ◽  
Yingchao Shen ◽  
Yiming Miu ◽  
Wenbin Kong ◽  
...  

Background: Osteoporosis is the most common bone metabolic disease. Abnormal osteoclast formation and resorption play a fundamental role in osteoporosis pathogenesis. Recent researches have greatly broadened our understanding of molecular mechanisms of osteoporosis. However, the molecular mechanisms leading to osteoporosis are still not entirely clear. Objective: The purpose of this work is to study the critical regulatory genes, functional modules, and signaling pathways. Methods: Differential expression analysis, network topology-based analysis, and overrepresentation enrichment analysis (ORA) were used to identify differentially expressed genes (DEGs), gene subnetworks, and signaling pathways related to osteoporosis, respectively. Results: Differential expression analysis identified DEGs, such as POGLUT1, DAPK3 and NFKBIA, associated with osteoclastogenesis, which highlighted Notch, apoptosis and NF-kB signaling pathways. Network topology-based analysis identified the upregulated subnetwork characterized by EXOSC8 and DIS3L from the RNA exosome complex, and the downregulated subnetwork composed of histone deacetylases and the cofactors, MORF4L1 and JDP2. Furthermore, the overrepresentation enrichment analysis highlighted that corticotrophin-releasing hormone signaling pathway may affect osteoclastogenesis through its component NR4A1, and suppressing osteoclast differentiation and osteoclast bone resorption with urocortin (UCN). Conclusion: Our systematic analysis not only discovered novel molecular mechanisms, but also proposed potential drug targets for osteoporosis.


Sign in / Sign up

Export Citation Format

Share Document