scholarly journals Assessing the impact of human genome annotation choice on RNA-seq expression estimates

2013 ◽  
Vol 14 (Suppl 11) ◽  
pp. S8 ◽  
Author(s):  
Po-Yen Wu ◽  
John H Phan ◽  
May D Wang
2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Li Tong ◽  
◽  
Po-Yen Wu ◽  
John H. Phan ◽  
Hamid R. Hassazadeh ◽  
...  

Abstract To use next-generation sequencing technology such as RNA-seq for medical and health applications, choosing proper analysis methods for biomarker identification remains a critical challenge for most users. The US Food and Drug Administration (FDA) has led the Sequencing Quality Control (SEQC) project to conduct a comprehensive investigation of 278 representative RNA-seq data analysis pipelines consisting of 13 sequence mapping, three quantification, and seven normalization methods. In this article, we focused on the impact of the joint effects of RNA-seq pipelines on gene expression estimation as well as the downstream prediction of disease outcomes. First, we developed and applied three metrics (i.e., accuracy, precision, and reliability) to quantitatively evaluate each pipeline’s performance on gene expression estimation. We then investigated the correlation between the proposed metrics and the downstream prediction performance using two real-world cancer datasets (i.e., SEQC neuroblastoma dataset and the NIH/NCI TCGA lung adenocarcinoma dataset). We found that RNA-seq pipeline components jointly and significantly impacted the accuracy of gene expression estimation, and its impact was extended to the downstream prediction of these cancer outcomes. Specifically, RNA-seq pipelines that produced more accurate, precise, and reliable gene expression estimation tended to perform better in the prediction of disease outcome. In the end, we provided scenarios as guidelines for users to use these three metrics to select sensible RNA-seq pipelines for the improved accuracy, precision, and reliability of gene expression estimation, which lead to the improved downstream gene expression-based prediction of disease outcome.


Genes ◽  
2014 ◽  
Vol 5 (3) ◽  
pp. 518-535 ◽  
Author(s):  
Jessica Bailey ◽  
Margaret Pericak-Vance ◽  
Jonathan Haines

2012 ◽  
Vol 22 (9) ◽  
pp. 1698-1710 ◽  
Author(s):  
C. Howald ◽  
A. Tanzer ◽  
J. Chrast ◽  
F. Kokocinski ◽  
T. Derrien ◽  
...  
Keyword(s):  
Rna Seq ◽  

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Geneviève Bart ◽  
Daniel Fischer ◽  
Anatoliy Samoylenko ◽  
Artem Zhyvolozhnyi ◽  
Pavlo Stehantsev ◽  
...  

Abstract Background The human sweat is a mixture of secretions from three types of glands: eccrine, apocrine, and sebaceous. Eccrine glands open directly on the skin surface and produce high amounts of water-based fluid in response to heat, emotion, and physical activity, whereas the other glands produce oily fluids and waxy sebum. While most body fluids have been shown to contain nucleic acids, both as ribonucleoprotein complexes and associated with extracellular vesicles (EVs), these have not been investigated in sweat. In this study we aimed to explore and characterize the nucleic acids associated with sweat particles. Results We used next generation sequencing (NGS) to characterize DNA and RNA in pooled and individual samples of EV-enriched sweat collected from volunteers performing rigorous exercise. In all sequenced samples, we identified DNA originating from all human chromosomes, but only the mitochondrial chromosome was highly represented with 100% coverage. Most of the DNA mapped to unannotated regions of the human genome with some regions highly represented in all samples. Approximately 5 % of the reads were found to map to other genomes: including bacteria (83%), archaea (3%), and virus (13%), identified bacteria species were consistent with those commonly colonizing the human upper body and arm skin. Small RNA-seq from EV-enriched pooled sweat RNA resulted in 74% of the trimmed reads mapped to the human genome, with 29% corresponding to unannotated regions. Over 70% of the RNA reads mapping to an annotated region were tRNA, while misc. RNA (18,5%), protein coding RNA (5%) and miRNA (1,85%) were much less represented. RNA-seq from individually processed EV-enriched sweat collection generally resulted in fewer percentage of reads mapping to the human genome (7–45%), with 50–60% of those reads mapping to unannotated region of the genome and 30–55% being tRNAs, and lower percentage of reads being rRNA, LincRNA, misc. RNA, and protein coding RNA. Conclusions Our data demonstrates that sweat, as all other body fluids, contains a wealth of nucleic acids, including DNA and RNA of human and microbial origin, opening a possibility to investigate sweat as a source for biomarkers for specific health parameters.


2021 ◽  
pp. 1-28
Author(s):  
Ya-Wen Lei

Abstract Literature on scientific controversies has inadequately attended to the impact of globalization and, more specifically, the emergence of China as a leader in scientific research. To bridge this gap in the literature, this article develops a theoretical framework to analyse global scientific controversies surrounding research in China. The framework highlights the existence of four overlapping discursive arenas: China's national public sphere and national expert sphere, the transnational public sphere and the transnational expert sphere. It then examines the struggles over inclusion/exclusion and publicity within these spheres as well as the within- and across-sphere effects of such struggles. Empirically, the article analyses the human genome editing controversy surrounding research conducted by scientists in China between 2015 and 2019. It shows how elite scientists negotiated expert–public relationships within and across the national and transnational expert spheres, how unexpected disruption at the nexus of the four spheres disrupted expert–public relationships as envisioned by elite experts, and how the Chinese state intervened to redraw the boundary between openness and secrecy at both national and transnational levels.


2020 ◽  
Author(s):  
Colin Peter Singer Kruse ◽  
Alexander D Meyers ◽  
Proma Basu ◽  
Sarahann Hutchinson ◽  
Darron R Luesse ◽  
...  

Abstract Background: Understanding of gravity sensing and response is critical to long-term human habitation in space and can provide new advantages for terrestrial agriculture. To this end, the altered gene expression profile induced by microgravity has been repeatedly queried by microarray and RNA-seq experiments to understand gravitropism. However, the quantification of altered protein abundance in space has been minimally investigated. Results: Proteomic (iTRAQ-labelled LC-MS/MS) and transcriptomic (RNA-seq) analyses simultaneously quantified protein and transcript differential expression of three-day old, etiolated Arabidopsis thaliana seedlings grown aboard the International Space Station along with their ground control counterparts. Protein extracts were fractionated to isolate soluble and membrane proteins and analyzed to detect differentially phosphorylated peptides. In total, 968 RNAs, 107 soluble proteins, and 103 membrane proteins were identified as differentially expressed. In addition, the proteomic analyses identified 16 differential phosphorylation events. Proteomic data delivered novel insights and simultaneously provided new context to previously made observations of gene expression in microgravity. There is a sweeping shift in post-transcriptional mechanisms of gene regulation including RNA-decapping protein DCP5, the splicing factors GRP7 and GRP8, and AGO4,. These data also indicate AHA2 and FERONIA as well as CESA1 and SHOU4 as central to the cell wall adaptations seen in spaceflight. Patterns of tubulin-a 1, 3,4 and 6 phosphorylation further reveal an interaction of microtubule and redox homeostasis that mirrors osmotic response signaling elements. The absence of gravity also results in a seemingly wasteful dysregulation of plastid gene transcription. Conclusions: The datasets gathered from Arabidopsis seedlings exposed to microgravity revealed marked impacts on post-transcriptional regulation, cell wall synthesis, redox/microtubule dynamics, and plastid gene transcription. The impact of post-transcriptional regulatory alterations represents an unstudied element of the plant microgravity response with the potential to significantly impact plant growth efficiency and beyond. What’s more, addressing the effects of microgravity on AHA2, CESA1, and alpha tubulins has the potential to enhance cytoskeletal organization and cell wall composition, thereby enhancing biomass production and growth in microgravity. Finally, understanding and manipulating the dysregulation of plastid gene transcription has further potential to address the goal of enhancing plant growth in the stressful conditions of microgravity.


2020 ◽  
Author(s):  
Silvia Llonch ◽  
Montserrat Barragán ◽  
Paula Nieto ◽  
Anna Mallol ◽  
Marc Elosua-Bayes ◽  
...  

AbstractStudy questionTo which degree does maternal age affect the transcriptome of human oocytes at the germinal vesicle (GV) stage or at metaphase II after maturation in vitro (IVM-MII)?Summary answerWhile the oocytes’ transcriptome is predominantly determined by maturation stage, transcript levels of genes related to chromosome segregation, mitochondria and RNA processing are affected by age after in vitro maturation of denuded oocytes.What is known alreadyFemale fertility is inversely correlated with maternal age due to both a depletion of the oocyte pool and a reduction in oocyte developmental competence. Few studies have addressed the effect of maternal age on the human mature oocyte (MII) transcriptome, which is established during oocyte growth and maturation, and the pathways involved remain unclear. Here, we characterize and compare the transcriptomes of a large cohort of fully grown GV and IVM-MII oocytes from women of varying reproductive age.Study design, size, durationIn this prospective molecular study, 37 women were recruited from May 2018 to June 2019. The mean age was 28.8 years (SD=7.7, range 18-43). A total of 72 oocytes were included in the study at GV stage after ovarian stimulation, and analyzed as GV (n=40) and in vitro matured oocytes (IVM-MII; n=32).Participants/materials, setting, methodsDenuded oocytes were included either as GV at the time of ovum pick-up or as IVM-MII after in vitro maturation for 30 hours in G2™ medium, and processed for transcriptomic analysis by single-cell RNA-seq using the Smart-seq2 technology. Cluster and maturation stage marker analysis were performed using the Seurat R package. Genes with an average fold change greater than 2 and a p-value < 0.01 were considered maturation stage markers. A Pearson correlation test was used to identify genes whose expression levels changed progressively with age. Those genes presenting a correlation value (R) >= |0.3| and a p-value < 0.05 were considered significant.Main results and the role of chanceFirst, by exploration of the RNA-seq data using tSNE dimensionality reduction, we identified two clusters of cells reflecting the oocyte maturation stage (GV and IVM-MII) with 4,445 and 324 putative marker genes, respectively. Next we identified genes, for which RNA levels either progressively increased or decreased with age. This analysis was performed independently for GV and IVM-MII oocytes. Our results indicate that the transcriptome is more affected by age in IVM-MII oocytes (1,219 genes) than in GV oocytes (596 genes). In particular, we found that genes involved in chromosome segregation and RNA splicing significantly increase in transcript levels with age, while genes related to mitochondrial activity present lower transcript levels with age. Gene regulatory network analysis revealed potential upstream master regulator functions for genes whose transcript levels present positive (GPBP1, RLF, SON, TTF1) or negative (BNC1, THRB) correlation with age.Limitations, reasons for cautionIVM-MII oocytes used in this study were obtained after in vitro maturation of denuded GV oocytes, therefore, their transcriptome might not be fully representative of in vivo matured MII oocytes.The Smart-seq2 methodology used in this study detects polyadenylated transcripts only and we could therefore not assess non-polyadenylated transcripts.Wider implications of the findingsOur analysis suggests that advanced maternal age does not globally affect the oocyte transcriptome at GV or IVM-MII stages. Nonetheless, hundreds of genes displayed altered transcript levels with age, particularly in IVM-MII oocytes. Especially affected by age were genes related to chromosome segregation and mitochondrial function, pathways known to be involved in oocyte ageing. Our study thereby suggests that misregulation of chromosome segregation and mitochondrial pathways also at the RNA-level might contribute to the age-related quality decline in human oocytes.Study funding/competing interest(s)This study was funded by the AXA research fund, the European commission, intramural funding of Clinica EUGIN, the Spanish Ministry of Science, Innovation and Universities, the Catalan Agència de Gestió d’Ajuts Universitaris i de Recerca (AGAUR) and by contributions of the Spanish Ministry of Economy, Industry and Competitiveness (MEIC) to the EMBL partnership and to the “Centro de Excelencia Severo Ochoa”.The authors have no conflict of interest to declare.


2021 ◽  
Author(s):  
Dennis A Sun ◽  
Nipam H Patel

AbstractEmerging research organisms enable the study of biology that cannot be addressed using classical “model” organisms. The development of novel data resources can accelerate research in such animals. Here, we present new functional genomic resources for the amphipod crustacean Parhyale hawaiensis, facilitating the exploration of gene regulatory evolution using this emerging research organism. We use Omni-ATAC-Seq, an improved form of the Assay for Transposase-Accessible Chromatin coupled with next-generation sequencing (ATAC-Seq), to identify accessible chromatin genome-wide across a broad time course of Parhyale embryonic development. This time course encompasses many major morphological events, including segmentation, body regionalization, gut morphogenesis, and limb development. In addition, we use short- and long-read RNA-Seq to generate an improved Parhyale genome annotation, enabling deeper classification of identified regulatory elements. We leverage a variety of bioinformatic tools to discover differential accessibility, predict nucleosome positioning, infer transcription factor binding, cluster peaks based on accessibility dynamics, classify biological functions, and correlate gene expression with accessibility. Using a Minos transposase reporter system, we demonstrate the potential to identify novel regulatory elements using this approach, including distal regulatory elements. This work provides a platform for the identification of novel developmental regulatory elements in Parhyale, and offers a framework for performing such experiments in other emerging research organisms.Primary Findings-Omni-ATAC-Seq identifies cis-regulatory elements genome-wide during crustacean embryogenesis-Combined short- and long-read RNA-Seq improves the Parhyale genome annotation-ImpulseDE2 analysis identifies dynamically regulated candidate regulatory elements-NucleoATAC and HINT-ATAC enable inference of nucleosome occupancy and transcription factor binding-Fuzzy clustering reveals peaks with distinct accessibility and chromatin dynamics-Integration of accessibility and gene expression reveals possible enhancers and repressors-Omni-ATAC can identify known and novel regulatory elements


Sign in / Sign up

Export Citation Format

Share Document