scholarly journals Protein coding genes as hosts for noncoding RNA expression

2018 ◽  
Vol 75 ◽  
pp. 3-12 ◽  
Author(s):  
Vincent Boivin ◽  
Gabrielle Deschamps-Francoeur ◽  
Michelle S Scott
2019 ◽  
Vol 116 (44) ◽  
pp. 22020-22029 ◽  
Author(s):  
Aritro Nath ◽  
Eunice Y. T. Lau ◽  
Adam M. Lee ◽  
Paul Geeleher ◽  
William C. S. Cho ◽  
...  

Large-scale cancer cell line screens have identified thousands of protein-coding genes (PCGs) as biomarkers of anticancer drug response. However, systematic evaluation of long noncoding RNAs (lncRNAs) as pharmacogenomic biomarkers has so far proven challenging. Here, we study the contribution of lncRNAs as drug response predictors beyond spurious associations driven by correlations with proximal PCGs, tissue lineage, or established biomarkers. We show that, as a whole, the lncRNA transcriptome is equally potent as the PCG transcriptome at predicting response to hundreds of anticancer drugs. Analysis of individual lncRNAs transcripts associated with drug response reveals nearly half of the significant associations are in fact attributable to proximal cis-PCGs. However, adjusting for effects of cis-PCGs revealed significant lncRNAs that augment drug response predictions for most drugs, including those with well-established clinical biomarkers. In addition, we identify lncRNA-specific somatic alterations associated with drug response by adopting a statistical approach to determine lncRNAs carrying somatic mutations that undergo positive selection in cancer cells. Lastly, we experimentally demonstrate that 2 lncRNAs, EGFR-AS1 and MIR205HG, are functionally relevant predictors of anti-epidermal growth factor receptor (EGFR) drug response.


2018 ◽  
Vol 49 (6) ◽  
Author(s):  
Elsahookie & et al.

The endosperm in cereals supplies nutrients to the developing kernel and seedling, and it is the primary tissue that gene imprinting occurs. Developing maize (Zea mays L.) endosperms were analysed for allelic gene expression in both reciprocal crosses of inbreds B73 and Mo17. A high-throughput transcriptome sequencing in kernels at 0, 3 up to 15 DAP of both reciprocals were performed, and found a gradual increased paternal transcript expression in 3 and 5 DAP kernels. Meanwhile, in 7 DAP endosperm, most of genes tested gave the ratio 2:1 maternal: paternal, suggesting that paternal genes are almost fully activated at 7 DAP. There were 300 PEGs and 499 MEGs identified across endosperm development stages. A 63 genes out of 116, 234 exhibited parent-specific expression were identified at 7, 10 and 15 DAP. Most of paternally expressed genes was at 7 DAP due to deviation of paternal alleles expression at this stage of development. Imprinted genes in terms of relative expression of maternal and paternal alleles differed at least five folds in both crosses. A total of 179 (1.6%) protein coding genes expressed in the endosperm were imprinted, 68 of them showed maternal preferential expression and 111 paternal expression, besides 38 long noncoding RNA were found imprinted and transcribed in either sense or antisense direction from intronic regions of normal protein coding genes or from intergenic regions. Imprinted genes showed clustering around the genome. A total of 21 imprinted  genes in the maize hybrid endosperm had differentially methylated regions (DMRs). All DMRs were found to be hypomethylated in maternal alleles and hypermethylated in paternal alleles. These results confirm a complex mechanism controlling endosperm in maize in imprinting, auxin activity, and development regulation. Studying F2 kernels on F1 plants may shed a new light on controlling kernel number weight in unit of area.


2021 ◽  
Author(s):  
XIang Li ◽  
Qiongyi Zhao ◽  
Ziqi Wang ◽  
Wei-Siang Liau ◽  
Dean Basic ◽  
...  

Long-noncoding RNA (lncRNA) comprise a new class of genes that have been assigned key roles in development and disease. Many lncRNAs are specifically transcribed in the brain where they regulate the expression of protein-coding genes that underpin neuronal function; however, their role in learning and memory remains largely unexplored. We used RNA Capture-Seq to identify a large population of lncRNAs that are expressed in the infralimbic cortex of adult male mice in response to fear-related learning, with 14.5% of these annotated in the GENCODE database as lncRNAs with no known function. We combined these data with cell-type-specific ATAC-seq on neurons that had been selectively activated by fear-extinction learning, and revealed 434 lncRNAs derived from enhancer regions in the vicinity of protein-coding genes. In particular, we discovered an experience-induced lncRNA called ADRAM that acts as both a scaffold and a combinatorial guide to recruit the brain-enriched chaperone protein 14-3-3 to the promoter of the memory-associated immediate early gene Nr4a2. This leads to the expulsion of histone deactylases 3 and 4, and the recruitment of the histone acetyltransferase creb binding protein, which drives learning-induced Nr4a2 expression. Knockdown of ADRAM disrupts this interaction, blocks the expression of Nr4a2, and ultimately impairs the formation of fear-extinction memory. This study expands the lexicon of experience-dependent lncRNA activity in the brain, highlights enhancer-derived RNAs (eRNAs) as key players in the epigenetic regulation of gene expression associated with fear extinction, and suggests eRNAs, such as ADRAM, may constitute viable targets in developing novel treatments for fear-related anxiety disorders.


2019 ◽  
Vol 8 (16) ◽  
Author(s):  
Sukjung Choi ◽  
Eun Bae Kim

Lactobacillus plantarum strain EBKLp545 was isolated from piglet feces in South Korea and sequenced using an Illumina HiSeq system. This draft genome of strain EBKLp545 consists of 3,306,513 bp with 3,049 protein-coding genes in 138 contigs (≥500 bp), 54 noncoding RNA genes, and a 44.3% G+C content.


2019 ◽  
Vol 36 (7) ◽  
pp. 2025-2032
Author(s):  
Yuwei Zhang ◽  
Tianfei Yi ◽  
Huihui Ji ◽  
Guofang Zhao ◽  
Yang Xi ◽  
...  

Abstract Motivation Long noncoding RNA (lncRNA) has been verified to interact with other biomolecules especially protein-coding genes (PCGs), thus playing essential regulatory roles in life activities and disease development. However, the inner mechanisms of most lncRNA–PCG relationships are still unclear. Our study investigated the characteristics of true lncRNA–PCG relationships and constructed a novel predictor with machine learning algorithms. Results We obtained the 307 true lncRNA-PCG pairs from database and found that there are significant differences in multiple characteristics between true and random lncRNA–PCG sets. Besides, 3-fold cross-validation and prediction results on independent test sets show the great AUC values of LR, SVM and RF, among which RF has the best performance with average AUC 0.818 for cross-validation, 0.823 and 0.853 for two independent test sets, respectively. In case study, some candidate lncRNA–PCG relationships in colorectal cancer were found and HOTAIR–COMP interaction was specially exemplified. The proportion of the reported pairs in the predicted positive results was significantly higher than that in negative results (P < 0.05). Supplementary information Supplementary data are available at Bioinformatics online.


Blood ◽  
2021 ◽  
Vol 138 (Supplement 1) ◽  
pp. 500-500
Author(s):  
Michelle Ng ◽  
Lonneke Verboon ◽  
Hasan Issa ◽  
Raj Bhayadia ◽  
Oriol Alejo ◽  
...  

Abstract The noncoding genome presents a largely untapped source of biological insights, including tens of thousands of long noncoding RNA (lncRNA) loci. While some produce bona fide lncRNAs, others exert transcript-independent cis-regulatory effects, and the lack of predictive features renders their mechanistic dissection highly challenging. Here, we describe CTCF-enriched lncRNA loci (C-LNC) as a putative new subclass of functional genetic elements exemplified by MYNRL15 - myeloid leukemia noncoding regulatory locus on chromosome 15. Initially identified by an expression-guided CRISPRi screen of hematopoietic stem and progenitor (HSPC) / acute myeloid leukemia (AML) lncRNA signatures (480 genes, 1545 sgRNAs), we found MYNRL15 dependency in myeloid leukemia cells of diverse genetic backgrounds. Interestingly, cis and trans perturbation approaches revealed both the MYNRL15 transcript and its flanking protein-coding genes to be dispensable. High density CRISPR tiling of a 15 kb area centered on MYNRL15 (1613 sgRNAs) instead uncovered two crucial, candidate cis-regulatory DNA elements in the locus, which drive the MYNRL15 perturbation phenotype. To determine the molecular basis of MYNRL15 dependence, we performed transcriptome, chromatin conformation, chromatin accessibility, and CTCF profiling. RNA-sequencing established MYNRL15's involvement in maintaining key cancer dependency pathways (e.g. cell cycle, ribosome, spliceosome). Further, MYNRL15 perturbation associated with the coordinated dysregulation of several chromosome 15 neighbourhoods, and formation of a long-range chromatin interaction between the locus and the base of a distal loop, as detected via next-generation Capture-C. The gained interaction was accompanied by diffuse gains in chromatin accessibility across the distal interaction sites (ATAC-seq) as well as reduced CTCF occupancy at the MYRNL15 locus (CTCF CUT&RUN), altogether indicating the 3D re-organization of chromosome 15 following MYNRL15 perturbation. Integrative analysis of the chromatin conformation and transcriptome data, combined with a small CRISPR-Cas9 knockout screen of protein-coding genes from the gained interaction region (29 genes, 149 sgRNAs), pinpointed two potent cancer dependency genes that are located in the region and downregulated following MYNRL15 perturbation: namely, WDR61 and IMP3. Individual knockout of both genes robustly depleted myeloid leukemia cells, recapitulating the MYNRL15 perturbation phenotype and positioning WDR61 and IMP3 as its regulatory targets. Importantly, in primary cells, MYNRL15 perturbation eradicated AML blasts while sparing 50-60% of CD34 + HSPCs in vitro, and reduced patient-derived AML xenografts up to 10-fold in vivo, indicating a potential therapeutic window. Having implicated MYNRL15 in 3D genome organization and demonstrated its role in myeloid leukemia cells, we explored whether MYNRL15 may belong to a sub-category of biologically relevant lncRNA loci that have thus far been overlooked due to their lack of transcript-specific functions. Remarkably, elevated CTCF density (e.g. number of CTCF binding sites per kb of gene length) distinguishes MYNRL15 and 531 other lncRNA loci in K562 cells, of which 43-54% associate with genetic subgroups and/or survival in AML patient cohorts, and 18.4% are functionally required for leukemia maintenance as determined by CRISPR-Cas9 screening. The latter hit identification rate represents a substantial improvement over typical lncRNA essentiality screens (which range from 2-6%) - illustrating the effectiveness of CTCF density metrics in refining functional lncRNA candidate lists, and underlining the relevance such loci hold for AML and cancer pathophysiology in general. Curated C-LNC catalogs in other cell types will facilitate the search for noncoding oncogenic vulnerabilities in AML and other malignancies. Figure 1 Figure 1. Disclosures Reinhardt: Celgene Corporation: Consultancy; Novartis: Consultancy; Bluebird Bio: Consultancy; Janssen: Consultancy; CLS Behring: Research Funding; Roche: Research Funding. Klusmann: Bluebird Bio: Consultancy; Novartis: Consultancy; Roche: Consultancy; Jazz Pharmaceuticals: Consultancy.


2012 ◽  
Vol 11 (4) ◽  
pp. 417-429 ◽  
Author(s):  
Karen Chinchilla ◽  
Juan B. Rodriguez-Molina ◽  
Doris Ursic ◽  
Jonathan S. Finkel ◽  
Aseem Z. Ansari ◽  
...  

ABSTRACT The Saccharomyces cerevisiae SEN1 gene codes for a nuclear, ATP-dependent helicase which is embedded in a complex network of protein-protein interactions. Pleiotropic phenotypes of mutations in SEN1 suggest that Sen1 functions in many nuclear processes, including transcription termination, DNA repair, and RNA processing. Sen1, along with termination factors Nrd1 and Nab3, is required for the termination of noncoding RNA transcripts, but Sen1 is associated during transcription with coding and noncoding genes. Sen1 and Nrd1 both interact directly with Nab3, as well as with the C-terminal domain (CTD) of Rpb1, the largest subunit of RNA polymerase II. It has been proposed that Sen1, Nab3, and Nrd1 form a complex that associates with Rpb1 through an interaction between Nrd1 and the Ser 5 -phosphorylated (Ser 5 -P) CTD. To further study the relationship between the termination factors and Rpb1, we used two-hybrid analysis and immunoprecipitation to characterize sen1-R302W , a mutation that impairs an interaction between Sen1 and the Ser 2 -phosphorylated CTD. Chromatin immunoprecipitation indicates that the impairment of the interaction between Sen1 and Ser 2 -P causes the reduced occupancy of mutant Sen1 across the entire length of noncoding genes. For protein-coding genes, mutant Sen1 occupancy is reduced early and late in transcription but is similar to that of the wild type across most of the coding region. The combined data suggest a handoff model in which proteins differentially transfer from the Ser 5 - to the Ser 2 -phosphorylated CTD to promote the termination of noncoding transcripts or other cotranscriptional events for protein-coding genes.


2016 ◽  
Author(s):  
Chia-Yi Cheng ◽  
Vivek Krishnakumar ◽  
Agnes Chan ◽  
Seth Schobel ◽  
Christopher D. Town

ABSTRACTThe flowering plant Arabidopsis thaliana is a dicot model organism for research in many aspects of plant biology. A comprehensive annotation of its genome paves the way for understanding the functions and activities of all types of transcripts, including mRNA, noncoding RNA, and small RNA. The most recent annotation update (TAIR10) released more than five years ago had a profound impact on Arabidopsis research. Maintaining the accuracy of the annotation continues to be a prerequisite for future progress. Using an integrative annotation pipeline, we assembled tissue-specific RNA-seq libraries from 113 datasets and constructed 48,359 transcript models of protein-coding genes in eleven tissues. In addition, we annotated various classes of noncoding RNA including small RNA, long intergenic RNA, small nucleolar RNA, natural antisense transcript, small nuclear RNA, and microRNA using published datasets and in-house analytic results. Altogether, we identified 738 novel protein-coding genes, 508 novel transcribed regions, 5051 non-coding genes, and 35846 small-RNA loci that formerly eluded annotation. Analysis on the splicing events and RNA-seq based expression profile revealed the landscapes of gene structures, untranslated regions, and splicing activities to be more intricate than previously appreciated. Furthermore, we present 692 uniformly expressed housekeeping genes, 43% of whose human orthologs are also housekeeping genes. This updated Arabidopsis genome annotation with a substantially increased resolution of gene models will not only further our understanding of the biological processes of this plant model but also of other species.


Sign in / Sign up

Export Citation Format

Share Document