scholarly journals Discovering long noncoding RNA predictors of anticancer drug sensitivity beyond protein-coding genes

2019 ◽  
Vol 116 (44) ◽  
pp. 22020-22029 ◽  
Author(s):  
Aritro Nath ◽  
Eunice Y. T. Lau ◽  
Adam M. Lee ◽  
Paul Geeleher ◽  
William C. S. Cho ◽  
...  

Large-scale cancer cell line screens have identified thousands of protein-coding genes (PCGs) as biomarkers of anticancer drug response. However, systematic evaluation of long noncoding RNAs (lncRNAs) as pharmacogenomic biomarkers has so far proven challenging. Here, we study the contribution of lncRNAs as drug response predictors beyond spurious associations driven by correlations with proximal PCGs, tissue lineage, or established biomarkers. We show that, as a whole, the lncRNA transcriptome is equally potent as the PCG transcriptome at predicting response to hundreds of anticancer drugs. Analysis of individual lncRNAs transcripts associated with drug response reveals nearly half of the significant associations are in fact attributable to proximal cis-PCGs. However, adjusting for effects of cis-PCGs revealed significant lncRNAs that augment drug response predictions for most drugs, including those with well-established clinical biomarkers. In addition, we identify lncRNA-specific somatic alterations associated with drug response by adopting a statistical approach to determine lncRNAs carrying somatic mutations that undergo positive selection in cancer cells. Lastly, we experimentally demonstrate that 2 lncRNAs, EGFR-AS1 and MIR205HG, are functionally relevant predictors of anti-epidermal growth factor receptor (EGFR) drug response.

2019 ◽  
Author(s):  
Aritro Nath ◽  
Eunice Y.T. Lau ◽  
Adam M. Lee ◽  
Paul Geeleher ◽  
William C.S. Cho ◽  
...  

AbstractLarge-scale cancer cell line screens have identified thousands of protein-coding genes (PCGs) as biomarkers of anticancer drug response. However, systematic evaluation of long non-coding RNAs (lncRNAs) as pharmacogenomic biomarkers has so far proven challenging. Here, we study the contribution of lncRNAs as drug response predictors beyond spurious associations driven by correlations with proximal PCGs, tissue-lineage or established biomarkers. We show that, as a whole, the lncRNA transcriptome is equally potent as the PCG transcriptome at predicting response to hundreds of anticancer drugs. Analysis of individual lncRNAs transcripts associated with drug response reveals nearly half of the significant associations are in fact attributable to proximal cis-PCGs. However, adjusting for effects of cis-PCGs revealed significant lncRNAs that augment drug response predictions for most drugs, including those with well-established clinical biomarkers. In addition, we identify lncRNA-specific somatic alterations associated with drug response by adopting a statistical approach to determine lncRNAs carrying somatic mutations that undergo positive selection in cancer cells. Lastly, we experimentally demonstrate that two novel lncRNA, EGFR-AS1 and MIR205HG, are functionally relevant predictors of anti-EGFR drug response.


2017 ◽  
Author(s):  
Morgan N. Price ◽  
Adam P. Arkin

AbstractLarge-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources that link protein sequences to scientific articles (Swiss-Prot, GeneRIF, and EcoCyc). PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/.


eLife ◽  
2014 ◽  
Vol 3 ◽  
Author(s):  
Vladislava Chalei ◽  
Stephen N Sansom ◽  
Lesheng Kong ◽  
Sheena Lee ◽  
Juan F Montiel ◽  
...  

Many intergenic long noncoding RNA (lncRNA) loci regulate the expression of adjacent protein coding genes. Less clear is whether intergenic lncRNAs commonly regulate transcription by modulating chromatin at genomically distant loci. Here, we report both genomically local and distal RNA-dependent roles of Dali, a conserved central nervous system expressed intergenic lncRNA. Dali is transcribed downstream of the Pou3f3 transcription factor gene and its depletion disrupts the differentiation of neuroblastoma cells. Locally, Dali transcript regulates transcription of the Pou3f3 locus. Distally, it preferentially targets active promoters and regulates expression of neural differentiation genes, in part through physical association with the POU3F3 protein. Dali interacts with the DNMT1 DNA methyltransferase in mouse and human and regulates DNA methylation status of CpG island-associated promoters in trans. These results demonstrate, for the first time, that a single intergenic lncRNA controls the activity and methylation of genomically distal regulatory elements to modulate large-scale transcriptional programmes.


2018 ◽  
Vol 49 (6) ◽  
Author(s):  
Elsahookie & et al.

The endosperm in cereals supplies nutrients to the developing kernel and seedling, and it is the primary tissue that gene imprinting occurs. Developing maize (Zea mays L.) endosperms were analysed for allelic gene expression in both reciprocal crosses of inbreds B73 and Mo17. A high-throughput transcriptome sequencing in kernels at 0, 3 up to 15 DAP of both reciprocals were performed, and found a gradual increased paternal transcript expression in 3 and 5 DAP kernels. Meanwhile, in 7 DAP endosperm, most of genes tested gave the ratio 2:1 maternal: paternal, suggesting that paternal genes are almost fully activated at 7 DAP. There were 300 PEGs and 499 MEGs identified across endosperm development stages. A 63 genes out of 116, 234 exhibited parent-specific expression were identified at 7, 10 and 15 DAP. Most of paternally expressed genes was at 7 DAP due to deviation of paternal alleles expression at this stage of development. Imprinted genes in terms of relative expression of maternal and paternal alleles differed at least five folds in both crosses. A total of 179 (1.6%) protein coding genes expressed in the endosperm were imprinted, 68 of them showed maternal preferential expression and 111 paternal expression, besides 38 long noncoding RNA were found imprinted and transcribed in either sense or antisense direction from intronic regions of normal protein coding genes or from intergenic regions. Imprinted genes showed clustering around the genome. A total of 21 imprinted  genes in the maize hybrid endosperm had differentially methylated regions (DMRs). All DMRs were found to be hypomethylated in maternal alleles and hypermethylated in paternal alleles. These results confirm a complex mechanism controlling endosperm in maize in imprinting, auxin activity, and development regulation. Studying F2 kernels on F1 plants may shed a new light on controlling kernel number weight in unit of area.


Genetics ◽  
1999 ◽  
Vol 153 (1) ◽  
pp. 179-219 ◽  
Author(s):  
M Ashburner ◽  
S Misra ◽  
J Roote ◽  
S E Lewis ◽  
R Blazej ◽  
...  

Abstract A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized “Adh region.” A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.


2021 ◽  
Author(s):  
XIang Li ◽  
Qiongyi Zhao ◽  
Ziqi Wang ◽  
Wei-Siang Liau ◽  
Dean Basic ◽  
...  

Long-noncoding RNA (lncRNA) comprise a new class of genes that have been assigned key roles in development and disease. Many lncRNAs are specifically transcribed in the brain where they regulate the expression of protein-coding genes that underpin neuronal function; however, their role in learning and memory remains largely unexplored. We used RNA Capture-Seq to identify a large population of lncRNAs that are expressed in the infralimbic cortex of adult male mice in response to fear-related learning, with 14.5% of these annotated in the GENCODE database as lncRNAs with no known function. We combined these data with cell-type-specific ATAC-seq on neurons that had been selectively activated by fear-extinction learning, and revealed 434 lncRNAs derived from enhancer regions in the vicinity of protein-coding genes. In particular, we discovered an experience-induced lncRNA called ADRAM that acts as both a scaffold and a combinatorial guide to recruit the brain-enriched chaperone protein 14-3-3 to the promoter of the memory-associated immediate early gene Nr4a2. This leads to the expulsion of histone deactylases 3 and 4, and the recruitment of the histone acetyltransferase creb binding protein, which drives learning-induced Nr4a2 expression. Knockdown of ADRAM disrupts this interaction, blocks the expression of Nr4a2, and ultimately impairs the formation of fear-extinction memory. This study expands the lexicon of experience-dependent lncRNA activity in the brain, highlights enhancer-derived RNAs (eRNAs) as key players in the epigenetic regulation of gene expression associated with fear extinction, and suggests eRNAs, such as ADRAM, may constitute viable targets in developing novel treatments for fear-related anxiety disorders.


2018 ◽  
Vol 75 ◽  
pp. 3-12 ◽  
Author(s):  
Vincent Boivin ◽  
Gabrielle Deschamps-Francoeur ◽  
Michelle S Scott

2019 ◽  
Vol 8 (16) ◽  
Author(s):  
Sukjung Choi ◽  
Eun Bae Kim

Lactobacillus plantarum strain EBKLp545 was isolated from piglet feces in South Korea and sequenced using an Illumina HiSeq system. This draft genome of strain EBKLp545 consists of 3,306,513 bp with 3,049 protein-coding genes in 138 contigs (≥500 bp), 54 noncoding RNA genes, and a 44.3% G+C content.


2019 ◽  
Vol 36 (7) ◽  
pp. 2025-2032
Author(s):  
Yuwei Zhang ◽  
Tianfei Yi ◽  
Huihui Ji ◽  
Guofang Zhao ◽  
Yang Xi ◽  
...  

Abstract Motivation Long noncoding RNA (lncRNA) has been verified to interact with other biomolecules especially protein-coding genes (PCGs), thus playing essential regulatory roles in life activities and disease development. However, the inner mechanisms of most lncRNA–PCG relationships are still unclear. Our study investigated the characteristics of true lncRNA–PCG relationships and constructed a novel predictor with machine learning algorithms. Results We obtained the 307 true lncRNA-PCG pairs from database and found that there are significant differences in multiple characteristics between true and random lncRNA–PCG sets. Besides, 3-fold cross-validation and prediction results on independent test sets show the great AUC values of LR, SVM and RF, among which RF has the best performance with average AUC 0.818 for cross-validation, 0.823 and 0.853 for two independent test sets, respectively. In case study, some candidate lncRNA–PCG relationships in colorectal cancer were found and HOTAIR–COMP interaction was specially exemplified. The proportion of the reported pairs in the predicted positive results was significantly higher than that in negative results (P < 0.05). Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document