scholarly journals Genome-wide Prediction of DNase I Hypersensitivity Using Gene Expression

2016 ◽  
Author(s):  
Weiqiang Zhou ◽  
Ben Sherwood ◽  
Zhicheng Ji ◽  
Fang Du ◽  
Jiawei Bai ◽  
...  

We evaluate the feasibility of using a biological sample’s transcriptome to predict its genome-wide regulatory element activities measured by DNase I hypersensitivity (DH). We develop BIRD, Big Data Regression for predicting DH, to handle this high-dimensional problem. Applying BIRD to the Encyclopedia of DNA Element (ENCODE) data, we found that gene expression to a large extent predicts DH, and information useful for prediction is contained in the whole transcriptome rather than limited to a regulatory element’s neighboring genes. We show that the predicted DH predicts transcription factor binding sites (TFBSs), prediction models trained using ENCODE data can be applied to gene expression samples in Gene Expression Omnibus (GEO) to predict regulome, and one can use predictions as pseudo-replicates to improve the analysis of high-throughput regulome profiling data. Besides improving our understanding of the regulome-transcriptome relationship, this study suggests that transcriptome-based prediction can provide a useful new approach for regulome mapping.


2017 ◽  
Vol 8 (1) ◽  
Author(s):  
Weiqiang Zhou ◽  
Ben Sherwood ◽  
Zhicheng Ji ◽  
Yingchao Xue ◽  
Fang Du ◽  
...  


1999 ◽  
Vol 152 (1-2) ◽  
pp. 147-159 ◽  
Author(s):  
Michelle Gaasenbeek ◽  
Birgit Gellersen ◽  
Gabriel E. DiMattia


2019 ◽  
Author(s):  
Zou Yutong ◽  
Bui Thuy Tien ◽  
Kumar Selvarajoo

AbstractHere we report a bio-statistical/informatics tool, ABioTrans, developed in R for gene expression analysis. The tool allows the user to directly read RNA-Seq data files deposited in the Gene Expression Omnibus or GEO database. Operated using any web browser application, ABioTrans provides easy options for multiple statistical distribution fitting, Pearson and Spearman rank correlations, PCA, k-means and hierarchical clustering, differential expression analysis, Shannon entropy and noise (square of coefficient of variation) analyses, as well as Gene ontology classifications.Availability and implementationABioTrans is available at https://github.com/buithuytien/ABioTransOperating system(s): Platform independent (web browser)Programming language: R (R studio)Other requirements: Bioconductor genome wide annotation databases, R-packages (shiny, LSD, fitdistrplus, actuar, entropy, moments, RUVSeq, edgeR, DESeq2, NOISeq, AnnotationDbi, ComplexHeatmap, circlize, clusterProfiler, reshape2, DT, plotly, shinycssloaders, dplyr, ggplot2). These packages will automatically be installed when the ABioTrans.R is executed in R studio.No restriction of usage for non-academic.



2019 ◽  
Vol 47 (20) ◽  
pp. 10580-10596 ◽  
Author(s):  
Karl J V Nordström ◽  
Florian Schmidt ◽  
Nina Gasparoni ◽  
Abdulrahman Salhab ◽  
Gilles Gasparoni ◽  
...  

Abstract Chromatin accessibility maps are important for the functional interpretation of the genome. Here, we systematically analysed assay specific differences between DNase I-seq, ATAC-seq and NOMe-seq in a side by side experimental and bioinformatic setup. We observe that most prominent nucleosome depleted regions (NDRs, e.g. in promoters) are roboustly called by all three or at least two assays. However, we also find a high proportion of assay specific NDRs that are often ‘called’ by only one of the assays. We show evidence that these assay specific NDRs are indeed genuine open chromatin sites and contribute important information for accurate gene expression prediction. While technically ATAC-seq and DNase I-seq provide a superb high NDR calling rate for relatively low sequencing costs in comparison to NOMe-seq, NOMe-seq singles out for its genome-wide coverage allowing to not only detect NDRs but also endogenous DNA methylation and as we show here genome wide segmentation into heterochromatic B domains and local phasing of nucleosomes outside of NDRs. In summary, our comparisons strongly suggest to consider assay specific differences for the experimental design and for generalized and comparative functional interpretations.





Blood ◽  
2010 ◽  
Vol 116 (21) ◽  
pp. 1024-1024
Author(s):  
Ilaria Iacobucci ◽  
Annalisa Lonetti ◽  
Alberto Ferrarini ◽  
Marco Sazzini ◽  
Anna Ferrari ◽  
...  

Abstract Abstract 1024 Although treatment with tyrosine kinase inhibitors has revolutionized the management of adult patients with BCR-ABL1-positive ALL and significantly improved response rates, relapse is still an expected and early event in the majority of them. It is usually attributed to the emergence of resistant clones with mutations in BCR-ABL1 kinase domain or to BCR-ABL1-independent pathways but many questions remain unresolved about the totality of genetic abnormalities and the knowledge of which alterations really matter for relapse. In an attempt to better understand the genetic mechanisms responsible for this phenomenon, we have analyzed matched diagnosis-relapse samples by three high resolution approaches: genome wide single nucleotide polymorphisms (SNPs) array (Affymetrix Genome-Wide Human SNP Array 6.0), gene expression profiling (Affymetrix Human Exon 1ST Array) and whole transcriptome deep sequencing by Illumina/Solexa technology. (RNA-Seq) approach. An additional subset of 30 adult BCR-ABL1+ positive ALL cases were analyzed by SNP and gene-expression arrays and by candidate gene sequencing in order to validate specific alterations. For whole transcriptome deep sequencing (RNA-seq) poly(A) RNA from blast cells was used to prepare cDNA libraries for Illumina/Solexa Genome Analyzer. Obtained sequence reads were mapped to the human genome reference sequence (UCSC hg18) to identify single nucleotide variants (SNVs). Reads that showed no match were mapped to a dataset of all possible splice junctions created in silico to identify alternative splicing (AS) events. The number of reads corresponding to RNA from known exons was also estimated and a normalized measure of gene expression level (RPKM) was computed. RNA-seq analysis generated 13.9 and 15.8 million reads from de novo and relapsed ALL samples, most of which successfully mapped to the reference sequence of the human genome. At diagnosis, 6 novel missense single nucleotide variations (SNVs) were detected after applying stringent criteria to reduce the SNV discovery false positive rate and validating novel substitutions with genomic DNA Sanger sequencing: 5 affected genes involved in metabolic processes (PDE4DIP, EIF2S3, DPEP1, ZC3H12D, TMEM46), one transport (MVP); these mutations disappeared at relapse and in this phase 3 novel missense mutations affecting genes involved in cell cycle regulation (CDC2L1) and catalytic activity (CTSZ, CXorf21) were found. Furthermore, the T315I mutation in the Bcr-Abl kinase domain was also identified. These differences in mutational patterns suggest that the leukemia clone from which relapsed cells have been developed was not the predominant one at diagnosis and that relapse specific variants were mutations acquired only during leukemia progression. Only the R20Q DPEP1 mutation was found in 1/30 additional BCR-ABL1 positive adult cases. By RNA-seq, 4,334 and 3,651 primary ALL and relapse isoforms with at least one AS event were identified. 240 of these genes were known cancer-related alternatively spliced genes, of which kinases and transcription regulators were the most represented functional classes. An average of 1.5 and 1.3 AS per isoform was estimated. The well-known alternatively spliced IKZF1 isoform was also detected both at the diagnosis and relapse. Finally, a detailed gene expression profile was obtained indicating that more than 60% of annotated human genes were transcribed in leukemia cells in both diagnosis and relapse phases. Approximately 23% of genes were up-regulated at relapse with respect to diagnosis. Many of these genes affect cell cycle progression (AURORA A, SURVIVIN, PLK1, CDK1, Cyclin A, Cyclin B), suggesting that the loss of cell cycle control and the subsequent increased proliferation play a role in disease progression. Conversely, only 9% of active genes in both samples were down-regulated at relapse with respect to diagnosis. In conclusion, this study provided, for the first time, a quite comprehensive overview of a BCR.-ABL1-positive ALL transcriptome, identifying novel mutations, changes in gene expression levels and AS events potentially involved in ALL progression. Supported by: European LeukemiaNet, AIL, AIRC, Fondazione Del Monte di Bologna e Ravenna, FIRB 2006, Ateneo RFO grants, Project of integreted program (PIO), Programma di Ricerca Regione – Università 2007 – 2009. Disclosures: Baccarani: NOVARTIS: Honoraria; BRISTOL MYERS SQUIBB: Honoraria. Martinelli:Novartis: Consultancy, Honoraria; BMS: Honoraria; Pfizer: Consultancy.



2019 ◽  
Vol 17 (04) ◽  
pp. 1950022
Author(s):  
Huiqing Wang ◽  
Chun Li ◽  
Jianhui Zhang ◽  
Jingjing Wang ◽  
Yue Ma ◽  
...  

Molecular biology combined with in silico machine learning and deep learning has facilitated the broad application of gene expression profiles for gene function prediction, optimal crop breeding, disease-related gene discovery, and drug screening. Although the acquisition cost of genome-wide expression profiles has been steadily declining, the requirement generates a compendium of expression profiles using thousands of samples remains high. The Library of Integrated Network-Based Cellular Signatures (LINCS) program used approximately 1000 landmark genes to predict the expression of the remaining target genes by linear regression; however, this approach ignored the nonlinear features influencing gene expression relationships, limiting the accuracy of the experimental results. We herein propose a gene expression prediction model, L-GEPM, based on long short-term memory (LSTM) neural networks, which captures the nonlinear features affecting gene expression and uses learned features to predict the target genes. By comparing and analyzing experimental errors and fitting the effects of different prediction models, the LSTM neural network-based model, L-GEPM, can achieve low error and a superior fitting effect.



2017 ◽  
Vol 24 (10) ◽  
pp. 531-541 ◽  
Author(s):  
Masanori Murakami ◽  
Takanobu Yoshimoto ◽  
Kazuhiko Nakabayashi ◽  
Yujiro Nakano ◽  
Takahiro Fukaishi ◽  
...  

The pathophysiology of aldosterone-producing adenomas (APAs) has been investigated via genetic approaches and the pathogenic significance of a series of somatic mutations, including KCNJ5, has been uncovered. However, how the mutational status of an APA is associated with its molecular characteristics, including its transcriptome and methylome, has not been fully understood. This study was undertaken to explore the molecular characteristics of APAs, specifically focusing on APAs with KCNJ5 mutations as opposed to those without KCNJ5 mutations, by comparing their transcriptome and methylome status. Cortisol-producing adenomas (CPAs) were used as reference. We conducted transcriptome and methylome analyses of 29 APAs with KCNJ5 mutations, 8 APAs without KCNJ5 mutations and 5 CPAs. Genome-wide gene expression and CpG methylation profiles were obtained from RNA and DNA samples extracted from these 42 adrenal tumors. Cluster analysis of the transcriptome and methylome revealed molecular heterogeneity in APAs depending on their mutational status. DNA hypomethylation and gene expression changes in Wnt signaling and inflammatory response pathways were characteristic of APAs with KCNJ5 mutations. Comparisons between transcriptome data from our APAs and that from normal adrenal cortex obtained from the Gene Expression Omnibus suggested similarities between APAs with KCNJ5 mutations and zona glomerulosa. The present study, which is based on transcriptome and methylome analyses, indicates the molecular heterogeneity of APAs depends on their mutational status. Here, we report the unique characteristics of APAs with KCNJ5 mutations.



2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Damien J. Downes ◽  
Robert A. Beagrie ◽  
Matthew E. Gosden ◽  
Jelena Telenius ◽  
Stephanie J. Carpenter ◽  
...  

AbstractChromosome conformation capture (3C) provides an adaptable tool for studying diverse biological questions. Current 3C methods generally provide either low-resolution interaction profiles across the entire genome, or high-resolution interaction profiles at limited numbers of loci. Due to technical limitations, generation of reproducible high-resolution interaction profiles has not been achieved at genome-wide scale. Here, to overcome this barrier, we systematically test each step of 3C and report two improvements over current methods. We show that up to 30% of reporter events generated using the popular in situ 3C method arise from ligations between two individual nuclei, but this noise can be almost entirely eliminated by isolating intact nuclei after ligation. Using Nuclear-Titrated Capture-C, we generate reproducible high-resolution genome-wide 3C interaction profiles by targeting 8055 gene promoters in erythroid cells. By pairing high-resolution 3C interaction calls with nascent gene expression we interrogate the role of promoter hubs and super-enhancers in gene regulation.



Sign in / Sign up

Export Citation Format

Share Document