scholarly journals DeeReCT-TSS: A novel meta-learning-based method annotates TSS in multiple cell types based on DNA sequences and RNA-seq data

2021 ◽  
Author(s):  
Juexiao Zhou ◽  
Bin Zhang ◽  
Haoyang Li ◽  
Longxi Zhou ◽  
Zhongxiao Li ◽  
...  

The accurate annotation of TSSs and their usage is critical for the mechanistic understanding of gene regulation under different biological contexts. To fulfill this, specific high-throughput experimental technologies have been developed to capture TSSs in a genome-wide manner. Various computational tools have also been developed for in silico prediction of TSSs solely based on genomic sequences. Most of these tools have drastic false positive predictions when applied on the genome-scale. Here, we present DeeReCT-TSS, a deep-learning-based method that is capable of TSSs identification across the whole genome based on DNA sequences and conventional RNA-seq data. We show that by effectively incorporating these two sources of information, DeeReCT-TSS significantly outperforms other solely sequence-based methods on the precise annotation of TSSs used in different cell types. Furthermore, we develop a meta-learning-based extension for simultaneous transcription start site (TSS) annotation on 10 cell types, which enables the identification of cell-type-specific TSS. Finally, we demonstrate the high precision of DeeReCT-TSS on two independent datasets from the ENCODE project by correlating our predicted TSSs with experimentally defined TSS chromatin states.

2021 ◽  
Author(s):  
Juexiao Zhou ◽  
bin zhang ◽  
Haoyang Li ◽  
Longxi Zhou ◽  
Zhongxiao Li ◽  
...  

Abstract The accurate annotation of transcription start sites (TSSs) and their usage is critical for the mechanistic understanding of gene regulation under different biological contexts. To fulfil this, on one hand, specific high-throughput experimental technologies have been developed to capture TSSs in a genome-wide manner. On the other hand, various computational tools have also been developed for in silico prediction of TSSs solely based on genomic sequences. Most of these computational tools cast the problem as a binary classification task on a balanced dataset and thus result in drastic false positive predictions when applied on the genome-scale. To address these issues, we present DeeReCT-TSS, a deep-learning-based method that is capable of TSSs identification across the whole genome based on both DNA sequences and conventional RNA-seq data. We show that by effectively incorporating these two sources of information, DeeReCT-TSS significantly outperforms other solely sequence-based methods on the precise annotation of TSSs used in different cell types. Furthermore, we develop a meta-learning-based extension for simultaneous transcription start site (TSS) annotation on 10 cell types, which enables the identification of cell-type-specific TSS. Finally, we demonstrate the high precision of DeeReCT-TSS on two independent datasets from the ENCODE project by correlating our predicted TSSs with experimentally defined TSS chromatin states. Our application, pre-trained models and data are available at https://github.com/JoshuaChou2018/DeeReCT-TSS_release.


2018 ◽  
Author(s):  
Sanju Sinha ◽  
Karina Barbosa Guerra ◽  
Kuoyuan Cheng ◽  
Mark DM Leiserson ◽  
David M Wilson ◽  
...  

AbstractRecent studies have reported that CRISPR-Cas9 gene editing induces a p53-dependent DNA damage response in primary cells, which may select for cells with oncogenic p53 mutations11,12. It is unclear whether these CRISPR-induced changes are applicable to different cell types, and whether CRISPR gene editing may select for other oncogenic mutations. Addressing these questions, we analyzed genome-wide CRISPR and RNAi screens to systematically chart the mutation selection potential of CRISPR knockouts across the whole exome. Our analysis suggests that CRISPR gene editing can select for mutants of KRAS and VHL, at a level comparable to that reported for p53. These predictions were further validated in a genome-wide manner by analyzing independent CRISPR screens and patients’ tumor data. Finally, we performed a new set of pooled and arrayed CRISPR screens to evaluate the competition between CRISPR-edited isogenic p53 WT and mutant cell lines, which further validated our predictions. In summary, our study systematically charts and points to the potential selection of specific cancer driver mutations during CRISPR-Cas9 gene editing.


2019 ◽  
Author(s):  
Qin Huang ◽  
Ken Y. Chan ◽  
Isabelle G. Tobey ◽  
Yujia Alina Chan ◽  
Tim Poterba ◽  
...  

The engineered AAV-PHP.B family of adeno-associated virus efficiently delivers genes throughout the mouse central nervous system. To guide their application across disease models, and to inspire the development of translational gene therapy vectors useful for targeting neurological diseases in humans, we sought to elucidate the host factors responsible for the CNS tropism of AAV-PHP.B vectors. Leveraging CNS tropism differences across mouse strains, we conducted a genome-wide association study, and rapidly identified and verified LY6A as an essential receptor for the AAV-PHP.B vectors in brain endothelial cells. Importantly, this newly discovered mode of AAV binding and transduction is independent of other known AAV receptors and can be imported into different cell types to confer enhanced transduction by the AAV-PHP.B vectors.


2019 ◽  
Author(s):  
Christopher Ritchie ◽  
Anthony F. Cordova ◽  
Lingyin Li

AbstractWe previously reported that SLC19A1 is an importer of the immunotransmitter 2’3’-cyclic-GMP-AMP (cGAMP)1 by performing a genome wide screen in U937 cells. Soon after, Lutejin et al. reported similar findings by conducting a screen in THP-1 cells2. While the conclusions of these two studies largely overlap, we arrived at significantly different conclusions regarding how broadly SLC19A1 is used by different cell types. Our study suggests that in addition to SLC19A1, many cultured and primary cell types use alternative, unidentified transporters to import cGAMP and other cyclic dinucleotides (CDNs). This conclusion was based on our findings that inhibition of SLC19A1 did not significantly reduce extracellular cGAMP signaling in multiple cell types, including primary CD14+peripheral blood mononuclear cells (PBMCs) from most donors. In contrast, Luteijn et al. concluded that SLC19A1 is the major CDN importer in humans, largely based on their use of a radiolabeled [32P] cGAMP uptake assay. Using this assay, they showed that inhibition of SLC19A1 abolishes [32P] uptake in total PBMCs. However, they did not test whether inhibition of SLC19A1 affects extracellular cGAMP signaling in these cells. Here, we highlight an important issue with the [32P] cGAMP uptake assay used by Luteijn et al. and demonstrate that measuring extracellular cGAMP signaling through the STING pathway is currently the best method for evaluating cGAMP import. We also show that inhibition of SLC19A1 has no effect on extracellular cGAMP signaling in total PBMCs, confirming that this cell type relies on other transport mechanisms for cGAMP import.


2020 ◽  
Author(s):  
Yuliang Wang

AbstractSingle cell RNA-seq measures the transcriptomes of many cell types across diverse conditions. However, an emerging challenge is to uncover how different cell types communicate with each other to maintain tissue homeostasis, and how inter-cellular communications are perturbed in diseases. To address this problem, we developed talklr, an information theory-based approach to uncover potential ligand-receptor interactions involved in tissue homeostasis and diseases. Compared to existing approaches that analyze changes in each gene in each cell type separately, talklr uses a holistic approach to simultaneously consider expression changes in both ligands and receptors across multiple cell types and conditions. talklr outperformed existing approaches in identifying ligand-receptor interactions, including those known to be important for tissue-specific functions and diseases across diverse datasets. talklr can reveal important signaling events in many biological problems in an unbiased way, and will be a valuable tool in single cell RNA-seq analysis. talklr is available as both an interactive website and an R package.


Acta Naturae ◽  
2016 ◽  
Vol 8 (2) ◽  
pp. 79-86 ◽  
Author(s):  
P. V. Elizar’ev ◽  
D. V. Lomaev ◽  
D. A. Chetverina ◽  
P. G. Georgiev ◽  
M. M. Erokhin

Maintenance of the individual patterns of gene expression in different cell types is required for the differentiation and development of multicellular organisms. Expression of many genes is controlled by Polycomb (PcG) and Trithorax (TrxG) group proteins that act through association with chromatin. PcG/TrxG are assembled on the DNA sequences termed PREs (Polycomb Response Elements), the activity of which can be modulated and switched from repression to activation. In this study, we analyzed the influence of transcriptional read-through on PRE activity switch mediated by the yeast activator GAL4. We show that a transcription terminator inserted between the promoter and PRE doesnt prevent switching of PRE activity from repression to activation. We demonstrate that, independently of PRE orientation, high levels of transcription fail to dislodge PcG/TrxG proteins from PRE in the absence of a terminator. Thus, transcription is not the main factor required for PRE activity switch.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Karen R. Mifsud ◽  
Clare L. M. Kennedy ◽  
Silvia Salatino ◽  
Eshita Sharma ◽  
Emily M. Price ◽  
...  

AbstractGlucocorticoid hormones (GCs) — acting through hippocampal mineralocorticoid receptors (MRs) and glucocorticoid receptors (GRs) — are critical to physiological regulation and behavioural adaptation. We conducted genome-wide MR and GR ChIP-seq and Ribo-Zero RNA-seq studies on rat hippocampus to elucidate MR- and GR-regulated genes under circadian variation or acute stress. In a subset of genes, these physiological conditions resulted in enhanced MR and/or GR binding to DNA sequences and associated transcriptional changes. Binding of MR at a substantial number of sites however remained unchanged. MR and GR binding occur at overlapping as well as distinct loci. Moreover, although the GC response element (GRE) was the predominant motif, the transcription factor recognition site composition within MR and GR binding peaks show marked differences. Pathway analysis uncovered that MR and GR regulate a substantial number of genes involved in synaptic/neuro-plasticity, cell morphology and development, behavior, and neuropsychiatric disorders. We find that MR, not GR, is the predominant receptor binding to >50 ciliary genes; and that MR function is linked to neuronal differentiation and ciliogenesis in human fetal neuronal progenitor cells. These results show that hippocampal MRs and GRs constitutively and dynamically regulate genomic activities underpinning neuronal plasticity and behavioral adaptation to changing environments.


2021 ◽  
Vol 22 (S2) ◽  
Author(s):  
Daniele D’Agostino ◽  
Pietro Liò ◽  
Marco Aldinucci ◽  
Ivan Merelli

Abstract Background High-throughput sequencing Chromosome Conformation Capture (Hi-C) allows the study of DNA interactions and 3D chromosome folding at the genome-wide scale. Usually, these data are represented as matrices describing the binary contacts among the different chromosome regions. On the other hand, a graph-based representation can be advantageous to describe the complex topology achieved by the DNA in the nucleus of eukaryotic cells. Methods Here we discuss the use of a graph database for storing and analysing data achieved by performing Hi-C experiments. The main issue is the size of the produced data and, working with a graph-based representation, the consequent necessity of adequately managing a large number of edges (contacts) connecting nodes (genes), which represents the sources of information. For this, currently available graph visualisation tools and libraries fall short with Hi-C data. The use of graph databases, instead, supports both the analysis and the visualisation of the spatial pattern present in Hi-C data, in particular for comparing different experiments or for re-mapping omics data in a space-aware context efficiently. In particular, the possibility of describing graphs through statistical indicators and, even more, the capability of correlating them through statistical distributions allows highlighting similarities and differences among different Hi-C experiments, in different cell conditions or different cell types. Results These concepts have been implemented in NeoHiC, an open-source and user-friendly web application for the progressive visualisation and analysis of Hi-C networks based on the use of the Neo4j graph database (version 3.5). Conclusion With the accumulation of more experiments, the tool will provide invaluable support to compare neighbours of genes across experiments and conditions, helping in highlighting changes in functional domains and identifying new co-organised genomic compartments.


Metabolites ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 168
Author(s):  
John I. Hendry ◽  
Hoang V. Dinh ◽  
Debolina Sarkar ◽  
Lin Wang ◽  
Anindita Bandyopadhyay ◽  
...  

Nitrogen fixing-cyanobacteria can significantly improve the economic feasibility of cyanobacterial production processes by eliminating the requirement for reduced nitrogen. Anabaena sp. ATCC 33047 is a marine, heterocyst forming, nitrogen fixing cyanobacteria with a very short doubling time of 3.8 h. We developed a comprehensive genome-scale metabolic (GSM) model, iAnC892, for this organism using annotations and content obtained from multiple databases. iAnC892 describes both the vegetative and heterocyst cell types found in the filaments of Anabaena sp. ATCC 33047. iAnC892 includes 953 unique reactions and accounts for the annotation of 892 genes. Comparison of iAnC892 reaction content with the GSM of Anabaena sp. PCC 7120 revealed that there are 109 reactions including uptake hydrogenase, pyruvate decarboxylase, and pyruvate-formate lyase unique to iAnC892. iAnC892 enabled the analysis of energy production pathways in the heterocyst by allowing the cell specific deactivation of light dependent electron transport chain and glucose-6-phosphate metabolizing pathways. The analysis revealed the importance of light dependent electron transport in generating ATP and NADPH at the required ratio for optimal N2 fixation. When used alongside the strain design algorithm, OptForce, iAnC892 recapitulated several of the experimentally successful genetic intervention strategies that over produced valerolactam and caprolactam precursors.


2021 ◽  
Vol 14 (694) ◽  
pp. eabe0387
Author(s):  
Orna Ernst ◽  
Jing Sun ◽  
Bin Lin ◽  
Balaji Banoth ◽  
Michael G. Dorrington ◽  
...  

Noncanonical inflammasome activation by cytosolic lipopolysaccharide (LPS) is a critical component of the host response to Gram-negative bacteria. Cytosolic LPS recognition in macrophages is preceded by a Toll-like receptor (TLR) priming signal required to induce transcription of inflammasome components and facilitate the metabolic reprograming that fuels the inflammatory response. Using a genome-scale arrayed siRNA screen to find inflammasome regulators in mouse macrophages, we identified the mitochondrial enzyme nucleoside diphosphate kinase D (NDPK-D) as a regulator of both noncanonical and canonical inflammasomes. NDPK-D was required for both mitochondrial DNA synthesis and cardiolipin exposure on the mitochondrial surface in response to inflammasome priming signals mediated by TLRs, and macrophages deficient in NDPK-D had multiple defects in LPS-induced inflammasome activation. In addition, NDPK-D was required for the recruitment of TNF receptor–associated factor 6 (TRAF6) to mitochondria, which was critical for reactive oxygen species (ROS) production and the metabolic reprogramming that supported the TLR-induced gene program. NDPK-D knockout mice were protected from LPS-induced shock, consistent with decreased ROS production and attenuated glycolytic commitment during priming. Our findings suggest that, in response to microbial challenge, NDPK-D–dependent TRAF6 mitochondrial recruitment triggers an energetic fitness checkpoint required to engage and maintain the transcriptional program necessary for inflammasome activation.


Sign in / Sign up

Export Citation Format

Share Document