scholarly journals HiC-Spector: A matrix library for spectral and reproducibility analysis of Hi-C contact maps

2016 ◽  
Author(s):  
Koon-Kiu Yan ◽  
Galip Guürkan Yardimci ◽  
William S. Noble ◽  
Mark Gerstein

AbstractSummaryGenome-wide proximity ligation based assays like Hi-C have opened a window to the 3D organization of the genome. In so doing, they present data structures that are different from conventional 1D signal tracks. To exploit the 2D nature of Hi-C contact maps, matrix techniques like spectral analysis are particularly useful. Here, we present HiC-spector, a collection of matrix-related functions for analyzing Hi-C contact maps. In particular, we introduce a novel reproducibility metric for quantifying the similarity between contact maps based on spectral decomposition. The metric successfully separates contact maps mapped from Hi-C data coming from biological replicates, pseudo-replicates and different cell types.AvailabilitySource code in Julia and the documentation of HiC-spector can be freely obtained athttps://github.com/gersteinlab/[email protected]


2020 ◽  
Author(s):  
Haizi Zheng ◽  
Michelle S Zhu ◽  
Yaping Liu

AbstractSummaryCirculating cell-free DNA (cfDNA) is a promising biomarker for the diagnosis and prognosis of many diseases, including cancer. The genome-wide non-random fragmentation patterns of cfDNA are associated with the nucleosomal protection, epigenetic environment, and gene expression in the cell types that contributed to cfDNA. However, current progress on the development of computational methods and understanding of molecular mechanisms behind cfDNA fragmentation patterns is significantly limited by the controlled-access of cfDNA whole-genome sequencing (WGS) dataset. Here, we present FinaleDB (FragmentatIoN AnaLysis of cEll-free DNA DataBase), a comprehensive database to host thousands of uniformly processed and curated de-identified cfDNA WGS datasets across different pathological conditions. Furthermore, FinaleDB comes with a fragmentation genome browser, from which users can seamlessly integrate thousands of other omics data in different cell types to experience a comprehensive view of both gene-regulatory landscape and cfDNA fragmentation patterns.Availability and implementationFinaleDB service: http://finaledb.research.cchmc.org/ FinaleDB source code: https://github.com/epifluidlab/finaledb_portal and https://github.com/epifluidlab/[email protected]



2018 ◽  
Author(s):  
Sanju Sinha ◽  
Karina Barbosa Guerra ◽  
Kuoyuan Cheng ◽  
Mark DM Leiserson ◽  
David M Wilson ◽  
...  

AbstractRecent studies have reported that CRISPR-Cas9 gene editing induces a p53-dependent DNA damage response in primary cells, which may select for cells with oncogenic p53 mutations11,12. It is unclear whether these CRISPR-induced changes are applicable to different cell types, and whether CRISPR gene editing may select for other oncogenic mutations. Addressing these questions, we analyzed genome-wide CRISPR and RNAi screens to systematically chart the mutation selection potential of CRISPR knockouts across the whole exome. Our analysis suggests that CRISPR gene editing can select for mutants of KRAS and VHL, at a level comparable to that reported for p53. These predictions were further validated in a genome-wide manner by analyzing independent CRISPR screens and patients’ tumor data. Finally, we performed a new set of pooled and arrayed CRISPR screens to evaluate the competition between CRISPR-edited isogenic p53 WT and mutant cell lines, which further validated our predictions. In summary, our study systematically charts and points to the potential selection of specific cancer driver mutations during CRISPR-Cas9 gene editing.



2018 ◽  
Author(s):  
Sourya Bhattacharyya ◽  
Vivek Chandra ◽  
Pandurangan Vijayanand ◽  
Ferhat Ay

Here we describe FitHiChIP (github.com/ay-lab/FitHiChIP), a computational method for identifying chromatin contacts among regulatory regions such as en-hancers and promoters from HiChIP/PLAC-seq data. FitHiChIP jointly models the non-uniform coverage and genomic distance scaling of HiChIP data, captures previously validated enhancer interactions for several genes including MYC and TP53, and recovers contacts genome-wide that are supported by ChIA-PET, pro-moter capture Hi-C and Hi-C data. FitHiChIP also provides a framework for differential contact analysis as showcased in a comparison of HiChIP data we have generated for two distinct immune cell types.



2021 ◽  
Author(s):  
Juexiao Zhou ◽  
Bin Zhang ◽  
Haoyang Li ◽  
Longxi Zhou ◽  
Zhongxiao Li ◽  
...  

The accurate annotation of TSSs and their usage is critical for the mechanistic understanding of gene regulation under different biological contexts. To fulfill this, specific high-throughput experimental technologies have been developed to capture TSSs in a genome-wide manner. Various computational tools have also been developed for in silico prediction of TSSs solely based on genomic sequences. Most of these tools have drastic false positive predictions when applied on the genome-scale. Here, we present DeeReCT-TSS, a deep-learning-based method that is capable of TSSs identification across the whole genome based on DNA sequences and conventional RNA-seq data. We show that by effectively incorporating these two sources of information, DeeReCT-TSS significantly outperforms other solely sequence-based methods on the precise annotation of TSSs used in different cell types. Furthermore, we develop a meta-learning-based extension for simultaneous transcription start site (TSS) annotation on 10 cell types, which enables the identification of cell-type-specific TSS. Finally, we demonstrate the high precision of DeeReCT-TSS on two independent datasets from the ENCODE project by correlating our predicted TSSs with experimentally defined TSS chromatin states.



Author(s):  
Keunsoo Kang ◽  
Lothar Hennighausen

AbstractThe signal transducer and activator of transcription (STAT) family is activated by cytokines and conveys biochemical signals to the genome through binding to specific regulatory sequences, called IFN-γ-activated sequence (GAS) motifs. As common GAS motifs (TTCnnnGAA) contain only six conserved nucleotides, the mammalian genome harbors hundreds of thousands of copies of this sequence. However, it is not possible to predict which specific GAS motifs bind to STATs and are of functional significance. Here, we apply several layers of statistical, bioinformatics and experimental analyses to narrow down the number of GAS sites that might be of biological relevance. In particular, we determined the number of bona fide GAS motifs by utilizing publically available genome-wide STAT5 ChIP-seq data sets. Less than 10% of GAS motifs within the mouse genome are recognized by STAT5 in vivo and only a small portion of them are shared across different cell types. However, even bona fide STAT5 binding did not predict that the respective gene was under cytokine-STAT control. Therefore, additional bioinformatics, genomic and epigenetic parameters, such as patterns of histone modifications, are required to more reliably predict the behavior of cytokine-STAT regulatory networks.



2019 ◽  
Author(s):  
Robert W Reid ◽  
Jacob W Ferrier ◽  
Jeremy J Jay

AbstractSummaryDatabio is capable of providing fast and accurate annotation of gene-oriented data sets, coupled with an integrated identifier conversion service to empower downstream data mining and computational analysis. Databio is enabled by fast real-time data structures applied to over 137 million unique identifiers, and uses automated heuristics to permit accurate data provenance without highly specialized knowledge and bioinformatics training.Availability and ImplementationFreely available on the web at https://datab.io/. Source code and binaries are freely available for download at https://github.com/joiningdata/databio/, implemented in Go and supported on Linux, Windows, and macOS.



2021 ◽  
Author(s):  
Ruud HM Wijdeven ◽  
Birol Cabukusta ◽  
Xueer Qiu ◽  
Daniel M Borras ◽  
Yun Liang ◽  
...  

The PD-L1/2 - PD-1 immune checkpoint is essential for the proper induction of peripheral tolerance and limits autoimmunity, whereas tumor cells exploit their expression to promote immune evasion. Many different cell types express PD-L1/2, either constitutively or upon stimulation, but the factors driving this expression are often not defined. Here, using genome-wide CRISPR-activation screening, we identified three factors that upregulate PD-L1 expression; GATA2, MBD6, and VGLL3. GATA2 and VGLL3 act as transcriptional regulators and their expression induced PD-L1 in many different cell types. Conversely, loss of VGLL3 impaired IFNγ-induced PD-L1/2 expression in keratinocytes. Mechanistically, by performing a second screen to identify proteins acting together with VGLL3, we found that VGLL3 forms a complex with TEAD1 and RUNX1/3 to drive expression of PD-L1/2. Collectively, our work identified a new transcriptional network controlling PD-L1/2 expression and suggests that VGLL3, in addition to its known role in the expression of pro-inflammatory genes, can balance inflammation by upregulating the anti-inflammatory factors PD-L1 and PD-L2.



2018 ◽  
Author(s):  
L. Carron ◽  
J.B. Morlot ◽  
Matthys V. ◽  
A. Lesne ◽  
J. Mozziconacci

AbstractGenome-wide chromosomal contact maps are widely used to uncover the 3D organisation of genomes. They rely on the collection of millions of contacting pairs of genomic loci. Contact frequencies at short range are usually well measured in experiments, while there is a lot of missing information about long-range contacts.We propose to use the sparse information contained in raw contact maps to determine high-confidence contact frequency between all pairs of loci. Our algorithmic procedure, Boost-HiC, enables the detection of Hi-C patterns such as chromosomal compartments at a resolution that would be otherwise only attainable by sequencing a hundred times deeper the experimental Hi-C library. Boost-HiC can also be used to compare contact maps at an improved resolution.Boost-HiC is available at https://github.com/LeopoldC/Boost-HiC



2019 ◽  
Author(s):  
Qin Huang ◽  
Ken Y. Chan ◽  
Isabelle G. Tobey ◽  
Yujia Alina Chan ◽  
Tim Poterba ◽  
...  

The engineered AAV-PHP.B family of adeno-associated virus efficiently delivers genes throughout the mouse central nervous system. To guide their application across disease models, and to inspire the development of translational gene therapy vectors useful for targeting neurological diseases in humans, we sought to elucidate the host factors responsible for the CNS tropism of AAV-PHP.B vectors. Leveraging CNS tropism differences across mouse strains, we conducted a genome-wide association study, and rapidly identified and verified LY6A as an essential receptor for the AAV-PHP.B vectors in brain endothelial cells. Importantly, this newly discovered mode of AAV binding and transduction is independent of other known AAV receptors and can be imported into different cell types to confer enhanced transduction by the AAV-PHP.B vectors.



2016 ◽  
Author(s):  
Elizabeth Baskin ◽  
Rick Farouni ◽  
Ewy A. Mathe

AbstractSummaryRegulatory elements regulate gene transcription, and their location and accessibility is cell-type specific, particularly for enhancers. Mapping and comparing chromatin accessibility between different cell types may identify mechanisms involved in cellular development and disease progression. To streamline and simplify differential analysis of regulatory elements genome-wide using chromatin accessibility data, such as DNase-seq, ATAC-seq, we developed ALTRE (ALTered Regulatory Elements), an R package and associated R Shiny web app. ALTRE makes such analysis accessible to a wide range of users – from novice to practiced computational biologists.Availabilityhttps://github.com/Mathelab/[email protected]



Sign in / Sign up

Export Citation Format

Share Document