scholarly journals Hi-C analyses with GENOVA: a case study with cohesin variants

2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Robin H van der Weide ◽  
Teun van den Brand ◽  
Judith H I Haarhuis ◽  
Hans Teunissen ◽  
Benjamin D Rowland ◽  
...  

Abstract Conformation capture-approaches like Hi-C can elucidate chromosome structure at a genome-wide scale. Hi-C datasets are large and require specialised software. Here, we present GENOVA: a user-friendly software package to analyse and visualise chromosome conformation capture (3C) data. GENOVA is an R-package that includes the most common Hi-C analyses, such as compartment and insulation score analysis. It can create annotated heatmaps to visualise the contact frequency at a specific locus and aggregate Hi-C signal over user-specified genomic regions such as ChIP-seq data. Finally, our package supports output from the major mapping-pipelines. We demonstrate the capabilities of GENOVA by analysing Hi-C data from HAP1 cell lines in which the cohesin-subunits SA1 and SA2 were knocked out. We find that ΔSA1 cells gain intra-TAD interactions and increase compartmentalisation. ΔSA2 cells have longer loops and a less compartmentalised genome. These results suggest that cohesinSA1 forms longer loops, while cohesinSA2 plays a role in forming and maintaining intra-TAD interactions. Our data supports the model that the genome is provided structure in 3D by the counter-balancing of loop formation on one hand, and compartmentalization on the other hand. By differentially controlling loops, cohesinSA1 and cohesinSA2 therefore also affect nuclear compartmentalization. We show that GENOVA is an easy to use R-package, that allows researchers to explore Hi-C data in great detail.

2021 ◽  
Author(s):  
Robin H. van der Weide ◽  
Teun van den Brand ◽  
Judith H.I. Haarhuis ◽  
Hans Teunissen ◽  
Benjamin D. Rowland ◽  
...  

AbstractConformation capture-approaches like Hi-C can elucidate chromosome structure at a genome-wide scale. Hi-C datasets are large and require specialised software. Here, we present GENOVA: a user-friendly software package to analyse and visualise conformation capture data. GENOVA is an R-package that includes the most common Hi-C analyses, such as compartment and insulation score analysis. It can create annotated heatmaps to visualise the contact frequency at a specific locus and aggregate Hi-C signal over user-specified genomic regions such as ChIP-seq data. Finally, our package supports output from the major mapping-pipelines. We demonstrate the capabilities of GENOVA by analysing Hi-C data from HAP1 cell lines in which the cohesin-subunits SA1 and SA2 were knocked out. We find that ΔSA1 cells gain intra-TAD interactions and increase compartmentalisation. ΔSA2 cells have longer loops and a less compartmentalised genome. These results suggest that cohesinSA1 forms longer loops, while cohesinSA2 plays a role in forming and maintaining intra-TAD interactions. Our data supports the model that the genome is provided structure in 3D by the counter-balancing of loop formation on one hand, and compartmentalization on the other hand. By differentially controlling loops, cohesinSA1 and cohesinSA2 therefore also affect nuclear compartmentalization. We show that GENOVA is an easy to use R-package, that allows researchers to explore Hi-C data in great detail.


2021 ◽  
Vol 11 ◽  
Author(s):  
Matthew J. Rybin ◽  
Melina Ramic ◽  
Natalie R. Ricciardi ◽  
Philipp Kapranov ◽  
Claes Wahlestedt ◽  
...  

Genome instability is associated with myriad human diseases and is a well-known feature of both cancer and neurodegenerative disease. Until recently, the ability to assess DNA damage—the principal driver of genome instability—was limited to relatively imprecise methods or restricted to studying predefined genomic regions. Recently, new techniques for detecting DNA double strand breaks (DSBs) and single strand breaks (SSBs) with next-generation sequencing on a genome-wide scale with single nucleotide resolution have emerged. With these new tools, efforts are underway to define the “breakome” in normal aging and disease. Here, we compare the relative strengths and weaknesses of these technologies and their potential application to studying neurodegenerative diseases.


2019 ◽  
Vol 36 (5) ◽  
pp. 1509-1516
Author(s):  
Andrew W George ◽  
Arunas Verbyla ◽  
Joshua Bowden

Abstract Motivation We present Eagle, a new method for multi-locus association mapping. The motivation for developing Eagle was to make multi-locus association mapping ‘easy’ and the method-of-choice. Eagle’s strengths are that it (i) is considerably more powerful than single-locus association mapping, (ii) does not suffer from multiple testing issues, (iii) gives results that are immediately interpretable and (iv) has a computational footprint comparable to single-locus association mapping. Results By conducting a large simulation study, we will show that Eagle finds true and avoids false single-nucleotide polymorphism trait associations better than competing single- and multi-locus methods. We also analyze data from a published mouse study. Eagle found over 50% more validated findings than the state-of-the-art single-locus method. Availability and implementation Eagle has been implemented as an R package, with a browser-based Graphical User Interface for users less familiar with R. It is freely available via the CRAN website at https://cran.r-project.org. Videos, Quick Start guides, FAQs and Demos are available via the Eagle website http://eagle.r-forge.r-project.org. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Yixin Guo ◽  
Ziwei Xue ◽  
Ruihong Yuan ◽  
Jingyi Jessica Li ◽  
William A Pastor ◽  
...  

Abstract Summary With the advance of genomic sequencing techniques, chromatin accessible regions, transcription factor binding sites and epigenetic modifications can be identified at genome-wide scale. Conventional analyses focus on the gene regulation at proximal regions; however, distal regions are usually less focused, largely due to the lack of reliable tools to link these regions to coding genes. In this study, we introduce RAD (Region Associated Differentially expressed genes), a user-friendly web tool to identify both proximal and distal region associated differentially expressed genes (DEGs). With DEGs and genomic regions of interest (gROI) as input, RAD maps the up- and down-regulated genes associated with any gROI and helps researchers to infer the regulatory function of these regions based on the distance of gROI to differentially expressed genes. RAD includes visualization of the results and statistical inference for significance. Availability and implementation RAD is implemented with Python 3.7 and run on a Nginx server. RAD is freely available at https://labw.org/rad as online web service. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Andrew W George ◽  
Arunas Verbyla ◽  
Joshua Bowden

Abstract Eagle is an R package for multi-locus association mapping on a genome-wide scale. It is unlike other multi-locus packages in that it is easy-to-use for R users and non-users alike. It has two modes of use, command line and GUI. Eagle is fully documented and has its own supporting website, http://eagle.r-forge.r-project.org/index.html. Eagle is a significant improvement over the method-of-choice, single-locus association mapping. It has greater power to detect SNP-trait associations. It is based on model selection, linear mixed models, and a clever idea on how random effects can be used to identify SNP-trait associations. Through an example with real mouse data, we demonstrate Eagle’s ability to bring clarity and increased insight to single-locus findings. Initially, we see Eagle complementing single-locus analyses. However, over time, we hope the community will make, increasingly, multi-locus association mapping their method-of-choice for the analysis of genome-wide association study data.


2020 ◽  
Vol 22 (1) ◽  
pp. 347
Author(s):  
Brandon Decker ◽  
Michal Liput ◽  
Hussam Abdellatif ◽  
Donald Yergeau ◽  
Yongho Bae ◽  
...  

During the development of mouse embryonic stem cells (ESC) to neuronal committed cells (NCC), coordinated changes in the expression of 2851 genes take place, mediated by the nuclear form of FGFR1. In this paper, widespread differences are demonstrated in the ESC and NCC inter- and intra-chromosomal interactions, chromatin looping, the formation of CTCF- and nFGFR1-linked Topologically Associating Domains (TADs) on a genome-wide scale and in exemplary HoxA-D loci. The analysis centered on HoxA cluster shows that blocking FGFR1 disrupts the loop formation. FGFR1 binding and genome locales are predictive of the genome interactions; likewise, chromatin interactions along with nFGFR1 binding are predictive of the genome function and correlate with genome regulatory attributes and gene expression. This study advances a topologically integrated genome archipelago model that undergoes structural transformations through the formation of nFGFR1-associated TADs. The makeover of the TAD islands serves to recruit distinct ontogenic programs during the development of the ESC to NCC.


2020 ◽  
Author(s):  
Yixin Guo ◽  
Ziwei Xue ◽  
Ruihong Yuan ◽  
William A. Pastor ◽  
Wanlu Liu

AbstractWith the advance of genomic sequencing techniques, chromatin accessible regions, transcription factor binding sites and epigenetic modifications can be identified at genome-wide scale. Conventional analyses focus on the gene regulation at proximal regions; however, distal regions are usually neglected, largely due to the lack of reliable tools to link the distal regions to coding genes. In this study, we introduce RAD (Region Associated Differentially expressed genes), a user-friendly web tool to identify both proximal and distal region associated differentially expressed genes. RAD maps the up- and down-regulated genes associated with any genomic regions of interest (gROI) and helps researchers to infer the regulatory function of these regions based on the distance of gROI to differentially expressed genes. RAD includes visualization of the results and statistical inference for significance.AvailabilityRAD is implemented with Python 3.7 and run on a Nginx server. RAD is freely available at http://labw.org/rad as online web service.


2021 ◽  
Vol 22 (22) ◽  
pp. 12383
Author(s):  
Alexander Kanapin ◽  
Mikhail Bankin ◽  
Tatyana Rozhmina ◽  
Anastasia Samsonova ◽  
Maria Samsonova

Modern flax cultivars are susceptible to many diseases; arguably, the most economically damaging of these is the Fusarium wilt fungal disease. Over the past decades international flax breeding initiatives resulted in the development of resistant cultivars. However, much remains to be learned about the mechanisms of resistance to Fusarium infection in flax. As a first step to uncover the genetic factors associated with resistance to Fusarium wilt disease, we performed a genome-wide association study (GWAS) using 297 accessions from the collection of the Federal Research Centre of the Bast Fiber Crops, Torzhok, Russia. These genotypes were infected with a highly pathogenic Fusarium oxysporum f.sp. lini MI39 strain; the wilt symptoms were documented in the course of three successive years. Six different single-locus models implemented in GAPIT3 R package were applied to a selected subset of 72,526 SNPs. A total of 15 QTNs (Quantitative Trait Nucleotides) were detected during at least two years of observation, while eight QTNs were found during all three years of the experiment. Of these, ten QTNs occupied a region of 640 Kb at the start of chromosome 1, while the remaining QTNs mapped to chromosomes 8, 11 and 13. All stable QTNs demonstrate a statistically significant allelic effect across 3 years of the experiment. Importantly, several QTNs spanned regions that harbored genes involved in the pathogen recognition and plant immunity response, including the KIP1-like protein (Lus10025717) and NBS-LRR protein (Lus10025852). Our results provide novel insights into the genetic architecture of flax resistance to Fusarium wilt and pinpoint potential candidate genes for further in-depth studies.


Nutrients ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 1984
Author(s):  
Majid Nikpay ◽  
Sepehr Ravati ◽  
Robert Dent ◽  
Ruth McPherson

Here, we performed a genome-wide search for methylation sites that contribute to the risk of obesity. We integrated methylation quantitative trait locus (mQTL) data with BMI GWAS information through a SNP-based multiomics approach to identify genomic regions where mQTLs for a methylation site co-localize with obesity risk SNPs. We then tested whether the identified site contributed to BMI through Mendelian randomization. We identified multiple methylation sites causally contributing to the risk of obesity. We validated these findings through a replication stage. By integrating expression quantitative trait locus (eQTL) data, we noted that lower methylation at cg21178254 site upstream of CCNL1 contributes to obesity by increasing the expression of this gene. Higher methylation at cg02814054 increases the risk of obesity by lowering the expression of MAST3, whereas lower methylation at cg06028605 contributes to obesity by decreasing the expression of SLC5A11. Finally, we noted that rare variants within 2p23.3 impact obesity by making the cg01884057 site more susceptible to methylation, which consequently lowers the expression of POMC, ADCY3 and DNAJC27. In this study, we identify methylation sites associated with the risk of obesity and reveal the mechanism whereby a number of these sites exert their effects. This study provides a framework to perform an omics-wide association study for a phenotype and to understand the mechanism whereby a rare variant causes a disease.


2021 ◽  
Vol 22 (S2) ◽  
Author(s):  
Daniele D’Agostino ◽  
Pietro Liò ◽  
Marco Aldinucci ◽  
Ivan Merelli

Abstract Background High-throughput sequencing Chromosome Conformation Capture (Hi-C) allows the study of DNA interactions and 3D chromosome folding at the genome-wide scale. Usually, these data are represented as matrices describing the binary contacts among the different chromosome regions. On the other hand, a graph-based representation can be advantageous to describe the complex topology achieved by the DNA in the nucleus of eukaryotic cells. Methods Here we discuss the use of a graph database for storing and analysing data achieved by performing Hi-C experiments. The main issue is the size of the produced data and, working with a graph-based representation, the consequent necessity of adequately managing a large number of edges (contacts) connecting nodes (genes), which represents the sources of information. For this, currently available graph visualisation tools and libraries fall short with Hi-C data. The use of graph databases, instead, supports both the analysis and the visualisation of the spatial pattern present in Hi-C data, in particular for comparing different experiments or for re-mapping omics data in a space-aware context efficiently. In particular, the possibility of describing graphs through statistical indicators and, even more, the capability of correlating them through statistical distributions allows highlighting similarities and differences among different Hi-C experiments, in different cell conditions or different cell types. Results These concepts have been implemented in NeoHiC, an open-source and user-friendly web application for the progressive visualisation and analysis of Hi-C networks based on the use of the Neo4j graph database (version 3.5). Conclusion With the accumulation of more experiments, the tool will provide invaluable support to compare neighbours of genes across experiments and conditions, helping in highlighting changes in functional domains and identifying new co-organised genomic compartments.


Sign in / Sign up

Export Citation Format

Share Document