scholarly journals 3D reconstruction of genomic regions from sparse interaction data

2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Julen Mendieta-Esteban ◽  
Marco Di Stefano ◽  
David Castillo ◽  
Irene Farabella ◽  
Marc A Marti-Renom

Abstract Chromosome conformation capture (3C) technologies measure the interaction frequency between pairs of chromatin regions within the nucleus in a cell or a population of cells. Some of these 3C technologies retrieve interactions involving non-contiguous sets of loci, resulting in sparse interaction matrices. One of such 3C technologies is Promoter Capture Hi-C (pcHi-C) that is tailored to probe only interactions involving gene promoters. As such, pcHi-C provides sparse interaction matrices that are suitable to characterize short- and long-range enhancer–promoter interactions. Here, we introduce a new method to reconstruct the chromatin structural (3D) organization from sparse 3C-based datasets such as pcHi-C. Our method allows for data normalization, detection of significant interactions and reconstruction of the full 3D organization of the genomic region despite of the data sparseness. Specifically, it builds, with as low as the 2–3% of the data from the matrix, reliable 3D models of similar accuracy of those based on dense interaction matrices. Furthermore, the method is sensitive enough to detect cell-type-specific 3D organizational features such as the formation of different networks of active gene communities.

2020 ◽  
Author(s):  
Julen Mendieta-Esteban ◽  
Marco Di Stefano ◽  
David Castillo ◽  
Irene Farabella ◽  
Marc A Marti-Renom

AbstractChromosome Conformation Capture (3C) technologies measure the interaction frequency between pairs of chromatin regions within the nucleus in a cell or a population of cells. Some of these 3C technologies retrieve interactions involving non-contiguous sets of loci, resulting in sparse interaction matrices. One of such 3C technologies is Promoter Capture Hi-C (pcHi-C) that is tailored to probe only interactions involving gene promoters. As such, pcHi-C provides sparse interaction matrices that are suitable to characterise short- and long-range enhancer-promoter interactions. Here, we introduce a new method to reconstruct the chromatin structural (3D) organisation from sparse 3C-based datasets such as pcHi-C. Our method allows for data normalisation, detection of significant interactions, and reconstruction of the full 3D organisation of the genomic region despite of the data sparseness. Specifically, it produces reliable reconstructions, in line with the ones obtained from dense interaction matrices, with as low as the 2-3% of the data from the matrix. Furthermore, the method is sensitive enough to detect cell-type-specific 3D organisational features such as the formation of different networks of active gene communities.


2019 ◽  
Author(s):  
Paula Soler-Vila ◽  
Pol Cuscó Pons ◽  
Irene Farabella ◽  
Marco Di Stefano ◽  
Marc A. Marti-Renom

ABSTRACTThe rapid development of chromosome conformation capture (3C-based) techniques as well as super-resolution imaging together with bioinformatics analyses has been fundamental for unveiling that chromosomes are organized into the so-called topologically associating domains or TADs. While these TADs appear as nested patterns in the 3C-based interaction matrices, the vast majority of available computational methods are based on the hypothesis that TADs are individual and unrelated chromatin structures. Here we introduce TADpole, a computational tool designed to identify and analyze the entire hierarchy of TADs in intra-chromosomal interaction matrices. TADpole combines principal component analysis and constrained hierarchical clustering to provide an unsupervised set of significant partitions in a genomic region of interest. TADpole identification of domains is robust to the data resolution, normalization strategy, and sequencing depth. TADpole domain borders are enriched in CTCF and cohesin binding proteins, while the domains are enriched in either H3K36me3 or H3k27me3 histone marks. We show TADpole usefulness by applying it to capture Hi-C experiments in wild-type and mutant mouse strains to pinpoint statistically significant differences in their topological structure.


2020 ◽  
Vol 48 (7) ◽  
pp. e39-e39
Author(s):  
Paula Soler-Vila ◽  
Pol Cuscó ◽  
Irene Farabella ◽  
Marco Di Stefano ◽  
Marc A Marti-Renom

Abstract The rapid development of Chromosome Conformation Capture (3C-based techniques), as well as imaging together with bioinformatics analyses, has been fundamental for unveiling that chromosomes are organized into the so-called topologically associating domains or TADs. While TADs appear as nested patterns in the 3C-based interaction matrices, the vast majority of available TAD callers are based on the hypothesis that TADs are individual and unrelated chromatin structures. Here we introduce TADpole, a computational tool designed to identify and analyze the entire hierarchy of TADs in intra-chromosomal interaction matrices. TADpole combines principal component analysis and constrained hierarchical clustering to provide a set of significant hierarchical chromatin levels in a genomic region of interest. TADpole is robust to data resolution, normalization strategy and sequencing depth. Domain borders defined by TADpole are enriched in main architectural proteins (CTCF and cohesin complex subunits) and in the histone mark H3K4me3, while their domain bodies, depending on their activation-state, are enriched in either H3K36me3 or H3K27me3, highlighting that TADpole is able to distinguish functional TAD units. Additionally, we demonstrate that TADpole's hierarchical annotation, together with the new DiffT score, allows for detecting significant topological differences on Capture Hi-C maps between wild-type and genetically engineered mouse.


2015 ◽  
Author(s):  
Mahfuza Sharmin ◽  
Hector Corrada Bravo ◽  
Sridhar S. Hannenhalli

Background. Large mega base-pair genomic regions show robust alterations in DNA methylation levels in multiple cancers, a vast majority of which are hypo-methylated in cancers. These regions are generally bounded by CpG islands, overlap with Lamin Associated Domains and Large organized chromatin lysine modifications, and are associated with stochastic variability in gene expression. Given the size and consistency of hypo-methylated blocks (HMB) across cancer types, their immediate causes are likely to be encoded in the genomic region near HMB boundaries, in terms of specific genomic or epigenomic signatures. However, a detailed characterization of the HMB boundaries has not been reported. Method. Here, we focused on ~13k HMBs, encompassing approximately half the genome, identified in colon cancer. We analyzed a number of distinguishing features at the HMB boundaries including transcription factor (TF) binding motifs, various epigenomic marks, and chromatin structural features. Result. We found that the classical promoter epigenomic mark, H3K4me3, is highly enriched at HMB boundaries, as are CTCF bound sites. HMB boundaries harbor distinct combinations of TF motifs. Our Random Forest model based on TF motifs can accurately distinguish boundaries not only from regions inside and outside HMBs, but surprisingly, from active promoters as well. Interestingly, the distinguishing TFs and their interacting proteins are involved in chromatin modification. Finally, HMB boundaries significantly coincide with the boundaries of Topologically Associating Domains of the chromatin. Conclusion. Our analyses suggest that the overall architecture of HMBs is guided by pre-existing chromatin architecture, and are associated with aberrant activity of promoter-like sequences at the boundary.


Cells ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 266
Author(s):  
Shin-ichiro Takebayashi ◽  
Tyrone Ryba ◽  
Kelsey Wimbish ◽  
Takuya Hayakawa ◽  
Morito Sakaue ◽  
...  

Multiple epigenetic pathways underlie the temporal order of DNA replication (replication timing) in the contexts of development and disease. DNA methylation by DNA methyltransferases (Dnmts) and downstream chromatin reorganization and transcriptional changes are thought to impact DNA replication, yet this remains to be comprehensively tested. Using cell-based and genome-wide approaches to measure replication timing, we identified a number of genomic regions undergoing subtle but reproducible replication timing changes in various Dnmt-mutant mouse embryonic stem (ES) cell lines that included a cell line with a drug-inducible Dnmt3a2 expression system. Replication timing within pericentromeric heterochromatin (PH) was shown to be correlated with redistribution of H3K27me3 induced by DNA hypomethylation: Later replicating PH coincided with H3K27me3-enriched regions. In contrast, this relationship with H3K27me3 was not evident within chromosomal arm regions undergoing either early-to-late (EtoL) or late-to-early (LtoE) switching of replication timing upon loss of the Dnmts. Interestingly, Dnmt-sensitive transcriptional up- and downregulation frequently coincided with earlier and later shifts in replication timing of the chromosomal arm regions, respectively. Our study revealed the previously unrecognized complex and diverse effects of the Dnmts loss on the mammalian DNA replication landscape.


Mathematics ◽  
2020 ◽  
Vol 8 (6) ◽  
pp. 1014
Author(s):  
Polly-Anne Jeffrey ◽  
Martín López-García ◽  
Mario Castro ◽  
Grant Lythe ◽  
Carmen Molina-París

Cellular receptors on the cell membrane can bind ligand molecules in the extra-cellular medium to form ligand-bound monomers. These interactions ultimately determine the fate of a cell through the resulting intra-cellular signalling cascades. Often, several receptor types can bind a shared ligand leading to the formation of different monomeric complexes, and in turn to competition for the common ligand. Here, we describe competition between two receptors which bind a common ligand in terms of a bi-variate stochastic process. The stochastic description is important to account for fluctuations in the number of molecules. Our interest is in computing two summary statistics—the steady-state distribution of the number of bound monomers and the time to reach a threshold number of monomers of a given kind. The matrix-analytic approach developed in this manuscript is exact, but becomes impractical as the number of molecules in the system increases. Thus, we present novel approximations which can work under low-to-moderate competition scenarios. Our results apply to systems with a larger number of population species (i.e., receptors) competing for a common resource (i.e., ligands), and to competition systems outside the area of molecular dynamics, such as Mathematical Ecology.


2017 ◽  
Author(s):  
◽  
Tuan Anh Trieu

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] Different cell types of an organism have the same DNA sequence, but they can function differently because their difference in 3D organization allows them to express different genes and has different cellular functions. Understanding the 3D organization of the genome is the key to understand functions of the cell. Chromosome conformation capture techniques like Hi-C and TCC that can capture interactions between proximal chromosome fragments have allowed the study of 3D genome organization in high resolution and high through-put. My work focuses on developing computational methods to reconstruct 3D genome structures from Hi-C data. I presented three methods to reconstruct 3D genome and chromosome structures. The first method can build 3D genome models from soft constraints of contacts and non-contacts. This method utilizes the concept of contact and non-contact to reconstruct 3D models without translating interaction frequencies into physical distances. The translation is commonly used by other methods even though it makes a strong assumption about the relationship between interaction frequencies and physical distances. In synthetic dataset, when the relationship was known, my method performed comparably with other methods assuming the relationship. This shows the potential of my method for real Hi-C datasets where the relationship is unknown. The limitation of the method is that it has parameters requiring manual adjustment. I developed the second method to reconstruct 3D genome models. This method utilizes a commonly used function to translate interaction frequencies to physical distances to build 3D models. I proposed a novel way to derive soft constraints to handle inconsistency in the data and to make the method robust. Building 3D models at high resolution is a more challenging problem as the number of constraints is small and the feasible space is larger. I introduced a third method to build 3D chromosome models at high resolution. The method reconstructs models at low resolution and then uses them to guide the reconstruction of models at high resolution. The last part of my work is the development of a comprehensive tool with intuitive graphic user interface to analyze Hi-C data, reconstruct and analyze 3D models.


2012 ◽  
Vol 78 (7) ◽  
pp. 2435-2442 ◽  
Author(s):  
Marie Foulongne-Oriol ◽  
Anne Rodier ◽  
Jean-Michel Savoie

ABSTRACTDry bubble, caused byLecanicillium fungicola, is one of the most detrimental diseases affecting button mushroom cultivation. In a previous study, we demonstrated that breeding for resistance to this pathogen is quite challenging due to its quantitative inheritance. A second-generation hybrid progeny derived from an intervarietal cross between a wild strain and a commercial cultivar was characterized forL. fungicolaresistance under artificial inoculation in three independent experiments. Analysis of quantitative trait loci (QTL) was used to determine the locations, numbers, and effects of genomic regions associated with dry-bubble resistance. Four traits related to resistance were analyzed. Two to four QTL were detected per trait, depending on the experiment. Two genomic regions, on linkage group X (LGX) and LGVIII, were consistently detected in the three experiments. The genomic region on LGX was detected for three of the four variables studied. The total phenotypic variance accounted for by all QTL ranged from 19.3% to 42.1% over all traits in all experiments. For most of the QTL, the favorable allele for resistance came from the wild parent, but for some QTL, the allele that contributed to a higher level of resistance was carried by the cultivar. Comparative mapping with QTL for yield-related traits revealed five colocations between resistance and yield component loci, suggesting that the resistance results from both genetic factors and fitness expression. The consequences for mushroom breeding programs are discussed.


2017 ◽  
Vol 24 (s1) ◽  
pp. 174-181 ◽  
Author(s):  
Zygmunt Paszotta ◽  
Malgorzata Szumilo ◽  
Jakub Szulwic

Abstract This paper intends to point out the possibility of using Internet photogrammetry to construct 3D models from the images obtained by means of UAVs (Unmanned Aerial Vehicles). The solutions may be useful for the inspection of ports as to the content of cargo, transport safety or the assessment of the technical infrastructure of port and quays. The solution can be a complement to measurements made by using laser scanning and traditional surveying methods. In this paper the authors recommend a solution useful for creating 3D models from images acquired by the UAV using non-metric images from digital cameras. The developed algorithms, created and presented software allows to generate 3D models through the Internet in two modes: anaglyph and display in shutter systems. The problem of 3D image generation in photogrammetry is solved by using epipolar images. The appropriate method was presented by Kreiling in 1976. However, it applies to photogrammetric images for which the internal orientation is known. In the case of digital images obtained with non-metric cameras it is required to use another solution based on the fundamental matrix concept, introduced by Luong in 1992. In order to determine the matrix which defines the relationship between left and right digital image it is required to have at least eight homologous points. To determine the solution it is necessary to use the SVD (singular value decomposition). By using the fundamental matrix the epipolar lines are determined, which makes the correct orientation of images making stereo pairs, possible. The appropriate mathematical bases and illustrations are included in the publication.


2021 ◽  
Vol 30 (1) ◽  
pp. 95-103
Author(s):  
Mohammad Shamimul Alam ◽  
Israt Jahan ◽  
Sadniman Rahman ◽  
Hawa Jahan ◽  
Kaniz Fatema

Tilapia is a hardy fish which can survive in water bodies polluted with heavy metals. Metal resistance is conferred by higher expression of metallothionein gene (mt) in many organisms. Level, time and tissue-specificity of gene expression is regulated through transcription factor binding sites (TFBS) which may be present in the upstream, downstream, or even in the introns of a gene. So, as a candidate regulatory region, the 5’upstream sequence of mt gene in three tilapia species, Oreochromis aureus, O. niloticus and O. mossambicus was studied. The targeted region was PCR-amplified and then sequenced using a pair of custom-designed primer. A total of only 2.7% variation was found in the sequenced genomic region among the three species. Metal-related TFBS were predicted from these sequences. A total of twenty eight TFBS were found in O. aureus and twenty nine in O. mossambicus and O. niloticus. The number of metalrelated TFBS predicted in the targeted sequence was significantly higher compared to that found in randomly selected other genomic regions of same size from O. niloticus genome. Thus, the results suggest the presence of putative regulatory elements in the targeted upstream region which might have important role in the regulation of mt gene function. Dhaka Univ. J. Biol. Sci. 30(1): 95-103, 2021 (January)


Sign in / Sign up

Export Citation Format

Share Document