Chromatin 3D structure reconstruction with consideration of adjacency relationship among genomic loci
AbstractChromatin 3D conformation plays important roles in regulating gene or protein functions. High-throughout chromosome conformation capture (3C)-based technologies, such as Hi-C, have been exploited to acquire the contact frequencies among genomic loci at genome-scale. Various computational tools have been proposed to recover the underlying chromatin 3D structures from in situ Hi-C contact map data. As connected residuals in a polymer, neighboring genomic loci have intrinsic mutual dependencies in building a 3D conformation. However, current methods seldom take this feature into account. We present a method called ShNeigh, which combines the classical MDS technique with local dependence of neighboring loci modelled by a Gaussian formula, to infer the best 3D structure from noisy and incomplete contact frequency matrices. The results obtained on simulations and real Hi-C data showed, while keeping the high-speed nature of classical MDS, ShNeigh is more accurate and robust than existing methods, especially for sparse contact maps. A Matlab implementation of the proposed method is available at https://github.com/fangzhen-li/ShNeigh.Author summaryWe propose a new method to infer a consensus 3D genome structure from a Hi-C contact map. The novelty of our method is that it takes into accounts the adjacency of genomic loci along chromosomes. Specifically, the proposed method penalizes the optimization problem of the classical multidimensional scaling method with a smoothness constraint weighted by a function of the genomic distance between the pairs of genomic loci. We demonstrate this optimization problem can still be solved efficiently by a classical multidimensional scaling method. We then show that the method can recover stable structures in high noise settings. We also show that it can reconstruct similar structures from data obtained using different restriction enzymes.