scholarly journals GAMtools: an automated pipeline for analysis of Genome Architecture Mapping data

2017 ◽  
Author(s):  
Robert A. Beagrie ◽  
Markus Schueler

AbstractGenome Architecture Mapping (GAM) is a recently developed method for mapping chromatin interactions genome-wide. GAM is based on sequencing genomic DNA extracted from thin cryosections of cell nuclei. As a new approach, GAM datasets require specialized analytical tools and approaches. Here we present GAMtools, a pipeline for analysing GAM datasets. GAMtools covers the automated mapping of raw next-generation sequencing data generated by GAM, detection of genomic regions present in each nuclear slice, calculation of quality control metrics, generation of inferred proximity matrices, plotting of heatmaps and detection of genomic features for which chromatin interactions are enriched/depleted.

2018 ◽  
Vol 2018 ◽  
pp. 1-22 ◽  
Author(s):  
Cheng-Wei Li ◽  
Yu-Kai Chiu ◽  
Bor-Sen Chen

The prevalence of hepatocellular carcinoma (HCC) is still high worldwide because liver diseases could develop into HCC. Recent reports indicate nonalcoholic fatty liver disease and nonalcoholic steatohepatitis (NAFLD&NASH) and primary biliary cirrhosis and primary sclerosing cholangitis (PBC&PSC) are significant of HCC. Therefore, understanding the cellular mechanisms of the pathogenesis and hepatocarcinogenesis from normal liver cells to HCC through NAFLD&NASH or PBC&PSC is a priority to prevent the progression of liver damage and reduce the risk of further complications. By the genetic and epigenetic data mining and the system identification through next-generation sequencing data and its corresponding DNA methylation profiles of liver cells in normal, NAFLD&NASH, PBC&PSC, and HCC patients, we identified the genome-wide real genetic and epigenetic networks (GENs) of normal, NAFLD&NASH, PBC&PSC, and HCC patients. In order to get valuable insight into these identified genome-wide GENs, we then applied a principal network projection method to extract the corresponding core GENs for normal liver cells, NAFLD&NASH, PBC&PSC, and HCC. By comparing the signal transduction pathways involved in the identified core GENs, we found that the hepatocarcinogenesis through NAFLD&NASH was induced through DNA methylation of HIST2H2BE, HSPB1, RPL30, and ALDOB and the regulation of miR-21 and miR-122, and the hepatocarcinogenesis through PBC&PSC was induced through DNA methylation of RPL23A, HIST2H2BE, TIMP1, IGF2, RPL30, and ALDOB and the regulation of miR-29a, miR-21, and miR-122. The genetic and epigenetic changes in the pathogenesis and hepatocarcinogenesis potentially serve as potential diagnostic biomarkers and/or therapeutic targets.


2019 ◽  
Vol 36 (6) ◽  
pp. 1704-1711
Author(s):  
Artur Jaroszewicz ◽  
Jason Ernst

Abstract Motivation Chromatin interactions play an important role in genome architecture and gene regulation. The Hi-C assay generates such interactions maps genome-wide, but at relatively low resolutions (e.g. 5-25 kb), which is substantially coarser than the resolution of transcription factor binding sites or open chromatin sites that are potential sources of such interactions. Results To predict the sources of Hi-C-identified interactions at a high resolution (e.g. 100 bp), we developed a computational method that integrates data from DNase-seq and ChIP-seq of TFs and histone marks. Our method, χ-CNN, uses this data to first train a convolutional neural network (CNN) to discriminate between called Hi-C interactions and non-interactions. χ-CNN then predicts the high-resolution source of each Hi-C interaction using a feature attribution method. We show these predictions recover original Hi-C peaks after extending them to be coarser. We also show χ-CNN predictions enrich for evolutionarily conserved bases, eQTLs and CTCF motifs, supporting their biological significance. χ-CNN provides an approach for analyzing important aspects of genome architecture and gene regulation at a higher resolution than previously possible. Availability and implementation χ-CNN software is available on GitHub (https://github.com/ernstlab/X-CNN). Supplementary information Supplementary data are available at Bioinformatics online.


2012 ◽  
Vol 367 (1587) ◽  
pp. 354-363 ◽  
Author(s):  
S. Renaut ◽  
N. Maillet ◽  
E. Normandeau ◽  
C. Sauvage ◽  
N. Derome ◽  
...  

The nature, size and distribution of the genomic regions underlying divergence and promoting reproductive isolation remain largely unknown. Here, we summarize ongoing efforts using young (12 000 yr BP) species pairs of lake whitefish ( Coregonus clupeaformis ) to expand our understanding of the initial genomic patterns of divergence observed during speciation. Our results confirmed the predictions that: (i) on average, phenotypic quantitative trait loci (pQTL) show higher F ST values and are more likely to be outliers (and therefore candidates for being targets of divergent selection) than non-pQTL markers; (ii) large islands of divergence rather than small independent regions under selection characterize the early stages of adaptive divergence of lake whitefish; and (iii) there is a general trend towards an increase in terms of numbers and size of genomic regions of divergence from the least (East L.) to the most differentiated species pair (Cliff L.). This is consistent with previous estimates of reproductive isolation between these species pairs being driven by the same selective forces responsible for environment specialization. Altogether, dwarf and normal whitefish species pairs represent a continuum of both morphological and genomic differentiation contributing to ecological speciation. Admittedly, much progress is still required to more finely map and circumscribe genomic islands of speciation. This will be achieved through the use of next generation sequencing data but also through a better quantification of phenotypic traits moulded by selection as organisms adapt to new environmental conditions.


2017 ◽  
Author(s):  
Jeremiah Wala ◽  
Pratiti Bandopadhayay ◽  
Noah Greenwald ◽  
Ryan O’Rourke ◽  
Ted Sharpe ◽  
...  

AbstractStructural variants (SVs), including small insertion and deletion variants (indels), are challenging to detect through standard alignment-based variant calling methods. Sequence assembly offers a powerful approach to identifying SVs, but is difficult to apply at-scale genome-wide for SV detection due to its computational complexity and the difficulty of extracting SVs from assembly contigs. We describe SvABA, an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements. We evaluated SvABA’s performance on the NA12878 human genome and in simulated and real cancer genomes. SvABA demonstrates superior sensitivity and specificity across a large spectrum of SVs, and substantially improved detection performance for variants in the 20-300 bp range, compared with existing methods. SvABA also identifies complex somatic rearrangements with chains of short (< 1,000 bp) templated-sequence insertions copied from distant genomic regions. We applied SvABA to 344 cancer genomes from 11 cancer types, and found that templated-sequence insertions occur in ~4% of all somatic rearrangements. Finally, we demonstrate that SvABA can identify sites of viral integration and cancer driver alterations containing medium-sized SVs.


Sign in / Sign up

Export Citation Format

Share Document