GAMtools: an automated pipeline for analysis of Genome Architecture Mapping data

AbstractGenome Architecture Mapping (GAM) is a recently developed method for mapping chromatin interactions genome-wide. GAM is based on sequencing genomic DNA extracted from thin cryosections of cell nuclei. As a new approach, GAM datasets require specialized analytical tools and approaches. Here we present GAMtools, a pipeline for analysing GAM datasets. GAMtools covers the automated mapping of raw next-generation sequencing data generated by GAM, detection of genomic regions present in each nuclear slice, calculation of quality control metrics, generation of inferred proximity matrices, plotting of heatmaps and detection of genomic features for which chromatin interactions are enriched/depleted.

Download Full-text

A genomic approach to selecting robust and versatile SNP sets from next-generation sequencing data for genome-wide association study in citrus cultivars

Acta Horticulturae ◽

10.17660/actahortic.2016.1135.4 ◽

2016 ◽

pp. 23-32 ◽

Cited By ~ 7

Author(s):

T. Shimizu ◽

E. Kaminuma ◽

K. Nonaka ◽

T. Yoshioka ◽

S. Goto ◽

...

Keyword(s):

Next Generation Sequencing ◽

Association Study ◽

Genome Wide Association Study ◽

Genome Wide Association ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Genome Wide ◽

Genomic Approach ◽

Generation Sequencing

Download Full-text

Investigating Pathogenic and Hepatocarcinogenic Mechanisms from Normal Liver to HCC by Constructing Genetic and Epigenetic Networks via Big Genetic and Epigenetic Data Mining and Genome-Wide NGS Data Identification

Disease Markers ◽

10.1155/2018/8635329 ◽

2018 ◽

Vol 2018 ◽

pp. 1-22 ◽

Cited By ~ 5

Author(s):

Cheng-Wei Li ◽

Yu-Kai Chiu ◽

Bor-Sen Chen

Keyword(s):

Data Mining ◽

Dna Methylation ◽

Normal Liver ◽

Liver Cells ◽

Next Generation Sequencing Data ◽

Valuable Insight ◽

Biliary Cirrhosis ◽

Sequencing Data ◽

Cellular Mechanisms ◽

Genome Wide

The prevalence of hepatocellular carcinoma (HCC) is still high worldwide because liver diseases could develop into HCC. Recent reports indicate nonalcoholic fatty liver disease and nonalcoholic steatohepatitis (NAFLD&NASH) and primary biliary cirrhosis and primary sclerosing cholangitis (PBC&PSC) are significant of HCC. Therefore, understanding the cellular mechanisms of the pathogenesis and hepatocarcinogenesis from normal liver cells to HCC through NAFLD&NASH or PBC&PSC is a priority to prevent the progression of liver damage and reduce the risk of further complications. By the genetic and epigenetic data mining and the system identification through next-generation sequencing data and its corresponding DNA methylation profiles of liver cells in normal, NAFLD&NASH, PBC&PSC, and HCC patients, we identified the genome-wide real genetic and epigenetic networks (GENs) of normal, NAFLD&NASH, PBC&PSC, and HCC patients. In order to get valuable insight into these identified genome-wide GENs, we then applied a principal network projection method to extract the corresponding core GENs for normal liver cells, NAFLD&NASH, PBC&PSC, and HCC. By comparing the signal transduction pathways involved in the identified core GENs, we found that the hepatocarcinogenesis through NAFLD&NASH was induced through DNA methylation of HIST2H2BE, HSPB1, RPL30, and ALDOB and the regulation of miR-21 and miR-122, and the hepatocarcinogenesis through PBC&PSC was induced through DNA methylation of RPL23A, HIST2H2BE, TIMP1, IGF2, RPL30, and ALDOB and the regulation of miR-29a, miR-21, and miR-122. The genetic and epigenetic changes in the pathogenesis and hepatocarcinogenesis potentially serve as potential diagnostic biomarkers and/or therapeutic targets.

Download Full-text

An integrative approach for fine-mapping chromatin interactions

Bioinformatics ◽

10.1093/bioinformatics/btz843 ◽

2019 ◽

Vol 36 (6) ◽

pp. 1704-1711

Author(s):

Artur Jaroszewicz ◽

Jason Ernst

Keyword(s):

Gene Regulation ◽

High Resolution ◽

Biological Significance ◽

Computational Method ◽

Supplementary Information ◽

Integrative Approach ◽

Genome Architecture ◽

Open Chromatin ◽

Chromatin Interactions ◽

Genome Wide

Abstract Motivation Chromatin interactions play an important role in genome architecture and gene regulation. The Hi-C assay generates such interactions maps genome-wide, but at relatively low resolutions (e.g. 5-25 kb), which is substantially coarser than the resolution of transcription factor binding sites or open chromatin sites that are potential sources of such interactions. Results To predict the sources of Hi-C-identified interactions at a high resolution (e.g. 100 bp), we developed a computational method that integrates data from DNase-seq and ChIP-seq of TFs and histone marks. Our method, χ-CNN, uses this data to first train a convolutional neural network (CNN) to discriminate between called Hi-C interactions and non-interactions. χ-CNN then predicts the high-resolution source of each Hi-C interaction using a feature attribution method. We show these predictions recover original Hi-C peaks after extending them to be coarser. We also show χ-CNN predictions enrich for evolutionarily conserved bases, eQTLs and CTCF motifs, supporting their biological significance. χ-CNN provides an approach for analyzing important aspects of genome architecture and gene regulation at a higher resolution than previously possible. Availability and implementation χ-CNN software is available on GitHub (https://github.com/ernstlab/X-CNN). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Genome-Wide Identification of Insertion and Deletion Markers in Chinese Commercial Rice Cultivars, Based on Next-Generation Sequencing Data

Agronomy ◽

10.3390/agronomy8040036 ◽

2018 ◽

Vol 8 (4) ◽

pp. 36 ◽

Cited By ~ 3

Author(s):

Kesavan Markkandan ◽

Seung-il Yoo ◽

Young-Chan Cho ◽

Dong Lee

Keyword(s):

Next Generation Sequencing ◽

Next Generation Sequencing Data ◽

Rice Cultivars ◽

Next Generation ◽

Sequencing Data ◽

Insertion And Deletion ◽

Genome Wide ◽

Generation Sequencing

Download Full-text

Computational Approaches in Next-Generation Sequencing Data Analysis for Genome-Wide DNA Methylation Studies

Computational Methods for Next Generation Sequencing Data Analysis ◽

10.1002/9781119272182.ch9 ◽

2016 ◽

pp. 197-226

Author(s):

Jeong-Hyeon Choi ◽

Huidong Shi

Keyword(s):

Dna Methylation ◽

Data Analysis ◽

Next Generation Sequencing ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Computational Approaches ◽

Genome Wide ◽

Generation Sequencing ◽

Sequencing Data Analysis

Download Full-text

Investigating core genetic-and-epigenetic cell cycle networks for stemness and carcinogenic mechanisms, and cancer drug design using big database mining and genome-wide next-generation sequencing data

Cell Cycle ◽

10.1080/15384101.2016.1198862 ◽

2016 ◽

Vol 15 (19) ◽

pp. 2593-2607 ◽

Cited By ~ 13

Author(s):

Cheng-Wei Li ◽

Bor-Sen Chen

Keyword(s):

Cell Cycle ◽

Next Generation Sequencing ◽

Drug Design ◽

Next Generation Sequencing Data ◽

Cancer Drug ◽

Database Mining ◽

Next Generation ◽

Sequencing Data ◽

Genome Wide ◽

Generation Sequencing

Download Full-text

Genome-wide patterns of divergence during speciation: the lake whitefish case study

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2011.0197 ◽

2012 ◽

Vol 367 (1587) ◽

pp. 354-363 ◽

Cited By ~ 81

Author(s):

S. Renaut ◽

N. Maillet ◽

E. Normandeau ◽

C. Sauvage ◽

N. Derome ◽

...

Keyword(s):

Reproductive Isolation ◽

Genomic Islands ◽

Next Generation Sequencing Data ◽

Adaptive Divergence ◽

Phenotypic Traits ◽

Lake Whitefish ◽

Sequencing Data ◽

Species Pairs ◽

Selective Forces ◽

Genomic Regions

The nature, size and distribution of the genomic regions underlying divergence and promoting reproductive isolation remain largely unknown. Here, we summarize ongoing efforts using young (12 000 yr BP) species pairs of lake whitefish ( Coregonus clupeaformis ) to expand our understanding of the initial genomic patterns of divergence observed during speciation. Our results confirmed the predictions that: (i) on average, phenotypic quantitative trait loci (pQTL) show higher F ST values and are more likely to be outliers (and therefore candidates for being targets of divergent selection) than non-pQTL markers; (ii) large islands of divergence rather than small independent regions under selection characterize the early stages of adaptive divergence of lake whitefish; and (iii) there is a general trend towards an increase in terms of numbers and size of genomic regions of divergence from the least (East L.) to the most differentiated species pair (Cliff L.). This is consistent with previous estimates of reproductive isolation between these species pairs being driven by the same selective forces responsible for environment specialization. Altogether, dwarf and normal whitefish species pairs represent a continuum of both morphological and genomic differentiation contributing to ecological speciation. Admittedly, much progress is still required to more finely map and circumscribe genomic islands of speciation. This will be achieved through the use of next generation sequencing data but also through a better quantification of phenotypic traits moulded by selection as organisms adapt to new environmental conditions.

Download Full-text

GenoSeq: A genotyping tool for next-generation sequencing data in genome-wide association study

BioChip Journal ◽

10.1007/s13206-013-7406-2 ◽

2013 ◽

Vol 7 (4) ◽

pp. 353-360

Author(s):

Jinwoo Kim ◽

Jaeyoung Kim ◽

Miyoung Shin

Keyword(s):

Next Generation Sequencing ◽

Association Study ◽

Genome Wide Association Study ◽

Genome Wide Association ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Genome Wide ◽

Generation Sequencing

Download Full-text

SvABA: Genome-wide detection of structural variants and indels by local assembly

10.1101/105080 ◽

2017 ◽

Cited By ~ 9

Author(s):

Jeremiah Wala ◽

Pratiti Bandopadhayay ◽

Noah Greenwald ◽

Ryan O’Rourke ◽

Ted Sharpe ◽

...

Keyword(s):

Variant Calling ◽

Accurate Method ◽

Structural Variants ◽

Sequencing Data ◽

Cancer Driver ◽

Insertion And Deletion ◽

Genome Wide ◽

Cancer Genomes ◽

Local Assembly ◽

Genomic Regions

AbstractStructural variants (SVs), including small insertion and deletion variants (indels), are challenging to detect through standard alignment-based variant calling methods. Sequence assembly offers a powerful approach to identifying SVs, but is difficult to apply at-scale genome-wide for SV detection due to its computational complexity and the difficulty of extracting SVs from assembly contigs. We describe SvABA, an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements. We evaluated SvABA’s performance on the NA12878 human genome and in simulated and real cancer genomes. SvABA demonstrates superior sensitivity and specificity across a large spectrum of SVs, and substantially improved detection performance for variants in the 20-300 bp range, compared with existing methods. SvABA also identifies complex somatic rearrangements with chains of short (< 1,000 bp) templated-sequence insertions copied from distant genomic regions. We applied SvABA to 344 cancer genomes from 11 cancer types, and found that templated-sequence insertions occur in ~4% of all somatic rearrangements. Finally, we demonstrate that SvABA can identify sites of viral integration and cancer driver alterations containing medium-sized SVs.

Download Full-text