Prediction of genome-wide effects of single nucleotide variants on transcription factor binding

Sebastian Carrasco Pro; Katia Bulekova; Brian Gregor; Adam Labadorf; Juan Ignacio Fuxman Bass

doi:10.1038/s41598-020-74793-4

Prediction of genome-wide effects of single nucleotide variants on transcription factor binding

Scientific Reports ◽

10.1038/s41598-020-74793-4 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Sebastian Carrasco Pro ◽

Katia Bulekova ◽

Brian Gregor ◽

Adam Labadorf ◽

Juan Ignacio Fuxman Bass

Keyword(s):

Binding Sites ◽

Cancer Type ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Regulatory Regions ◽

Genome Wide ◽

Transcriptional Regulatory ◽

Gene Regulatory ◽

The Impact ◽

The Relationship

Abstract Single nucleotide variants (SNVs) located in transcriptional regulatory regions can result in gene expression changes that lead to adaptive or detrimental phenotypic outcomes. Here, we predict gain or loss of binding sites for 741 transcription factors (TFs) across the human genome. We calculated ‘gainability’ and ‘disruptability’ scores for each TF that represent the likelihood of binding sites being created or disrupted, respectively. We found that functional cis-eQTL SNVs are more likely to alter TF binding sites than rare SNVs in the human population. In addition, we show that cancer somatic mutations have different effects on TF binding sites from different TF families on a cancer-type basis. Finally, we discuss the relationship between these results and cancer mutational signatures. Altogether, we provide a blueprint to study the impact of SNVs derived from genetic variation or disease association on TF binding to gene regulatory regions.

Get full-text (via PubEx)

Genome-wide allele-specific methylation is enriched at gene regulatory regions in a multi-generation pedigree from the Norfolk Island isolate

Epigenetics & Chromatin ◽

10.1186/s13072-019-0304-7 ◽

2019 ◽

Vol 12 (1) ◽

Cited By ~ 1

Author(s):

Miles C. Benton ◽

Rodney A. Lea ◽

Donia Macartney-Coxson ◽

Heidi G. Sutherland ◽

Nicole White ◽

...

Keyword(s):

Disease Risk ◽

Single Nucleotide Variants ◽

Regulatory Regions ◽

Fold Enrichment ◽

Genome Wide ◽

A Genome ◽

Norfolk Island ◽

Allele Specific ◽

Gene Regulatory ◽

Allele Specific Methylation

Abstract Background Allele-specific methylation (ASM) occurs when DNA methylation patterns exhibit asymmetry among alleles. ASM occurs at imprinted loci, but its presence elsewhere across the human genome is indicative of wider importance in terms of gene regulation and disease risk. Here, we studied ASM by focusing on blood-based DNA collected from 24 subjects comprising a 3-generation pedigree from the Norfolk Island genetic isolate. We applied a genome-wide bisulphite sequencing approach with a genotype-independent ASM calling method to map ASM across the genome. Regions of ASM were then tested for enrichment at gene regulatory regions using Genomic Association Test (GAT) tool. Results In total, we identified 1.12 M CpGs of which 147,170 (13%) exhibited ASM (P ≤ 0.05). When including contiguous ASM signal spanning ≥ 2 CpGs, this condensed to 12,761 ASM regions (AMRs). These AMRs tagged 79% of known imprinting regions and most (98.1%) co-localised with known single nucleotide variants. Notably, miRNA and lncRNA showed a 3.3- and 1.8-fold enrichment of AMRs, respectively (P < 0.005). Also, the 5′ UTR and start codons each showed a 3.5-fold enrichment of AMRs (P < 0.005). There was also enrichment of AMRs observed at subtelomeric regions of many chromosomes. Five out of 11 large AMRs localised to the protocadherin cluster on chromosome 5. Conclusions This study shows ASM extends far beyond genomic imprinting in humans and that gene regulatory regions are hotspots for ASM. Future studies of ASM in pedigrees should help to clarify transgenerational inheritance patterns in relation to genotype and disease phenotypes.

Get full-text (via PubEx)

Combination of Genome-Wide Polymorphisms and Copy Number Variations of Pharmacogenes in Koreans

Journal of Personalized Medicine ◽

10.3390/jpm11010033 ◽

2021 ◽

Vol 11 (1) ◽

pp. 33

Author(s):

Nayoung Han ◽

Jung Mi Oh ◽

In-Wha Kim

Keyword(s):

Copy Number ◽

Genome Wide Association Study ◽

Copy Number Gain ◽

Copy Number Variations ◽

Gene Gain ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Haplotype Blocks ◽

Genome Wide ◽

Control And Prevention

For predicting phenotypes and executing precision medicine, combination analysis of single nucleotide variants (SNVs) genotyping with copy number variations (CNVs) is required. The aim of this study was to discover SNVs or common copy CNVs and examine the combined frequencies of SNVs and CNVs in pharmacogenes using the Korean genome and epidemiology study (KoGES), a consortium project. The genotypes (N = 72,299) and CNV data (N = 1000) were provided by the Korean National Institute of Health, Korea Centers for Disease Control and Prevention. The allele frequencies of SNVs, CNVs, and combined SNVs with CNVs were calculated and haplotype analysis was performed. CYP2D6 rs1065852 (c.100C>T, p.P34S) was the most common variant allele (48.23%). A total of 8454 haplotype blocks in 18 pharmacogenes were estimated. DMD ranked the highest in frequency for gene gain (64.52%), while TPMT ranked the highest in frequency for gene loss (51.80%). Copy number gain of CYP4F2 was observed in 22 subjects; 13 of those subjects were carriers with CYP4F2*3 gain. In the case of TPMT, approximately one-half of the participants (N = 308) had loss of the TPMT*1*1 diplotype. The frequencies of SNVs and CNVs in pharmacogenes were determined using the Korean cohort-based genome-wide association study.

Get full-text (via PubEx)

Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes

Toxicology and Applied Pharmacology ◽

10.1016/j.taap.2004.09.024 ◽

2005 ◽

Vol 207 (2) ◽

pp. 84-90 ◽

Cited By ~ 70

Author(s):

X WANG ◽

D TOMSO ◽

X LIU ◽

D BELL

Keyword(s):

Single Nucleotide Polymorphism ◽

Nucleotide Polymorphism ◽

Single Nucleotide ◽

Regulatory Regions ◽

Transcriptional Regulatory ◽

Environmentally Responsive

Get full-text (via PubEx)

Re-evaluation of single nucleotide variants and identification of structural variants in a cohort of 45 sudden unexplained death cases

International Journal of Legal Medicine ◽

10.1007/s00414-021-02580-5 ◽

2021 ◽

Author(s):

Jacqueline Neubauer ◽

Shouyu Wang ◽

Giancarlo Russo ◽

Cordula Haas

Keyword(s):

Sudden Death ◽

Cardiac Diseases ◽

Structural Variants ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Sudden Unexplained Death ◽

Unexplained Death ◽

Pathogenic Variants ◽

The Impact ◽

Death Cases

AbstractSudden unexplained death (SUD) takes up a considerable part in overall sudden death cases, especially in adolescents and young adults. During the past decade, many channelopathy- and cardiomyopathy-associated single nucleotide variants (SNVs) have been identified in SUD studies by means of postmortem molecular autopsy, yet the number of cases that remain inconclusive is still high. Recent studies had suggested that structural variants (SVs) might play an important role in SUD, but there is no consensus on the impact of SVs on inherited cardiac diseases. In this study, we searched for potentially pathogenic SVs in 244 genes associated with cardiac diseases. Whole-exome sequencing and appropriate data analysis were performed in 45 SUD cases. Re-analysis of the exome data according to the current ACMG guidelines identified 14 pathogenic or likely pathogenic variants in 10 (22.2%) out of the 45 SUD cases, whereof 2 (4.4%) individuals had variants with likely functional effects in the channelopathy-associated genes SCN5A and TRDN and 1 (2.2%) individual in the cardiomyopathy-associated gene DTNA. In addition, 18 structural variants (SVs) were identified in 15 out of the 45 individuals. Two SVs with likely functional impairment were found in the coding regions of PDSS2 and TRPM4 in 2 SUD cases (4.4%). Both were identified as heterozygous deletions, which were confirmed by multiplex ligation-dependent probe amplification. In conclusion, our findings support that SVs could contribute to the pathology of the sudden death event in some of the cases and therefore should be investigated on a routine basis in suspected SUD cases.

Get full-text (via PubEx)

Impact of Pre and Post Variant Filtration Strategies on Imputation

10.21203/rs.3.rs-128366/v1 ◽

2020 ◽

Author(s):

Celine Charon ◽

Rodrigue Allodji ◽

Vincent Meyer ◽

Jean-François Deleuze

Keyword(s):

Quality Control ◽

Rare Variants ◽

Association Studies ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Direct Effects ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Genome Wide ◽

Conservative Post

Abstract Quality control methods for genome-wide association studies and fine mapping are commonly used for imputation, however, they result in loss of many single nucleotide polymorphisms (SNPs). To investigate the consequences of filtration on imputation, we studied the direct effects on the number of markers, their allele frequencies, imputation quality scores and post-filtration events. We pre-phrased 1,031 genotyped individuals from diverse ethnicities and compared the imputed variants to 1,089 NCBI recorded individuals for additional validation.Without variant pre-filtration based on quality control (QC), we observed no impairment in the imputation of SNPs that failed QC whereas with pre-filtration there was an overall loss of information. Significant differences between frequencies with and without pre-filtration were found only in the range of very rare (5E-04-1E-03) and rare variants (1E-03-5E-03) (p < 1E-04). Increasing the post-filtration imputation quality score from 0.3 to 0.8 reduced the number of single nucleotide variants (SNVs) <0.001 2.5 fold with or without QC pre-filtration and halved the number of very rare variants (5E-04). As a result, to maintain confidence and enough SNVs, we propose here a 2-step post-filtration approach to increase the number of very rare and rare variants compared to conservative post-filtration methods.

Get full-text (via PubEx)

CSIG-22. CANCER-ASSOCIATED MISSENSE SINGLE NUCLEOTIDE VARIANTS REGULATE THE STABILITY AND SUBCELLULAR LOCALIZATION OF NF2/MERLIN

Neuro-Oncology ◽

10.1093/neuonc/noaa215.134 ◽

2020 ◽

Vol 22 (Supplement_2) ◽

pp. ii32-ii32

Author(s):

Charlotte Eaton ◽

Paola Bisignano ◽

David Raleigh

Keyword(s):

Tumor Suppressor ◽

Subcellular Localization ◽

Function Analysis ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Ferm Domain ◽

Binding Partners ◽

Subcellular Compartments ◽

The Stability ◽

The Impact

Abstract BACKGROUND Alterations in the NF2 tumor suppressor gene lead to meningiomas and schwannomas, but the tumor suppressor functions of the NF2 gene product, Merlin, are incompletely understood. To address this problem, we performed a structure-function analysis of Merlin by expressing cancer-associated missense single-nucleotide variants (mSNVs) in primary cancer cells for biochemical and cell biology experiments. METHODS All NF2 mSNVs were assembled from cBioPortal and COSMIC, and modelled on the FERM, a-helical, and C-terminal domains of Merlin (PDB 4ZRJ) using comparative structure prediction on the Robetta server and visually inspected using Pymol. mSNV hotspots were defined from sliding windows with at least 10 mutations within 5 residues in either direction. mSNVs from hotspots in meningiomas, schwannomas, or both, were selected for in vitro mechanistic analyses using immunofluorescence and immunoblotting of whole cell, plasma membrane, cytoskeletal, cytoplasmic, nuclear, and chromatin subcellular fractions from M10G meningioma cells and HEI-193 schwannoma cells. RESULTS We identified the following cancer-associated hotspot mSNVs in NF2, which were over-expressed for mechanistic studies: L46R, S156N, W191R, A211D, V219M, R418C and R462K. Endogenous Merlin was detected in all subcellular compartments, but was enriched in the nucleus. L46R and A211D mapped to hydrophobic pockets in the FERM domain, destabilized Merlin, and excluded Merlin from all subcellular compartments except the cytoskeleton. S156N, W191R and V219M also mapped to the FERM domain, but did not affect Merlin stability, and V219M attenuated chromatin localization, suggesting this motif may be involved in binding events that regulate subcellular localization. R418C and R463K mapped to the a-helical domain, but only R418C destabilized Merlin. CONCLUSION Our results suggest that cancer-associated mSNVs inactive the tumor suppressor functions of NF2 by altering the stability, subcellular localization, or binding partners of Merlin. Further work is required to identify and understand the impact of binding partners and subcellular localization on Merlin function.

Get full-text (via PubEx)

HGG-41. STRUCTURAL VARIANT DRIVERS IN PEDIATRIC HIGH-GRADE GLIOMA

Neuro-Oncology ◽

10.1093/neuonc/noaa222.322 ◽

2020 ◽

Vol 22 (Supplement_3) ◽

pp. iii351-iii351

Author(s):

Frank Dubois ◽

Ofer Shapira ◽

Noah Greenwald ◽

Travis Zack ◽

Jessica W Tsai ◽

...

Keyword(s):

Copy Number ◽

High Grade Glioma ◽

High Grade ◽

Structural Variants ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Effector Domains ◽

Topologically Associating Domains ◽

Genome Wide ◽

Pediatric High Grade Glioma

Abstract BACKGROUND Driver single nucleotide variants (SNV) and somatic copy number aberrations (SCNA) of pediatric high-grade glioma (pHGGs), including Diffuse Midline Gliomas (DMGs) are characterized. However, structural variants (SVs) in pHGGs and the mechanisms through which they contribute to glioma formation have not been systematically analyzed genome-wide. METHODS Using SvABA for SVs as well as the latest pipelines for SCNAs and SNVs we analyzed whole-genome sequencing from 174 patients. This includes 60 previously unpublished samples, 43 of which are DMGs. Signature analysis allowed us to define pHGG groups with shared SV characteristics. Significantly recurring SV breakpoints and juxtapositions were identified with algorithms we recently developed and the findings were correlated with RNAseq and H3K27ac ChIPseq. RESULTS The SV characteristics in pHGG showed three groups defined by either complex, intermediate or simple signature activities. These associated with distinct combinations of known driver oncogenes. Our statistical analysis revealed recurring SVs in the topologically associating domains of MYCN, MYC, EGFR, PDGFRA & MET. These correlated with increased mRNA expression and amplification of H3K27ac peaks. Complex recurring amplifications showed characteristics of extrachromosomal amplicons and were enriched in coding SVs splitting protein regulatory from effector domains. Integrative analysis of all SCNAs, SNVs & SVs revealed patterns of characteristic combinations between potential drivers and signatures. This included two distinct groups of H3K27M DMGs with either complex or simple signatures and different combinations of associated variants. CONCLUSION Recurrent SVs associate with signatures shaped by an underlying process, which can lead to distinct mechanisms to activate the same oncogene.

Get full-text (via PubEx)

Genome-wide effects of chromatin on vitamin D signaling

Journal of Molecular Endocrinology ◽

10.1530/jme-19-0246 ◽

2020 ◽

Vol 64 (4) ◽

pp. R45-R56 ◽

Cited By ~ 3

Author(s):

Andrea Hanel ◽

Henna-Riikka Malmberg ◽

Carsten Carlberg

Keyword(s):

Vitamin D ◽

Binding Sites ◽

Target Genes ◽

Chromatin Accessibility ◽

Transcription Start Sites ◽

Dihydroxyvitamin D ◽

Genome Wide ◽

Spatio Temporal ◽

Vitamin D Signaling ◽

The Impact

Molecular endocrinology of vitamin D is based on the activation of the transcription factor vitamin D receptor (VDR) by the vitamin D metabolite 1α,25-dihydroxyvitamin D3. This nuclear vitamin D-sensing process causes epigenome-wide effects, such as changes in chromatin accessibility as well as in the contact of VDR and its supporting pioneer factors with thousands of genomic binding sites, referred to as vitamin D response elements. VDR binding enhancer regions loop to transcription start sites of hundreds of vitamin D target genes resulting in changes of their expression. Thus, vitamin D signaling is based on epigenome- and transcriptome-wide shifts in VDR-expressing tissues. Monocytes are the most responsive cell type of the immune system and serve as a paradigm for uncovering the chromatin model of vitamin D signaling. In this review, an alternative approach for selecting vitamin D target genes is presented, which are most relevant for understanding the impact of vitamin D endocrinology on innate immunity. Different scenarios of the regulation of primary upregulated vitamin D target genes are presented, in which vitamin D-driven super-enhancers comprise a cluster of persistent (constant) and/or inducible (transient) VDR-binding sites. In conclusion, the spatio-temporal VDR binding in the context of chromatin is most critical for the regulation of vitamin D target genes.

Get full-text (via PubEx)

Korean Genome Project: 1094 Korean personal genomes with clinical information

Science Advances ◽

10.1126/sciadv.aaz7835 ◽

2020 ◽

Vol 6 (22) ◽

pp. eaaz7835 ◽

Cited By ~ 2

Author(s):

Sungwon Jeon ◽

Youngjune Bhak ◽

Yeonsong Choi ◽

Yeonsu Jeon ◽

Seunghoon Kim ◽

...

Keyword(s):

Genome Wide Association Study ◽

Imputation Accuracy ◽

Clinical Information ◽

Genome Project ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Genome Wide ◽

A Genome ◽

Whole Genomes ◽

Personal Genomes

We present the initial phase of the Korean Genome Project (Korea1K), including 1094 whole genomes (sequenced at an average depth of 31×), along with data of 79 quantitative clinical traits. We identified 39 million single-nucleotide variants and indels of which half were singleton or doubleton and detected Korean-specific patterns based on several types of genomic variations. A genome-wide association study illustrated the power of whole-genome sequences for analyzing clinical traits, identifying nine more significant candidate alleles than previously reported from the same linkage disequilibrium blocks. Also, Korea1K, as a reference, showed better imputation accuracy for Koreans than the 1KGP panel. As proof of utility, germline variants in cancer samples could be filtered out more effectively when the Korea1K variome was used as a panel of normals compared to non-Korean variome sets. Overall, this study shows that Korea1K can be a useful genotypic and phenotypic resource for clinical and ethnogenetic studies.

Get full-text (via PubEx)

Cytosine base editor 4 but not adenine base editor generates off-target mutations in mouse embryos

Communications Biology ◽

10.1038/s42003-019-0745-3 ◽

2020 ◽

Vol 3 (1) ◽

Cited By ~ 15

Author(s):

Hye Kyung Lee ◽

Harold E. Smith ◽

Chengyu Liu ◽

Michaela Willi ◽

Lothar Hennighausen

Keyword(s):

Point Mutations ◽

Mouse Embryos ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Base Editing ◽

Genome Wide ◽

Wide Range ◽

Family Based ◽

Correct Point ◽

Adenine Base

AbstractDeaminase base editing has emerged as a tool to install or correct point mutations in the genomes of living cells in a wide range of organisms. However, the genome-wide off-target effects introduced by base editors in the mammalian genome have been examined in only one study. Here, we have investigated the fidelity of cytosine base editor 4 (BE4) and adenine base editors (ABE) in mouse embryos using unbiased whole-genome sequencing of a family-based trio cohort. The same sgRNA was used for BE4 and ABE. We demonstrate that BE4-edited mice carry an excess of single-nucleotide variants and deletions compared to ABE-edited mice and controls. Therefore, an optimization of cytosine base editors is required to improve its fidelity. While the remarkable fidelity of ABE has implications for a wide range of applications, the occurrence of rare aberrant C-to-T conversions at specific target sites needs to be addressed.

Get full-text (via PubEx)