scholarly journals Mapping short tandem repeats for liver gene expression traits helps prioritize potential causal variants for complex traits in pigs

2022 ◽  
Vol 13 (1) ◽  
Author(s):  
Zhongzi Wu ◽  
Huanfa Gong ◽  
Zhimin Zhou ◽  
Tao Jiang ◽  
Ziqi Lin ◽  
...  

Abstract Background Short tandem repeats (STRs) were recently found to have significant impacts on gene expression and diseases in humans, but their roles on gene expression and complex traits in pigs remain unexplored. This study investigates the effects of STRs on gene expression in liver tissues based on the whole-genome sequences and RNA-Seq data of a discovery cohort of 260 F6 individuals and a validation population of 296 F7 individuals from a heterogeneous population generated from crosses among eight pig breeds. Results We identified 5203 and 5868 significantly expression STRs (eSTRs, FDR < 1%) in the F6 and F7 populations, respectively, most of which could be reciprocally validated (π1 = 0.92). The eSTRs explained 27.5% of the cis-heritability of gene expression traits on average. We further identified 235 and 298 fine-mapped STRs through the Bayesian fine-mapping approach in the F6 and F7 pigs, respectively, which were significantly enriched in intron, ATAC peak, compartment A and H3K4me3 regions. We identified 20 fine-mapped STRs located in 100 kb windows upstream and downstream of published complex trait-associated SNPs, which colocalized with epigenetic markers such as H3K27ac and ATAC peaks. These included eSTR of the CLPB, PGLS, PSMD6 and DHDH genes, which are linked with genome-wide association study (GWAS) SNPs for blood-related traits, leg conformation, growth-related traits, and meat quality traits, respectively. Conclusions This study provides insights into the effects of STRs on gene expression traits. The identified eSTRs are valuable resources for prioritizing causal STRs for complex traits in pigs.

2020 ◽  
Vol 11 (1) ◽  
Author(s):  
David Jakubosky ◽  
◽  
Matteo D’Antonio ◽  
Marc Jan Bonder ◽  
Craig Smail ◽  
...  

2019 ◽  
Author(s):  
David Jakubosky ◽  
Matteo D’Antonio ◽  
Marc Jan Bonder ◽  
Craig Smail ◽  
Margaret K.R. Donovan ◽  
...  

AbstractStructural variants (SVs) and short tandem repeats (STRs) comprise a broad group of diverse DNA variants which vastly differ in their sizes and distributions across the genome. Here, we show that different SV classes and STRs differentially impact gene expression and complex traits. Functional differences between SV classes and STRs include their genomic locations relative to eGenes, likelihood of being associated with multiple eGenes, associated eGene types (e.g., coding, noncoding, level of evolutionary constraint), effect sizes, linkage disequilibrium with tagging single nucleotide variants used in GWAS, and likelihood of being associated with GWAS traits. We also identified a set of high-impact SVs/STRs associated with the expression of three or more eGenes via chromatin loops and showed they are highly enriched for being associated with GWAS traits. Our study provides insights into the genomic properties of structural variant classes and short tandem repeats that impact gene expression and human traits.


2018 ◽  
Author(s):  
Stephanie Feupe Fotsing ◽  
Jonathan Margoliash ◽  
Catherine Wang ◽  
Shubham Saini ◽  
Richard Yanicky ◽  
...  

AbstractShort tandem repeats (STRs) have been implicated in a variety of complex traits in humans. However, genome-wide studies of the effects of STRs on gene expression thus far have had limited power to detect associations and provide insights into putative mechanisms. Here, we leverage whole genome sequencing and expression data for 17 tissues from the Genotype-Tissue Expression Project (GTEx) to identify STRs for which repeat number is associated with expression of nearby genes (eSTRs). Our analysis reveals more than 28,000 eSTRs. We employ fine-mapping to quantify the probability that each eSTR is causal and characterize a group of the top 1,400 fine-mapped eSTRs. We identify hundreds of eSTRs linked with published GWAS signals and implicate specific eSTRs in complex traits including height and schizophrenia, inflammatory bowel disease, and intelligence. Overall, our results support the hypothesis that eSTRs contribute to a range of human phenotypes and will serve as a valuable resource for future studies of complex traits.


2015 ◽  
Vol 48 (1) ◽  
pp. 22-29 ◽  
Author(s):  
Melissa Gymrek ◽  
Thomas Willems ◽  
Audrey Guilmatre ◽  
Haoyang Zeng ◽  
Barak Markus ◽  
...  

2018 ◽  
Author(s):  
Shubham Saini ◽  
Ileena Mitra ◽  
Nima Mousavi ◽  
Stephanie Feupe Fotsing ◽  
Melissa Gymrek

AbstractShort tandem repeats (STRs) are involved in dozens of Mendelian disorders and have been implicated in a variety of complex traits. However, existing technologies focusing on single nucleotide polymorphisms (SNPs) have not allowed for systematic STR association studies. Here, we leverage next-generation sequencing data from 479 families to create a SNP+STR reference haplotype panel for genome-wide imputation of STRs into SNP data. Imputation achieved an average of 97% concordance between genotyped and imputed STR genotypes in an external dataset compared to 63% expected under a random model. Performance varied widely across STRs, with near perfect concordance at bi-allelic STRs vs. 70% at highly polymorphic forensics markers. We demonstrate that imputation increases power over individual SNPs to detect STR associations using simulated phenotypes and gene expression data. This resource will enable the first large-scale STR association studies using existing SNP datasets, and will likely yield new insights into complex traits.


2015 ◽  
Author(s):  
Melissa Gymrek ◽  
Thomas Willems ◽  
Haoyang Zeng ◽  
Barak Markus ◽  
Mark J Daly ◽  
...  

Expression quantitative trait loci (eQTLs) are a key tool to dissect cellular processes mediating complex diseases. However, little is known about the role of repetitive elements as eQTLs. We report a genome-wide survey of the contribution of Short Tandem Repeats (STRs), one of the most polymorphic and abundant repeat classes, to gene expression in humans. Our survey identified 2,060 significant expression STRs (eSTRs). These eSTRs were replicable in orthogonal populations and expression assays. We used variance partitioning to disentangle the contribution of eSTRs from linked SNPs and indels and found that eSTRs contribute 10%-15% of the cis-heritability mediated by all common variants. Functional genomic analyses showed that eSTRs are enriched in conserved regions, co-localize with regulatory elements, and are predicted to modulate histone modifications. Our results show that eSTRs provide a novel set of regulatory variants and highlight the contribution of repeats to the genetic architecture of quantitative human traits.


2020 ◽  
Author(s):  
Milad Mortazavi ◽  
Yangsu Ren ◽  
Shubham Saini ◽  
Danny Antaki ◽  
Celine St. Pierre ◽  
...  

AbstractC57BL/6J is the most widely used inbred mouse strain and is the basis for the mouse reference genome. In addition to C57BL/6J, several other C57BL/6 and C57BL/10 substrains exist. Previous studies have documented extensive phenotypic and genetic differences among these substrains, which are presumed to be due to the accumulation of new mutations. These differences can be used for genome wide association studies. They can also have unintended consequences for reproducibility when substrain differences are not properly accounted for. In this paper, we performed genomic sequencing and RNA-sequencing in the hippocampus of 9 C57BL/6 and 5 C57BL/10 substrains. We identified 985,329 SNPs, 150,344 Short Tandem Repeats (STR) and 896 Structural Variants (SV), out of which 330,178 SNPs and 14,367 STRs differentiated the C57BL/6 and C57BL/10 groups. We found several regions that contained dense polymorphisms. We also identified 578 differentially expressed genes for C57BL/6 substrains and 37 differentially expressed genes for C57BL/10 substrains (FDR < 0.01). We then identified nearby SNPs, STRs and SVs that matched the gene expression patterns. In so doing, we identified SVs in coding regions of Wdfy1, Ide, Fgfbp3 and Btaf1 that explain the expression patterns observed. We replicated several previously reported gene expression differences between substrains (Nnt, Gabra2) as well as many novel gene expression differences (e.g. Kcnc2). Our results illustrate the impact of new mutations on gene expression among these substrains and provides a resource for future mapping studies.


2021 ◽  
Vol 12 ◽  
Author(s):  
Claire P. Prowse-Wilkins ◽  
Jianghui Wang ◽  
Ruidong Xiang ◽  
Josie B. Garner ◽  
Michael E. Goddard ◽  
...  

Genetic variants which affect complex traits (causal variants) are thought to be found in functional regions of the genome. Identifying causal variants would be useful for predicting complex trait phenotypes in dairy cows, however, functional regions are poorly annotated in the bovine genome. Functional regions can be identified on a genome-wide scale by assaying for post-translational modifications to histone proteins (histone modifications) and proteins interacting with the genome (e.g., transcription factors) using a method called Chromatin immunoprecipitation followed by sequencing (ChIP-seq). In this study ChIP-seq was performed to find functional regions in the bovine genome by assaying for four histone modifications (H3K4Me1, H3K4Me3, H3K27ac, and H3K27Me3) and one transcription factor (CTCF) in 6 tissues (heart, kidney, liver, lung, mammary and spleen) from 2 to 3 lactating dairy cows. Eighty-six ChIP-seq samples were generated in this study, identifying millions of functional regions in the bovine genome. Combinations of histone modifications and CTCF were found using ChromHMM and annotated by comparing with active and inactive genes across the genome. Functional marks differed between tissues highlighting areas which might be particularly important to tissue-specific regulation. Supporting the cis-regulatory role of functional regions, the read counts in some ChIP peaks correlated with nearby gene expression. The functional regions identified in this study were enriched for putative causal variants as seen in other species. Interestingly, regions which correlated with gene expression were particularly enriched for potential causal variants. This supports the hypothesis that complex traits are regulated by variants that alter gene expression. This study provides one of the largest ChIP-seq annotation resources in cattle including, for the first time, in the mammary gland of lactating cows. By linking regulatory regions to expression QTL and trait QTL we demonstrate a new strategy for identifying causal variants in cattle.


Sign in / Sign up

Export Citation Format

Share Document