scholarly journals CompMap: an allele-specific expression read-counter based on competitive mapping

2021 ◽  
Author(s):  
S. Sánchez-Ramírez ◽  
A. D. Cutter

ABSTRACTSummaryChanges to regulatory sequences account for important phenotypic differences between species and populations. In heterozygote individuals, regulatory polymorphism typically manifests as allele-specific expression (ASE) of transcripts. ASE data from inter-species and inter-population hybrids, in conjunction with expression data from the parents, can be used to infer regulatory changes in cis and trans throughout the genome. Improper data handling, however, can create problems of mapping bias and excessive loss of information, which are prone to arise unintentionally from the cumbersome pipelines with multiple dependencies that are common among current methods. Here, we introduce a new, selfcontained method implemented in Python that generates allele-specific expression counts from genotype-specific map alignments. Rather than assessing individual SNPs, our approach sorts and counts reads within a given homologous region by comparing individual read-mapping statistics from each parental alignment. Reads that are aligned ambiguously to both references are resolved proportionally to the allele-specific matching read counts or statistically using a binomial distribution. Using simulations, we show CompMap has low error rates in assessing regulatory divergence.AvailabilityThe Python code with examples and installation instructions is available on the GitHub repository https://github.com/santiagosnchez/[email protected] information

2019 ◽  
Vol 10 ◽  
Author(s):  
Mazdak Salavati ◽  
Stephen J. Bush ◽  
Sergio Palma-Vera ◽  
Mary E. B. McCulloch ◽  
David A. Hume ◽  
...  

2020 ◽  
Vol 36 (19) ◽  
pp. 4955-4956
Author(s):  
Lili Dong ◽  
Jianan Wang ◽  
Guohua Wang

Abstract Summary Allele-specific expression (ASE) is involved in many important biological mechanisms. We present a python package BYASE and its graphical user interface (GUI) tool BYASE-GUI for the identification of ASE from single-end and paired-end RNA-seq data based on Bayesian inference, which can simultaneously report differences in gene-level and isoform-level expression. BYASE uses both phased SNPs and non-phased SNPs, and supports polyploid organisms. Availability and implementation The source codes of BYASE and BYASE-GUI are freely available at https://github.com/ncjllld/byase and https://github.com/ncjllld/byase_gui. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Junjie Shao ◽  
Sangang He ◽  
Xiangyu Pan ◽  
Zhirui Yang ◽  
Hojjat Asadollahpour Nanaei ◽  
...  

Abstract Background: The thin-tailed sheep breeds from Europe and the fat-tailed sheep breeds from China exhibit distinct phenotypic differences in fat deposition and meat production traits. However, the molecular mechanisms underlying gene expression related to these phenotypic differences are not well understood. Allele-specific expression (ASE) refers to the significant imbalance of expression levels of two parental alleles. Characterization of such events in F1 hybrid offspring generated from these two groups of sheep breeds can minimize the external factors influencing gene expression and reveal the variants with a cis -regulatory effect on gene expression. The aim of the present study was to investigate the genetic factors that influence different fat-deposition and meat production traits between thin- and fat-tailed sheep.Results: Fifteen F1 hybrids were generated from crosses between Texel and Kazakh sheep as the representative phenotypes of thin- and fat-tailed breeds, respectively. Totally, 33 whole genomes from F1 individuals and their parents were sequenced with an average depth of ~17.21× coverage per sample. ASE analysis results from 70 RNA-seq samples of adipose and skeleton muscle tissues showed 128 ASE candidate genes were related to the function of fat deposition and meat production traits. A genome-wide scan of selective sweeps was also conducted between these two groups of sheep breeds in an effort to identify genomic regions related to fat deposition and meat production, respectively. We detected signatures of selection in ASE genes associated with fat deposition (e.g., PDGFD ) and meat production traits (e.g., LRCC2 ). Further analysis suggested that PDGFD and LRCC2 genes were speculated to be causative genes for fat deposition and meat production traits in sheep, respectively. Furthermore, AMPK signaling pathway was significantly enriched in ASE genes related to fatty acid biosynthesis in both adipose and skeleton muscle tissues, while PPAR signaling pathway was significantly enriched in ASE genes related to lipid metabolism in adipose tissue. Conclusions: Our finding illustrates that the expression of identified ASE genes could potentially lead to the differences in traits of fat deposition and meat production between thin- and fat-tailed sheep. Keywords: allele-specific expression, phenotypic difference, thin- and fat-tailed sheep, whole-genome sequencing, transcriptome


2019 ◽  
Author(s):  
Xinwen Zhang ◽  
J.J. Emerson

AbstractGene expression variation between alleles in a diploid cell is mediated by variation in cis regulatory sequences, which usually refers to the differences in DNA sequence between two alleles near the gene of interest. Expression differences caused by cis variation has been estimated by the ratio of the expression level of the two alleles under a binomial model. However, the binomial model underestimates the variance among replicated experiments resulting in the exaggerated statistical significance of estimated cis effects and thus many false discoveries of cis-affected genes. Here we describe a beta-binomial model that estimates the cis-effect for each gene while permitting overdispersion of variance among replicates. We demonstrated with simulated null data (data without true cis-effect) that the new model fits the true distribution better, resulting in approximately 5% false positive rate under 5% significance level in all null datasets, considerably better than the 6%-40% false positive rate of the binomial model. Additional replicates increase the performance of the beta-binomial model but not of the binomial model. We also collected new allele-specific expression data from an experiment comprised of 20 replicates of a yeast hybrid (YPS128/RM11-1a). We eliminated the mapping bias problem with de novo assemblies of the two parental genomes. By applying the beta-binomial model to this dataset, we found that cis effects are ubiquitous, affecting around 70% of genes. However, most of these changes are small in magnitude. The high number of replicates enabled us a better approximation of cis landscape within species and also provides a resource for future exploration for better models.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
M. Joseph Tomlinson ◽  
Shawn W. Polson ◽  
Jing Qiu ◽  
Juniper A. Lake ◽  
William Lee ◽  
...  

AbstractDifferential abundance of allelic transcripts in a diploid organism, commonly referred to as allele specific expression (ASE), is a biologically significant phenomenon and can be examined using single nucleotide polymorphisms (SNPs) from RNA-seq. Quantifying ASE aids in our ability to identify and understand cis-regulatory mechanisms that influence gene expression, and thereby assist in identifying causal mutations. This study examines ASE in breast muscle, abdominal fat, and liver of commercial broiler chickens using variants called from a large sub-set of the samples (n = 68). ASE analysis was performed using a custom software called VCF ASE Detection Tool (VADT), which detects ASE of biallelic SNPs using a binomial test. On average ~ 174,000 SNPs in each tissue passed our filtering criteria and were considered informative, of which ~ 24,000 (~ 14%) showed ASE. Of all ASE SNPs, only 3.7% exhibited ASE in all three tissues, with ~ 83% showing ASE specific to a single tissue. When ASE genes (genes containing ASE SNPs) were compared between tissues, the overlap among all three tissues increased to 20.1%. Our results indicate that ASE genes show tissue-specific enrichment patterns, but all three tissues showed enrichment for pathways involved in translation.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Asia Mendelevich ◽  
Svetlana Vinogradova ◽  
Saumya Gupta ◽  
Andrey A. Mironov ◽  
Shamil R. Sunyaev ◽  
...  

AbstractA sensitive approach to quantitative analysis of transcriptional regulation in diploid organisms is analysis of allelic imbalance (AI) in RNA sequencing (RNA-seq) data. A near-universal practice in such studies is to prepare and sequence only one library per RNA sample. We present theoretical and experimental evidence that data from a single RNA-seq library is insufficient for reliable quantification of the contribution of technical noise to the observed AI signal; consequently, reliance on one-replicate experimental design can lead to unaccounted-for variation in error rates in allele-specific analysis. We develop a computational approach, Qllelic, that accurately accounts for technical noise by making use of replicate RNA-seq libraries. Testing on new and existing datasets shows that application of Qllelic greatly decreases false positive rate in allele-specific analysis while conserving appropriate signal, and thus greatly improves reproducibility of AI estimates. We explore sources of technical overdispersion in observed AI signal and conclude by discussing design of RNA-seq studies addressing two biologically important questions: quantification of transcriptome-wide AI in one sample, and differential analysis of allele-specific expression between samples.


Genetics ◽  
2013 ◽  
Vol 195 (3) ◽  
pp. 1157-1166 ◽  
Author(s):  
Sandrine Lagarrigue ◽  
Lisa Martin ◽  
Farhad Hormozdiari ◽  
Pierre-François Roux ◽  
Calvin Pan ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document