scholarly journals MPRAdecoder: Processing of the Raw MPRA Data With a priori Unknown Sequences of the Region of Interest and Associated Barcodes

2021 ◽  
Vol 12 ◽  
Author(s):  
Anna E. Letiagina ◽  
Evgeniya S. Omelina ◽  
Anton V. Ivankin ◽  
Alexey V. Pindyurin

Massively parallel reporter assays (MPRAs) enable high-throughput functional evaluation of numerous DNA regulatory elements and/or their mutant variants. The assays are based on the construction of reporter plasmid libraries containing two variable parts, a region of interest (ROI) and a barcode (BC), located outside and within the transcription unit, respectively. Importantly, each plasmid molecule in a such a highly diverse library is characterized by a unique BC–ROI association. The reporter constructs are delivered to target cells and expression of BCs at the transcript level is assayed by RT-PCR followed by next-generation sequencing (NGS). The obtained values are normalized to the abundance of BCs in the plasmid DNA sample. Altogether, this allows evaluating the regulatory potential of the associated ROI sequences. However, depending on the MPRA library construction design, the BC and ROI sequences as well as their associations can be a priori unknown. In such a case, the BC and ROI sequences, their possible mutant variants, and unambiguous BC–ROI associations have to be identified, whereas all uncertain cases have to be excluded from the analysis. Besides the preparation of additional “mapping” samples for NGS, this also requires specific bioinformatics tools. Here, we present a pipeline for processing raw MPRA data obtained by NGS for reporter construct libraries with a priori unknown sequences of BCs and ROIs. The pipeline robustly identifies unambiguous (so-called genuine) BCs and ROIs associated with them, calculates the normalized expression level for each BC and the averaged values for each ROI, and provides a graphical visualization of the processed data.

2015 ◽  
Author(s):  
Ilias Georgakopoulos-Soares ◽  
Naman Jain ◽  
Jesse Gray ◽  
Martin Hemberg

DNA regulatory elements contain short motifs where transcription factors (TFs) can bind to modulate gene expression. Although the broad principles of TF regulation are well understood, the rules that dictate how combinatorial TF binding translates into transcriptional activity remain largely unknown. With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as Massively Parallel Reporter Assays (MPRAs) and similar methods remains challenging. We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequences therefore allowing the user to investigate the rules that govern TF occupancy. MPRA SNP design can be used to investigate the functional effects of single or combinations of SNPs at regulatory sequences. Finally, the Transmutation tool allows for the design of negative controls by permitting scrambling, reversing, complementing or introducing multiple random mutations in the input sequences or motifs.


Genetics ◽  
2020 ◽  
Vol 217 (1) ◽  
Author(s):  
Jaclyn M Noshay ◽  
Alexandre P Marand ◽  
Sarah N Anderson ◽  
Peng Zhou ◽  
Maria Katherine Mejia Guerra ◽  
...  

Abstract Transposable elements (TEs) have the potential to create regulatory variation both through the disruption of existing DNA regulatory elements and through the creation of novel DNA regulatory elements. In a species with a large genome, such as maize, many TEs interspersed with genes create opportunities for significant allelic variation due to TE presence/absence polymorphisms among individuals. We used information on putative regulatory elements in combination with knowledge about TE polymorphisms in maize to identify TE insertions that interrupt existing accessible chromatin regions (ACRs) in B73 as well as examples of polymorphic TEs that contain ACRs among four inbred lines of maize including B73, Mo17, W22, and PH207. The TE insertions in three other assembled maize genomes (Mo17, W22, or PH207) that interrupt ACRs that are present in the B73 genome can trigger changes to the chromatin, suggesting the potential for both genetic and epigenetic influences of these insertions. Nearly 20% of the ACRs located over 2 kb from the nearest gene are located within an annotated TE. These are regions of unmethylated DNA that show evidence for functional importance similar to ACRs that are not present within TEs. Using a large panel of maize genotypes, we tested if there is an association between the presence of TE insertions that interrupt, or carry, an ACR and the expression of nearby genes. While most TE polymorphisms are not associated with expression for nearby genes, the TEs that carry ACRs exhibit enrichment for being associated with higher expression of nearby genes, suggesting that these TEs may contribute novel regulatory elements. These analyses highlight the potential for a subset of TEs to rewire transcriptional responses in eukaryotic genomes.


1993 ◽  
Vol 296 (3) ◽  
pp. 663-670 ◽  
Author(s):  
M F Wilkemeyer ◽  
E R Andrews ◽  
F D Ledley

Methylmalonyl-CoA mutase (MCM) is a nuclear-encoded mitochondrial matrix enzyme. We have reported characterization of murine MCM and cloning of a murine MCM cDNA and now describe the murine Mut locus, its promoter and evidence for tissue-specific variation in MCM mRNA, enzyme and holo-enzyme levels. The Mut locus spans 30 kb and contains 13 exons constituting a unique transcription unit. A B1 repeat element was found in the 3′ untranslated region (exon 13). The transcription initiation site was identified and upstream sequences were shown to direct expression of a reporter gene in cultured cells. The promoter contains sequence motifs characteristic of: (1) TATA-less housekeeping promoters; (2) enhancer elements purportedly involved in co-ordinating expression of nuclear-encoded mitochondrial proteins; and (3) regulatory elements including CCAAT boxes, cyclic AMP-response elements and potential AP-2-binding sites. Northern blots demonstrate a greater than 10-fold variation in steady-state mRNA levels, which correlate with tissue levels of enzyme activity. However, the ratio of holoenzyme to total enzyme varies among different tissues, and there is no correlation between steady-state mRNA levels and holoenzyme activity. These results suggest that, although there may be regulation of MCM activity at the level of mRNA, the significance of genetic regulation is unclear owning to the presence of epigenetic regulation of holoenzyme formation.


eLife ◽  
2014 ◽  
Vol 3 ◽  
Author(s):  
Andrew R Bassett ◽  
Asifa Akhtar ◽  
Denise P Barlow ◽  
Adrian P Bird ◽  
Neil Brockdorff ◽  
...  

Although a small number of the vast array of animal long non-coding RNAs (lncRNAs) have known effects on cellular processes examined in vitro, the extent of their contributions to normal cell processes throughout development, differentiation and disease for the most part remains less clear. Phenotypes arising from deletion of an entire genomic locus cannot be unequivocally attributed either to the loss of the lncRNA per se or to the associated loss of other overlapping DNA regulatory elements. The distinction between cis- or trans-effects is also often problematic. We discuss the advantages and challenges associated with the current techniques for studying the in vivo function of lncRNAs in the light of different models of lncRNA molecular mechanism, and reflect on the design of experiments to mutate lncRNA loci. These considerations should assist in the further investigation of these transcriptional products of the genome.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Julius Judd ◽  
Hayley Sanderson ◽  
Cédric Feschotte

Abstract Background Transposable elements are increasingly recognized as a source of cis-regulatory variation. Previous studies have revealed that transposons are often bound by transcription factors and some have been co-opted into functional enhancers regulating host gene expression. However, the process by which transposons mature into complex regulatory elements, like enhancers, remains poorly understood. To investigate this process, we examined the contribution of transposons to the cis-regulatory network controlling circadian gene expression in the mouse liver, a well-characterized network serving an important physiological function. Results ChIP-seq analyses reveal that transposons and other repeats contribute ~ 14% of the binding sites for core circadian regulators (CRs) including BMAL1, CLOCK, PER1/2, and CRY1/2, in the mouse liver. RSINE1, an abundant murine-specific SINE, is the only transposon family enriched for CR binding sites across all datasets. Sequence analyses and reporter assays reveal that the circadian regulatory activity of RSINE1 stems from the presence of imperfect CR binding motifs in the ancestral RSINE1 sequence. These motifs matured into canonical motifs through point mutations after transposition. Furthermore, maturation occurred preferentially within elements inserted in the proximity of ancestral CR binding sites. RSINE1 also acquired motifs that recruit nuclear receptors known to cooperate with CRs to regulate circadian gene expression specifically in the liver. Conclusions Our results suggest that the birth of enhancers from transposons is predicated both by the sequence of the transposon and by the cis-regulatory landscape surrounding their genomic integration site.


Circulation ◽  
2012 ◽  
Vol 125 (suppl_10) ◽  
Author(s):  
Christy L Avery ◽  
Praveen Sethupathy ◽  
Steven Buyske ◽  
Q. C He ◽  
Dan Y Lin ◽  
...  

The QT interval (QT) is a heritable trait and its prolongation is an established risk factor for ventricular tachyarrhythmia and sudden cardiac death. Most genetic studies of QT have examined populations of European ancestry, although the increased genetic diversity in populations of African descent provides opportunity for fine-mapping, which can help narrow association signals and identify candidates for functional characterization. We examined whether eleven previously identified QT loci comprising 6,681 variants on the Illumina Metabochip array were associated with QT in 7,516 African American participants from the Atherosclerosis Risk in Communities study and Women’s Health Initiative clinical trial. Among associated loci, we used conditional analyses and queried bioinformatics databases to identify and functionally categorize signals. We identified nine of the eleven QT loci in African American populations ( P <0.0045 under an additive genetic model adjusting for ancestry and demographic characteristics: NOS1AP, ATP1B1, SCN5A, SLC35F1, KCNH2, KCNQ1, LITAF, NDRG4, and RFFL ). We also identified two independent secondary signals in NOS1AP and ATP1B1 ( P < 7.4x10 −6 ). Conditional analyses adjusting for published loci in European populations demonstrated that eight of these eleven SNPs (nine primary; two secondary) were independent of previously reported SNPs. We then performed the first bioinformatics-based functional characterization of QT loci using the eleven primary and secondary variants and SNPs in strong LD (r 2 > 0.5) among these African American participants. Only the SCN5A locus included a non-synonymous coding variant (rs1805124, H558R, r 2 = 0.7 with primary SNP rs9871385, P = 4.7x10 −4 ). The remaining ten loci harbored variants located exclusively within non-coding regions. Specifically, three contained SNPs within candidate long-range regulatory elements in human cardiomyocytes, five were in or near annotated promoter regions, and the remaining two were in un-annotated, but highly conserved non-coding elements. Several of the QT risk alleles at these SNPs significantly alter the predicted binding affinity for transcription factors, such as TBX5 and AhR, which have been previously implicated in cardiac formation and function. In summary, the findings provide compelling evidence that the same genes influence variation in QT across global populations and that additional, independent signals exist in African Americans. Moreover, of those SNPs identified as strong candidates for functional evaluation, the majority implicate gene regulatory dysfunction in QT prolongation.


2000 ◽  
Vol 20 (16) ◽  
pp. 6040-6050 ◽  
Author(s):  
Jorge A. Iñiguez-Lluhí ◽  
David Pearce

ABSTRACT DNA regulatory elements frequently harbor multiple recognition sites for several transcriptional activators. The response mounted from such compound response elements is often more pronounced than the simple sum of effects observed at single binding sites. The determinants of such transcriptional synergy and its control, however, are poorly understood. Through a genetic approach, we have uncovered a novel protein motif that limits the transcriptional synergy of multiple DNA-binding regulators. Disruption of these conserved synergy control motifs (SC motifs) selectively increases activity at compound, but not single, response elements. Although isolated SC motifs do not regulate transcription when tethered to DNA, their transfer to an activator lacking them is sufficient to impose limits on synergy. Mechanistic analysis of the two SC motifs found in the glucocorticoid receptor N-terminal region reveals that they function irrespective of the arrangement of the receptor binding sites or their distance from the transcription start site. Proper function, however, requires the receptor's ligand-binding domain and an engaged dimer interface. Notably, the motifs are not functional in yeast and do not alter the effect of p160 coactivators, suggesting that they require other nonconserved components to operate. Many activators across multiple classes harbor seemingly unrelated negative regulatory regions. The presence of SC motifs within them, however, suggests a common function and identifies SC motifs as critical elements of a general mechanism to modulate higher-order interactions among transcriptional regulators.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Sinisa Hrvatin ◽  
Christopher P Tzeng ◽  
M Aurel Nagy ◽  
Hume Stroud ◽  
Charalampia Koutsioumpa ◽  
...  

Enhancers are the primary DNA regulatory elements that confer cell type specificity of gene expression. Recent studies characterizing individual enhancers have revealed their potential to direct heterologous gene expression in a highly cell-type-specific manner. However, it has not yet been possible to systematically identify and test the function of enhancers for each of the many cell types in an organism. We have developed PESCA, a scalable and generalizable method that leverages ATAC- and single-cell RNA-sequencing protocols, to characterize cell-type-specific enhancers that should enable genetic access and perturbation of gene function across mammalian cell types. Focusing on the highly heterogeneous mammalian cerebral cortex, we apply PESCA to find enhancers and generate viral reagents capable of accessing and manipulating a subset of somatostatin-expressing cortical interneurons with high specificity. This study demonstrates the utility of this platform for developing new cell-type-specific viral reagents, with significant implications for both basic and translational research.


2020 ◽  
Vol 2 (Supplement_4) ◽  
pp. iv3-iv14
Author(s):  
Niha Beig ◽  
Kaustav Bera ◽  
Pallavi Tiwari

Abstract Neuro-oncology largely consists of malignancies of the brain and central nervous system including both primary as well as metastatic tumors. Currently, a significant clinical challenge in neuro-oncology is to tailor therapies for patients based on a priori knowledge of their survival outcome or treatment response to conventional or experimental therapies. Radiomics or the quantitative extraction of subvisual data from conventional radiographic imaging has recently emerged as a powerful data-driven approach to offer insights into clinically relevant questions related to diagnosis, prediction, prognosis, as well as assessing treatment response. Furthermore, radiogenomic approaches provide a mechanism to establish statistical correlations of radiomic features with point mutations and next-generation sequencing data to further leverage the potential of routine MRI scans to serve as “virtual biopsy” maps. In this review, we provide an introduction to radiomic and radiogenomic approaches in neuro-oncology, including a brief description of the workflow involving preprocessing, tumor segmentation, and extraction of “hand-crafted” features from the segmented region of interest, as well as identifying radiogenomic associations that could ultimately lead to the development of reliable prognostic and predictive models in neuro-oncology applications. Lastly, we discuss the promise of radiomics and radiogenomic approaches in personalizing treatment decisions in neuro-oncology, as well as the challenges with clinical adoption, which will rely heavily on their demonstrated resilience to nonstandardization in imaging protocols across sites and scanners, as well as in their ability to demonstrate reproducibility across large multi-institutional cohorts.


2021 ◽  
Vol 8 (3) ◽  
pp. 741-748
Author(s):  
Farah Afiqah Baharuddin ◽  
Zhan Xuan Khong ◽  
Zamri Zainal ◽  
Noor Liyana Sukiran

Auxin Binding Protein 57 (ABP57) is one of the molecular components involved in rice response to abiotic stress. The ABP57 gene encodes an auxin receptor which functions in activating the plasma membrane H+-ATPase. Biochemical properties of ABP57 have been characterized; however, the function of ABP57, particularly on stress and hormone responses is still limited. This study was conducted to understand the regulation of ABP57 expression under abiotic stress. Thus, in silico identification of cis-acting regulatory elements (CAREs) in the promoter region of ABP57 was performed. Several motifs and transcription factor binding site (TFBS) that are involved in abiotic stress such as ABRE, DRE, AP2/EREBP, WRKY and NAC were identified. Next, expression analysis of ABP57 under drought, salt, auxin (IAA) and abscisic acid (ABA) was conducted by reverse transcription-PCR (RT-PCR) to verify the effect of these treatments on ABP57 transcript level. ABP57 was expressed at different levels in the shoot and root under drought conditions, and its expression was increased under IAA and ABA treatments. Moreover, our results showed that ABP57 expression in the root was more responsive to drought, auxin and ABA treatments compared to its transcript in the shoot. This finding suggests that ABP57 is a drought-responsive gene and possibly regulated by IAA and ABA.


Sign in / Sign up

Export Citation Format

Share Document