scholarly journals Statistical considerations for the analysis of massively parallel reporter assays data

2020 ◽  
Vol 44 (7) ◽  
pp. 785-794
Author(s):  
Dandi Qiao ◽  
Corwin M. Zigler ◽  
Michael H. Cho ◽  
Edwin K. Silverman ◽  
Xiaobo Zhou ◽  
...  
PLoS ONE ◽  
2019 ◽  
Vol 14 (6) ◽  
pp. e0218073 ◽  
Author(s):  
Rajiv Movva ◽  
Peyton Greenside ◽  
Georgi K. Marinov ◽  
Surag Nair ◽  
Avanti Shrikumar ◽  
...  

2019 ◽  
Vol 40 (9) ◽  
pp. 1299-1313 ◽  
Author(s):  
Anat Kreimer ◽  
Zhongxia Yan ◽  
Nadav Ahituv ◽  
Nir Yosef

2019 ◽  
Vol 15 ◽  
pp. P628-P628
Author(s):  
Karen Nuytemans ◽  
Derek J. van Booven ◽  
Natalia K. Hofmann ◽  
Farid Rajabli ◽  
Anthony J. Griswold ◽  
...  

2018 ◽  
Author(s):  
Rajiv Movva ◽  
Peyton Greenside ◽  
Georgi K. Marinov ◽  
Surag Nair ◽  
Avanti Shrikumar ◽  
...  

AbstractThe relationship between noncoding DNA sequence and gene expression is not well-understood. Massively parallel reporter assays (MPRAs), which quantify the regulatory activity of large libraries of DNA sequences in parallel, are a powerful approach to characterize this relationship. We present MPRA-DragoNN, a convolutional neural network (CNN)-based framework to predict and interpret the regulatory activity of DNA sequences as measured by MPRAs. While our method is generally applicable to a variety of MPRA designs, here we trained our model on the Sharpr-MPRA dataset that measures the activity of ~500,000 constructs tiling 15,720 regulatory regions in human K562 and HepG2 cell lines. MPRA-DragoNN predictions were moderately correlated (Spearman ρ = 0.28) with measured activity and were within range of replicate concordance of the assay. State-of-the-art model interpretation methods revealed high-resolution predictive regulatory sequence features that overlapped transcription factor (TF) binding motifs. We used the model to investigate the cell type and chromatin state preferences of predictive TF motifs. We explored the ability of our model to predict the allelic effects of regulatory variants in an independent MPRA experiment and fine map putative functional SNPs in loci associated with lipid traits. Our results suggest that interpretable deep learning models trained on MPRA data have the potential to reveal meaningful patterns in regulatory DNA sequences and prioritize regulatory genetic variants, especially as larger, higher-quality datasets are produced.


2019 ◽  
Author(s):  
Daniel Esposito ◽  
Jochen Weile ◽  
Jay Shendure ◽  
Lea M Starita ◽  
Anthony T Papenfuss ◽  
...  

AbstractMultiplex Assays of Variant Effect (MAVEs), such as deep mutational scans and massively parallel reporter assays, test thousands of sequence variants in a single experiment. Despite the importance of MAVE data for basic and clinical research, there is no standard resource for their discovery and distribution. Here we present MaveDB, a public repository for large-scale measurements of sequence variant impact, designed for interoperability with applications to interpret these datasets. We also describe the first of these applications, MaveVis, which retrieves, visualizes, and contextualizes variant effect maps. Together, the database and applications will empower the community to mine these powerful datasets.


2021 ◽  
Author(s):  
Anat Kreimer ◽  
Tal Ashuach ◽  
Fumitaka Inoue ◽  
Alex Khodaverdian ◽  
Nir Yosef ◽  
...  

AbstractGene regulatory elements play a key role in orchestrating gene expression during cellular differentiation, but what determines their function over time remains largely unknown. Here, we performed perturbation-based massively parallel reporter assays at seven early time points of neural differentiation to systematically characterize how regulatory elements and motifs within them guide cellular differentiation. By perturbing over 2,000 putative DNA binding motifs in active regulatory regions, we delineated four categories of functional elements, and observed that activity direction is mostly determined by the sequence itself, while the magnitude of effect depends on the cellular environment. We also find that fine-tuning transcription rates is often achieved by a combined activity of adjacent activating and repressing elements. Our work provides a blueprint for the sequence components needed to induce different transcriptional patterns in general and specifically during neural differentiation.


2017 ◽  
Author(s):  
Joe Paggi ◽  
Andrew Lamb ◽  
Kevin Tian ◽  
Irving Hsu ◽  
Pierre-Louis Cedoz ◽  
...  

AbstractMassively parallel reporter assays (MPRAs) are a method to probe the effects of short sequences on transcriptional regulation activity. In a MPRA, short sequences are extracted from suspected regulatory regions, inserted into reporter plasmids, transfected into cell-types of interest, and the transcriptional activity of each reporter is assayed. Recently, Ernst et al. presented MPRA data covering 15750 putative regulatory regions. We trained a multitask convolutional neural network architecture using these sequence expression readouts which predicts as output the expression level outputs across four combinations of cell types and promoters. The model allows for the assigning of importance scores to each base through in silico mutagenesis, and the resulting importance scores correlated well with regions enriched for conservation and transcription factor binding.


2019 ◽  
Author(s):  
Jason Klein ◽  
Vikram Agarwal ◽  
Fumitaka Inoue ◽  
Aidan Keith ◽  
Beth Martin ◽  
...  

ABSTRACTMassively parallel reporter assays (MPRAs) functionally screen thousands of sequences for regulatory activity in parallel. Although MPRAs have been applied to address diverse questions in gene regulation, there has been no systematic comparison of how differences in experimental design influence findings. Here, we screen a library of 2,440 sequences, representing candidate liver enhancers and controls, in HepG2 cells for regulatory activity using nine different approaches (including conventional episomal, STARR-seq, and lentiviral MPRA designs). We identify subtle but significant differences in the resulting measurements that correlate with epigenetic and sequence-level features. We also test this library in both orientations with respect to the promoter, validatingen massethat enhancer activity is robustly independent of orientation. Finally, we develop and apply a novel method to assemble and functionally test libraries of the same putative enhancers as 192-mers, 354-mers, and 678-mers, and observe surprisingly large differences in functional activity. This work provides a framework for the experimental design of high-throughput reporter assays, suggesting that the extended sequence context of tested elements, and to a lesser degree the precise assay, influence MPRA results.


Sign in / Sign up

Export Citation Format

Share Document