scholarly journals In vivo mRNA display enables large-scale proteomics by next generation sequencing

2020 ◽  
Vol 117 (43) ◽  
pp. 26710-26718 ◽  
Author(s):  
Panos Oikonomou ◽  
Roberto Salatino ◽  
Saeed Tavazoie

Large-scale proteomic methods are essential for the functional characterization of proteins in their native cellular context. However, proteomics has lagged far behind genomic approaches in scalability, standardization, and cost. Here, we introduce in vivo mRNA display, a technology that converts a variety of proteomics applications into a DNA sequencing problem. In vivo-expressed proteins are coupled with their encoding messenger RNAs (mRNAs) via a high-affinity stem-loop RNA binding domain interaction, enabling high-throughput identification of proteins with high sensitivity and specificity by next generation DNA sequencing. We have generated a high-coverage in vivo mRNA display library of the Saccharomyces cerevisiae proteome and demonstrated its potential for characterizing subcellular localization and interactions of proteins expressed in their native cellular context. In vivo mRNA display libraries promise to circumvent the limitations of mass spectrometry-based proteomics and leverage the exponentially improving cost and throughput of DNA sequencing to systematically characterize native functional proteomes.

2020 ◽  
Author(s):  
Michael W J Hall ◽  
David Shorthouse ◽  
Philip H Jones ◽  
Benjamin A Hall

AbstractThe recent development of highly sensitive DNA sequencing techniques has detected large numbers of missense mutations of genes, including NOTCH1 and 2, in ageing normal tissues. Driver mutations persist and propagate in the tissue through a selective advantage over both wild-type cells and alternative mutations. This process of selection can be considered as a large scale, in vivo screen for mutations that increase clone fitness. It follows that the specific missense mutations that are observed in individual genes may offer us insights into the structure-function relationships. Here we show that the positively selected missense mutations in NOTCH1 and NOTCH2 in human oesophageal epithelium cause inactivation predominantly through protein misfolding. Once these mutations are excluded, we further find statistically significant evidence for selection at the ligand binding interface and calcium binding sites. In this, we observe stronger evidence of selection at the ligand interface on EGF12 over EGF11, suggesting that in this tissue EGF12 may play a more important role in ligand interaction. Finally, we show how a mutation hotspot in the NOTCH1 transmembrane helix arises through the intersection of both a high mutation rate and residue conservation. Together these insights offer a route to understanding the mechanism of protein function through in vivo mutant selection.


2020 ◽  
Vol 48 (8) ◽  
pp. 4507-4520 ◽  
Author(s):  
Smriti Pandey ◽  
Chandra M Gravel ◽  
Oliver M Stockert ◽  
Clara D Wang ◽  
Courtney L Hegner ◽  
...  

Abstract The FinO-domain-protein ProQ is an RNA-binding protein that has been known to play a role in osmoregulation in proteobacteria. Recently, ProQ has been shown to act as a global RNA-binding protein in Salmonella and Escherichia coli, binding to dozens of small RNAs (sRNAs) and messenger RNAs (mRNAs) to regulate mRNA-expression levels through interactions with both 5′ and 3′ untranslated regions (UTRs). Despite excitement around ProQ as a novel global RNA-binding protein, and its potential to serve as a matchmaking RNA chaperone, significant gaps remain in our understanding of the molecular mechanisms ProQ uses to interact with RNA. In order to apply the tools of molecular genetics to this question, we have adapted a bacterial three-hybrid (B3H) assay to detect ProQ’s interactions with target RNAs. Using domain truncations, site-directed mutagenesis and an unbiased forward genetic screen, we have identified a group of highly conserved residues on ProQ’s NTD as the primary face for in vivo recognition of two RNAs, and propose that the NTD structure serves as an electrostatic scaffold to recognize the shape of an RNA duplex.


2019 ◽  
Vol 48 (1) ◽  
pp. 1-18 ◽  
Author(s):  
Celia Blanco ◽  
Evan Janzen ◽  
Abe Pressman ◽  
Ranajay Saha ◽  
Irene A. Chen

The function of fitness (or molecular activity) in the space of all possible sequences is known as the fitness landscape. Evolution is a random walk on the fitness landscape, with a bias toward climbing hills. Mapping the topography of real fitness landscapes is fundamental to understanding evolution, but previous efforts were hampered by the difficulty of obtaining large, quantitative data sets. The accessibility of high-throughput sequencing (HTS) has transformed this study, enabling large-scale enumeration of fitness for many mutants and even complete sequence spaces in some cases. We review the progress of high-throughput studies in mapping molecular fitness landscapes, both in vitro and in vivo, as well as opportunities for future research. Such studies are rapidly growing in number. HTS is expected to have a profound effect on the understanding of real molecular fitness landscapes.


2020 ◽  
Author(s):  
DL Demy ◽  
ML Campanari ◽  
R Munoz-Ruiz ◽  
HD Durham ◽  
BJ Gentil ◽  
...  

AbstractNeurofilaments (NFs), a major cytoskeletal component of motor neurons, play a key role in their differentiation, establishment and maintenance of their morphology and mechanical strength. The de novo assembly of these neuronal intermediate filaments requires the presence of the neurofilament light subunit, NEFL, which expression is reduced in motor neurons in Amyotrophic Lateral Sclerosis (ALS). This study used zebrafish as a model to characterize the NEFL homologue neflb, which encodes two different isoforms via splicing of the primary transcript (neflbE4 and neflbE3). In vivo imaging showed that neflb is crucial for proper neuronal development, and that disrupting the balance between its two isoforms specifically affects NF assembly and motor axon growth, with resulting motor deficits. This equilibrium is also disrupted upon partial depletion of TDP-43, a RNA binding protein that is mislocalized into cytoplasmic inclusions in ALS. The study supports interaction of NEFL expression and splicing with TDP-43 in a common pathway, both biologically and pathogenetically.


Cells ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 1238
Author(s):  
Doris Lou Demy ◽  
Maria Letizia Campanari ◽  
Raphael Munoz-Ruiz ◽  
Heather D. Durham ◽  
Benoit J. Gentil ◽  
...  

Neurofilaments (NFs), a major cytoskeletal component of motor neurons, play a key role in the differentiation, establishment and maintenance of their morphology and mechanical strength. The de novo assembly of these neuronal intermediate filaments requires the presence of the neurofilament light subunit (NEFL), whose expression is reduced in motor neurons in amyotrophic lateral sclerosis (ALS). This study used zebrafish as a model to characterize the NEFL homologue neflb, which encodes two different isoforms via a splicing of the primary transcript (neflbE4 and neflbE3). In vivo imaging showed that neflb is crucial for proper neuronal development, and that disrupting the balance between its two isoforms specifically affects the NF assembly and motor axon growth, with resultant motor deficits. This equilibrium is also disrupted upon the partial depletion of TDP-43 (TAR DNA-binding protein 43), an RNA-binding protein encoded by the gene TARDBP that is mislocalized into cytoplasmic inclusions in ALS. The study supports the interaction of the NEFL expression and splicing with TDP-43 in a common pathway, both biologically and pathogenetically.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Shitao Zhao ◽  
Michiaki Hamada

Abstract Background Protein-RNA interactions play key roles in many processes regulating gene expression. To understand the underlying binding preference, ultraviolet cross-linking and immunoprecipitation (CLIP)-based methods have been used to identify the binding sites for hundreds of RNA-binding proteins (RBPs) in vivo. Using these large-scale experimental data to infer RNA binding preference and predict missing binding sites has become a great challenge. Some existing deep-learning models have demonstrated high prediction accuracy for individual RBPs. However, it remains difficult to avoid significant bias due to the experimental protocol. The DeepRiPe method was recently developed to solve this problem via introducing multi-task or multi-label learning into this field. However, this method has not reached an ideal level of prediction power due to the weak neural network architecture. Results Compared to the DeepRiPe approach, our Multi-resBind method demonstrated substantial improvements using the same large-scale PAR-CLIP dataset with respect to an increase in the area under the receiver operating characteristic curve and average precision. We conducted extensive experiments to evaluate the impact of various types of input data on the final prediction accuracy. The same approach was used to evaluate the effect of loss functions. Finally, a modified integrated gradient was employed to generate attribution maps. The patterns disentangled from relative contributions according to context offer biological insights into the underlying mechanism of protein-RNA interactions. Conclusions Here, we propose Multi-resBind as a new multi-label deep-learning approach to infer protein-RNA binding preferences and predict novel interactions. The results clearly demonstrate that Multi-resBind is a promising tool to predict unknown binding sites in vivo and gain biology insights into why the neural network makes a given prediction.


Author(s):  
Pierre Taberlet ◽  
Aurélie Bonin ◽  
Lucie Zinger ◽  
Eric Coissac

The emergence of eDNA analysis is tightly linked to the development of next-generation sequencing. Chapter 7 “DNA sequencing” gives an overview of the characteristics and limitations of the main next-generation sequencing platforms. It focuses particularly on the Illumina platform, which is the only technology currently suitable for large-scale analysis with hundreds to thousands of samples. More specifically, Chapter 7 describes the Illumina library preparation process, the generation of sequencing clusters by bridge PCR on the flow cell, and the sequencing reaction itself, based on sequencing by synthesis. Finally, detailed information is provided on the meaning and coding of quality scores of the sequencing reads.


Sign in / Sign up

Export Citation Format

Share Document