Resolving Missing Protein Problems Using Functional Class Scoring

Author(s):  
Bertrand Jernhan Wong ◽  
Weijia Kong ◽  
Limsoon Wong ◽  
Wilson Wen Bin Goh

Abstract Despite technological advances in proteomics, incomplete coverage and inconsistency issues persist, resulting in “data holes”. These data holes cause the missing protein problem (MPP), where relevant proteins are persistently unobserved, or sporadically observed across samples. This hinders biomarker and drug discovery from proteomics data. Network-based approaches are powerful: The Functional Class Scoring (FCS) method using protein complexes was able to easily recover missed proteins with weak or partial support. However, there are limitations: The verification approach (in determining missing protein recovery) is potentially biased as the test data was based on relatively outdated Data-Dependent Acquisition (DDA) proteomics and FCS does not provide a scoring scheme for individual protein components (in significant complexes). To address these issues: First, we devised a more rigorous evaluation of FCS based on same-sample technical replicates. And second, we evaluate using data from more recent Data-Independent Acquisition (DIA) technologies (viz. SWATH).Although cross-replicate examination reveals some inconsistencies amongst same-class samples, tissue-differentiating signal is nonetheless strongly conserved. This confirms FCS as a viable method that selects biologically meaningful networks. We also report that predicted missing proteins are statistically significant based on FCS p-values. Although cross-replicate verification rates are not spectacular, the predicted missing proteins as a whole, have higher peptide support than non-predicted proteins. FCS also has the capacity to predict missing proteins that are often lost due to weak specific peptide support. As a yet unresolved limitation, we find that FCS cannot assign meaningful probabilities to individual protein components (no relationship between actual probability of verification and FCS-assigned probability) as it only provides a p-value at the level of complexes.

2019 ◽  
Vol 17 (02) ◽  
pp. 1950013
Author(s):  
Yaxing Zhao ◽  
Andrew Chi-Hau Sue ◽  
Wilson Wen Bin Goh

Functional Class Scoring (FCS) is a network-based approach previously demonstrated to be powerful in missing protein prediction (MPP). We update its performance evaluation using data derived from new proteomics technology (SWATH) and also checked for reproducibility using two independent datasets profiling kidney tissue proteome. We also evaluated the objectivity of the FCS p-value, and followed up on the value of MPP from predicted complexes. Our results suggest that (1) FCS [Formula: see text]-values are non-objective, and are confounded strongly by complex size, (2) best recovery performance do not necessarily lie at standard [Formula: see text]-value cutoffs, (3) while predicted complexes may be used for augmenting MPP, they are inferior to real complexes, and are further confounded by issues relating to network coverage and quality and (4) moderate sized complexes of size 5 to 10 still exhibit considerable instability, we find that FCS works best with big complexes. While FCS is a powerful approach, blind reliance on its non-objective [Formula: see text]-value is ill-advised.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Yalan Xu ◽  
Xiuyue Song ◽  
Dong Wang ◽  
Yin Wang ◽  
Peifeng Li ◽  
...  

AbstractChemical synapses in the brain connect neurons to form neural circuits, providing the structural and functional bases for neural communication. Disrupted synaptic signaling is closely related to a variety of neurological and psychiatric disorders. In the past two decades, proteomics has blossomed as a versatile tool in biological and biomedical research, rendering a wealth of information toward decoding the molecular machinery of life. There is enormous interest in employing proteomic approaches for the study of synapses, and substantial progress has been made. Here, we review the findings of proteomic studies of chemical synapses in the brain, with special attention paid to the key players in synaptic signaling, i.e., the synaptic protein complexes and their post-translational modifications. Looking toward the future, we discuss the technological advances in proteomics such as data-independent acquisition mass spectrometry (DIA-MS), cross-linking in combination with mass spectrometry (CXMS), and proximity proteomics, along with their potential to untangle the mystery of how the brain functions at the molecular level. Last but not least, we introduce the newly developed synaptomic methods. These methods and their successful applications marked the beginnings of the synaptomics era.


2021 ◽  
Vol 7 (1) ◽  
pp. 11 ◽  
Author(s):  
André P. Gerber

RNA–protein interactions frame post-transcriptional regulatory networks and modulate transcription and epigenetics. While the technological advances in RNA sequencing have significantly expanded the repertoire of RNAs, recently developed biochemical approaches combined with sensitive mass-spectrometry have revealed hundreds of previously unrecognized and potentially novel RNA-binding proteins. Nevertheless, a major challenge remains to understand how the thousands of RNA molecules and their interacting proteins assemble and control the fate of each individual RNA in a cell. Here, I review recent methodological advances to approach this problem through systematic identification of proteins that interact with particular RNAs in living cells. Thereby, a specific focus is given to in vivo approaches that involve crosslinking of RNA–protein interactions through ultraviolet irradiation or treatment of cells with chemicals, followed by capture of the RNA under study with antisense-oligonucleotides and identification of bound proteins with mass-spectrometry. Several recent studies defining interactomes of long non-coding RNAs, viral RNAs, as well as mRNAs are highlighted, and short reference is given to recent in-cell protein labeling techniques. These recent experimental improvements could open the door for broader applications and to study the remodeling of RNA–protein complexes upon different environmental cues and in disease.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Leyla A. Erozenci ◽  
Sander R. Piersma ◽  
Thang V. Pham ◽  
Irene V. Bijnsdorp ◽  
Connie R. Jimenez

AbstractThe protein content of urinary extracellular vesicles (EVs) is considered to be an attractive non-invasive biomarker source. However, little is known about the consistency and variability of urinary EV proteins within and between individuals over a longer time-period. Here, we evaluated the stability of the urinary EV proteomes of 8 healthy individuals at 9 timepoints over 6 months using data-independent-acquisition mass spectrometry. The 1802 identified proteins had a high correlation amongst all samples, with 40% of the proteome detected in every sample and 90% detected in more than 1 individual at all timepoints. Unsupervised analysis of top 10% most variable proteins yielded person-specific profiles. The core EV-protein-interaction network of 516 proteins detected in all measured samples revealed sub-clusters involved in the biological processes of G-protein signaling, cytoskeletal transport, cellular energy metabolism and immunity. Furthermore, gender-specific expression patterns were detected in the urinary EV proteome. Our findings indicate that the urinary EV proteome is stable in longitudinal samples of healthy subjects over a prolonged time-period, further underscoring its potential for reliable non-invasive diagnostic/prognostic biomarkers.


Author(s):  
Farkhanda Manzoor ◽  
Rooma Adalat ◽  
Tallat Anwar Faridi ◽  
Wafa Fatima ◽  
Muhammad Moazzam ◽  
...  

Dengue fever is an arbo-viral infection, widespread all over the world. In 21th Century, there is no safe affordable and effective vaccine accessible yet; vector control is that most effective method for the control of the disease Objective: To determine the susceptibility status of wild and laboratory strains larvae and adults of Aedes aegypti against different group of insecticides in Lahore city. Method: From Lahore sites, larvae were collected where insecticides used for wild strain at high frequency and quantity. The Insectary of National Institute of Malaria Research and Training (NIMRT), Lahore, Pakistan, adults and larvae were collected for laboratory strain.The laboratory strains for larvae bioassays were used. The mosquitoes populations indoor and outdoor collected in 2009, hatched from larvae into adults insectary in Lahore, Pakistan. During this study, four major insecticides groups are used which include Pyrethroids (Deltamethrine 2.5% SC), Neonicotenoids (Imidacloprid 5% SC), Phenyl-pyrazoles (Fipronil 2.5% EC) and Organophosphates dichlorvos (DDVP 50% EC). For data analysis, Minitab statistical software (Version 13.20) used for data expressed as mean ± S.E.M from bioassays. By using EPA Probit, LC was estimated with 95% confidence. The 50 statistically significant p value was <= 0.05. For comparing the concentrations of insecticides, Duncan's multiple range tests was used with significant difference (5% level) using at New Costat. Results: Different location of Lahore samples, Imidacloprid the most toxic to Aedes aegypti's wild strains on the other hand while Fipronil was also active for wild larval samples. Deltamethrine showed least activity against both adults and larval strains. The susceptibility of the eld strains was lower than laboratory strains; resistant ratio varies from insecticide to insecticide. In reporting results, mosquitos' population was resistance because of infrequent and incomplete coverage. Conclusions: This study concluded that Pyrethroids and agriculture pest control play role in indirect growth of insecticides' classes. Based on this study it is suggested that by using new strategies to prevent and delay in growth of insecticides will helpful in Lahore, city, Pakistan.


2021 ◽  
Author(s):  
Sonja Walter ◽  
Jeong-Dong Lee

This research aims to investigate the link between human capital depreciation and job tasks, with an emphasis on potential differences between education levels. We estimate an extended Mincer equation based on Neumann and Weiss’s (1995) model using data from the German Socio-Economic Panel. The results show that human capital gained from higher education levels depreciates at a faster rate than other human capital. Moreover, the productivity-enhancing value of education diminishes faster in jobs with a high share of non-routine analytical, non-routine manual, and routine cognitive tasks. These jobs are characterized by more frequent changes in core-skill or technology-skill requirements. The key implication of this research is that education should focus on equipping workers with more general skills in all education levels. With ongoing technological advances, work environments, and with it, skill demands will change, increasing the importance to provide educational and lifelong learning policies to counteract the depreciation of skills. The study contributes by incorporating a task perspective based on the classification used in works on job polarization. This allows a comparison with studies on job obsolescence due to labor-replacing technologies and enables combined education and labor market policies to address the challenges imposed by the Fourth Industrial Revolution.


2021 ◽  
Author(s):  
Alejandro Fernandez-Vega ◽  
Federica Farabegoli ◽  
Maria Mercedes Alonso-Martinez ◽  
Ignacio Ortea

Data-independent acquisition (DIA) methods have gained great popularity in bottom-up quantitative proteomics, as they overcome the irreproducibility and under-sampling limitations of data-dependent acquisition (DDA). diaPASEF, recently developed for the timsTOF Pro mass spectrometers, has brought improvements to DIA, providing additional ion separation (in the ion mobility dimension) and increasing sensitivity. Several studies have benchmarked different workflows for DIA quantitative proteomics, but mostly using instruments from Sciex and Thermo, and therefore, the results are not extrapolable to diaPASEF data. In this work, using a real-life sample set like the one that can be found in any proteomics experiment, we compared the results of analyzing PASEF data with different combinations of library-based and library-free analysis, combining the tools of the FragPipe suite, DIA-NN and including MS1-level LFQ with DDA-PASEF data, and also comparing with the workflows possible in Spectronaut. We verified that library-independent workflows, not so efficient not so long ago, have greatly improved in the recent versions of the software tools, and now perform as well or even better than library-based ones. We report here information so that the user who is going to conduct a relative quantitative proteomics study using a timsTOF Pro mass spectrometer can make an informed decision on how to acquire (diaPASEF for DIA analysis, or DDA-PASEF for MS1-level LFQ) the samples, and what can be expected depending on the data analysis tool used, among the different alternatives offered by the recently optimized tools for TIMS-PASEF data analysis.


2020 ◽  
Vol 52 (4) ◽  
Author(s):  
Nisa Fauziah ◽  
Arie Galih Mohamad ◽  
Naufal Fakhri Nugraha ◽  
Lia Faridah ◽  
Jontari Hutagalung

More than half of the areas in East Nusa Tenggara province, a province in the eastern part of Indonesia, are planned to be free from malaria by the end of 2030. However, one of the critical indicators for malaria elimination is still lacking, i.e. vectors’ environment and breeding place indicators. South Central Timor (SCT) District is one of the areas with the highest Annual Parasite Incidence (API) >2‰ with the majority of the population works as farmers. The purpose of this study was to capture the relationship between environmental factors and the prevalence of malaria. This study was a cross-sectional analytic retrospective study using data from a previous malaria study conducted in August 2013 to September 2014 in 5 sub-districts of SCT district. All respondents were selected using the systematic random sampling approach from the population of healthy people. Data were collected using a standard questionnaire and an observation environment form. Malaria was confirmed through microscopic and Polymerase Chain Reaction (PCR) examinations. Data were then analyzed using the bivariate and multivariate analysis with 95% CI and α:0.05. Of 357 data collected, 35% (125/357) were malaria positive based on PCR examination. Two variables (living nearby lagoon and nearby rice field) were significant (p-value<0.05) as vector shelters for Anopheles sp. Thus, these have to be included as inputs to formulate effective and efficient malaria elimination strategies and programs in 2030.


Sign in / Sign up

Export Citation Format

Share Document