What We Have Learned and Will Learn from Sequence Databases

2018 ◽  
pp. 21-31 ◽  
Author(s):  
Russell F. Doolittle
Keyword(s):  
2019 ◽  
Vol 102 (5) ◽  
pp. 1263-1270 ◽  
Author(s):  
Weili Xiong ◽  
Melinda A McFarland ◽  
Cary Pirone ◽  
Christine H Parker

Abstract Background: To effectively safeguard the food-allergic population and support compliance with food-labeling regulations, the food industry and regulatory agencies require reliable methods for food allergen detection and quantification. MS-based detection of food allergens relies on the systematic identification of robust and selective target peptide markers. The selection of proteotypic peptide markers, however, relies on the availability of high-quality protein sequence information, a bottleneck for the analysis of many plant-based proteomes. Method: In this work, data were compiled for reference tree nut ingredients and evaluated using a parsimony-driven global proteomics workflow. Results: The utility of supplementing existing incomplete protein sequence databases with translated genomic sequencing data was evaluated for English walnut and provided enhanced selection of candidate peptide markers and differentiation between closely related species. Highlights: Future improvements of protein databases and release of genomics-derived sequences are expected to facilitate the development of robust and harmonized LC–tandem MS-based methods for food allergen detection.


1991 ◽  
Vol 198 (2) ◽  
pp. 330-333 ◽  
Author(s):  
Peter R. Sibbald ◽  
Hubert Sommerfeldt ◽  
Patrick Argos

1999 ◽  
Vol 24 (7) ◽  
pp. 276-280 ◽  
Author(s):  
Akhilesh Pandey ◽  
Fran Lewitter

PLoS ONE ◽  
2013 ◽  
Vol 8 (12) ◽  
pp. e82981 ◽  
Author(s):  
Alessandro Tanca ◽  
Antonio Palomba ◽  
Massimo Deligios ◽  
Tiziana Cubeddu ◽  
Cristina Fraumene ◽  
...  

Author(s):  
Vivek Kumar Chaturvedi ◽  
Divya Mishra ◽  
Aprajita Tiwari ◽  
V. P. Snijesh ◽  
Noor Ahmad Shaik ◽  
...  
Keyword(s):  

2012 ◽  
Vol 10 (1) ◽  
pp. 51 ◽  
Author(s):  
Md. Rezaul Karim ◽  
Md. Mamunur Rashid ◽  
Byeong-Soo Jeong ◽  
Ho-Jin Choi

2021 ◽  
Author(s):  
◽  
Cassidy Moeke

<p>The greenshell mussel Perna canaliculus is considered to be a suitable biomonitor for heavy metal pollution. This is due to their ability to accumulate and tolerate heavy metals in their tissues. These characteristics make them useful for identifying protein biomarkers of heavy metal pollution, as well as proteins associated with heavy metal detoxification and homeostasis. However, the identification of such proteins is restricted by the greenshell mussel being poorly represented in sequence databases. Several strategies have previously been used to identify proteins in unsequenced species, but only one of these strategies has been applied to the greenshell mussel. The objective of this thesis was to examine different protein identification strategies using a combined two-dimensional gel electrophoresis and MALDI-TOF/TOF mass spectrometry approach. The protein identification strategies used include a Mascot database search, as well as de novo sequencing approaches using PEAKS DB and SPIDER homology searches. In total, 155 protein spots were excised and a total of 68 identified. Fifty-six proteins were identified using a Mascot search against the Mollusca, NCBInr and Invertebrate EST database, with seven single-peptide identifications. De novo sequencing strategies identified additional proteins, with two from a PEAKS DB search and 10 from an error-tolerant SPIDER homology search. The most noticeable protein groups identified were cytoskeletal proteins, stress response proteins and those involved in protein biosynthesis. Actin and tubulin made up the bulk of the identifications, accounting for 39% of all proteins identified. This multifaceted approach was shown to be useful for identifying proteins in the greenshell mussel Perna canaliculus. Mascot and PEAKS DB performed equally well, while the error-tolerant functionality of SPIDER was useful for identifying additional proteins. A subsequent search against the Invertebrate EST database was also found to be useful for identifying additional proteins. Despite this, more than half of all proteins remained unidentified. Most of these proteins either failed to produce good quality MS spectra or did not find a match to a sequence in the database. Future research should first focus on obtaining quality MS spectra for all proteins concerned and then examine other strategies that may be more suitable for identifying proteins for species with poor representation in sequence databases.</p>


Sign in / Sign up

Export Citation Format

Share Document