scholarly journals Graphical illustration for explaining mass spectrum fingerprinting in microbial identification

Author(s):  
Wenfa Ng

Pattern recognition is commonly used for identifying an unknown entity from a set of known objects curated in a database, and find use in various applications such as fingerprint matching and microbial identification. Mass spectrometry is increasingly used in identifying microbes in the research and clinical settings via species- or strain-specific mass spectrum signatures. Although the existence of unique biomarkers (such as ribosomal proteins) underpins mass spectrometry-based microbial identification, absence of corresponding genome or proteome information in public databases for a large fraction of extant microbes significantly hamper biomarker (and species) assignment. However, the reproducible generation of species-specific mass spectrum across different growth and environmental conditions opens up the possibility of identifying unknown microbes, without biomarker identities, via comparing peak positions between mass spectra. Thus, the mass spectrum fingerprinting (pattern recognition) approach circumvents the need for biomarker information, where alignment of as many mass peaks as possible (particularly, those of phylogenetic significance) between spectra is the basis for identification. In contrast, variation in gene expression and metabolism with environmental and nutritional factors, meant that alignment of peak intensities, though desired, is not a strict requirement in species annotation. With large diversity of biomolecules present in each microbial species, mass spectrometry-based microbial identification is inherently data-intensive, which necessitates statistical tools and computers for implementation. However, relegation of algorithmic details to the backend of software obfuscates the approach’s conceptual underpinnings and hinders understanding. More importantly, mathematics-centric approaches for explaining the conceptual basis of pattern recognition, though useful, are generally less pedagogically accessible to students relative to visual illustration techniques. This short primer describes a simple graphical illustration that explains the conceptual underpinnings of mass spectrum fingerprinting, and highlights caveats for avoiding misidentifications, and may find use as a supplement in a microbiology or bioinformatics course for introducing the conceptual basis of pattern recognition based microbial identification by mass spectrometric analysis.

2017 ◽  
Author(s):  
Wenfa Ng

Pattern recognition is commonly used for identifying an unknown entity from a set of known objects curated in a database, and find use in various applications such as fingerprint matching and microbial identification. Mass spectrometry is increasingly used in identifying microbes in the research and clinical settings via species- or strain-specific mass spectrum signatures. Although the existence of unique biomarkers (such as ribosomal proteins) underpins mass spectrometry-based microbial identification, absence of corresponding genome or proteome information in public databases for a large fraction of extant microbes significantly hamper biomarker (and species) assignment. However, the reproducible generation of species-specific mass spectrum across different growth and environmental conditions opens up the possibility of identifying unknown microbes, without biomarker identities, via comparing peak positions between mass spectra. Thus, the mass spectrum fingerprinting (pattern recognition) approach circumvents the need for biomarker information, where alignment of as many mass peaks as possible (particularly, those of phylogenetic significance) between spectra is the basis for identification. In contrast, variation in gene expression and metabolism with environmental and nutritional factors, meant that alignment of peak intensities, though desired, is not a strict requirement in species annotation. With large diversity of biomolecules present in each microbial species, mass spectrometry-based microbial identification is inherently data-intensive, which necessitates statistical tools and computers for implementation. However, relegation of algorithmic details to the backend of software obfuscates the approach’s conceptual underpinnings and hinders understanding. More importantly, mathematics-centric approaches for explaining the conceptual basis of pattern recognition, though useful, are generally less pedagogically accessible to students relative to visual illustration techniques. This short primer describes a simple graphical illustration that explains the conceptual underpinnings of mass spectrum fingerprinting, and highlights caveats for avoiding misidentifications, and may find use as a supplement in a microbiology or bioinformatics course for introducing the conceptual basis of pattern recognition based microbial identification by mass spectrometric analysis.


2014 ◽  
Author(s):  
Wenfa Ng

Pattern recognition is a common approach for identifying an unknown entity from a set of known objects curated in a database – and find use in various data processing applications such as microbial identification. Whether matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) or electrospray ionization tandem mass spectrometry (ESI MS/MS), mass spectrometry techniques are increasingly used for identifying microbes in the research and clinical settings via species- or strain-specific mass spectrum signatures. Although the existence of unique biomarkers - such as ribosomal proteins - underpins mass spectrometry-enabled microbial identification, lack of corresponding genome or proteome information in publicly accessible databases for a large fraction of extant microbes significantly hamper biomarker (and species) assignment. Nevertheless, the reproducible generation of species-specific mass spectrum across different growth and environmental conditions opens up the possibility of identifying unknown microbes via comparing peak positions between mass spectra, without requiring knowledge of biomarker molecular identities. Thus, the mass spectrum fingerprinting (or pattern recognition) approach circumvents the need for biomarker information. Alignment of as many mass peaks as possible (particularly, those of phylogenetic significance) between spectra is the basis of mass spectrum fingerprinting. In contrast, variation in gene expression and metabolism (and hence, biomolecules’ abundances) with environmental and nutritional factors, meant that alignment of peak intensities, though desired, is not a strict requirement for identification. With large diversity of biomolecules present in each microbial species, mass spectrometry-based microbial identification is inherently data-intensive; thereby, requiring statistical tools and computational implementation of the pattern recognition approach, which is incorporated in software packages of microbial typing instruments. Nevertheless, relegation of algorithmic details of pattern recognition to the backend of software obfuscates the approach’s conceptual underpinnings and hinders students’ understanding. More important, mathematics-centric approaches for explaining the conceptual basis of pattern recognition, though useful, are generally less pedagogically accessible to life science students relative to visual illustration techniques. This short primer describes a simple graphical illustration (featuring three examples common in mass spectrometry-based biotyping workflows) that attempts to explain the conceptual underpinnings of mass spectrum fingerprinting, and highlights caveats for avoiding misidentification.


2015 ◽  
Author(s):  
Wenfa Ng

Pattern recognition is commonly used for identifying an unknown entity from a set of known objects curated in a database – and find use in various applications such as fingerprint matching and microbial identification. Whether matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) or electrospray ionization tandem mass spectrometry (ESI MS/MS), mass spectrometry is increasingly used in identifying microbes in the research and clinical settings via species- or strain-specific mass spectrum signatures. Although the existence of unique biomarkers - such as ribosomal proteins - underpins mass spectrometry-based microbial identification, absence of corresponding genome or proteome information in publicly accessible databases for a large fraction of extant microbes significantly hamper biomarker (and species) assignment. Nevertheless, the reproducible generation of species-specific mass spectrum across different growth and environmental conditions opens up the possibility of identifying unknown microbes via comparing peak positions between mass spectra, without biomarker identities. Thus, the mass spectrum fingerprinting (pattern recognition) approach circumvents the need for biomarker information, where alignment of as many mass peaks as possible (particularly, those of phylogenetic significance) between spectra is the basis for identification. In contrast, variation in gene expression and metabolism (and biomolecules’ abundances) with environmental and nutritional factors, meant that alignment of peak intensities, though desired, is not a strict requirement in species annotation. With large diversity of biomolecules present in each microbial species, mass spectrometry-based microbial identification is inherently data-intensive, which requires statistical tools and computers for implementing pattern recognition. Nevertheless, relegation of algorithmic details to the backend of software obfuscates the approach’s conceptual underpinnings and hinders students’ understanding. More important, mathematics-centric approaches for explaining the conceptual basis of pattern recognition, though useful, are generally less pedagogically accessible to students relative to visual illustration techniques. This short primer describes a simple graphical illustration (featuring three examples common in mass spectrometry-based biotyping workflows) that attempts to explain the conceptual underpinnings of mass spectrum fingerprinting, and highlights caveats for avoiding misidentifications.


Author(s):  
Vera Solntceva ◽  
Markus Kostrzewa ◽  
Gerald Larrouy-Maumus

MALDI-TOF mass spectrometry has revolutionized clinical microbiology diagnostics by delivering accurate, fast, and reliable identification of microorganisms. It is conventionally based on the detection of intracellular molecules, mainly ribosomal proteins, for identification at the species-level and/or genus-level. Nevertheless, for some microorganisms (e.g., for mycobacteria) extensive protocols are necessary in order to extract intracellular proteins, and in some cases a protein-based approach cannot provide sufficient evidence to accurately identify the microorganisms within the same genus (e.g., Shigella sp. vs E. coli and the species of the M. tuberculosis complex). Consequently lipids, along with proteins are also molecules of interest. Lipids are ubiquitous, but their structural diversity delivers complementary information to the conventional protein-based clinical microbiology matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) based approaches currently used. Lipid modifications, such as the ones found on lipid A related to polymyxin resistance in Gram-negative pathogens (e.g., phosphoethanolamine and aminoarabinose), not only play a role in the detection of microorganisms by routine MALDI-TOF mass spectrometry but can also be used as a read-out of drug susceptibility. In this review, we will demonstrate that in combination with proteins, lipids are a game-changer in both the rapid detection of pathogens and the determination of their drug susceptibility using routine MALDI-TOF mass spectrometry systems.


2020 ◽  
Author(s):  
C Nabet ◽  
S Imbert ◽  
A C Normand ◽  
D Blanchet ◽  
R Chanlin ◽  
...  

Abstract New mold species are increasingly reported in invasive fungal infections. However, these fungi are often misdiagnosed or undiagnosed due to the use of inappropriate laboratory diagnostic tools. Tropical countries, such as French Guiana, harbor a vast diversity of environmental fungi representing a potential source of emerging pathogens. To assess the impact of this diversity on the accuracy of mold-infection diagnoses, we identified mold clinical isolates in French Guiana during a five-month follow-up using both microscopy and matrix-assisted laser desorption ionization time-of-flight mass spectrometry. In total, 38.8% of the 98 obtained molds isolates could not be identified and required a DNA-based identification. Fungal diversity was high, including 46 species, 26 genera, and 13 orders. Fungal ecology was unusual, as Aspergillus species accounted for only 27% of all isolates, and the Nigri section was the most abundant out of the six detected Aspergillus sections. Macromycetes (orders Agaricales, Polyporales, and Russulales) and endophytic fungi accounted for respectively 11% and 14% of all isolates. Thus, in tropical areas with high fungal diversity, such as French Guiana, routine mold identification tools are inadequate. Molecular identifications, as well as morphological descriptions, are necessary for the construction of region-specific mass spectrum databases. These advances will improve the diagnosis and clinical management of new fungal infections. Lay summary In French Guiana, environmental fungal diversity may be a source of emerging pathogens. We evaluated microscopy and mass spectrometry to identify mold clinical isolates. With 39% of unidentified isolates, a region-specific mass spectrum database would improve the diagnosis of new fungal infections.


2021 ◽  
Author(s):  
Wenfa Ng

Existence of theoretical ribosomal protein mass fingerprint as well as utility of ribosomal protein as biomarkers in mass spectrometry microbial identification suggests phylogenetic significance for this class of proteins. To serve the above two functions, facile means of identifying and extracting important attributes of ribosomal proteins from proteome data file of microbial species must be found. Additionally, there is a need to calculate important properties of ribosomal proteins such as molecular weight and nucleotide sequence based on amino acid sequence information from FASTA proteome file. This work sought to support the above endeavour through developing a MATLAB software that extracts the amino acid sequence information of all ribosomal proteins from the FASTA proteome datafile of a microbial species downloaded from UniProt. Built-in functions in MATLAB are subsequently employed to calculate important properties of extracted ribosomal proteins such as number of amino acid residue, molecular weight and nucleotide sequence. All information above are output, as a database, to an Excel file for ease of storage and retrieval. Data available from the analysis of an Escherichia coli K-12 proteome revealed that the bacterium possess a total of 59 ribosomal proteins distributed between the large and small ribosome subunits. The ribosomal protein ranges in sequence length from 38 (50S ribosomal protein L36) to 557 (30S ribosomal protein S1). In terms of molecular weight distribution, the profiled ribosomal proteins range in weight from 4364.305 Da (50S ribosomal protein L36) to 61157.66 Da (30S ribosomal protein S1). More important, analysis of the distribution of the molecular weight of different ribosomal proteins in E. coli reveals a smooth curve that suggests strong co-evolution of ribosomal protein sequence and mass given the tight constraints that a functional ribosome presents. Finally, cluster analysis reveals a preponderance of small ribosomal proteins compared to larger ones, which remains to be a mystery to evolutionary biologists. Overall, the information encapsulated in the ribosomal protein database should find use in gaining a better appreciation for the molecular weight distribution of ribosomal proteins in a species, as well as delivering information for using ribosomal protein biomarkers in identifying particular microbial species in mass spectrometry microbial identification.


2018 ◽  
Author(s):  
Wenfa Ng

Ribosomes are highly conserved given the importance of protein synthesis to cell survival. Although small differences in structure and functions exists in ribosomes from different species of bacteria, archaea and eukaryotes, the general structure and function remains conserved across species in the same domain of life. Thus, are ribosomal proteins that constitute ribosomes highly conserved between species in the same domain or do they possess sufficient sequence variation that help identify individual species? Having differentiated sequence would mean that ribosomal proteins from different species might account for differences in structure and function of the ribosomes in different species. Using ribosomal protein amino acid sequence information from Ribosomal Protein Gene Database for calculating molecular mass of ribosomal proteins, this study sought to determine if the molecular mass of a set of ribosomal proteins from a species could constitute a unique ribosomal protein mass fingerprint. In addition, the question of whether unique ribosomal protein mass fingerprint exists between different species in the three domains of life was also examined. Results revealed that distinct molecular mass of individual ribosomal protein could aggregate into a unique ribosomal protein mass fingerprint for individual bacterial, archaeal and eukaryotic species. Such ribosomal protein mass fingerprints could potentially find use in microbial identification through gel-free matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) profiling of solubilized ribosomal proteins. Obtained ribosomal protein mass spectrum could be compared with those catalogued in a reference database of known microorganisms where pattern recognition algorithms could determine a match. Additionally, existence of theoretical ribosomal protein mass fingerprint across species in the three domains of life also pointed to the presence of small differences in structure and function of both the large and small ribosome subunit. Such differences could reveal possible differentiated ribosomal structure and function in different species even though the general structure and function of the ribosome is conserved across species. Collectively, distinct molecular mass of individual ribosomal proteins in species pointed to a unique ribosomal protein mass fingerprint that could find use in microbial identification through gel-free mass spectrometry analysis of solubilized ribosomal proteins. Differences in mass of ribosomal proteins across species also highlighted existence of ribosomes of differentiated structure and function between different species even though the general structure and function of the ribosome remains highly conserved.


2018 ◽  
Author(s):  
Wenfa Ng

Ribosomes are highly conserved given the importance of protein synthesis to cell survival. Although small differences in structure and functions exists in ribosomes from different species of bacteria, archaea and eukaryotes, the general structure and function remains conserved across species in the same domain of life. Thus, are ribosomal proteins that constitute ribosomes highly conserved between species in the same domain or do they possess sufficient sequence variation that help identify individual species? Having differentiated sequence would mean that ribosomal proteins from different species might account for differences in structure and function of the ribosomes in different species. Using ribosomal protein amino acid sequence information from Ribosomal Protein Gene Database for calculating molecular mass of ribosomal proteins, this study sought to determine if the molecular mass of a set of ribosomal proteins from a species could constitute a unique ribosomal protein mass fingerprint. In addition, the question of whether unique ribosomal protein mass fingerprint exists between different species in the three domains of life was also examined. Results revealed that distinct molecular mass of individual ribosomal protein could aggregate into a unique ribosomal protein mass fingerprint for individual bacterial, archaeal and eukaryotic species. Such ribosomal protein mass fingerprints could potentially find use in microbial identification through gel-free matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) profiling of solubilized ribosomal proteins. Obtained ribosomal protein mass spectrum could be compared with those catalogued in a reference database of known microorganisms where pattern recognition algorithms could determine a match. Additionally, existence of theoretical ribosomal protein mass fingerprint across species in the three domains of life also pointed to the presence of small differences in structure and function of both the large and small ribosome subunit. Such differences could reveal possible differentiated ribosomal structure and function in different species even though the general structure and function of the ribosome is conserved across species. Collectively, distinct molecular mass of individual ribosomal proteins in species pointed to a unique ribosomal protein mass fingerprint that could find use in microbial identification through gel-free mass spectrometry analysis of solubilized ribosomal proteins. Differences in mass of ribosomal proteins across species also highlighted existence of ribosomes of differentiated structure and function between different species even though the general structure and function of the ribosome remains highly conserved.


1994 ◽  
Vol 40 (2) ◽  
pp. 216-220 ◽  
Author(s):  
A H Wu ◽  
D Ostheimer ◽  
M Cremese ◽  
E Forte ◽  
D Hill

Abstract Interference by substances coeluting with targeted drugs is a general problem for gas chromatographic/mass spectrometric analysis of urine. To characterize these interferences, we examined human urine samples containing benzoylecgonine and fluconazole, and other drug combinations including deuterated internal standards that coelute (ISd,c) with target drugs, by selected-ion monitoring (SIM) and full-scan mass spectrometry. We show that, by SIM analysis, detecting the presence of an interferent is dependent on the specific IS used for the assay. When an ISd,c is used, the presence of another coeluting substance (interferent) suggests that the intensity of IS ions is substantially diminished, because the interferent affects both the ISd,c and target drug. When a noncoeluting IS (ISnc) is used, the interferent cannot be discerned unless it coincidently contains one or more of the ions monitored for either the target drug or ISnc. Under full-scan analysis, a coeluting interferent is directly discernable by examining the total ion gas chromatogram.


Sign in / Sign up

Export Citation Format

Share Document