scholarly journals SMITER—A Python Library for the Simulation of LC-MS/MS Experiments

Genes ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 396
Author(s):  
Manuel Kösters ◽  
Johannes Leufken ◽  
Sebastian A. Leidel

SMITER (Synthetic mzML writer) is a Python-based command-line tool designed to simulate liquid-chromatography-coupled tandem mass spectrometry LC-MS/MS runs. It enables the simulation of any biomolecule amenable to mass spectrometry (MS) since all calculations are based on chemical formulas. SMITER features a modular design, allowing for an easy implementation of different noise and fragmentation models. By default, SMITER uses an established noise model and offers several methods for peptide fragmentation, and two models for nucleoside fragmentation and one for lipid fragmentation. Due to the rich Python ecosystem, other modules, e.g., for retention time (RT) prediction, can easily be implemented for the tailored simulation of any molecule of choice. This facilitates the generation of defined gold-standard LC-MS/MS datasets for any type of experiment. Such gold standards, where the ground truth is known, are required in computational mass spectrometry to test new algorithms and to improve parameters of existing ones. Similarly, gold-standard datasets can be used to evaluate analytical challenges, e.g., by predicting co-elution and co-fragmentation of molecules. As these challenges hinder the detection or quantification of co-eluents, a comprehensive simulation can identify and thus, prevent such difficulties before performing actual MS experiments. SMITER allows the creation of such datasets easily, fast, and efficiently.

Viruses ◽  
2021 ◽  
Vol 13 (5) ◽  
pp. 730
Author(s):  
Magda Rybicka ◽  
Ewa Miłosz ◽  
Krzysztof Piotr Bielawski

At present, the RT-PCR test remains the gold standard for early diagnosis of SARS-CoV-2. Nevertheless, there is growing evidence demonstrating that this technique may generate false-negative results. Here, we aimed to compare the new mass spectrometry-based assay MassARRAY® SARS-CoV-2 Panel with the RT-PCR diagnostic test approved for clinical use. The study group consisted of 168 suspected patients with symptoms of a respiratory infection. After simultaneous analysis by RT-PCR and mass spectrometry methods, we obtained discordant results for 17 samples (10.12%). Within fifteen samples officially reported as presumptive positive, 13 were positive according to the MS-based assay. Moreover, four samples reported by the officially approved RT-PCR as negative were positive in at least one MS assay. We have successfully demonstrated superior sensitivity of the MS-based assay in SARS-CoV-2 detection, showing that MALDI-TOF MS seems to be ideal for the detection as well as discrimination of mutations within the viral genome.


Author(s):  
Xuhai Xu ◽  
Ebrahim Nemati ◽  
Korosh Vatanparvar ◽  
Viswam Nathan ◽  
Tousif Ahmed ◽  
...  

The prevalence of ubiquitous computing enables new opportunities for lung health monitoring and assessment. In the past few years, there have been extensive studies on cough detection using passively sensed audio signals. However, the generalizability of a cough detection model when applied to external datasets, especially in real-world implementation, is questionable and not explored adequately. Beyond detecting coughs, researchers have looked into how cough sounds can be used in assessing lung health. However, due to the challenges in collecting both cough sounds and lung health condition ground truth, previous studies have been hindered by the limited datasets. In this paper, we propose Listen2Cough to address these gaps. We first build an end-to-end deep learning architecture using public cough sound datasets to detect coughs within raw audio recordings. We employ a pre-trained MobileNet and integrate a number of augmentation techniques to improve the generalizability of our model. Without additional fine-tuning, our model is able to achieve an F1 score of 0.948 when tested against a new clean dataset, and 0.884 on another in-the-wild noisy dataset, leading to an advantage of 5.8% and 8.4% on average over the best baseline model, respectively. Then, to mitigate the issue of limited lung health data, we propose to transform the cough detection task to lung health assessment tasks so that the rich cough data can be leveraged. Our hypothesis is that these tasks extract and utilize similar effective representation from cough sounds. We embed the cough detection model into a multi-instance learning framework with the attention mechanism and further tune the model for lung health assessment tasks. Our final model achieves an F1-score of 0.912 on healthy v.s. unhealthy, 0.870 on obstructive v.s. non-obstructive, and 0.813 on COPD v.s. asthma classification, outperforming the baseline by 10.7%, 6.3%, and 3.7%, respectively. Moreover, the weight value in the attention layer can be used to identify important coughs highly correlated with lung health, which can potentially provide interpretability for expert diagnosis in the future.


Amino Acids ◽  
2021 ◽  
Author(s):  
Magdalena Widgren Sandberg ◽  
Jakob Bunkenborg ◽  
Stine Thyssen ◽  
Martin Villadsen ◽  
Thomas Kofoed

AbstractGranulocyte-macrophage colony-stimulating factor (GM-CSF) is a cytokine and a white blood cell growth factor that has found usage as a therapeutic protein. During analysis of different fermentation batches of GM-CSF recombinantly expressed in E. coli, a covalent modification was identified on the protein by intact mass spectrometry. The modification gave a mass shift of + 70 Da and peptide mapping analysis demonstrated that it located to the protein N-terminus and lysine side chains. The chemical composition of C4H6O was found to be the best candidate by peptide fragmentation using tandem mass spectrometry. The modification likely contains a carbonyl group, since the mass of the modification increased by 2 Da by reduction with borane pyridine complex and it reacted with 2,4-dinitrophenylhydrazine. On the basis of chemical and tandem mass spectrometry fragmentation behavior, the modification could be attributed to crotonaldehyde, a reactive compound formed during lipid peroxidation. A low recorded oxygen pressure in the reactor during protein expression could be linked to the formation of this compound. This study shows the importance of maintaining full control over all reaction parameters during recombinant protein production.


eLife ◽  
2015 ◽  
Vol 4 ◽  
Author(s):  
Pierre Barbier de Reuille ◽  
Anne-Lise Routier-Kierzkowska ◽  
Daniel Kierzkowski ◽  
George W Bassel ◽  
Thierry Schüpbach ◽  
...  

Morphogenesis emerges from complex multiscale interactions between genetic and mechanical processes. To understand these processes, the evolution of cell shape, proliferation and gene expression must be quantified. This quantification is usually performed either in full 3D, which is computationally expensive and technically challenging, or on 2D planar projections, which introduces geometrical artifacts on highly curved organs. Here we present MorphoGraphX (www.MorphoGraphX.org), a software that bridges this gap by working directly with curved surface images extracted from 3D data. In addition to traditional 3D image analysis, we have developed algorithms to operate on curved surfaces, such as cell segmentation, lineage tracking and fluorescence signal quantification. The software's modular design makes it easy to include existing libraries, or to implement new algorithms. Cell geometries extracted with MorphoGraphX can be exported and used as templates for simulation models, providing a powerful platform to investigate the interactions between shape, genes and growth.


Author(s):  
John Chiverton ◽  
Kevin Wells

This chapter applies a Bayesian formulation of the Partial Volume (PV) effect, based on the Benford distribution, to the statistical classification of nuclear medicine imaging data: specifically Positron Emission Tomography (PET) acquired as part of a PET-CT phantom imaging procedure. The Benford distribution is a discrete probability distribution of great interest for medical imaging, because it describes the probabilities of occurrence of single digits in many sources of data. The chapter thus describes the PET-CT imaging and post-processing process to derive a gold standard. Moreover, this chapter uses it as a ground truth for the assessment of a Benford classifier formulation. The use of this gold standard shows that the classification of both the simulated and real phantom imaging data is well described by the Benford distribution.


Stroke ◽  
2020 ◽  
Vol 51 (Suppl_1) ◽  
Author(s):  
Benjamin Zahneisen ◽  
Matus Straka ◽  
Shalini Bammer ◽  
Greg Albers ◽  
Roland Bammer

Introduction: Ruling out hemorrhage (stroke or traumatic) prior to administration of thrombolytics is critical for Code Strokes. A triage software that identifies hemorrhages on head CTs and alerts radiologists would help to streamline patient care and increase diagnostic confidence and patient safety. ML approach: We trained a deep convolutional network with a hybrid 3D/2D architecture on unenhanced head CTs of 805 patients. Our training dataset comprised 348 positive hemorrhage cases (IPH=245, SAH=67, Sub/Epi-dural=70, IVH=83) (128 female) and 457 normal controls (217 female). Lesion outlines were drawn by experts and stored as binary masks that were used as ground truth data during the training phase (random 80/20 train/test split). Diagnostic sensitivity and specificity were defined on a per patient study level, i.e. a single, binary decision for presence/absence of a hemorrhage on a patient’s CT scan. Final validation was performed in 380 patients (167 positive). Tool: The hemorrhage detection module was prototyped in Python/Keras. It runs on a local LINUX server (4 CPUs, no GPUs) and is embedded in a larger image processing platform dedicated to stroke. Results: Processing time for a standard whole brain CT study (3-5mm slices) was around 2min. Upon completion, an instant notification (by email and/or mobile app) was sent to users to alert them about the suspected presence of a hemorrhage. Relative to neuroradiologist gold standard reads the algorithm’s sensitivity and specificity is 90.4% and 92.5% (95% CI: 85%-94% for both). Detection of acute intracranial hemorrhage can be automatized by deploying deep learning. It yielded very high sensitivity/specificity when compared to gold standard reads by a neuroradiologist. Volumes as small as 0.5mL could be detected reliably in the test dataset. The software can be deployed in busy practices to prioritize worklists and alert health care professionals to speed up therapeutic decision processes and interventions.


Data Mining ◽  
2013 ◽  
pp. 1794-1818
Author(s):  
William H. Horsthemke ◽  
Daniela S. Raicu ◽  
Jacob D. Furst ◽  
Samuel G. Armato

Evaluating the success of computer-aided decision support systems depends upon a reliable reference standard, a ground truth. The ideal gold standard is expected to result from the marking, labeling, and rating by domain experts of the image of interest. However experts often disagree, and this lack of agreement challenges the development and evaluation of image-based feature prediction of expert-defined “truth.” The following discussion addresses the success and limitation of developing computer-aided models to characterize suspicious pulmonary nodules based upon ratings provided by multiple expert radiologists. These prediction models attempt to bridge the semantic gap between images and medically-meaningful, descriptive opinions about visual characteristics of nodules. The resultant computer-aided diagnostic characterizations (CADc) are directly usable for indexing and retrieving in content-based medical image retrieval and supporting computer-aided diagnosis. The predictive performance of CADc models are directly related to the extent of agreement between radiologists; the models better predict radiologists’ opinions when radiologists agree more with each other about the characteristics of nodules.


Author(s):  
Haipeng Wang

Protein identification (sequencing) by tandem mass spectrometry is a fundamental technique for proteomics which studies structures and functions of proteins in large scale and acts as a complement to genomics. Analysis and interpretation of vast amounts of spectral data generated in proteomics experiments present unprecedented challenges and opportunities for data mining in areas such as data preprocessing, peptide-spectrum matching, results validation, peptide fragmentation pattern discovery and modeling, and post-translational modification (PTM) analysis. This article introduces the basic concepts and terms of protein identification and briefly reviews the state-of-the-art relevant data mining applications. It also outlines challenges and future potential hot spots in this field.


2019 ◽  
Vol 6 (Supplement_2) ◽  
pp. S731-S731
Author(s):  
Carlos Correa-Martinez ◽  
Evgeny A Idelevich ◽  
Karsten Becker

Abstract Background The accurate identification of carbapenem resistance mechanisms is decisive for the appropriate selection of antibiotic regimens. Numerous methods can detect carbapenemase-producing carbapenem-resistant bacteria (CPCR). However, non-CPCR (NCPCR) are routinely assumed to display porin loss as a diagnosis of exclusion. No further confirmatory tests are performed since the gold standard (sodium dodecylsulfate polyacrylamide gel electrophoresis, SDS–PAGE) is laborious and time consuming. We propose a test for rapid and easy detection of porin loss by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). Methods Clinical meropenem-resistant Enterobacterales strains (10 CPCR, 10 NCPCR) and control strains recommended by EUCAST (5 carbapenemase-producing, one with porin loss, one-negative control) were analyzed. Membrane proteins were extracted by successive centrifugation of bacterial suspensions (McFarland 0.5) and addition of ethanol, formic acid and acetonitrile. MALDI-TOF MS of the protein extracts was performed on a 96-spot target (Bruker Daltonics, Germany). Peaks between 35 and 40 kDa were analyzed for the presence of porins and compared with the bands observed in the SDS–PAGE of the protein extracts. Results Within the molecular weight range of 35–40 kDa, the MALDI-TOF MS-based method revealed peaks in all CPCR isolates corresponding to those observed in the carbapenemase-producing control strains. In contrast, the control strain with porin loss as well as all CNCR isolates showed a lower quantity of peaks in this range. All peaks observed correlated with the bands observed in the SDS–PAGE of the protein extracts at the corresponding molecular weight (Figure 1). Conclusion Yielding results that reliably correspond to the current gold standard, we propose a method for accelerated detection of porin loss as an alternative to the diagnosis of exclusion usually made in routine settings. With a processing time of approximately 20 minutes, the method can be easily implemented in the clinical setting. Applying this MALDI-TOF MS-based approach, valuable information will be provided about a resistance mechanism that otherwise remains unexplained. Disclosures All authors: No reported disclosures.


Sign in / Sign up

Export Citation Format

Share Document