LC–QTOFMS Presumptive Identification of Synthetic Cannabinoids without Reference Chromatographic Retention/Mass Spectral Information. II. Evaluation of a Computational Approach for Predicting and Identifying Unknown High-Resolution Product Ion Mass Spectra

Author(s):  
Aldo E Polettini ◽  
Johannes Kutzler ◽  
Christoph Sauer ◽  
Susanne Guber ◽  
Wolfgang Schultis

Abstract Despite liquid chromatography–high-resolution tandem mass spectrometry (MS2) enables untargeted acquisition, data processing in toxicological screenings is almost invariably performed in targeted mode. We developed a computational approach based on open source chemometrics software that, starting from a suspected synthetic cannabinoid (SC) determined formula, searches for isomers in different new psychoactive substances web databases, predicts retention time (RT) and high-resolution MS2 spectrum, and compares them with the unknown providing a rank-ordered candidates list. R was applied on 105 SC measured data to develop and validate a multiple linear regression quantitative structure–activity relationship model predicting RT. Competitive Fragmentation Modeling for Metabolite Identification (CFM-ID) freeware was used to predict/compare spectra with Jaccard similarity index. Data-dependent acquisition was performed with an Agilent Infinity 1290 LC-6550 iFunnel Q-TOF MS with ZORBAX Eclipse-Plus C18 (100 × 2.1 mm2/1.8 µm) in water/acetonitrile/ammonium formate gradient. Ability of the combined RT/MS2 prediction to identify unknowns was evaluated on SC standards (with leave-one-out from the RT model) and on unexpected SC encountered in real cases. RT prediction reduced the number of isomers retrieved from a group of new psychoactive substances web databases to one-third (2,792 ± 3,358→845 ± 983) and differentiated between SC isomers when spectra were not selective (4F-MDMB-BUTINACA, 4F-MDMB-BUTINACA 2ʹ-indazole isomer) or unavailable (4CN-Cumyl-B7AICA, 4CN-Cumyl-BUTINACA). When comparing 30/40 eV measured spectra of 99 SC against RT-selected, CFM-ID predicted spectra of isomers, the right candidate ranked 1st on median and 4th on average; 54% and 88% of times the right match ranked 1st or within the first 5 matches, respectively. To our knowledge, this is the first case of extensive chemometrics application to toxicological screening. In most cases, presumptive identification (being based on computation, it requires further information for confirmation) of unexpected SC was achieved without reference measured information. This method is currently the closest possible to true unbiased/untargeted screening. The bottleneck of the method is the processing time required to predict mass spectra (ca. 30–35 s/compound using a 64-bit 2.50-GHz Intel® Core™ i5-7200U CPU). However, strategies can be implemented to reduce prediction processing time.

Yakhak Hoeji ◽  
2017 ◽  
Vol 61 (2) ◽  
pp. 65-74
Author(s):  
Jaesuk Yun ◽  
◽  
Kyung Sik Yoon ◽  
Yong-seop Lee ◽  
Kyoung moon Han ◽  
...  

2016 ◽  
Vol 408 (16) ◽  
pp. 4297-4309 ◽  
Author(s):  
Iria González-Mariño ◽  
Emma Gracia-Lor ◽  
Renzo Bagnati ◽  
Claudia P. B. Martins ◽  
Ettore Zuccato ◽  
...  

Author(s):  
So Yeon Lee ◽  
Sang Tak Lee ◽  
Sungill Suh ◽  
Bum Jun Ko ◽  
Han Bin Oh

Abstract High-resolution liquid chromatography (LC)–tandem mass spectrometry (MS-MS)-based machine learning models are constructed to address the analytical challenge of identifying unknown controlled substances and new psychoactive substances (NPSs). Using a training set composed of 770 LC–MS-MS barcode spectra (with binary entries 0 or 1) obtained generally by high-resolution mass spectrometers, three classification machine learning models were generated and evaluated. The three models are artificial neural network (ANN), support vector machine (SVM) and k-nearest neighbor (k-NN) models. In these models, controlled substances and NPSs were classified into 13 subgroups (benzylpiperazine, opiate, benzodiazepine, amphetamine, cocaine, methcathinone, classical cannabinoid, fentanyl, 2C series, indazole carbonyl compound, indole carbonyl compound, phencyclidine and others). Using 193 LC–MS-MS barcode spectra as an external test set, accuracy of the ANN, SVM and k-NN models were evaluated as 72.5%, 90.0% and 94.3%, respectively. Also, the hybrid similarity search (HSS) algorithm was evaluated to examine whether this algorithm can successfully identify unknown controlled substances and NPSs whose data are unavailable in the database. When only 24 representative LC–MS-MS spectra of controlled substances and NPSs were selectively included in the database, it was found that HSS can successfully identify compounds with high reliability. The machine learning models and HSS algorithms are incorporated into our home-coded artificial intelligence screener for narcotic drugs and psychotropic substances standalone software that is equipped with a graphic user interface. The use of this software allows unknown controlled substances and NPSs to be identified in a convenient manner.


Sign in / Sign up

Export Citation Format

Share Document