Toward a Comprehensive Treatment of Tautomerism in Chemoinformatics Including in InChI V2

Author(s):  
Devendra K. Dhaked ◽  
Wolf Ihlenfeldt ◽  
Hitesh Patel ◽  
Marc Nicklaus

<p>We have collected 86 different transforms of tautomeric interconversions. Out of those, 54 are for prototropic (non-ring-chain) tautomerism; 21 for ring-chain tautomerism; and 11 for valence tautomerism. The majority of these rules have been extracted from experimental literature. Twenty rules – covering the most well-known types of tautomerism such as keto-enol tautomerism – were taken from the default handling of tautomerism by the chemoinformatics toolkit CACTVS. The rules were analyzed against nine differerent databases totaling over 400 million (non-unique) structures as to their occurrence rates, mutual overlap in coverage, and recapitulation of the rules’ enumerated tautomer sets by InChI V.1.05, both in InChI’s Standard and a Non-Standard version with the increased tautomer-handling options 15T and KET turned on. These results and the background of this study are discussed in the context of the IUPAC InChI Project tasked with the redesign of handling of tautomerism for an InChI version 2. Applying the rules presented in this paper would approximately triple the number of compounds in typical small-molecule databases that would be affected by tautomeric interconversion by InChI V2. A web tool has been created to test these rules at https://cactus.nci.nih.gov/tautomerizer.</p>

2019 ◽  
Author(s):  
Devendra K. Dhaked ◽  
Wolf Ihlenfeldt ◽  
Hitesh Patel ◽  
Marc Nicklaus

<p>We have collected 86 different transforms of tautomeric interconversions. Out of those, 54 are for prototropic (non-ring-chain) tautomerism; 21 for ring-chain tautomerism; and 11 for valence tautomerism. The majority of these rules have been extracted from experimental literature. Twenty rules – covering the most well-known types of tautomerism such as keto-enol tautomerism – were taken from the default handling of tautomerism by the chemoinformatics toolkit CACTVS. The rules were analyzed against nine differerent databases totaling over 400 million (non-unique) structures as to their occurrence rates, mutual overlap in coverage, and recapitulation of the rules’ enumerated tautomer sets by InChI V.1.05, both in InChI’s Standard and a Non-Standard version with the increased tautomer-handling options 15T and KET turned on. These results and the background of this study are discussed in the context of the IUPAC InChI Project tasked with the redesign of handling of tautomerism for an InChI version 2. Applying the rules presented in this paper would approximately triple the number of compounds in typical small-molecule databases that would be affected by tautomeric interconversion by InChI V2. A web tool has been created to test these rules at https://cactus.nci.nih.gov/tautomerizer.</p>


2021 ◽  
Author(s):  
Devendra Kumar Dhaked ◽  
Marc Nicklaus

We have analyzed forty different databases ranging in size from a few thousand to nearly 100 million molecules, comprising a total of over 200 million structures, for their tautomeric conflicts. A tautomeric conflict is defined as an occurrence of two or more structures within a data set identified by the tautomeric rules applied as being tautomers of each other. We tested a total of 119 detailed tautomeric transform rules expressed as SMIRKS, out of which 79 yielded at least one conflict. The databases analyzed spanned a wide variety of types including large aggregating databases, drug collections, and experimentally based structure collections. Almost all databases analyzed showed intra-database tautomeric conflicts. The conflict rates as percentage of the database were typically in the few tenths of a percent range, which for the largest databases amounts to more than 100,000 cases per database.


2018 ◽  
Vol 28 (9) ◽  
pp. 685-698 ◽  
Author(s):  
Rikin D. Patel ◽  
Sivakumar Prasanth Kumar ◽  
Himanshu A. Pandya ◽  
Hitesh A. Solanki

2015 ◽  
Vol 7 (1) ◽  
Author(s):  
Saber A. Akhondi ◽  
Sorel Muresan ◽  
Antony J. Williams ◽  
Jan A. Kors

2007 ◽  
Vol 3 (1) ◽  
pp. 107-113 ◽  
Author(s):  
Maxwell Cummings ◽  
Alan Maxwell ◽  
Renee DesJarlais

2020 ◽  
Vol 66 ◽  
pp. 102499
Author(s):  
Zheng-Fei Yang ◽  
Ran Xiao ◽  
Fei-Jun Luo ◽  
Qin-Lu Lin ◽  
Defang Ouyang ◽  
...  

2016 ◽  
Vol 8 (1) ◽  
Author(s):  
Jakub Galgonek ◽  
Tomáš Hurt ◽  
Vendula Michlíková ◽  
Petr Onderka ◽  
Jan Schwarz ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document