atom pair
Recently Published Documents


TOTAL DOCUMENTS

117
(FIVE YEARS 20)

H-INDEX

25
(FIVE YEARS 2)

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Alice Capecchi ◽  
Jean-Louis Reymond

AbstractNatural products (NPs) represent one of the most important resources for discovering new drugs. Here we asked whether NP origin can be assigned from their molecular structure in a subset of 60,171 NPs in the recently reported Collection of Open Natural Products (COCONUT) database assigned to plants, fungi, or bacteria. Visualizing this subset in an interactive tree-map (TMAP) calculated using MAP4 (MinHashed atom pair fingerprint) clustered NPs according to their assigned origin (https://tm.gdb.tools/map4/coconut_tmap/), and a support vector machine (SVM) trained with MAP4 correctly assigned the origin for 94% of plant, 89% of fungal, and 89% of bacterial NPs in this subset. An online tool based on an SVM trained with the entire subset correctly assigned the origin of further NPs with similar performance (https://np-svm-map4.gdb.tools/). Origin information might be useful when searching for biosynthetic genes of NPs isolated from plants but produced by endophytic microorganisms.


2021 ◽  
Author(s):  
Alice Capecchi ◽  
Jean-Louis Reymond

Natural products (NPs) represent one of the most important resources for discovering new drugs. Here we asked whether NP origin can be assigned from their molecular structure in a subset of 60,171 NPs in the recently reported Collection of Open Natural Products (COCONUT) database assigned to plants, fungi, or bacteria. Visualizing this subset in an interactive tree-map (TMAP) calculated using MAP4 (MinHashed atom pair fingerprint) clustered NPs according to their assigned origin (https://tm.gdb.tools/map4/coconut_tmap/), and a support vector machine (SVM) trained with MAP4 correctly assigned the origin for 94% of plant, 89% of fungal, and 89% of bacterial NPs in this subset. An online tool based on an SVM trained with the entire subset correctly assigned the origin of further NPs with similar performance (https://np-svm-map4.gdb.tools/). Origin information might be useful when searching for biosynthetic genes of NPs isolated from plants but produced by endophytic microorganisms.


2021 ◽  
Vol 271 ◽  
pp. 115247
Author(s):  
Debashis Roy ◽  
Md Kamal Hossain ◽  
Syed Mahedi Hasan ◽  
Shamima Khanom ◽  
Md. Abul Hossain ◽  
...  

2021 ◽  
Vol 140 (5) ◽  
Author(s):  
Saurabh Khodia ◽  
Shouvik Halder ◽  
Saibalendu Sarkar ◽  
Surajit Maity

2021 ◽  
Author(s):  
Alla P. Toropova ◽  
Andrey A. Toropov ◽  
Emilio Benfenati

Abstract Atom-pairs proportions are the transparent quality of a molecule: if a molecule has two atoms of oxygen and three atoms of nitrogen, the atom-pair atom1-atom2 can be expressed as a code 'atom1-atom2-n1-n2', indicating the different atoms and their numbers. These codes for a group of atoms (nitrogen, oxygen, sulfur, phosphorus, fluorine, chlorine, bromine, as well as, double and triple covalent bonds) are applied to build up the so-called optimal molecular descriptor calculated with special coefficients named correlation weights of corresponding pairs. The numerical data on the correlation weights are calculated by the Monte Carlo technique using the CORAL software (http://www.insilico.eu/coral). The one-variable model for melting points of 8653 various organic compounds is characterized by the following statistical quality: n=6483, r2=0.6452; RMSE=61.9’C; n=2170, r2=0.7941, RMSE=39.2’C.


2020 ◽  
Vol 11 (15) ◽  
pp. 6320-6329 ◽  
Author(s):  
Ting Deng ◽  
Chao Cen ◽  
Hujun Shen ◽  
Shuyi Wang ◽  
Jingdong Guo ◽  
...  

2020 ◽  
Author(s):  
Alice Capecchi ◽  
Daniel Probst ◽  
Jean-Louis Reymond

<p><b>Background</b>: Molecular fingerprints are essential cheminformatics tools for virtual screening and mapping chemical space. Among the different types of fingerprints, substructure fingerprints perform best for small molecules such as drugs, while atom-pair fingerprints are preferable for large molecules such as peptides. However, no available fingerprint achieves good performance on both classes of molecules.</p> <p><b>Results</b>: Here we set out to design a new fingerprint suitable for both small and large molecules by combining substructure and atom-pair concepts. Our quest resulted in a new fingerprint called MinHashed atom-pair fingerprint up to a diameter of four bonds (MAP4). In this fingerprint the circular substructures with radii of <i>r</i> = 1 and <i>r </i>= 2 bonds around each atom in an atom-pair are written as two pairs of SMILES, each pair being combined with the topological distance separating the two central atoms. These so-called atom-pair molecular shingles are hashed, and the resulting set of hashes is MinHashed to form the MAP4 fingerprint. MAP4 significantly outperforms all other fingerprints on an extended benchmark that combines the Riniker and Landrum small molecule benchmark with a peptide benchmark recovering BLAST analogs from either scrambled or point mutation analogs. MAP4 furthermore produces well-organized chemical space tree-maps (TMAPs) for databases as diverse as DrugBank, ChEMBL, SwissProt and the Human Metabolome Database (HMBD), and differentiates between all metabolites in HMBD, over 70 % of which are indistinguishable from their nearest neighbor using substructure fingerprints. </p> <b>Conclusion</b>: MAP4 is a new molecular fingerprint suitable for drugs, biomolecules, and the metabolome and can be adopted as a universal fingerprint to describe and search chemical space. The source code is available at <a href="https://github.com/reymond-group/map4">https://github.com/reymond-group/map4</a> and interactive MAP4 similarity search tools and TMAPs for various databases are accessible at <a href="http://map-search.gdb.tools/">http://map-search.gdb.tools/</a> and <a href="http://tm.gdb.tools/map4/">http://tm.gdb.tools/map4/</a>.<a href="http://tm.gdb.tools/map4/"></a>


2020 ◽  
Author(s):  
Alice Capecchi ◽  
Daniel Probst ◽  
Jean-Louis Reymond

<p><b>Background</b>: Molecular fingerprints are essential cheminformatics tools for virtual screening and mapping chemical space. Among the different types of fingerprints, substructure fingerprints perform best for small molecules such as drugs, while atom-pair fingerprints are preferable for large molecules such as peptides. However, no available fingerprint achieves good performance on both classes of molecules.</p> <p><b>Results</b>: Here we set out to design a new fingerprint suitable for both small and large molecules by combining substructure and atom-pair concepts. Our quest resulted in a new fingerprint called MinHashed atom-pair fingerprint up to a diameter of four bonds (MAP4). In this fingerprint the circular substructures with radii of <i>r</i> = 1 and <i>r </i>= 2 bonds around each atom in an atom-pair are written as two pairs of SMILES, each pair being combined with the topological distance separating the two central atoms. These so-called atom-pair molecular shingles are hashed, and the resulting set of hashes is MinHashed to form the MAP4 fingerprint. MAP4 significantly outperforms all other fingerprints on an extended benchmark that combines the Riniker and Landrum small molecule benchmark with a peptide benchmark recovering BLAST analogs from either scrambled or point mutation analogs. MAP4 furthermore produces well-organized chemical space tree-maps (TMAPs) for databases as diverse as DrugBank, ChEMBL, SwissProt and the Human Metabolome Database (HMBD), and differentiates between all metabolites in HMBD, over 70 % of which are indistinguishable from their nearest neighbor using substructure fingerprints. </p> <b>Conclusion</b>: MAP4 is a new molecular fingerprint suitable for drugs, biomolecules, and the metabolome and can be adopted as a universal fingerprint to describe and search chemical space. The source code is available at <a href="https://github.com/reymond-group/map4">https://github.com/reymond-group/map4</a> and interactive MAP4 similarity search tools and TMAPs for various databases are accessible at <a href="http://map-search.gdb.tools/">http://map-search.gdb.tools/</a> and <a href="http://tm.gdb.tools/map4/">http://tm.gdb.tools/map4/</a>.<a href="http://tm.gdb.tools/map4/"></a>


Sign in / Sign up

Export Citation Format

Share Document