<p>Microbial natural
products (NPs) are an important source of drugs. However, their structural
diversity remains poorly understood. Here we used our recently reported MinHashed
Atom
Pair
fingerprint with diameter of
four bonds (MAP4), a fingerprint suitable for
molecules across very different sizes, to analyze the Natural Products Atlas (NPAtlas), a
database of 25,523 NPs of bacterial or fungal origin downloaded from <a href="https://www.npatlas.org/joomla/">https://www.npatlas.org/joomla/</a>.
To visualize NPAtlas by MAP4 similarity, we used the dimensionality reduction
method tree map (TMAP) (<a href="http://tmap.gdb.tools/">http://tmap.gdb.tools</a>).
The resulting interactive map (<a href="https://tm.gdb.tools/map4/npatlas_map_tmap/">https://tm.gdb.tools/map4/npatlas_map_tmap/</a>)
organizes molecules by physico-chemical properties and compound families such
as peptides, glycosides, polyphenols or terpenoids. Remarkably, the map separates
bacterial and fungal NPs from one another, revealing that these two compound
families are intrinsically different despite of their related biosynthetic
pathways. We used these differences to train a machine learning model capable
of distinguishing between NPs of bacterial or fungal origin. </p>