scholarly journals Benchmarking Molecular Feature Attribution Methods with Activity Cliffs

Author(s):  
José Jiménez-Luna ◽  
Miha Skalic ◽  
Nils Weskamp
2021 ◽  
Author(s):  
José Jiménez Luna ◽  
Miha Skalic ◽  
Nils Weskamp

Feature attribution techniques are popular choices within the explainable artificial intelligence toolbox, as they can help elucidate which parts of the provided inputs used by an underlying supervised-learning method are considered relevant for a specific prediction. In the context of molecular design, these approaches typically involve the coloring of molecular graphs, whose presentation to medicinal chemists can be useful for making a decision of which compounds to synthesize or prioritize. The consistency of the highlighted moieties alongside expert background knowledge is expected to contribute to the understanding of machine-learning models in drug design. Quantitative evaluation of such coloring approaches, however, has so far been limited to substructure identification tasks. We here present an approach that is based on maximum common substructure algorithms applied to experimentally-determined activity cliffs. Using the proposed benchmark, we found that molecule coloring approaches in conjunction with classical machine-learning models tend to outperform more modern, deep-learning-based alternatives. However, none of the tested feature attribution methods sufficiently and consistently generalized when confronted with unseen examples.


Spine ◽  
2017 ◽  
Vol 42 (5) ◽  
pp. 291-297 ◽  
Author(s):  
Shixin Gu ◽  
Wentao Gu ◽  
Jiajun Shou ◽  
Ji Xiong ◽  
Xiaodong Liu ◽  
...  

2019 ◽  
Author(s):  
Juan C. Villada ◽  
Maria F. Duran ◽  
Patrick K. H. Lee

Codon usage bias exerts control over a wide variety of molecular processes. The positioning of synonymous codons within coding sequences (CDSs) dictates protein expression by mechanisms such as local translation efficiency, mRNA Gibbs free energy, and protein co-translational folding. In this work, we explore how codon variants affect the position-dependent content of hydrogen bonding, which in turn influences energy requirements for unwinding double-stranded DNA. By analyzing over 14,000 bacterial, archaeal, and fungal ORFeomes, we found that Bacteria and Archaea exhibit an exponential ramp of hydrogen bonding at the 5′-end of CDSs, while a similar ramp was not found in Fungi. The ramp develops within the first 20 codon positions in prokaryotes, eventually reaching a steady carrying capacity of hydrogen bonding that does not differ from Fungi. Selection against uniformity tests proved that selection acts against synonymous codons with high content of hydrogen bonding at the 5′-end of prokaryotic ORFeomes. Overall, this study provides novel insights into the molecular feature of hydrogen bonding that is governed by the genetic code at the 5′-end of CDSs. A web-based application to analyze the position-dependent hydrogen bonding of ORFeomes has been developed and is publicly available (https://juanvillada.shinyapps.io/hbonds/).


Sign in / Sign up

Export Citation Format

Share Document