A heuristic method for simulating open-data of arbitrary complexity that can be used to compare and evaluate machine learning methods

Author(s):  
Jason H. Moore ◽  
Maksim Shestov ◽  
Peter Schmitt ◽  
Randal S. Olson
2020 ◽  
Author(s):  
Edwin Tse ◽  
Laksh Aithani ◽  
Mark Anderson ◽  
Jonathan Cardoso-Silva ◽  
Giovanni Cincilla ◽  
...  

<p>The discovery of new antimalarial medicines with novel mechanisms of action is key to combating the problem of increasing resistance to our frontline treatments. The Open Source Malaria (OSM) consortium has been developing compounds ("Series 4") that have potent activity against <i>Plasmodium falciparum</i> <i>in vitro</i> and <i>in vivo</i> and that have been suggested to act through the inhibition of <i>Pf</i>ATP4, an essential membrane ion pump that regulates the parasite’s intracellular Na<sup>+</sup> concentration. The structure of <i>Pf</i>ATP4 is yet to be determined. In the absence of structural information about this target, a public competition was created to develop a model that would allow the prediction of anti-<i>Pf</i>ATP4 activity among Series 4 compounds, thereby reducing project costs associated with the unnecessary synthesis of inactive compounds.</p>In the first round, in 2016, six participants used the open data collated by OSM to develop moderately predictive models using diverse methods. Notably, all submitted models were available to all other participants in real time. Since then further bioactivity data have been acquired and machine learning methods have rapidly developed, so a second round of the competition was undertaken, in 2019, again with freely-donated models that other participants could see. The best-performing models from this second round were used to predict novel inhibitory molecules, of which several were synthesised and evaluated against the parasite. One such compound, containing a motif that the human chemists familiar with this series would have dismissed as ill-advised, was active. The project demonstrated the abilities of new machine learning methods in the prediction of active compounds where there is no biological target structure, frequently the central problem in phenotypic drug discovery. Since all data and participant interactions remain in the public domain, this research project “lives” and may be improved by others.


2020 ◽  
Author(s):  
Edwin Tse ◽  
Laksh Aithani ◽  
Mark Anderson ◽  
Jonathan Cardoso-Silva ◽  
Giovanni Cincilla ◽  
...  

<p>The discovery of new antimalarial medicines with novel mechanisms of action is key to combating the problem of increasing resistance to our frontline treatments. The Open Source Malaria (OSM) consortium has been developing compounds ("Series 4") that have potent activity against <i>Plasmodium falciparum</i> <i>in vitro</i> and <i>in vivo</i> and that have been suggested to act through the inhibition of <i>Pf</i>ATP4, an essential membrane ion pump that regulates the parasite’s intracellular Na<sup>+</sup> concentration. The structure of <i>Pf</i>ATP4 is yet to be determined. In the absence of structural information about this target, a public competition was created to develop a model that would allow the prediction of anti-<i>Pf</i>ATP4 activity among Series 4 compounds, thereby reducing project costs associated with the unnecessary synthesis of inactive compounds.</p>In the first round, in 2016, six participants used the open data collated by OSM to develop moderately predictive models using diverse methods. Notably, all submitted models were available to all other participants in real time. Since then further bioactivity data have been acquired and machine learning methods have rapidly developed, so a second round of the competition was undertaken, in 2019, again with freely-donated models that other participants could see. The best-performing models from this second round were used to predict novel inhibitory molecules, of which several were synthesised and evaluated against the parasite. One such compound, containing a motif that the human chemists familiar with this series would have dismissed as ill-advised, was active. The project demonstrated the abilities of new machine learning methods in the prediction of active compounds where there is no biological target structure, frequently the central problem in phenotypic drug discovery. Since all data and participant interactions remain in the public domain, this research project “lives” and may be improved by others.


Sign in / Sign up

Export Citation Format

Share Document