tree edit distance
Recently Published Documents


TOTAL DOCUMENTS

89
(FIVE YEARS 12)

H-INDEX

16
(FIVE YEARS 1)

Author(s):  
Samuel Genheden ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

Abstract We expand the recent work on clustering of synthetic routes and train a deep learning model to predict the distances between arbitrary routes. The model is based on an long short-term memory (LSTM) representation of a synthetic route and is trained as a twin network to reproduce the tree edit distance (TED) between two routes. The ML approach is approximately two orders of magnitude faster than the TED approach and enables clustering many more routes from a retrosynthesis route prediction. The clusters have a high degree of similarity to the clusters given by the TED-based approach and are accordingly intuitive and explainable. We provide the developed model as open-source.


2021 ◽  
Author(s):  
Samuel Genheden ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

We expand our recent work on clustering of synthesis routes and train a deep learning model to predict the distances between arbitrary routes. The model is based on an long short-term memory (LSTM) representation of a synthesis route and is trained as a twin network to reproduce the tree edit distance (TED) between two routes. The ML approach is approximately two orders of magnitude faster than the TED approach and enables clustering many more routes from a retrosynthesis route prediction. The clusters have a high degree of similarity to the clusters given by the TED-based approach and are accordingly intuitive and explainable. We provide the developed model as open-source (https://github.com/MolecularAI/route-distances).


2021 ◽  
Author(s):  
Samuel Genheden ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

We expand our recent work on clustering of synthesis routes and train a deep learning model to predict the distances between arbitrary routes. The model is based on an long short-term memory (LSTM) representation of a synthesis route and is trained as a twin network to reproduce the tree edit distance (TED) between two routes. The ML approach is approximately two orders of magnitude faster than the TED approach and enables clustering many more routes from a retrosynthesis route prediction. The clusters have a high degree of similarity to the clusters given by the TED-based approach and are accordingly intuitive and explainable. We provide the developed model as open-source (https://github.com/MolecularAI/route-distances).


2020 ◽  
Author(s):  
Samuel Genheden ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<div>We present a novel algorithm to compute the distance between synthesis routes based on a tree edit distance calculation. Such distances can be used to cluster synthesis routes from a retrosynthesis prediction tool. We show that the clustering of routes from a retrosynthesis analysis is performed in less than ten seconds on average, and only constitutes seven percent of the total time (prediction + clustering). Furthermore, we are able to show that representative routes from each cluster can be used to reduce the set of predicted routes. Finally, we show with a number of examples that the algorithm gives intuitive clusters that can be easily rationalized. The algorithm is included in the latest version of the open-source AiZynthFinder software.</div>


2020 ◽  
Author(s):  
Samuel Genheden ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<div>We present a novel algorithm to compute the distance between synthesis routes based on a tree edit distance calculation. Such distances can be used to cluster synthesis routes from a retrosynthesis prediction tool. We show that the clustering of routes from a retrosynthesis analysis is performed in less than ten seconds on average, and only constitutes seven percent of the total time (prediction + clustering). Furthermore, we are able to show that representative routes from each cluster can be used to reduce the set of predicted routes. Finally, we show with a number of examples that the algorithm gives intuitive clusters that can be easily rationalized. The algorithm is included in the latest version of the open-source AiZynthFinder software.</div>


2020 ◽  
Vol 18 (4) ◽  
pp. 31-50
Author(s):  
Vinay Vachharajani ◽  
Jyoti Pareek

The demand for higher education keeps on increasing. The invention of information technology and e-learning have, to a large extent, solved the problem of shortage of skilled and qualified teachers. But there is no guarantee that this will ensure the high quality of learning. In spite of large number of students, though the delivery of learning materials and tests to the students have become very easy by uploading the same on the web, assessment could be tedious. There is a need to develop tools and technologies for fully automated assessment. In this paper, an innovative algorithm has been proposed for matching structures of two use-case diagrams drawn by a student and an expert respectively for automatic assessment of the same. Zhang and Shasha's tree edit distance algorithm has been extended for assessing use-case diagrams. Results from 445 students' answers based on 14 different scenarios are analyzed to evaluate the performance of the proposed algorithm. No comparable study has been reported by any other diagram assessing algorithms in the research literature.


2020 ◽  
Vol 16 (4) ◽  
pp. 1-22
Author(s):  
Karl Bringmann ◽  
Paweł Gawrychowski ◽  
Shay Mozes ◽  
Oren Weimann

Sign in / Sign up

Export Citation Format

Share Document