<p>Several recent reports have shown that long short-term memory generative
neural networks (LSTM) of the type used for grammar learning efficiently learn
to write SMILES of drug-like compounds when trained with SMILES from a database
of bioactive compounds such as ChEMBL and can later produce focused sets upon
transfer learning with compounds of specific bioactivity profiles. Here we
trained an LSTM using molecules taken either from ChEMBL, DrugBank, commercially
available fragments, or from FDB-17 (a database of fragments up to 17 atoms) and
performed transfer learning to a single known drug to obtain new analogs of
this drug. We found that this approach readily generates hundreds of relevant and
diverse new drug analogs and works best with training sets of around 40,000
compounds as simple as commercial fragments. These data suggest that
fragment-based LSTM offer a promising method for new molecule generation.</p>