Background:
Drug development requires a lot of money and time, and the outcome of the challenge is unknown. So, there is an urgent need for researchers to find a new approach that can reduce costs. Therefore, the identification of drug-target interactions (DTIs) has been a critical step in the early stages of drug discovery. These computational methods aim to narrow the search space for novel DTIs and to elucidate the functional background of drugs. Most of the methods developed so far use binary classification to predict the presence or absence of interactions between the drug and the target. However, it is more informative, but also more challenging, to predict the strength of the binding between a drug and its target. If the strength is not strong enough, such a DTI may not be useful. Hence, the development of methods to predict drug-target affinity (DTA) is of significant importance.
Method:
We have improved the Graph DTA model from a dual-channel model to a triple-channel model. We interpreted the target/protein sequences as time series and extracted their features using the LSTM network. For the drug, we considered both the molecular structure and the local chemical background, retaining the four variant networks used in Graph DTA to extract the topological features of the drug and capturing the local chemical background of the atoms in the drug by using BiGRU. Thus, we obtained the latent features of the target and two latent features of the drug. The connection of these three feature vectors is then input into a 2-layer FC network, and a valuable binding affinity is output.
Result:
We use the Davis and Kiba datasets, using 80% of the data for training and 20% of the data for validation. Our model shows better performance by comparing it with the experimental results of Graph DTA.
Conclusion:
In this paper, we altered the Graph DTA model to predict drug-target affinity. It represents the drug as a graph, and extracts the two-dimensional drug information using a graph convolutional neural network. Simultaneously, the drug and protein targets are represented as a word vector, and the convolutional neural network is used to extract the time series information of the drug and the target. We demonstrate that our improved method has better performance than the original method. In particular, our model has better performance in the evaluation of benchmark databases.