Prediction of Drug-target interactions from heterogeneous information network based on LINE embedding model

Abstract Background: The prediction of potential drug-protein target interactions (DTIs) not only provides a better comprehension of biological processes but also is critical for identifying new drugs. However, due to the disadvantages of costly and high time-consuming traditional experiments, only a small section of interactions between drugs and targets in the database is verified experimentally. Therefore, it is meaningful and important to develop new computational methods with good performance for predicting DTIs. At present, many existing computational methods only utilize a single type of molecule without paying attention to the interactions and influences between other types of molecules. Methods: In this work, we developed a novel network embedding-based heterogeneous information integration model to predict potential DTIs. Firstly, a heterogeneous information network is built by combining the known associations among protein, drug, lncRNA, disease, and miRNA. Secondly, the Large-scale Information Network Embedding (LINE) model is used to learn behavior information of nodes in the network. Hence, the known drug-protein interaction pairs can be represented as a combination of attribute information (e.g. protein sequences information and drug molecular fingerprints) and behavior information of themselves. Thirdly, the Random Forest classifier is used for training and predicting. Results: In the results, under the 5-fold cross validation, our method obtained 85.83% prediction accuracy with 80.47% sensitivity at the AUC of 92.33%. Moreover, in the case studies of three common drugs, the top 10 candidate targets have 8 (Caffeine), 7 (Clozapine) and 6 (Pioglitazone) are respectively verified to be associated with corresponding drugs. Conclusions: In short, these results indicate that our method can be a powerful tool for predicting drug-protein interactions and finding unknown targets for certain drugs or unknown drugs for certain targets.

Download Full-text

Prediction of Drug-target interactions from heterogeneous information network based on LINE embedding model

10.21203/rs.3.rs-16492/v2 ◽

2020 ◽

Author(s):

Bo-Ya Ji ◽

Zhu-Hong You ◽

Han-Jing Jiang ◽

Zhen-Hao Guo ◽

Kai Zheng

Keyword(s):

Computational Methods ◽

Drug Target ◽

Large Scale ◽

New Drugs ◽

Single Type ◽

Information Network ◽

Potential Drug ◽

Network Embedding ◽

Heterogeneous Information Network ◽

Heterogeneous Information

Abstract Background: The prediction of potential drug-protein target interactions (DTIs) not only provides a better comprehension of biological processes but also is critical for identifying new drugs. However, due to the disadvantages of expensive and high time-consuming traditional experiments, only a small section of interactions between drugs and targets in the database were verified experimentally. Therefore, it is meaningful and important to develop new computational methods with good performance for DTIs prediction. At present, many existing computational methods only utilize the single type of interactions between drugs and proteins without paying attention to the associations and influences with other types of molecules. Methods: In this work, we developed a novel network embedding-based heterogeneous information integration model to predict potential drug-target interactions. Firstly, a heterogeneous information network is built by combining the known associations among protein, drug, lncRNA, disease, and miRNA. Secondly, the Large-scale Information Network Embedding (LINE) model is used to learn behavior information (associations with other nodes) of drugs and proteins in the network. Hence, the known drug-protein interaction pairs can be represented as a combination of attribute information (e.g. protein sequences information and drug molecular fingerprints) and behavior information of themselves. Thirdly, the Random Forest classifier is used for training and prediction. Results: In the results, under the 5-fold cross validation, our method obtained 85.83% prediction accuracy with 80.47% sensitivity at the AUC of 92.33%. Moreover, in the case studies of three common drugs, the top 10 candidate targets have 8 (Caffeine), 7 (Clozapine) and 6 (Pioglitazone) are respectively verified to be associated with corresponding drugs. Conclusions: In short, these results indicate that our method can be a powerful tool for predicting potential drug-protein interactions and finding unknown targets for certain drugs or unknown drugs for certain targets.

Download Full-text

Prediction of drug-target interactions from multi-molecular network based on LINE network representation method

Journal of Translational Medicine ◽

10.1186/s12967-020-02490-x ◽

2020 ◽

Vol 18 (1) ◽

Author(s):

Bo-Ya Ji ◽

Zhu-Hong You ◽

Han-Jing Jiang ◽

Zhen-Hao Guo ◽

Kai Zheng

Keyword(s):

Computational Methods ◽

Information Integration ◽

Drug Target ◽

Large Scale ◽

New Drugs ◽

Single Type ◽

Information Network ◽

Potential Drug Target ◽

Potential Drug ◽

Network Embedding

Abstract Background The prediction of potential drug-target interactions (DTIs) not only provides a better comprehension of biological processes but also is critical for identifying new drugs. However, due to the disadvantages of expensive and high time-consuming traditional experiments, only a small section of interactions between drugs and targets in the database were verified experimentally. Therefore, it is meaningful and important to develop new computational methods with good performance for DTIs prediction. At present, many existing computational methods only utilize the single type of interactions between drugs and proteins without paying attention to the associations and influences with other types of molecules. Methods In this work, we developed a novel network embedding-based heterogeneous information integration model to predict potential drug-target interactions. Firstly, a heterogeneous multi-molecuar information network is built by combining the known associations among protein, drug, lncRNA, disease, and miRNA. Secondly, the Large-scale Information Network Embedding (LINE) model is used to learn behavior information (associations with other nodes) of drugs and proteins in the network. Hence, the known drug-protein interaction pairs can be represented as a combination of attribute information (e.g. protein sequences information and drug molecular fingerprints) and behavior information of themselves. Thirdly, the Random Forest classifier is used for training and prediction. Results In the results, under the five-fold cross validation, our method obtained 85.83% prediction accuracy with 80.47% sensitivity at the AUC of 92.33%. Moreover, in the case studies of three common drugs, the top 10 candidate targets have 8 (Caffeine), 7 (Clozapine) and 6 (Pioglitazone) are respectively verified to be associated with corresponding drugs. Conclusions In short, these results indicate that our method can be a powerful tool for predicting potential drug-target interactions and finding unknown targets for certain drugs or unknown drugs for certain targets.

Download Full-text

W-MMP2Vec: Topic-driven network embedding model for link prediction in content-based heterogeneous information network

Intelligent Data Analysis ◽

10.3233/ida-205168 ◽

2021 ◽

Vol 25 (3) ◽

pp. 711-738

Author(s):

Phu Pham ◽

Phuc Do

Keyword(s):

Link Prediction ◽

Representation Learning ◽

Information Network ◽

Network Embedding ◽

Heterogeneous Information Network ◽

Heterogeneous Information ◽

Learning Framework ◽

Novel Approach ◽

Proposed Model ◽

Meta Path

Link prediction on heterogeneous information network (HIN) is considered as a challenge problem due to the complexity and diversity in types of nodes and links. Currently, there are remained challenges of meta-path-based link prediction in HIN. Previous works of link prediction in HIN via network embedding approach are mainly focused on exploiting features of node rather than existing relations in forms of meta-paths between nodes. In fact, predicting the existence of new links between non-linked nodes is absolutely inconvincible. Moreover, recent HIN-based embedding models also lack of thorough evaluations on the topic similarity between text-based nodes along given meta-paths. To tackle these challenges, in this paper, we proposed a novel approach of topic-driven multiple meta-path-based HIN representation learning framework, namely W-MMP2Vec. Our model leverages the quality of node representations by combining multiple meta-paths as well as calculating the topic similarity weight for each meta-path during the processes of network embedding learning in content-based HINs. To validate our approach, we apply W-TMP2Vec model in solving several link prediction tasks in both content-based and non-content-based HINs (DBLP, IMDB and BlogCatalog). The experimental outputs demonstrate the effectiveness of proposed model which outperforms recent state-of-the-art HIN representation learning models.

Download Full-text