scholarly journals TransformerCPI: improving compound–protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments

2020 ◽  
Vol 36 (16) ◽  
pp. 4406-4414 ◽  
Author(s):  
Lifan Chen ◽  
Xiaoqin Tan ◽  
Dingyan Wang ◽  
Feisheng Zhong ◽  
Xiaohong Liu ◽  
...  

Abstract Motivation Identifying compound–protein interaction (CPI) is a crucial task in drug discovery and chemogenomics studies, and proteins without three-dimensional structure account for a large part of potential biological targets, which requires developing methods using only protein sequence information to predict CPI. However, sequence-based CPI models may face some specific pitfalls, including using inappropriate datasets, hidden ligand bias and splitting datasets inappropriately, resulting in overestimation of their prediction performance. Results To address these issues, we here constructed new datasets specific for CPI prediction, proposed a novel transformer neural network named TransformerCPI, and introduced a more rigorous label reversal experiment to test whether a model learns true interaction features. TransformerCPI achieved much improved performance on the new experiments, and it can be deconvolved to highlight important interacting regions of protein sequences and compound atoms, which may contribute chemical biology studies with useful guidance for further ligand structural optimization. Availability and implementation https://github.com/lifanchen-simm/transformerCPI.

2010 ◽  
Vol 20 (1) ◽  
pp. 37-45
Author(s):  
Mohammad Shoyaib ◽  
M. Abdullah-Al-Wadud ◽  
Syed Murtuza Baker ◽  
Mohammad Nurul Islam ◽  
Oksam Chae

An improved computational approach which implements a protein-protein interaction prediction system based on the sequence information of a protein has been presented. A Support Vector Machine (SVM) is trained with this sequence information to predict the interactions. This interaction prediction technique exhibits 79.81% accuracy over a wide range of data, which is a significant improvement over other conventional computational protein-protein interaction prediction methods. Key words: Protein-protein interaction, Amino acid sequence, Computational approach D.O.I. 10.3329/ptcb.v20i1.5963 Plant Tissue Cult. & Biotech. 20(1): 37-45, 2010 (June)  


Author(s):  
Ananthan Nambiar ◽  
Simon Liu ◽  
Mark Hopkins ◽  
Maeve Heflin ◽  
Sergei Maslov ◽  
...  

AbstractThe scientific community is rapidly generating protein sequence information, but only a fraction of these proteins can be experimentally characterized. While promising deep learning approaches for protein prediction tasks have emerged, they have computational limitations or are designed to solve a specific task. We present a Transformer neural network that pre-trains task-agnostic sequence representations. This model is fine-tuned to solve two different protein prediction tasks: protein family classification and protein interaction prediction. Our method is comparable to existing state-of-the art approaches for protein family classification, while being much more general than other architectures. Further, our method outperforms all other approaches for protein interaction prediction. These results offer a promising framework for fine-tuning the pre-trained sequence representations for other protein prediction tasks.


Methods ◽  
2016 ◽  
Vol 110 ◽  
pp. 64-72 ◽  
Author(s):  
Kai Tian ◽  
Mingyu Shao ◽  
Yang Wang ◽  
Jihong Guan ◽  
Shuigeng Zhou

Sign in / Sign up

Export Citation Format

Share Document