protein interaction prediction
Recently Published Documents


TOTAL DOCUMENTS

189
(FIVE YEARS 53)

H-INDEX

25
(FIVE YEARS 5)

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Jingping Yuan ◽  
Changwei Shen ◽  
Ranghua Yuan ◽  
Huaixia Zhang ◽  
Yan Xiao ◽  
...  

Abstract Background Tipburn, also known as leaf tip necrosis, is a severe issue in Chinese cabbage production. One known cause is that plants are unable to provide adequate Ca2+ to rapidly expanding leaves. Bacterial infection is also a contributing factor. Different cultivars have varying degrees of tolerance to tipburn. Two inbred lines of Chinese cabbage were employed as resources in this research. Results We determined that the inbred line ‘J39290’ was the tipburn resistant material and the inbred line ‘J95822’ was the tipburn sensitive material based on the severity of tipburn, and the integrity of cell membrane structure. Ca2+ concentration measurements revealed no significant difference in Ca2+ concentration between the two materials inner leaves. Transcriptome sequencing technology was also used to find the differentially expressed genes (DEGs) of ‘J95822’ and ‘J39290’, and there was no significant difference in the previously reported Ca2+ uptake and transport related genes in the two materials. However, it is evident through DEG screening and classification that 23 genes are highly linked to plant-pathogen interactions, and they encode three different types of proteins: CaM/CML, Rboh, and CDPK. These 23 genes mainly function through Ca2+-CaM/CML-CDPK signal pathway based on KEGG pathway analysis, protein interaction prediction, and quantitative real-time PCR (qRT-PCR) of key genes. Conclusions By analyzing the Ca2+ concentration in the above two materials, the transcription of previously reported genes related to Ca2+ uptake and transport, the functional annotation and KEGG pathway of DEGs, it was found that Ca2+ deficiency was not the main cause of tipburn in ‘J95822’, but was probably caused by bacterial infection. This study lays a theoretical foundation for exploring the molecular mechanism of resistance to tipburn in Chinese cabbage, and has important guiding significance for genetics and breeding.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Thi Ngan Dong ◽  
Graham Brogden ◽  
Gisa Gerold ◽  
Megha Khosla

Abstract Background Viral infections are causing significant morbidity and mortality worldwide. Understanding the interaction patterns between a particular virus and human proteins plays a crucial role in unveiling the underlying mechanism of viral infection and pathogenesis. This could further help in prevention and treatment of virus-related diseases. However, the task of predicting protein–protein interactions between a new virus and human cells is extremely challenging due to scarce data on virus-human interactions and fast mutation rates of most viruses. Results We developed a multitask transfer learning approach that exploits the information of around 24 million protein sequences and the interaction patterns from the human interactome to counter the problem of small training datasets. Instead of using hand-crafted protein features, we utilize statistically rich protein representations learned by a deep language modeling approach from a massive source of protein sequences. Additionally, we employ an additional objective which aims to maximize the probability of observing human protein–protein interactions. This additional task objective acts as a regularizer and also allows to incorporate domain knowledge to inform the virus-human protein–protein interaction prediction model. Conclusions Our approach achieved competitive results on 13 benchmark datasets and the case study for the SARS-CoV-2 virus receptor. Experimental results show that our proposed model works effectively for both virus-human and bacteria-human protein–protein interaction prediction tasks. We share our code for reproducibility and future research at https://git.l3s.uni-hannover.de/dong/multitask-transfer.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Liqian Zhou ◽  
Qi Duan ◽  
Xiongfei Tian ◽  
He Xu ◽  
Jianxin Tang ◽  
...  

Abstract Background Long noncoding RNAs (lncRNAs) have dense linkages with a plethora of important cellular activities. lncRNAs exert functions by linking with corresponding RNA-binding proteins. Since experimental techniques to detect lncRNA-protein interactions (LPIs) are laborious and time-consuming, a few computational methods have been reported for LPI prediction. However, computation-based LPI identification methods have the following limitations: (1) Most methods were evaluated on a single dataset, and researchers may thus fail to measure their generalization ability. (2) The majority of methods were validated under cross validation on lncRNA-protein pairs, did not investigate the performance under other cross validations, especially for cross validation on independent lncRNAs and independent proteins. (3) lncRNAs and proteins have abundant biological information, how to select informative features need to further investigate. Results Under a hybrid framework (LPI-HyADBS) integrating feature selection based on AdaBoost, and classification models including deep neural network (DNN), extreme gradient Boost (XGBoost), and SVM with a penalty Coefficient of misclassification (C-SVM), this work focuses on finding new LPIs. First, five datasets are arranged. Each dataset contains lncRNA sequences, protein sequences, and an LPI network. Second, biological features of lncRNAs and proteins are acquired based on Pyfeat. Third, the obtained features of lncRNAs and proteins are selected based on AdaBoost and concatenated to depict each LPI sample. Fourth, DNN, XGBoost, and C-SVM are used to classify lncRNA-protein pairs based on the concatenated features. Finally, a hybrid framework is developed to integrate the classification results from the above three classifiers. LPI-HyADBS is compared to six classical LPI prediction approaches (LPI-SKF, LPI-NRLMF, Capsule-LPI, LPI-CNNCP, LPLNP, and LPBNI) on five datasets under 5-fold cross validations on lncRNAs, proteins, lncRNA-protein pairs, and independent lncRNAs and independent proteins. The results show LPI-HyADBS has the best LPI prediction performance under four different cross validations. In particular, LPI-HyADBS obtains better classification ability than other six approaches under the constructed independent dataset. Case analyses suggest that there is relevance between ZNF667-AS1 and Q15717. Conclusions Integrating feature selection approach based on AdaBoost, three classification techniques including DNN, XGBoost, and C-SVM, this work develops a hybrid framework to identify new linkages between lncRNAs and proteins.


2021 ◽  
Author(s):  
Isak Johansson-Åkhe ◽  
Björn Wallner

Protein interactions are key in vital biological process. In many cases, particularly often in regulation, this interaction is between a protein and a shorter peptide fragment. Such peptides are often part of larger disordered regions of other proteins. The flexible nature of peptides enable rapid, yet specific, regulation of important functions in the cell, such as the cell-cycle. Because of this, understanding the molecular details of these interactions are crucial to understand and alter their function, and many specialized computational methods have been developed to study them. The recent release of AlphaFold and now AlphaFold-Multimer has caused a leap in accuracy for computational modeling of proteins. Additionally, AlphaFold has proven generalizable enough that it can be adapted to a number of specialized protein modeling challenges outside of the original single-chain protein modeling it was trained for. In this paper, the ability of AlphaFold to predict which peptides and proteins interact as well as its accuracy in modeling the resulting interaction complexes are benchmarked against established methods in the fields of peptide-protein interaction prediction and modeling. We find that AlphaFold-Multimer consistently produces predicted interaction complexes with the best DockQ-scores, with a mean DockQ of 0.49 for all 247 complexes investigated. Additionally, it can be used to separate interacting from non-interacting pairs of peptides and proteins with ROC-AUC and PR-AUC of 0.75 and 0.54, respectively, best among the method in benchmark. However, there is still room for improvement, for a decent precision of 0.8 it only recalls 0.2 of the positive examples (FPR=0.01), which means the will miss many true interactions. By combining AlphaFold-Multimer with InterPep2 the model quality for interacting proteins is increased, but it does not improve the separation of interacting from non-interacting.


PLoS Genetics ◽  
2021 ◽  
Vol 17 (11) ◽  
pp. e1009869
Author(s):  
Jiajun Qiu ◽  
Kui Chen ◽  
Chunlong Zhong ◽  
Sihao Zhu ◽  
Xiao Ma

The perturbations of protein-protein interactions (PPIs) were found to be the main cause of cancer. Previous PPI prediction methods which were trained with non-disease general PPI data were not compatible to map the PPI network in cancer. Therefore, we established a novel cancer specific PPI prediction method dubbed NECARE, which was based on relational graph convolutional network (R-GCN) with knowledge-based features. It achieved the best performance with a Matthews correlation coefficient (MCC) = 0.84±0.03 and an F1 = 91±2% compared with other methods. With NECARE, we mapped the cancer interactome atlas and revealed that the perturbations of PPIs were enriched on 1362 genes, which were named cancer hub genes. Those genes were found to over-represent with mutations occurring at protein-macromolecules binding interfaces. Furthermore, over 56% of cancer treatment-related genes belonged to hub genes and they were significantly related to the prognosis of 32 types of cancers. Finally, by coimmunoprecipitation, we confirmed that the NECARE prediction method was highly reliable with a 90% accuracy. Overall, we provided the novel network-based cancer protein-protein interaction prediction method and mapped the perturbation of cancer interactome. NECARE is available at: https://github.com/JiajunQiu/NECARE.


2021 ◽  
Vol 22 (S5) ◽  
Author(s):  
Ermal Elbasani ◽  
Soualihou Ngnamsie Njimbouom ◽  
Tae-Jin Oh ◽  
Eung-Hee Kim ◽  
Hyun Lee ◽  
...  

Abstract Background Compound–protein interaction prediction is necessary to investigate health regulatory functions and promotes drug discovery. Machine learning is becoming increasingly important in bioinformatics for applications such as analyzing protein-related data to achieve successful solutions. Modeling the properties and functions of proteins is important but challenging, especially when dealing with predictions of the sequence type. Result We propose a method to model compounds and proteins for compound–protein interaction prediction. A graph neural network is used to represent the compounds, and a convolutional layer extended with a bidirectional recurrent neural network framework, Long Short-Term Memory, and Gate Recurrent unit is used for protein sequence vectorization. The convolutional layer captures regulatory protein functions, while the recurrent layer captures long-term dependencies between protein functions, thus improving the accuracy of interaction prediction with compounds. A database of 7000 sets of annotated compound protein interaction, containing 1000 base length proteins is taken into consideration for the implementation. The results indicate that the proposed model performs effectively and can yield satisfactory accuracy regarding compound protein interaction prediction. Conclusion The performance of GCRNN is based on the classification accordiong to a binary class of interactions between proteins and compounds The architectural design of GCRNN model comes with the integration of the Bi-Recurrent layer on top of CNN to learn dependencies of motifs on protein sequences and improve the accuracy of the predictions.


Sign in / Sign up

Export Citation Format

Share Document