scholarly journals DeepInterface: Protein-protein interface validation using 3D Convolutional Neural Networks

2019 ◽  
Author(s):  
A.T. Balci ◽  
C. Gumeli ◽  
A. Hakouz ◽  
D. Yuret ◽  
O. Keskin ◽  
...  

AbstractMotivationProtein–protein interactions are crucial in almost all biological processes. Proteins interact through their interfaces. It is important to determine how proteins interact through interfaces to understand protein binding mechanisms and to predict new protein-protein interactions.ResultsWe present DeepInterface, a deep learning based method which predicts, for a given protein complex, if the interface between the proteins of a complex is a true interface or not. The model is a 3-dimensional convolutional neural networks model and the positive datasets are obtained from all complexes in the Protein Data Bank, the negative datasets are the incorrect solutions of the docking decoys. The model analyzes a given interface structure and outputs the probability of the given structure being an interface. The accuracy of the model for several interface data sets, including PIFACE, PPI4DOCK, DOCKGROUND is approximately 88% in the validation dataset and 75% in the test dataset. The method can be used to improve the accuracy of template based PPI predictions.

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e4750 ◽  
Author(s):  
Afshine Amidi ◽  
Shervine Amidi ◽  
Dimitrios Vlachakis ◽  
Vasileios Megalooikonomou ◽  
Nikos Paragios ◽  
...  

During the past decade, with the significant progress of computational power as well as ever-rising data availability, deep learning techniques became increasingly popular due to their excellent performance on computer vision problems. The size of the Protein Data Bank (PDB) has increased more than 15-fold since 1999, which enabled the expansion of models that aim at predicting enzymatic function via their amino acid composition. Amino acid sequence, however, is less conserved in nature than protein structure and therefore considered a less reliable predictor of protein function. This paper presents EnzyNet, a novel 3D convolutional neural networks classifier that predicts the Enzyme Commission number of enzymes based only on their voxel-based spatial structure. The spatial distribution of biochemical properties was also examined as complementary information. The two-layer architecture was investigated on a large dataset of 63,558 enzymes from the PDB and achieved an accuracy of 78.4% by exploiting only the binary representation of the protein shape. Code and datasets are available at https://github.com/shervinea/enzynet.


2019 ◽  
Vol 21 (5) ◽  
pp. 1798-1805 ◽  
Author(s):  
Kai Yu ◽  
Qingfeng Zhang ◽  
Zekun Liu ◽  
Yimeng Du ◽  
Xinjiao Gao ◽  
...  

Abstract Protein lysine acetylation regulation is an important molecular mechanism for regulating cellular processes and plays critical physiological and pathological roles in cancers and diseases. Although massive acetylation sites have been identified through experimental identification and high-throughput proteomics techniques, their enzyme-specific regulation remains largely unknown. Here, we developed the deep learning-based protein lysine acetylation modification prediction (Deep-PLA) software for histone acetyltransferase (HAT)/histone deacetylase (HDAC)-specific acetylation prediction based on deep learning. Experimentally identified substrates and sites of several HATs and HDACs were curated from the literature to generate enzyme-specific data sets. We integrated various protein sequence features with deep neural network and optimized the hyperparameters with particle swarm optimization, which achieved satisfactory performance. Through comparisons based on cross-validations and testing data sets, the model outperformed previous studies. Meanwhile, we found that protein–protein interactions could enrich enzyme-specific acetylation regulatory relations and visualized this information in the Deep-PLA web server. Furthermore, a cross-cancer analysis of acetylation-associated mutations revealed that acetylation regulation was intensively disrupted by mutations in cancers and heavily implicated in the regulation of cancer signaling. These prediction and analysis results might provide helpful information to reveal the regulatory mechanism of protein acetylation in various biological processes to promote the research on prognosis and treatment of cancers. Therefore, the Deep-PLA predictor and protein acetylation interaction networks could provide helpful information for studying the regulation of protein acetylation. The web server of Deep-PLA could be accessed at http://deeppla.cancerbio.info.


Author(s):  
Sagnik Banerjee ◽  
Valeria Velásquez-Zapata ◽  
Gregory Fuerst ◽  
J. Mitch Elmore ◽  
Roger P. Wise

ABSTRACTMapping protein-protein interactions at a proteome scale is critical to understanding how cellular signaling networks respond to stimuli. Since eukaryotic genomes encode thousands of proteins, testing their interactions one-by-one is a challenging prospect. High-throughput yeast-two hybrid (Y2H) assays that employ next-generation sequencing to interrogate cDNA libraries represent an alternative approach that optimizes scale, cost, and effort. We present NGPINT, a robust and scalable software to identify all putative interactors of a protein using Y2H in batch culture. NGPINT combines diverse tools to align sequence reads to target genomes, reconstruct prey fragments and compute gene enrichment under reporter selection. Central to this pipeline is the identification of fusion reads containing sequences derived from both the Y2H expression plasmid and the cDNA of interest. To reduce false positives, these fusion reads are evaluated as to whether the cDNA fragment forms an in-frame translational fusion with the Y2H transcription factor. NGPINT successfully recognized 95% of interactions in simulated test runs. As proof of concept, NGPINT was tested using published data sets and recognized all validated interactions. NGPINT can be used in any organism with an available reference, thus facilitating the discovery of protein-protein interactions in non-model organisms.


Author(s):  
Abeer Al-Hyari ◽  
Shawki Areibi

This paper proposes a framework for design space exploration ofConvolutional Neural Networks (CNNs) using Genetic Algorithms(GAs). CNNs have many hyperparameters that need to be tunedcarefully in order to achieve favorable results when used for imageclassification tasks or similar vision applications. Genetic Algorithmsare adopted to efficiently traverse the huge search spaceof CNNs hyperparameters, and generate the best architecture thatfits the given task. Some of the hyperparameters that were testedinclude the number of convolutional and fully connected layers, thenumber of filters for each convolutional layer, and the number ofnodes in the fully connected layers. The proposed approach wastested using MNIST dataset for handwritten digit classification andresults obtained indicate that the proposed approach is able to generatea CNN architecture with validation accuracy up to 96.66% onaverage.


2021 ◽  
Author(s):  
Jielu Yan ◽  
Bob Zhang ◽  
Mingliang Zhou ◽  
Hang Fai Kwok ◽  
Shirley W.I. Siu

Ligand peptides that have high affinity for ion channels are critical for regulating ion flux across the plasma membrane. These peptides are now being considered as potential drug candidates for many diseases, such as cardiovascular disease and cancers. There are several studies to identify ion channel interacting peptides computationally, but, to the best of our knowledge, none of them published available tools for prediction. To provide a solution, we present Multi-branch-CNN, a parallel convolutional neural networks (CNNs) method for identifying three types of ion channel peptide binders (sodium, potassium, and calcium). Our experiment shows that the Multi-Branch-CNN method performs comparably to thirteen traditional ML algorithms (TML13) on the test sets of three ion channels. To evaluate the predictive power of our method with respect to novel sequences, as is the case in real-world applications, we created an additional test set for each ion channel, called the novel-test set, which has little or no similarities to the sequences in either the sequences of the train set or the test set. In the novel-test experiment, Multi-Branch-CNN performs significantly better than TML13, showing an improvement in accuracy of 6%, 14%, and 15% for sodium, potassium, and calcium channels, respectively. We confirmed the effectiveness of Multi-Branch-CNN by comparing it to the standard CNN method with one input branch (Single-Branch-CNN) and an ensemble method (TML13-Stack). To facilitate applications, the data sets, script files to reproduce the experiments, and the final predictive models are freely available at https://github.com/jieluyan/Multi-Branch-CNN.


2021 ◽  
Vol 13 (18) ◽  
pp. 3770
Author(s):  
Mark A. Lundine ◽  
Arthur C. Trembanis

Carolina Bays are oriented and sandy-rimmed depressions that are ubiquitous throughout the Atlantic Coastal Plain (ACP). Their origin has been a highly debated topic since the 1800s and remains unsolved. Past population estimates of Carolina Bays have varied vastly, ranging between as few as 10,000 to as many as 500,000. With such a large uncertainty around the actual population size, mapping these enigmatic features is a problem that requires an automated detection scheme. Using publicly available LiDAR-derived digital elevation models (DEMs) of the ACP as training images, various types of convolutional neural networks (CNNs) were trained to detect Carolina bays. The detection results were assessed for accuracy and scalability, as well as analyzed for various morphologic, land-use and land cover, and hydrologic characteristics. Overall, the detector found over 23,000 Carolina Bays from southern New Jersey to northern Florida, with highest densities along interfluves. Carolina Bays in Delmarva were found to be smaller and shallower than Bays in the southeastern ACP. At least a third of Carolina Bays have been converted to agricultural lands and almost half of all Carolina Bays are forested. Few Carolina Bays are classified as open water basins, yet almost all of the detected Bays were within 2 km of a water body. In addition, field investigations based upon detection results were performed to describe the sedimentology of Carolina Bays. Sedimentological investigations showed that Bays typically have 1.5 m to 2.5 m thick sand rims that show a gradient in texture, with coarser sand at the bottom and finer sand and silt towards the top. Their basins were found to be 0.5 m to 2 m thick and showed a mix of clayey, silty, and sandy deposits. Last, the results compiled during this study were compared to similar depressional features (i.e., playa-lunette systems) to pinpoint any similarities in origin processes. Altogether, this study shows that CNNs are valuable tools for automated geomorphic feature detection and can lead to new insights when coupled with various forms of remotely sensed and field-based datasets.


2021 ◽  
Vol 7 ◽  
pp. e497
Author(s):  
Shakeel Shafiq ◽  
Tayyaba Azim

Deep neural networks have been widely explored and utilised as a useful tool for feature extraction in computer vision and machine learning. It is often observed that the last fully connected (FC) layers of convolutional neural network possess higher discrimination power as compared to the convolutional and maxpooling layers whose goal is to preserve local and low-level information of the input image and down sample it to avoid overfitting. Inspired from the functionality of local binary pattern (LBP) operator, this paper proposes to induce discrimination into the mid layers of convolutional neural network by introducing a discriminatively boosted alternative to pooling (DBAP) layer that has shown to serve as a favourable replacement of early maxpooling layer in a convolutional neural network (CNN). A thorough research of the related works show that the proposed change in the neural architecture is novel and has not been proposed before to bring enhanced discrimination and feature visualisation power achieved from the mid layer features. The empirical results reveal that the introduction of DBAP layer in popular neural architectures such as AlexNet and LeNet produces competitive classification results in comparison to their baseline models as well as other ultra-deep models on several benchmark data sets. In addition, better visualisation of intermediate features can allow one to seek understanding and interpretation of black box behaviour of convolutional neural networks, used widely by the research community.


Sign in / Sign up

Export Citation Format

Share Document