scholarly journals VoroCNN: Deep convolutional neural network built on 3D Voronoi tessellation of protein structures

Author(s):  
Ilia Igashov ◽  
liment Olechnovič ◽  
Maria Kadukova ◽  
Česlovas Venclovas ◽  
Sergei Grudinin

Abstract Motivation Effective use of evolutionary information has recently led to tremendous progress in computational prediction of three-dimensional (3D) structures of proteins and their complexes. Despite the progress, the accuracy of predicted structures tends to vary considerably from case to case. Since the utility of computational models depends on their accuracy, reliable estimates of deviation between predicted and native structures are of utmost importance. Results For the first time, we present a deep convolutional neural network (CNN) constructed on a Voronoi tessellation of 3D molecular structures. Despite the irregular data domain, our data representation allows us to efficiently introduce both convolution and pooling operations and train the network in an end-to-end fashion without precomputed descriptors. The resultant model, VoroCNN, predicts local qualities of 3D protein folds. The prediction results are competitive to state of the art and superior to the previous 3D CNN architectures built for the same task. We also discuss practical applications of VoroCNN, for example, in recognition of protein binding interfaces. Availability The model, data, and evaluation tests are available at https://team.inria.fr/nano-d/software/vorocnn/. Supplementary information Supplementary data are available at Bioinformatics online.

Author(s):  
Ilia Igashov ◽  
Kliment Olechnovic ◽  
Maria Kadukova ◽  
Česlovas Venclovas ◽  
Sergei Grudinin

MotivationEffective use of evolutionary information has recently led to tremendous progress in computational prediction of three-dimensional (3D) structures of proteins and their complexes. Despite the progress, the accuracy of predicted structures tends to vary considerably from case to case. Since the utility of computational models depends on their accuracy, reliable estimates of deviation between predicted and native structures are of utmost importance.ResultsFor the first time we present a deep convolutional neural network (CNN) constructed on a Voronoi tessellation of 3D molecular structures. Despite the irregular data domain, our data representation allows to efficiently introduce both convolution and pooling operations of the network. We trained our model, called VoroCNN, to predict local qualities of 3D protein folds. The prediction results are competitive to the state of the art and superior to the previous 3D CNN architectures built for the same task. We also discuss practical applications of VoroCNN, for example, in the recognition of protein binding interfaces.AvailabilityThe model, data, and evaluation tests are available at https://team.inria.fr/nano-d/software/vorocnn/[email protected], [email protected]


Author(s):  
Amira Ahmad Al-Sharkawy ◽  
Gehan A. Bahgat ◽  
Elsayed E. Hemayed ◽  
Samia Abdel-Razik Mashali

Object classification problem is essential in many applications nowadays. Human can easily classify objects in unconstrained environments easily. Classical classification techniques were far away from human performance. Thus, researchers try to mimic the human visual system till they reached the deep neural networks. This chapter gives a review and analysis in the field of the deep convolutional neural network usage in object classification under constrained and unconstrained environment. The chapter gives a brief review on the classical techniques of object classification and the development of bio-inspired computational models from neuroscience till the creation of deep neural networks. A review is given on the constrained environment issues: the hardware computing resources and memory, the object appearance and background, and the training and processing time. Datasets that are used to test the performance are analyzed according to the images environmental conditions, besides the dataset biasing is discussed.


2017 ◽  
Author(s):  
Evangelia I Zacharaki

Background. The availability of large databases containing high resolution three-dimensional (3D) models of proteins in conjunction with functional annotation allows the exploitation of advanced supervised machine learning techniques for automatic protein function prediction. Methods. In this work, novel shape features are extracted representing protein structure in the form of local (per amino acid) distribution of angles and amino acid distances, respectively. Each of the multi-channel feature maps is introduced into a deep convolutional neural network (CNN) for function prediction and the outputs are fused through Support Vector Machines (SVM) or a correlation-based k-nearest neighbor classifier. Two different architectures are investigated employing either one CNN per multi-channel feature set, or one CNN per image channel. Results. Cross validation experiments on enzymes (n = 44,661) from the PDB database achieved 90.1% correct classification demonstrating the effectiveness of the proposed method for automatic function annotation of protein structures. Discussion. The automatic prediction of protein function can provide quick annotations on extensive datasets opening the path for relevant applications, such as pharmacological target identification.


Author(s):  
Nazanin Fouladgar ◽  
Marjan Alirezaie ◽  
Kary Främling

AbstractAffective computing solutions, in the literature, mainly rely on machine learning methods designed to accurately detect human affective states. Nevertheless, many of the proposed methods are based on handcrafted features, requiring sufficient expert knowledge in the realm of signal processing. With the advent of deep learning methods, attention has turned toward reduced feature engineering and more end-to-end machine learning. However, most of the proposed models rely on late fusion in a multimodal context. Meanwhile, addressing interrelations between modalities for intermediate-level data representation has been largely neglected. In this paper, we propose a novel deep convolutional neural network, called CN-Waterfall, consisting of two modules: Base and General. While the Base module focuses on the low-level representation of data from each single modality, the General module provides further information, indicating relations between modalities in the intermediate- and high-level data representations. The latter module has been designed based on theoretically grounded concepts in the Explainable AI (XAI) domain, consisting of four different fusions. These fusions are mainly tailored to correlation- and non-correlation-based modalities. To validate our model, we conduct an exhaustive experiment on WESAD and MAHNOB-HCI, two publicly and academically available datasets in the context of multimodal affective computing. We demonstrate that our proposed model significantly improves the performance of physiological-based multimodal affect detection.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Hua Xie ◽  
Minghua Zhang ◽  
Jiaming Ge ◽  
Xinfang Dong ◽  
Haiyan Chen

A sector is a basic unit of airspace whose operation is managed by air traffic controllers. The operation complexity of a sector plays an important role in air traffic management system, such as airspace reconfiguration, air traffic flow management, and allocation of air traffic controller resources. Therefore, accurate evaluation of the sector operation complexity (SOC) is crucial. Considering there are numerous factors that can influence SOC, researchers have proposed several machine learning methods recently to evaluate SOC by mining the relationship between factors and complexity. However, existing studies rely on hand-crafted factors, which are computationally difficult, specialized background required, and may limit the evaluation performance of the model. To overcome these problems, this paper for the first time proposes an end-to-end SOC learning framework based on deep convolutional neural network (CNN) specifically for free of hand-crafted factors environment. A new data representation, i.e., multichannel traffic scenario image (MTSI), is proposed to represent the overall air traffic scenario. A MTSI is generated by splitting the airspace into a two-dimension grid map and filled with navigation information. Motivated by the applications of deep learning network, the specific CNN model is introduced to automatically extract high-level traffic features from MTSIs and learn the SOC pattern. Thus, the model input is determined by combining multiple image channels composed of air traffic information, which are used to describe the traffic scenario. The model output is SOC levels for the target sector. The experimental results using a real dataset from the Guangzhou airspace sector in China show that our model can effectively extract traffic complexity information from MTSIs and achieve promising performance than traditional machine learning methods. In practice, our work can be flexibly and conveniently applied to SOC evaluation without the additional calculation of hand-crafted factors.


2020 ◽  
Vol 15 (7) ◽  
pp. 767-777
Author(s):  
Lin Guo ◽  
Qian Jiang ◽  
Xin Jin ◽  
Lin Liu ◽  
Wei Zhou ◽  
...  

Background: Protein secondary structure prediction (PSSP) is a fundamental task in bioinformatics that is helpful for understanding the three-dimensional structure and biological function of proteins. Many neural network-based prediction methods have been developed for protein secondary structures. Deep learning and multiple features are two obvious means to improve prediction accuracy. Objective: To promote the development of PSSP, a deep convolutional neural network-based method is proposed to predict both the eight-state and three-state of protein secondary structure. Methods: In this model, sequence and evolutionary information of proteins are combined as multiple input features after preprocessing. A deep convolutional neural network with no pooling layer and connection layer is then constructed to predict the secondary structure of proteins. L2 regularization, batch normalization, and dropout techniques are employed to avoid over-fitting and obtain better prediction performance, and an improved cross-entropy is used as the loss function. Results: Our proposed model can obtain Q3 prediction results of 86.2%, 84.5%, 87.8%, and 84.7%, respectively, on CullPDB, CB513, CASP10 and CASP11 datasets, with corresponding Q8 prediction results of 74.1%, 70.5%, 74.9%, and 71.3%. Conclusion: We have proposed the DCNN-SS deep convolutional-network-based PSSP method, and experimental results show that DCNN-SS performs competitively with other methods.


2019 ◽  
Vol 35 (21) ◽  
pp. 4222-4228 ◽  
Author(s):  
Tong Liu ◽  
Zheng Wang

Abstract Motivation High-resolution Hi-C data are indispensable for the studies of three-dimensional (3D) genome organization at kilobase level. However, generating high-resolution Hi-C data (e.g. 5 kb) by conducting Hi-C experiments needs millions of mammalian cells, which may eventually generate billions of paired-end reads with a high sequencing cost. Therefore, it will be important and helpful if we can enhance the resolutions of Hi-C data by computational methods. Results We developed a new computational method named HiCNN that used a 54-layer very deep convolutional neural network to enhance the resolutions of Hi-C data. The network contains both global and local residual learning with multiple speedup techniques included resulting in fast convergence. We used mean squared errors and Pearson’s correlation coefficients between real high-resolution and computationally predicted high-resolution Hi-C data to evaluate the method. The evaluation results show that HiCNN consistently outperforms HiCPlus, the only existing tool in the literature, when training and testing data are extracted from the same cell type (i.e. GM12878) and from two different cell types in the same or different species (i.e. GM12878 as training with K562 as testing, and GM12878 as training with CH12-LX as testing). We further found that the HiCNN-enhanced high-resolution Hi-C data are more consistent with real experimental high-resolution Hi-C data than HiCPlus-enhanced data in terms of indicating statistically significant interactions. Moreover, HiCNN can efficiently enhance low-resolution Hi-C data, which eventually helps recover two chromatin loops that were confirmed by 3D-FISH. Availability and implementation HiCNN is freely available at http://dna.cs.miami.edu/HiCNN/. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 36 (13) ◽  
pp. 4038-4046 ◽  
Author(s):  
Lei Wang ◽  
Zhu-Hong You ◽  
Yu-An Huang ◽  
De-Shuang Huang ◽  
Keith C C Chan

Abstract Motivation Emerging evidence indicates that circular RNA (circRNA) plays a crucial role in human disease. Using circRNA as biomarker gives rise to a new perspective regarding our diagnosing of diseases and understanding of disease pathogenesis. However, detection of circRNA–disease associations by biological experiments alone is often blind, limited to small scale, high cost and time consuming. Therefore, there is an urgent need for reliable computational methods to rapidly infer the potential circRNA–disease associations on a large scale and to provide the most promising candidates for biological experiments. Results In this article, we propose an efficient computational method based on multi-source information combined with deep convolutional neural network (CNN) to predict circRNA–disease associations. The method first fuses multi-source information including disease semantic similarity, disease Gaussian interaction profile kernel similarity and circRNA Gaussian interaction profile kernel similarity, and then extracts its hidden deep feature through the CNN and finally sends them to the extreme learning machine classifier for prediction. The 5-fold cross-validation results show that the proposed method achieves 87.21% prediction accuracy with 88.50% sensitivity at the area under the curve of 86.67% on the CIRCR2Disease dataset. In comparison with the state-of-the-art SVM classifier and other feature extraction methods on the same dataset, the proposed model achieves the best results. In addition, we also obtained experimental support for prediction results by searching published literature. As a result, 7 of the top 15 circRNA–disease pairs with the highest scores were confirmed by literature. These results demonstrate that the proposed model is a suitable method for predicting circRNA–disease associations and can provide reliable candidates for biological experiments. Availability and implementation The source code and datasets explored in this work are available at https://github.com/look0012/circRNA-Disease-association. Supplementary information Supplementary data are available at Bioinformatics online.


2017 ◽  
Author(s):  
Evangelia I Zacharaki

Background. The availability of large databases containing high resolution three-dimensional (3D) models of proteins in conjunction with functional annotation allows the exploitation of advanced supervised machine learning techniques for automatic protein function prediction. Methods. In this work, novel shape features are extracted representing protein structure in the form of local (per amino acid) distribution of angles and amino acid distances, respectively. Each of the multi-channel feature maps is introduced into a deep convolutional neural network (CNN) for function prediction and the outputs are fused through Support Vector Machines (SVM) or a correlation-based k-nearest neighbor classifier. Two different architectures are investigated employing either one CNN per multi-channel feature set, or one CNN per image channel. Results. Cross validation experiments on enzymes (n = 44,661) from the PDB database achieved 90.1% correct classification demonstrating the effectiveness of the proposed method for automatic function annotation of protein structures. Discussion. The automatic prediction of protein function can provide quick annotations on extensive datasets opening the path for relevant applications, such as pharmacological target identification.


Sign in / Sign up

Export Citation Format

Share Document