scholarly journals Prediction of protein function using a deep convolutional neural network ensemble

Author(s):  
Evangelia I Zacharaki

Background. The availability of large databases containing high resolution three-dimensional (3D) models of proteins in conjunction with functional annotation allows the exploitation of advanced supervised machine learning techniques for automatic protein function prediction. Methods. In this work, novel shape features are extracted representing protein structure in the form of local (per amino acid) distribution of angles and amino acid distances, respectively. Each of the multi-channel feature maps is introduced into a deep convolutional neural network (CNN) for function prediction and the outputs are fused through Support Vector Machines (SVM) or a correlation-based k-nearest neighbor classifier. Two different architectures are investigated employing either one CNN per multi-channel feature set, or one CNN per image channel. Results. Cross validation experiments on enzymes (n = 44,661) from the PDB database achieved 90.1% correct classification demonstrating the effectiveness of the proposed method for automatic function annotation of protein structures. Discussion. The automatic prediction of protein function can provide quick annotations on extensive datasets opening the path for relevant applications, such as pharmacological target identification.

2017 ◽  
Author(s):  
Evangelia I Zacharaki

Background. The availability of large databases containing high resolution three-dimensional (3D) models of proteins in conjunction with functional annotation allows the exploitation of advanced supervised machine learning techniques for automatic protein function prediction. Methods. In this work, novel shape features are extracted representing protein structure in the form of local (per amino acid) distribution of angles and amino acid distances, respectively. Each of the multi-channel feature maps is introduced into a deep convolutional neural network (CNN) for function prediction and the outputs are fused through Support Vector Machines (SVM) or a correlation-based k-nearest neighbor classifier. Two different architectures are investigated employing either one CNN per multi-channel feature set, or one CNN per image channel. Results. Cross validation experiments on enzymes (n = 44,661) from the PDB database achieved 90.1% correct classification demonstrating the effectiveness of the proposed method for automatic function annotation of protein structures. Discussion. The automatic prediction of protein function can provide quick annotations on extensive datasets opening the path for relevant applications, such as pharmacological target identification.


2017 ◽  
Vol 3 ◽  
pp. e124 ◽  
Author(s):  
Evangelia I. Zacharaki

Background The availability of large databases containing high resolution three-dimensional (3D) models of proteins in conjunction with functional annotation allows the exploitation of advanced supervised machine learning techniques for automatic protein function prediction. Methods In this work, novel shape features are extracted representing protein structure in the form of local (per amino acid) distribution of angles and amino acid distances, respectively. Each of the multi-channel feature maps is introduced into a deep convolutional neural network (CNN) for function prediction and the outputs are fused through support vector machines or a correlation-based k-nearest neighbor classifier. Two different architectures are investigated employing either one CNN per multi-channel feature set, or one CNN per image channel. Results Cross validation experiments on single-functional enzymes (n = 44,661) from the PDB database achieved 90.1% correct classification, demonstrating an improvement over previous results on the same dataset when sequence similarity was not considered. Discussion The automatic prediction of protein function can provide quick annotations on extensive datasets opening the path for relevant applications, such as pharmacological target identification. The proposed method shows promise for structure-based protein function prediction, but sufficient data may not yet be available to properly assess the method’s performance on non-homologous proteins and thus reduce the confounding factor of evolutionary relationships.


Entropy ◽  
2022 ◽  
Vol 24 (1) ◽  
pp. 102
Author(s):  
Michele Lo Giudice ◽  
Giuseppe Varone ◽  
Cosimo Ieracitano ◽  
Nadia Mammone ◽  
Giovanbattista Gaspare Tripodi ◽  
...  

The differential diagnosis of epileptic seizures (ES) and psychogenic non-epileptic seizures (PNES) may be difficult, due to the lack of distinctive clinical features. The interictal electroencephalographic (EEG) signal may also be normal in patients with ES. Innovative diagnostic tools that exploit non-linear EEG analysis and deep learning (DL) could provide important support to physicians for clinical diagnosis. In this work, 18 patients with new-onset ES (12 males, 6 females) and 18 patients with video-recorded PNES (2 males, 16 females) with normal interictal EEG at visual inspection were enrolled. None of them was taking psychotropic drugs. A convolutional neural network (CNN) scheme using DL classification was designed to classify the two categories of subjects (ES vs. PNES). The proposed architecture performs an EEG time-frequency transformation and a classification step with a CNN. The CNN was able to classify the EEG recordings of subjects with ES vs. subjects with PNES with 94.4% accuracy. CNN provided high performance in the assigned binary classification when compared to standard learning algorithms (multi-layer perceptron, support vector machine, linear discriminant analysis and quadratic discriminant analysis). In order to interpret how the CNN achieved this performance, information theoretical analysis was carried out. Specifically, the permutation entropy (PE) of the feature maps was evaluated and compared in the two classes. The achieved results, although preliminary, encourage the use of these innovative techniques to support neurologists in early diagnoses.


2021 ◽  
pp. 1-13
Author(s):  
R. Bhuvaneswari ◽  
S. Ganesh Vaidyanathan

Diabetic Retinopathy (DR) is one of the most common diabetic diseases that affect the retina’s blood vessels. Too much of the glucose level in blood leads to blockage of blood vessels in the retina, weakening and damaging the retina. Automatic classification of diabetic retinopathy is a challenging task in medical research. This work proposes a Mixture of Ensemble Classifiers (MEC) to classify and grade diabetic retinopathy images using hierarchical features. We use an ensemble of classifiers such as support vector machine, random forest, and Adaboost classifiers that use the hierarchical feature maps obtained at every pooling layer of a convolutional neural network (CNN) for training. The feature maps are generated by applying the filters to the output of the previous layer. Lastly, we predict the class label or the grade for the given test diabetic retinopathy image by considering the class labels of all the ensembled classifiers. We have tested our approaches on the E-ophtha dataset for the classification task and the Messidor dataset for the grading task. We achieved an accuracy of 95.8% and 96.2% for the E-ophtha and Messidor datasets, respectively. A comparison among prominent convolutional neural network architectures and the proposed approach is provided.


2019 ◽  
Vol 17 (01) ◽  
pp. 1950004 ◽  
Author(s):  
Chun Fang ◽  
Yoshitaka Moriwaki ◽  
Aikui Tian ◽  
Caihong Li ◽  
Kentaro Shimizu

Molecular recognition features (MoRFs) are key functional regions of intrinsically disordered proteins (IDPs), which play important roles in the molecular interaction network of cells and are implicated in many serious human diseases. Identifying MoRFs is essential for both functional studies of IDPs and drug design. This study adopts the cutting-edge machine learning method of artificial intelligence to develop a powerful model for improving MoRFs prediction. We proposed a method, named as en_DCNNMoRF (ensemble deep convolutional neural network-based MoRF predictor). It combines the outcomes of two independent deep convolutional neural network (DCNN) classifiers that take advantage of different features. The first, DCNNMoRF1, employs position-specific scoring matrix (PSSM) and 22 types of amino acid-related factors to describe protein sequences. The second, DCNNMoRF2, employs PSSM and 13 types of amino acid indexes to describe protein sequences. For both single classifiers, DCNN with a novel two-dimensional attention mechanism was adopted, and an average strategy was added to further process the output probabilities of each DCNN model. Finally, en_DCNNMoRF combined the two models by averaging their final scores. When compared with other well-known tools applied to the same datasets, the accuracy of the novel proposed method was comparable with that of state-of-the-art methods. The related web server can be accessed freely via http://vivace.bi.a.u-tokyo.ac.jp:8008/fang/en_MoRFs.php .


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Wentao Wu ◽  
Daning Li ◽  
Jiaoyang Du ◽  
Xiangyu Gao ◽  
Wen Gu ◽  
...  

Among the currently proposed brain segmentation methods, brain tumor segmentation methods based on traditional image processing and machine learning are not ideal enough. Therefore, deep learning-based brain segmentation methods are widely used. In the brain tumor segmentation method based on deep learning, the convolutional network model has a good brain segmentation effect. The deep convolutional network model has the problems of a large number of parameters and large loss of information in the encoding and decoding process. This paper proposes a deep convolutional neural network fusion support vector machine algorithm (DCNN-F-SVM). The proposed brain tumor segmentation model is mainly divided into three stages. In the first stage, a deep convolutional neural network is trained to learn the mapping from image space to tumor marker space. In the second stage, the predicted labels obtained from the deep convolutional neural network training are input into the integrated support vector machine classifier together with the test images. In the third stage, a deep convolutional neural network and an integrated support vector machine are connected in series to train a deep classifier. Run each model on the BraTS dataset and the self-made dataset to segment brain tumors. The segmentation results show that the performance of the proposed model is significantly better than the deep convolutional neural network and the integrated SVM classifier.


Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6008 ◽  
Author(s):  
Misbah Farooq ◽  
Fawad Hussain ◽  
Naveed Khan Baloch ◽  
Fawad Riasat Raja ◽  
Heejung Yu ◽  
...  

Speech emotion recognition (SER) plays a significant role in human–machine interaction. Emotion recognition from speech and its precise classification is a challenging task because a machine is unable to understand its context. For an accurate emotion classification, emotionally relevant features must be extracted from the speech data. Traditionally, handcrafted features were used for emotional classification from speech signals; however, they are not efficient enough to accurately depict the emotional states of the speaker. In this study, the benefits of a deep convolutional neural network (DCNN) for SER are explored. For this purpose, a pretrained network is used to extract features from state-of-the-art speech emotional datasets. Subsequently, a correlation-based feature selection technique is applied to the extracted features to select the most appropriate and discriminative features for SER. For the classification of emotions, we utilize support vector machines, random forests, the k-nearest neighbors algorithm, and neural network classifiers. Experiments are performed for speaker-dependent and speaker-independent SER using four publicly available datasets: the Berlin Dataset of Emotional Speech (Emo-DB), Surrey Audio Visual Expressed Emotion (SAVEE), Interactive Emotional Dyadic Motion Capture (IEMOCAP), and the Ryerson Audio Visual Dataset of Emotional Speech and Song (RAVDESS). Our proposed method achieves an accuracy of 95.10% for Emo-DB, 82.10% for SAVEE, 83.80% for IEMOCAP, and 81.30% for RAVDESS, for speaker-dependent SER experiments. Moreover, our method yields the best results for speaker-independent SER with existing handcrafted features-based SER approaches.


2017 ◽  
Vol 3 (2) ◽  
pp. 103-107 ◽  
Author(s):  
Jirapong Manit ◽  
Achim Schweikard ◽  
Floris Ernst

AbstractIn this paper, we presented a deep convolutional neural network (CNN) approach for forehead tissue thickness estimation. We use down sampled NIR laser backscattering images acquired from a novel marker-less near-infrared laser-based head tracking system, combined with the beam’s incident angle parameter. These two-channel augmented images were constructed for the CNN input, while a single node output layer represents the estimated value of the forehead tissue thickness. The models were – separately for each subject – trained and tested on datasets acquired from 30 subjects (high resolution MRI data is used as ground truth). To speed up training, we used a pre-trained network from the first subject to bootstrap training for each of the other subjects. We could show a clear improvement for the tissue thickness estimation (mean RMSE of 0.096 mm). This proposed CNN model outperformed previous support vector regression (mean RMSE of 0.155 mm) or Gaussian processes learning approaches (mean RMSE of 0.114 mm) and eliminated their restrictions for future research.


2020 ◽  
Author(s):  
Yuwei Sun ◽  
Hideya Ochiai ◽  
Hiroshi Esaki

Abstract This article illustrates a method of visualizing network traffic in LAN based on the Hilbert Curve structure and the array exchange and projection, with nine types of protocols’ communication frequency information as the discriminators, the results of which we call them feature maps of network events. Several known scan cases are simulated in LANs and network traffic is collected for generating feature maps under each case. In order to solve this multi-label classification task, we adopt and train a deep convolutional neural network (DCNN), in two different network environments with feature maps as the input data, and different scan cases as the labels. We separate datasets with a ratio of 4:1 into the training dataset and the validation dataset. Then, based on the micro scores and the macro scores of the validation, we evaluate performance of the scheme, achieving macro-F-measure scores of 0.982 and 0.975, and micro-F-measure scores of 0.976 and 0.965 separately in these two LANs.


Sign in / Sign up

Export Citation Format

Share Document