Deep learning methods for protein prediction problem

2017 ◽  
Author(s):  
◽  
Son Phong Nguyen

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] Computational protein structure prediction is very important for many applications in bioinformatics. Many prediction methods have been developed, including Modeller, HHpred, I-TASSER, Robetta, and MUFOLD. In the process of predicting protein structures, it is essential to accurately assess the quality of generated models. Consensus quality assessment (QA) methods, such as Pcons-net and MULTICOM-refine, which are based on structure similarity, performed well on QA tasks. The drawback of consensus QA methods is that they require a pool of diverse models to work well, which is not always available. More importantly, they cannot evaluate the quality of a single protein model, which is a very common task in protein predictions and other applications. Although many single-model quality assessment methods, such as ProQ2, MQAPmulti, OPUS-CA, DOPE, DFIRE, and RW, etc. have been developed to address that problem, their accuracy is not good enough for most real applications. In this dissertation, based on the idea of using C-[alpha] atoms distance matrix and deep learning methods, two methods have been proposed for assessing quality of protein structures. First, a novel algorithm based on deep learning techniques, called DL-Pro, is proposed. From training examples of distance matrices corresponding to good and bad models, DL-Pro learns a stacked autoencoder network as a classifier. In experiments on selected targets from the Critical Assessment of Structure Prediction (CASP) competition, DL-Pro obtained promising results, outperforming state-of-the-art energy/scoring functions, including OPUS-CA, DOPE, DFIRE, and RW. Second, a new method DeepCon-QA is developed to predict quality of single protein model. Based on the idea of using protein vector representation and distance matrix, DeepCon-QA was able to achieve comparable performance with the best state-of-the-art QA method in our experiments. It also takes advantage the strength of deep convolutional neural networks to “learn” and “understand” the input data to be able to predict output data precisely. On the other hand, this dissertation also proposes several new methods for solving loop modeling problem. Five new loop modeling methods based on machine learning techniques, called NearLooper, ConLooper, ResLooper, HyLooper1 and HyLooper2 are proposed. NearLooper is based on the nearest neighbor technique; ConLooper applies deep convolutional neural networks to predict Cα atoms distance matrix as an orientation-independent representation of protein structure; ResLooper uses residual neural networks instead of deep convolutional neural networks; HyLooper1 combines the results of NearLooper and ConLooper while HyLooper2 combines NearLooper and ResLooper. Three commonly used benchmarks for loop modeling are used to compare the performance between these methods and existing state-of-the-art methods. The experiment results show promising performance in which our best method improves existing state-of-the-art methods by 28% and 54% of average RMSD on two datasets while being comparable on the other one.

2019 ◽  
Vol 277 ◽  
pp. 02024 ◽  
Author(s):  
Lincan Li ◽  
Tong Jia ◽  
Tianqi Meng ◽  
Yizhe Liu

In this paper, an accurate two-stage deep learning method is proposed to detect vulnerable plaques in ultrasonic images of cardiovascular. Firstly, a Fully Convonutional Neural Network (FCN) named U-Net is used to segment the original Intravascular Optical Coherence Tomography (IVOCT) cardiovascular images. We experiment on different threshold values to find the best threshold for removing noise and background in the original images. Secondly, a modified Faster RCNN is adopted to do precise detection. The modified Faster R-CNN utilize six-scale anchors (122,162,322,642,1282,2562) instead of the conventional one scale or three scale approaches. First, we present three problems in cardiovascular vulnerable plaque diagnosis, then we demonstrate how our method solve these problems. The proposed method in this paper apply deep convolutional neural networks to the whole diagnostic procedure. Test results show the Recall rate, Precision rate, IoU (Intersection-over-Union) rate and Total score are 0.94, 0.885, 0.913 and 0.913 respectively, higher than the 1st team of CCCV2017 Cardiovascular OCT Vulnerable Plaque Detection Challenge. AP of the designed Faster RCNN is 83.4%, higher than conventional approaches which use one-scale or three-scale anchors. These results demonstrate the superior performance of our proposed method and the power of deep learning approaches in diagnose cardiovascular vulnerable plaques.


Author(s):  
Sheng Shen ◽  
M. K. Sadoughi ◽  
Xiangyi Chen ◽  
Mingyi Hong ◽  
Chao Hu

Over the past two decades, safety and reliability of lithium-ion (Li-ion) rechargeable batteries have been receiving a considerable amount of attention from both industry and academia. To guarantee safe and reliable operation of a Li-ion battery pack and build failure resilience in the pack, battery management systems (BMSs) should possess the capability to monitor, in real time, the state of health (SOH) of the individual cells in the pack. This paper presents a deep learning method, named deep convolutional neural networks, for cell-level SOH assessment based on the capacity, voltage, and current measurements during a charge cycle. The unique features of deep convolutional neural networks include the local connectivity and shared weights, which enable the model to estimate battery capacity accurately using the measurements during charge. To our knowledge, this is the first attempt to apply deep learning to online SOH assessment of Li-ion battery. 10-year daily cycling data from implantable Li-ion cells are used to verify the performance of the proposed method. Compared with traditional machine learning methods such as relevance vector machine and shallow neural networks, the proposed method is demonstrated to produce higher accuracy and robustness in capacity estimation.


BMC Genomics ◽  
2019 ◽  
Vol 20 (S9) ◽  
Author(s):  
Yang-Ming Lin ◽  
Ching-Tai Chen ◽  
Jia-Ming Chang

Abstract Background Tandem mass spectrometry allows biologists to identify and quantify protein samples in the form of digested peptide sequences. When performing peptide identification, spectral library search is more sensitive than traditional database search but is limited to peptides that have been previously identified. An accurate tandem mass spectrum prediction tool is thus crucial in expanding the peptide space and increasing the coverage of spectral library search. Results We propose MS2CNN, a non-linear regression model based on deep convolutional neural networks, a deep learning algorithm. The features for our model are amino acid composition, predicted secondary structure, and physical-chemical features such as isoelectric point, aromaticity, helicity, hydrophobicity, and basicity. MS2CNN was trained with five-fold cross validation on a three-way data split on the large-scale human HCD MS2 dataset of Orbitrap LC-MS/MS downloaded from the National Institute of Standards and Technology. It was then evaluated on a publicly available independent test dataset of human HeLa cell lysate from LC-MS experiments. On average, our model shows better cosine similarity and Pearson correlation coefficient (0.690 and 0.632) than MS2PIP (0.647 and 0.601) and is comparable with pDeep (0.692 and 0.642). Notably, for the more complex MS2 spectra of 3+ peptides, MS2PIP is significantly better than both MS2PIP and pDeep. Conclusions We showed that MS2CNN outperforms MS2PIP for 2+ and 3+ peptides and pDeep for 3+ peptides. This implies that MS2CNN, the proposed convolutional neural network model, generates highly accurate MS2 spectra for LC-MS/MS experiments using Orbitrap machines, which can be of great help in protein and peptide identifications. The results suggest that incorporating more data for deep learning model may improve performance.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Kai Kiwitz ◽  
Christian Schiffer ◽  
Hannah Spitzer ◽  
Timo Dickscheid ◽  
Katrin Amunts

AbstractThe distribution of neurons in the cortex (cytoarchitecture) differs between cortical areas and constitutes the basis for structural maps of the human brain. Deep learning approaches provide a promising alternative to overcome throughput limitations of currently used cytoarchitectonic mapping methods, but typically lack insight as to what extent they follow cytoarchitectonic principles. We therefore investigated in how far the internal structure of deep convolutional neural networks trained for cytoarchitectonic brain mapping reflect traditional cytoarchitectonic features, and compared them to features of the current grey level index (GLI) profile approach. The networks consisted of a 10-block deep convolutional architecture trained to segment the primary and secondary visual cortex. Filter activations of the networks served to analyse resemblances to traditional cytoarchitectonic features and comparisons to the GLI profile approach. Our analysis revealed resemblances to cellular, laminar- as well as cortical area related cytoarchitectonic features. The networks learned filter activations that reflect the distinct cytoarchitecture of the segmented cortical areas with special regard to their laminar organization and compared well to statistical criteria of the GLI profile approach. These results confirm an incorporation of relevant cytoarchitectonic features in the deep convolutional neural networks and mark them as a valid support for high-throughput cytoarchitectonic mapping workflows.


Electronics ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 190 ◽  
Author(s):  
Zhiwei Huang ◽  
Jinzhao Lin ◽  
Liming Xu ◽  
Huiqian Wang ◽  
Tong Bai ◽  
...  

The application of deep convolutional neural networks (CNN) in the field of medical image processing has attracted extensive attention and demonstrated remarkable progress. An increasing number of deep learning methods have been devoted to classifying ChestX-ray (CXR) images, and most of the existing deep learning methods are based on classic pretrained models, trained by global ChestX-ray images. In this paper, we are interested in diagnosing ChestX-ray images using our proposed Fusion High-Resolution Network (FHRNet). The FHRNet concatenates the global average pooling layers of the global and local feature extractors—it consists of three branch convolutional neural networks and is fine-tuned for thorax disease classification. Compared with the results of other available methods, our experimental results showed that the proposed model yields a better disease classification performance for the ChestX-ray 14 dataset, according to the receiver operating characteristic curve and area-under-the-curve score. An ablation study further confirmed the effectiveness of the global and local branch networks in improving the classification accuracy of thorax diseases.


2016 ◽  
Vol 21 (9) ◽  
pp. 998-1003 ◽  
Author(s):  
Oliver Dürr ◽  
Beate Sick

Deep learning methods are currently outperforming traditional state-of-the-art computer vision algorithms in diverse applications and recently even surpassed human performance in object recognition. Here we demonstrate the potential of deep learning methods to high-content screening–based phenotype classification. We trained a deep learning classifier in the form of convolutional neural networks with approximately 40,000 publicly available single-cell images from samples treated with compounds from four classes known to lead to different phenotypes. The input data consisted of multichannel images. The construction of appropriate feature definitions was part of the training and carried out by the convolutional network, without the need for expert knowledge or handcrafted features. We compare our results against the recent state-of-the-art pipeline in which predefined features are extracted from each cell using specialized software and then fed into various machine learning algorithms (support vector machine, Fisher linear discriminant, random forest) for classification. The performance of all classification approaches is evaluated on an untouched test image set with known phenotype classes. Compared to the best reference machine learning algorithm, the misclassification rate is reduced from 8.9% to 6.6%.


Author(s):  
Mohammed Abdulla Salim Al Husaini ◽  
Mohamed Hadi Habaebi ◽  
Teddy Surya Gunawan ◽  
Md Rafiqul Islam ◽  
Elfatih A. A. Elsheikh ◽  
...  

AbstractBreast cancer is one of the most significant causes of death for women around the world. Breast thermography supported by deep convolutional neural networks is expected to contribute significantly to early detection and facilitate treatment at an early stage. The goal of this study is to investigate the behavior of different recent deep learning methods for identifying breast disorders. To evaluate our proposal, we built classifiers based on deep convolutional neural networks modelling inception V3, inception V4, and a modified version of the latter called inception MV4. MV4 was introduced to maintain the computational cost across all layers by making the resultant number of features and the number of pixel positions equal. DMR database was used for these deep learning models in classifying thermal images of healthy and sick patients. A set of epochs 3–30 were used in conjunction with learning rates 1 × 10–3, 1 × 10–4 and 1 × 10–5, Minibatch 10 and different optimization methods. The training results showed that inception V4 and MV4 with color images, a learning rate of 1 × 10–4, and SGDM optimization method, reached very high accuracy, verified through several experimental repetitions. With grayscale images, inception V3 outperforms V4 and MV4 by a considerable accuracy margin, for any optimization methods. In fact, the inception V3 (grayscale) performance is almost comparable to inception V4 and MV4 (color) performance but only after 20–30 epochs. inception MV4 achieved 7% faster classification response time compared to V4. The use of MV4 model is found to contribute to saving energy consumed and fluidity in arithmetic operations for the graphic processor. The results also indicate that increasing the number of layers may not necessarily be useful in improving the performance.


Sign in / Sign up

Export Citation Format

Share Document