Accurate refinement of docked protein complexes using evolutionary information and deep learning

2016 ◽  
Vol 14 (03) ◽  
pp. 1642002 ◽  
Author(s):  
Bahar Akbal-Delibas ◽  
Roshanak Farhoodi ◽  
Marc Pomplun ◽  
Nurit Haspel

One of the major challenges for protein docking methods is to accurately discriminate native-like structures from false positives. Docking methods are often inaccurate and the results have to be refined and re-ranked to obtain native-like complexes and remove outliers. In a previous work, we introduced AccuRefiner, a machine learning based tool for refining protein–protein complexes. Given a docked complex, the refinement tool produces a small set of refined versions of the input complex, with lower root-mean-square-deviation (RMSD) of atomic positions with respect to the native structure. The method employs a unique ranking tool that accurately predicts the RMSD of docked complexes with respect to the native structure. In this work, we use a deep learning network with a similar set of features and five layers. We show that a properly trained deep learning network can accurately predict the RMSD of a docked complex with 1.40 Å error margin on average, by approximating the complex relationship between a wide set of scoring function terms and the RMSD of a docked structure. The network was trained on 35000 unbound docking complexes generated by RosettaDock. We tested our method on 25 different putative docked complexes produced also by RosettaDock for five proteins that were not included in the training data. The results demonstrate that the high accuracy of the ranking tool enables AccuRefiner to consistently choose the refinement candidates with lower RMSD values compared to the coarsely docked input structures.

2019 ◽  
Vol 59 (1) ◽  
pp. 426
Author(s):  
James Lowell ◽  
Jacob Smith

The interpretation of key horizons on seismic data is an essential but time-consuming part of the subsurface workflow. This is compounded when surfaces need to be re-interpreted on variations of the same data, such as angle stacks, 4D data, or reprocessed data. Deep learning networks, which are a subset of machine learning, have the potential to automate this reinterpretation process, and significantly increase the efficiency of the subsurface workflow. This study investigates whether a deep learning network can learn from a single horizon interpretation in order to identify that event in a different version of the same data. The results were largely successful with the target horizon correctly identified in an alternative offset stack, and was correctly repositioned in areas where there was misalignment between the training data and the test data.


2020 ◽  
Vol 14 ◽  
pp. 174830262097352
Author(s):  
Anis Theljani ◽  
Ke Chen

Different from image segmentation, developing a deep learning network for image registration is less straightforward because training data cannot be prepared or supervised by humans unless they are trivial (e.g. pre-designed affine transforms). One approach for an unsupervised deep leaning model is to self-train the deformation fields by a network based on a loss function with an image similarity metric and a regularisation term, just with traditional variational methods. Such a function consists in a smoothing constraint on the derivatives and a constraint on the determinant of the transformation in order to obtain a spatially smooth and plausible solution. Although any variational model may be used to work with a deep learning algorithm, the challenge lies in achieving robustness. The proposed algorithm is first trained based on a new and robust variational model and tested on synthetic and real mono-modal images. The results show how it deals with large deformation registration problems and leads to a real time solution with no folding. It is then generalised to multi-modal images. Experiments and comparisons with learning and non-learning models demonstrate that this approach can deliver good performances and simultaneously generate an accurate diffeomorphic transformation.


2019 ◽  
Author(s):  
Sambit K. Mishra ◽  
Sarah J. Cooper ◽  
Jerry M. Parks ◽  
Julie C. Mitchell

AbstractProtein-protein interactions play a key role in mediating numerous biological functions, with more than half the proteins in living organisms existing as either homo- or hetero-oligomeric assemblies. Protein subunits that form oligomers minimize the free energy of the complex, but exhaustive computational search-based docking methods have not comprehensively addressed the protein docking challenge of distinguishing a natively bound complex from non-native forms. In this study, we propose a scoring function, KFC-E, that accounts for both conservation and coevolution of putative binding hotspot residues at protein-protein interfaces. For a benchmark set of 53 bound complexes, KFC-E identifies a near-native binding mode as the top-scoring pose in 38% and in the top 5 in 55% of the complexes. For a set of 17 unbound complexes, KFC-E identifies a near-native pose in the top 10 ranked poses in more than 50% of the cases. By contrast, a scoring function that incorporates information on coevolution at predicted non-hotspots performs poorly by comparison. Our study highlights the importance of coevolution at hotspot residues in forming natively bound complexes and suggests a novel approach for coevolutionary scoring in protein docking.Author SummaryA fundamental problem in biology is to distinguish between the native and non-native bound forms of protein-protein complexes. Experimental methods are often used to detect the native bound forms of proteins but, are demanding in terms of time and resources. Computational approaches have proven to be a useful alternative; they sample the different binding configurations for a pair of interacting proteins and then use an heuristic or physical model to score them. In this study we propose a new scoring approach, KFC-E, which focuses on the evolutionary contributions from a subset of key interface residues (hotspots) to identify native bound complexes. KFC-E capitalizes on the wealth of information in protein sequence databases by incorporating residue-level conservation and coevolution of putative binding hotspots. As hotspot residues mediate the binding energetics of protein-protein interactions, we hypothesize that the knowledge of putative hotspots coupled with their evolutionary information should be helpful in the identification of native bound protein-protein complexes.


In the last few years, Deep Learning is one of the top research areas in academia as well as in industry. Every industry is now looking for a deep learning-based solution to the problems in hand. As a researcher, learning “Deep Learning” through practical experiments will be a very challenging task. Particularly, training a deep learning network with huge amount of training data will make it impractical to do this on a normal desktop computer or laptop. Even a small-scale application in computer vision using deep learning techniques will require several days of training the deep network model on a very higher end Graphical Processing Unit (GPU) clusters or Tensor Processing Unit (TPU) clusters that makes impractical to do that research on a conventional laptop. In this work, we address the possibilities of training a deep learning network with an insignificantly small dataset. Here we mean “significantly small dataset’ as a dataset with only few images (<10) per class. Since we are going to design a prototype drone detection system which is a single class classification problem, we hereby try to train the deep learning network only with few drone images (2 images only). Our research question is: will it be possible to train a YOLO deep learning network model only with two images and achieve a descent detection accurate on a constrained test dataset of drones? This paper addresses that issue and our results prove that it is possible to train a deep learning network only with two images and achieve good performance under constrained application environments.


2021 ◽  
Vol 11 (1) ◽  
pp. 339-348
Author(s):  
Piotr Bojarczak ◽  
Piotr Lesiak

Abstract The article uses images from Unmanned Aerial Vehicles (UAVs) for rail diagnostics. The main advantage of such a solution compared to traditional surveys performed with measuring vehicles is the elimination of decreased train traffic. The authors, in the study, limited themselves to the diagnosis of hazardous split defects in rails. An algorithm has been proposed to detect them with an efficiency rate of about 81% for defects not less than 6.9% of the rail head width. It uses the FCN-8 deep-learning network, implemented in the Tensorflow environment, to extract the rail head by image segmentation. Using this type of network for segmentation increases the resistance of the algorithm to changes in the recorded rail image brightness. This is of fundamental importance in the case of variable conditions for image recording by UAVs. The detection of these defects in the rail head is performed using an algorithm in the Python language and the OpenCV library. To locate the defect, it uses the contour of a separate rail head together with a rectangle circumscribed around it. The use of UAVs together with artificial intelligence to detect split defects is an important element of novelty presented in this work.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ryoya Shiode ◽  
Mototaka Kabashima ◽  
Yuta Hiasa ◽  
Kunihiro Oka ◽  
Tsuyoshi Murase ◽  
...  

AbstractThe purpose of the study was to develop a deep learning network for estimating and constructing highly accurate 3D bone models directly from actual X-ray images and to verify its accuracy. The data used were 173 computed tomography (CT) images and 105 actual X-ray images of a healthy wrist joint. To compensate for the small size of the dataset, digitally reconstructed radiography (DRR) images generated from CT were used as training data instead of actual X-ray images. The DRR-like images were generated from actual X-ray images in the test and adapted to the network, and high-accuracy estimation of a 3D bone model from a small data set was possible. The 3D shape of the radius and ulna were estimated from actual X-ray images with accuracies of 1.05 ± 0.36 and 1.45 ± 0.41 mm, respectively.


2021 ◽  
Vol 11 (13) ◽  
pp. 5880
Author(s):  
Paloma Tirado-Martin ◽  
Raul Sanchez-Reillo

Nowadays, Deep Learning tools have been widely applied in biometrics. Electrocardiogram (ECG) biometrics is not the exception. However, the algorithm performances rely heavily on a representative dataset for training. ECGs suffer constant temporal variations, and it is even more relevant to collect databases that can represent these conditions. Nonetheless, the restriction in database publications obstructs further research on this topic. This work was developed with the help of a database that represents potential scenarios in biometric recognition as data was acquired in different days, physical activities and positions. The classification was implemented with a Deep Learning network, BioECG, avoiding complex and time-consuming signal transformations. An exhaustive tuning was completed including variations in enrollment length, improving ECG verification for more complex and realistic biometric conditions. Finally, this work studied one-day and two-days enrollments and their effects. Two-days enrollments resulted in huge general improvements even when verification was accomplished with more unstable signals. EER was improved in 63% when including a change of position, up to almost 99% when visits were in a different day and up to 91% if the user experienced a heartbeat increase after exercise.


Sign in / Sign up

Export Citation Format

Share Document