scholarly journals DeepDriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding

Author(s):  
Hyungro Lee ◽  
Matteo Turilli ◽  
Shantenu Jha ◽  
Debsindhu Bhowmik ◽  
Heng Ma ◽  
...  
Science ◽  
2021 ◽  
Vol 373 (6557) ◽  
pp. 866.2-866
Author(s):  
Valda Vinson

Author(s):  
Mu Gao ◽  
Jeffrey Skolnick

Abstract Motivation From evolutionary interference, function annotation to structural prediction, protein sequence comparison has provided crucial biological insights. While many sequence alignment algorithms have been developed, existing approaches often cannot detect hidden structural relationships in the ‘twilight zone’ of low sequence identity. To address this critical problem, we introduce a computational algorithm that performs protein Sequence Alignments from deep-Learning of Structural Alignments (SAdLSA, silent ‘d’). The key idea is to implicitly learn the protein folding code from many thousands of structural alignments using experimentally determined protein structures. Results To demonstrate that the folding code was learned, we first show that SAdLSA trained on pure α-helical proteins successfully recognizes pairs of structurally related pure β-sheet protein domains. Subsequent training and benchmarking on larger, highly challenging datasets show significant improvement over established approaches. For challenging cases, SAdLSA is ∼150% better than HHsearch for generating pairwise alignments and ∼50% better for identifying the proteins with the best alignments in a sequence library. The time complexity of SAdLSA is O(N) thanks to GPU acceleration. Availability and implementation Datasets and source codes of SAdLSA are available free of charge for academic users at http://sites.gatech.edu/cssb/sadlsa/. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Stefan Doerr ◽  
Maciej Majewski ◽  
Adrià Pérez ◽  
Andreas Krämer ◽  
Cecilia Clementi ◽  
...  

2019 ◽  
Vol 10 (32) ◽  
pp. 7503-7515 ◽  
Author(s):  
Ryan S. DeFever ◽  
Colin Targonski ◽  
Steven W. Hall ◽  
Melissa C. Smith ◽  
Sapna Sarupria

We demonstrate a PointNet-based deep learning approach to classify local structure in molecular simulations, learning features directly from atomic coordinates.


2019 ◽  
Vol 116 (34) ◽  
pp. 16856-16865 ◽  
Author(s):  
Jinbo Xu

Direct coupling analysis (DCA) for protein folding has made very good progress, but it is not effective for proteins that lack many sequence homologs, even coupled with time-consuming conformation sampling with fragments. We show that we can accurately predict interresidue distance distribution of a protein by deep learning, even for proteins with ∼60 sequence homologs. Using only the geometric constraints given by the resulting distance matrix we may construct 3D models without involving extensive conformation sampling. Our method successfully folded 21 of the 37 CASP12 hard targets with a median family size of 58 effective sequence homologs within 4 h on a Linux computer of 20 central processing units. In contrast, DCA-predicted contacts cannot be used to fold any of these hard targets in the absence of extensive conformation sampling, and the best CASP12 group folded only 11 of them by integrating DCA-predicted contacts into fragment-based conformation sampling. Rigorous experimental validation in CASP13 shows that our distance-based folding server successfully folded 17 of 32 hard targets (with a median family size of 36 sequence homologs) and obtained 70% precision on the top L/5 long-range predicted contacts. The latest experimental validation in CAMEO shows that our server predicted correct folds for 2 membrane proteins while all of the other servers failed. These results demonstrate that it is now feasible to predict correct fold for many more proteins lack of similar structures in the Protein Data Bank even on a personal computer.


Sign in / Sign up

Export Citation Format

Share Document