Bagging MSA Learning: Enhancing Low-Quality PSSM with Deep Learning for Accurate Protein Structure Property Prediction

Comprehensive Study on Enhancing Low-Quality Position-Specific Scoring Matrix with Deep Learning for Accurate Protein Structure Property Prediction: Using Bagging Multiple Sequence Alignment Learning

Journal of Computational Biology ◽

10.1089/cmb.2020.0416 ◽

2021 ◽

Vol 28 (4) ◽

pp. 346-361

Author(s):

Yuzhi Guo ◽

Jiaxiang Wu ◽

Hehuan Ma ◽

Sheng Wang ◽

Junzhou Huang

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Position Specific Scoring Matrix ◽

Structure Property ◽

Multiple Sequence ◽

Property Prediction ◽

Scoring Matrix ◽

Comprehensive Study

Download Full-text

WeightAln: Weighted Homologous Alignment for Protein Structure Property Prediction

2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm49941.2020.9313340 ◽

2020 ◽

Author(s):

Yuzhi Guo ◽

Jiaxiang Wu ◽

Hehuan Ma ◽

Jinyu Yang ◽

Xinliang Zhu ◽

...

Keyword(s):

Protein Structure ◽

Structure Property ◽

Property Prediction ◽

Homologous Alignment

Download Full-text

Towards an Instant Structure-Property Prediction Quality Control Tool for Additive Manufactured Steel using a Crystal Plasticity Trained Deep Learning Surrogate

Materials & Design ◽

10.1016/j.matdes.2021.110345 ◽

2021 ◽

pp. 110345

Author(s):

Yuhui Tu ◽

Zhongzhou Liu ◽

Luiz Carneiro ◽

Caitriona M. Ryan ◽

Andrew C. Parnell ◽

...

Keyword(s):

Quality Control ◽

Deep Learning ◽

Crystal Plasticity ◽

Structure Property ◽

Property Prediction ◽

Prediction Quality ◽

Quality Control Tool ◽

Control Tool

Download Full-text

RaptorX-Property: a web server for protein structure property prediction

Nucleic Acids Research ◽

10.1093/nar/gkw306 ◽

2016 ◽

Vol 44 (W1) ◽

pp. W430-W435 ◽

Cited By ~ 193

Author(s):

Sheng Wang ◽

Wei Li ◽

Shiwang Liu ◽

Jinbo Xu

Keyword(s):

Protein Structure ◽

Web Server ◽

Structure Property ◽

Property A ◽

Property Prediction

Download Full-text

Deep learning techniques have significantly impacted protein structure prediction and protein design

Current Opinion in Structural Biology ◽

10.1016/j.sbi.2021.01.007 ◽

2021 ◽

Vol 68 ◽

pp. 194-207

Author(s):

Robin Pearce ◽

Yang Zhang

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Protein Structure Prediction ◽

Protein Design ◽

Structure Prediction ◽

Learning Techniques

Download Full-text

Template-based prediction of protein structure with deep learning

BMC Genomics ◽

10.1186/s12864-020-07249-8 ◽

2020 ◽

Vol 21 (S11) ◽

Author(s):

Haicang Zhang ◽

Yufeng Shen

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Structure Prediction ◽

Tertiary Structure ◽

Query Sequence ◽

Dynamic Programming Algorithm ◽

Tertiary Structure Prediction ◽

Protein Tertiary Structure ◽

Protein Threading ◽

Protein Tertiary Structure Prediction

Abstract Background Accurate prediction of protein structure is fundamentally important to understand biological function of proteins. Template-based modeling, including protein threading and homology modeling, is a popular method for protein tertiary structure prediction. However, accurate template-query alignment and template selection are still very challenging, especially for the proteins with only distant homologs available. Results We propose a new template-based modelling method called ThreaderAI to improve protein tertiary structure prediction. ThreaderAI formulates the task of aligning query sequence with template as the classical pixel classification problem in computer vision and naturally applies deep residual neural network in prediction. ThreaderAI first employs deep learning to predict residue-residue aligning probability matrix by integrating sequence profile, predicted sequential structural features, and predicted residue-residue contacts, and then builds template-query alignment by applying a dynamic programming algorithm on the probability matrix. We evaluated our methods both in generating accurate template-query alignment and protein threading. Experimental results show that ThreaderAI outperforms currently popular template-based modelling methods HHpred, CNFpred, and the latest contact-assisted method CEthreader, especially on the proteins that do not have close homologs with known structures. In particular, in terms of alignment accuracy measured with TM-score, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 56, 13, and 11%, respectively, on template-query pairs at the similarity of fold level from SCOPe data. And on CASP13’s TBM-hard data, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 16, 9 and 8% in terms of TM-score, respectively. Conclusions These results demonstrate that with the help of deep learning, ThreaderAI can significantly improve the accuracy of template-based structure prediction, especially for distant-homology proteins.

Download Full-text

Deep Learning Insights into Lanthanides Complexation Chemistry

Molecules ◽

10.3390/molecules26113237 ◽

2021 ◽

Vol 26 (11) ◽

pp. 3237

Author(s):

Artem A. Mitrofanov ◽

Petr I. Matveev ◽

Kristina V. Yakubova ◽

Alexandru Korotcov ◽

Boris Sattarov ◽

...

Keyword(s):

Deep Learning ◽

Stability Constants ◽

Black Box ◽

Lanthanide Ions ◽

Structure Property ◽

Target Property ◽

Mutual Location ◽

Molecule Structure ◽

Complexation Chemistry ◽

Main Influence

Modern structure–property models are widely used in chemistry; however, in many cases, they are still a kind of a “black box” where there is no clear path from molecule structure to target property. Here we present an example of deep learning usage not only to build a model but also to determine key structural fragments of ligands influencing metal complexation. We have a series of chemically similar lanthanide ions, and we have collected data on complexes’ stability, built models, predicting stability constants and decoded the models to obtain key fragments responsible for complexation efficiency. The results are in good correlation with the experimental ones, as well as modern theories of complexation. It was shown that the main influence on the constants had a mutual location of the binding centers.

Download Full-text

Predicting carbon nanotube forest attributes and mechanical properties using simulated images and deep learning

npj Computational Materials ◽

10.1038/s41524-021-00603-8 ◽

2021 ◽

Vol 7 (1) ◽

Cited By ~ 1

Author(s):

Taher Hajilounezhad ◽

Rina Bao ◽

Kannappan Palaniappan ◽

Filiz Bunyak ◽

Prasad Calyam ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Carbon Nanotube ◽

High Throughput ◽

Self Assembly ◽

Mechanical Performance ◽

Physical Parameters ◽

Structure Property ◽

Processing Parameter ◽

Physics Based Simulation

AbstractUnderstanding and controlling the self-assembly of vertically oriented carbon nanotube (CNT) forests is essential for realizing their potential in myriad applications. The governing process–structure–property mechanisms are poorly understood, and the processing parameter space is far too vast to exhaustively explore experimentally. We overcome these limitations by using a physics-based simulation as a high-throughput virtual laboratory and image-based machine learning to relate CNT forest synthesis attributes to their mechanical performance. Using CNTNet, our image-based deep learning classifier module trained with synthetic imagery, combinations of CNT diameter, density, and population growth rate classes were labeled with an accuracy of >91%. The CNTNet regression module predicted CNT forest stiffness and buckling load properties with a lower root-mean-square error than that of a regression predictor based on CNT physical parameters. These results demonstrate that image-based machine learning trained using only simulated imagery can distinguish subtle CNT forest morphological features to predict physical material properties with high accuracy. CNTNet paves the way to incorporate scanning electron microscope imagery for high-throughput material discovery.

Download Full-text

Improved protein structure prediction by deep learning irrespective of co-evolution information

Nature Machine Intelligence ◽

10.1038/s42256-021-00348-5 ◽

2021 ◽

Author(s):

Jinbo Xu ◽

Matthew McPartlon ◽

Jin Li

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction

Download Full-text

Mol-BERT: An Effective Molecular Representation with BERT for Molecular Property Prediction

Wireless Communications and Mobile Computing ◽

10.1155/2021/7181815 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Juncai Li ◽

Xiaofei Jiang

Keyword(s):

Deep Learning ◽

Language Processing ◽

Large Scale ◽

Molecular Data ◽

Molecular Property ◽

Property Prediction ◽

Learning Framework ◽

Learning Techniques ◽

Potential Benefits ◽

Current Sequence

Molecular property prediction is an essential task in drug discovery. Most computational approaches with deep learning techniques either focus on designing novel molecular representation or combining with some advanced models together. However, researchers pay fewer attention to the potential benefits in massive unlabeled molecular data (e.g., ZINC). This task becomes increasingly challenging owing to the limitation of the scale of labeled data. Motivated by the recent advancements of pretrained models in natural language processing, the drug molecule can be naturally viewed as language to some extent. In this paper, we investigate how to develop the pretrained model BERT to extract useful molecular substructure information for molecular property prediction. We present a novel end-to-end deep learning framework, named Mol-BERT, that combines an effective molecular representation with pretrained BERT model tailored for molecular property prediction. Specifically, a large-scale prediction BERT model is pretrained to generate the embedding of molecular substructures, by using four million unlabeled drug SMILES (i.e., ZINC 15 and ChEMBL 27). Then, the pretrained BERT model can be fine-tuned on various molecular property prediction tasks. To examine the performance of our proposed Mol-BERT, we conduct several experiments on 4 widely used molecular datasets. In comparison to the traditional and state-of-the-art baselines, the results illustrate that our proposed Mol-BERT can outperform the current sequence-based methods and achieve at least 2% improvement on ROC-AUC score on Tox21, SIDER, and ClinTox dataset.

Download Full-text