domain boundary prediction Latest Research Papers

DomBpred: protein domain boundary predictor using inter-residue distance and domain-residue level clustering

10.1101/2021.11.19.469204 ◽

2021 ◽

Author(s):

Zhongze Yu ◽

Chunxiang Peng ◽

Jun Liu ◽

Biao Zhang ◽

Xiaogen Zhou ◽

...

Keyword(s):

Clustering Algorithm ◽

State Of The Art ◽

Domain Boundary ◽

Residue Level ◽

Single Domain ◽

Protein Domain ◽

Test Set ◽

Cut Points ◽

Domain Boundary Prediction ◽

Domain Protein

Domain boundary prediction is one of the most important problems in the study of protein structure and function, especially for large proteins. At present, most domain boundary prediction methods have low accuracy and limitations in dealing with multi-domain proteins. In this study, we develop a sequence-based protein domain boundary predictor, named DomBpred. In DomBpred, the input sequence is firstly classified as either a single-domain protein or a multi-domain protein through a designed effective sequence metric based on a constructed single-domain sequence library. For the multi-domain protein, a domain-residue level clustering algorithm inspired by Ising model is proposed to cluster the spatially close residues according inter-residue distance. The unclassified residues and the residues at the edge of the cluster are then tuned by the secondary structure to form potential cut points. Finally, a domain boundary scoring function is proposed to recursively evaluate the potential cut points to generate the domain boundary. DomBpred is tested on a large-scale test set of FUpred comprising 2549 proteins. Experimental results show that DomBpred better performs than the state-of-the-art methods in classifying whether protein sequences are composed by single or multiple domains, and the Matthew's correlation coefficient is 0.882. Moreover, on 849 multi-domain proteins, the domain boundary distance and normalised domain overlap scores of DomBpred are 0.523 and 0.824, respectively, which are 5.0% and 4.2% higher than those of the best comparison method, respectively. Comparison with other methods on the given test set shows that DomBpred outperforms most state-of-the-art sequence-based methods and even achieves better results than the top-level template-based method.

preciseTAD: A machine learning framework for precise 3D domain boundary prediction at base-level resolution

10.1101/2020.09.03.282186 ◽

2020 ◽

Author(s):

Spiro C. Stilianoudakis ◽

Mikhail G. Dozmorov

Keyword(s):

Domain Boundary ◽

Model Performance ◽

Learning Framework ◽

Base Level ◽

Topologically Associating Domains ◽

Annotation Data ◽

Domain Boundaries ◽

Under Sampling ◽

Domain Boundary Prediction ◽

Genome Annotations

AbstractThe low resolution of high-throughput chromatin conformation capture data limits the precise mapping of boundaries of topologically associating domains and chromatin loops. We developed preciseTAD, an optimized random forest model trained on high-resolution genome annotation data (e.g., CTCF ChIP-seq) to predict the location of domain boundaries at base-level resolution. Distance between boundaries and annotations, random under-sampling, and transcription factor binding sites resulted in best model performance. preciseTAD boundaries were more enriched for CTCF, RAD21, SMC3, and ZNF143, and conserved across cell lines. Using genome annotations, pre-trained models can detect boundaries in cells without Hi-C data. preciseTAD is available at https://bioconductor.org/packages/preciseTAD

DNN-Dom: predicting protein domain boundary from sequence alone by deep neural network

Bioinformatics ◽

10.1093/bioinformatics/btz464 ◽

2019 ◽

Vol 35 (24) ◽

pp. 5128-5136 ◽

Cited By ~ 3

Author(s):

Qiang Shi ◽

Weiya Chen ◽

Siqi Huang ◽

Fanglin Jin ◽

Yinghao Dong ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Structure Prediction ◽

Domain Boundary ◽

Protein Domain ◽

Supplementary Information ◽

High Dimensions ◽

Long Range Interactions ◽

Domain Boundary Prediction ◽

And Function

Abstract Motivation Accurate delineation of protein domain boundary plays an important role for protein engineering and structure prediction. Although machine-learning methods are widely used to predict domain boundary, these approaches often ignore long-range interactions among residues, which have been proven to improve the prediction performance. However, how to simultaneously model the local and global interactions to further improve domain boundary prediction is still a challenging problem. Results This article employs a hybrid deep learning method that combines convolutional neural network and gate recurrent units’ models for domain boundary prediction. It not only captures the local and non-local interactions, but also fuses these features for prediction. Additionally, we adopt balanced Random Forest for classification to deal with high imbalance of samples and high dimensions of deep features. Experimental results show that our proposed approach (DNN-Dom) outperforms existing machine-learning-based methods for boundary prediction. We expect that DNN-Dom can be useful for assisting protein structure and function prediction. Availability and implementation The method is available as DNN-Dom Server at http://isyslab.info/DNN-Dom/. Supplementary information Supplementary data are available at Bioinformatics online.

ConDo: protein domain boundary prediction using coevolutionary information

Bioinformatics ◽

10.1093/bioinformatics/bty973 ◽

2018 ◽

Vol 35 (14) ◽

pp. 2411-2417 ◽

Cited By ~ 2

Author(s):

Seung Hwan Hong ◽

Keehyoung Joo ◽

Jooyoung Lee

Keyword(s):

Long Range ◽

Domain Boundary ◽

Prediction Method ◽

Protein Domain ◽

Supplementary Information ◽

Sequence Information ◽

Multiple Sequence ◽

Domain Boundary Prediction ◽

Correlation Information ◽

And Function

AbstractMotivationDomain boundary prediction is one of the most important problems in the study of protein structure and function. Many sequence-based domain boundary prediction methods are either template-based or machine learning (ML) based. ML-based methods often perform poorly due to their use of only local (i.e. short-range) features. These conventional features such as sequence profiles, secondary structures and solvent accessibilities are typically restricted to be within 20 residues of the domain boundary candidate.ResultsTo address the performance of ML-based methods, we developed a new protein domain boundary prediction method (ConDo) that utilizes novel long-range features such as coevolutionary information in addition to the aforementioned local window features as inputs for ML. Toward this purpose, two types of coevolutionary information were extracted from multiple sequence alignment using direct coupling analysis: (i) partially aligned sequences, and (ii) correlated mutation information. Both the partially aligned sequence information and the modularity of residue–residue couplings possess long-range correlation information.Availability and implementationhttps://github.com/gicsaw/ConDo.gitSupplementary informationSupplementary data are available at Bioinformatics online.

Interdomain conformational flexibility underpins the activity of UGGT, the eukaryotic glycoprotein secretion checkpoint

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1703682114 ◽

2017 ◽

Vol 114 (32) ◽

pp. 8544-8549 ◽

Cited By ~ 28

Author(s):

Pietro Roversi ◽

Lucia Marti ◽

Alessandro T. Caputo ◽

Dominic S. Alonzi ◽

Johan C. Hill ◽

...

Keyword(s):

Crystal Structures ◽

Secretory Pathway ◽

Domain Boundary ◽

Conformational Flexibility ◽

Structural Determination ◽

Disulfide Bridges ◽

Er Retention ◽

Domain Boundary Prediction ◽

Glycoprotein Secretion ◽

Single Enzyme

Glycoproteins traversing the eukaryotic secretory pathway begin life in the endoplasmic reticulum (ER), where their folding is surveyed by the 170-kDa UDP-glucose:glycoprotein glucosyltransferase (UGGT). The enzyme acts as the single glycoprotein folding quality control checkpoint: it selectively reglucosylates misfolded glycoproteins, promotes their association with ER lectins and associated chaperones, and prevents premature secretion from the ER. UGGT has long resisted structural determination and sequence-based domain boundary prediction. Questions remain on how this single enzyme can flag misfolded glycoproteins of different sizes and shapes for ER retention and how it can span variable distances between the site of misfold and a glucose-accepting N-linked glycan on the same glycoprotein. Here, crystal structures of a full-length eukaryotic UGGT reveal four thioredoxin-like (TRXL) domains arranged in a long arc that terminates in two β-sandwiches tightly clasping the glucosyltransferase domain. The fold of the molecule is topologically complex, with the first β-sandwich and the fourth TRXL domain being encoded by nonconsecutive stretches of sequence. In addition to the crystal structures, a 15-Å cryo-EM reconstruction reveals interdomain flexibility of the TRXL domains. Double cysteine point mutants that engineer extra interdomain disulfide bridges rigidify the UGGT structure and exhibit impaired activity. The intrinsic flexibility of the TRXL domains of UGGT may therefore endow the enzyme with the promiscuity needed to recognize and reglucosylate its many different substrates and/or enable reglucosylation of N-linked glycans situated at variable distances from the site of misfold.

PDP-RF: Protein Domain Boundary Prediction Using Random Forest Classifier

Lecture Notes in Computer Science - Pattern Recognition and Machine Intelligence ◽

10.1007/978-3-319-19941-2_42 ◽

2015 ◽

pp. 441-450 ◽

Cited By ~ 2

Author(s):

Piyali Chatterjee ◽

Subhadip Basu ◽

Julian Zubek ◽

Mahantapas Kundu ◽

Mita Nasipuri ◽

...

Keyword(s):

Random Forest ◽

Domain Boundary ◽

Random Forest Classifier ◽

Protein Domain ◽

Domain Boundary Prediction

Protein Domain Boundary Prediction Using Multiple Protein Properties

Journal of Bionanoscience ◽

10.1166/jbns.2013.1095 ◽

2013 ◽

Vol 7 (1) ◽

pp. 104-109

Author(s):

Jiaxin Wang ◽

Jiafeng Wang ◽

Wei Du ◽

Chong Yu ◽

Yanchun Liang

Keyword(s):

Domain Boundary ◽

Protein Domain ◽

Multiple Protein ◽

Protein Properties ◽

Domain Boundary Prediction

DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning

BMC Bioinformatics ◽

10.1186/1471-2105-12-43 ◽

2011 ◽

Vol 12 (1) ◽

pp. 43 ◽

Cited By ~ 38

Author(s):

Jesse Eickholt ◽

Xin Deng ◽

Jianlin Cheng

Keyword(s):

Machine Learning ◽

Domain Boundary ◽

Protein Domain ◽

Domain Boundary Prediction

An Improved Profile-Level Domain Linker Propensity Index for Protein Domain Boundary Prediction.

Protein and Peptide Letters ◽

10.2174/092986611794328717 ◽

2011 ◽

Vol 18 (1) ◽

pp. 7-16 ◽

Cited By ~ 14

Author(s):

Yanfeng Zhang ◽

Bin Liu ◽

Qiwen Dong ◽

Victor X. Jin

Keyword(s):

Domain Boundary ◽

Protein Domain ◽

Domain Boundary Prediction

Protein Domain Boundary Prediction

Algorithms in Computational Molecular Biology ◽

10.1002/9780470892107.ch23 ◽

2010 ◽

pp. 501-519 ◽

Cited By ~ 1

Author(s):

Paul D. Yoo ◽

Bing Bing Zhou ◽

Albert Y. Zomaya

Keyword(s):

Domain Boundary ◽

Protein Domain ◽

Domain Boundary Prediction

domain boundary prediction
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

DomBpred: protein domain boundary predictor using inter-residue distance and domain-residue level clustering

preciseTAD: A machine learning framework for precise 3D domain boundary prediction at base-level resolution

DNN-Dom: predicting protein domain boundary from sequence alone by deep neural network

ConDo: protein domain boundary prediction using coevolutionary information

Interdomain conformational flexibility underpins the activity of UGGT, the eukaryotic glycoprotein secretion checkpoint

PDP-RF: Protein Domain Boundary Prediction Using Random Forest Classifier

Protein Domain Boundary Prediction Using Multiple Protein Properties

DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning

An Improved Profile-Level Domain Linker Propensity Index for Protein Domain Boundary Prediction.

Protein Domain Boundary Prediction

Export Citation Format

domain boundary predictionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

DomBpred: protein domain boundary predictor using inter-residue distance and domain-residue level clustering

preciseTAD: A machine learning framework for precise 3D domain boundary prediction at base-level resolution

DNN-Dom: predicting protein domain boundary from sequence alone by deep neural network

ConDo: protein domain boundary prediction using coevolutionary information

Interdomain conformational flexibility underpins the activity of UGGT, the eukaryotic glycoprotein secretion checkpoint

PDP-RF: Protein Domain Boundary Prediction Using Random Forest Classifier

Protein Domain Boundary Prediction Using Multiple Protein Properties

DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning

An Improved Profile-Level Domain Linker Propensity Index for Protein Domain Boundary Prediction.

Protein Domain Boundary Prediction

domain boundary prediction
Recently Published Documents