homology information Latest Research Papers

Small molecule docking remains one of the most valuable computational techniques for the structure prediction of protein-small molecule complexes. It allows us to study the interactions between compounds and the protein receptors they target at atomic detail, in a timely and efficient manner. Here we present a new protocol in HADDOCK, our integrative modelling platform, which incorporates homology information for both receptor and compounds. It makes use of HADDOCK's unique ability to integrate information in the simulation to drive it toward conformations which agree with the provided data. The focal point is the use of shape restraints derived from homologous compounds bound to the target receptors. We have developed two protocols: In the first, the shape is composed of fake atom beads based on the position of the heavy atoms of the homologous template compound, whereas in the second the shape is additionally annotated with pharmacophore data, for some or all beads. For both protocols, ambiguous distance restraints are subsequently defined between those beads and the heavy atoms of the ligand to be docked. We have benchmarked the performance of these protocols with a fully unbound version of the widely used DUD-E dataset. In this unbound docking scenario, our template/shape-based docking protocol reaches an overall success rate of 81% on 99 complexes, which is close to the best results reported for bound docking on the DUD-E dataset.

Download Full-text

A Deep Semi-Supervised Framework for Accurate Modelling of Orphan Sequences

10.1101/2020.07.13.201459 ◽

2020 ◽

Author(s):

Lewis Moffat ◽

David T. Jones

Keyword(s):

Structure Prediction ◽

Secondary Structure Prediction ◽

Supervised Machine Learning ◽

Evolutionary Information ◽

Predictive Methods ◽

Homology Information ◽

Single Method ◽

Mature Field ◽

New Generation ◽

Single Sequence

AbstractAccurate modelling of a single orphan protein sequence in the absence of homology information has remained a challenge for several decades. Although not as performant as their homology-based counterparts, single-sequence bioinformatic methods are not constrained by the requirement of evolutionary information and so have a swathe of applications and uses. By taking a bioinformatics approach to semi-supervised machine learning we develop Profile Augmentation of Single Sequences (PASS), a simple but powerful framework for developing accurate single-sequence methods. To demonstrate the effectiveness of PASS we apply it to the mature field of secondary structure prediction. In doing so we develop S4PRED, the successor to the open-source PSIPRED-Single method, which achieves an unprecedented Q3 score of 75.3% on the standard CB513 test. PASS provides a blueprint for the development of a new generation of predictive methods, advancing our ability to model individual protein sequences.

Download Full-text

FunGeCo: a web-based tool for estimation of functional potential of bacterial genomes and microbiomes using gene context information

Bioinformatics ◽

10.1093/bioinformatics/btz957 ◽

2019 ◽

Vol 36 (8) ◽

pp. 2575-2577

Author(s):

Swadha Anand ◽

Bhusan K Kuntal ◽

Anwesha Mohapatra ◽

Vineet Bhatt ◽

Sharmila S Mande

Keyword(s):

Supplementary Information ◽

Valuable Resource ◽

Bacterial Genomes ◽

Web Based ◽

Microbial Genomes ◽

Functional Potential ◽

Genomic Location ◽

Homology Information ◽

Functional Inference ◽

Analysis Platform

Abstract Motivation Functional potential of genomes and metagenomes which are inferred using homology-based methods are often subjected to certain limitations, especially for proteins with homologs which function in multiple pathways. Augmenting the homology information with genomic location of the constituent genes can significantly improve the accuracy of estimated functions. This can help in distinguishing cognate homolog belonging to a candidate pathway from its other homologs functional in different pathways. Results In this article, we present a web-based analysis platform ‘FunGeCo’ to enable gene-context-based functional inference for microbial genomes and metagenomes. It is expected to be a valuable resource and complement the existing tools for understanding the functional potential of microbes which reside in an environment. Availability and implementation https://web.rniapps.net/fungeco [Freely available for academic use]. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

GOLabeler: Improving Sequence-based Large-scale Protein Function Prediction by Learning to Rank

10.1101/145763 ◽

2017 ◽

Author(s):

Ronghui You ◽

Zihan Zhang ◽

Yi Xiong ◽

Fengzhu Sun ◽

Hiroshi Mamitsuka ◽

...

Keyword(s):

Protein Function ◽

Large Scale ◽

Protein Function Prediction ◽

Learning To Rank ◽

Classification Problem ◽

Function Prediction ◽

New Paradigm ◽

Effective Manner ◽

Homology Information ◽

Significant Performance

AbstractMotivation: Gene Ontology (GO) has been widely used to annotate functions of proteins and understand their biological roles. Currently only ¡1% of more than 70 million proteins in UniProtKB have experimental GO annotations, implying the strong necessity of automated function prediction (AFP) of proteins, where AFP is a hard multi-label classification problem due to one protein with a diverse number of GO terms. Most of these proteins have only sequences as input information, indicating the importance of sequence-based AFP (SAFP: sequences are the only input). Furthermore, homology-based SAFP tools are competitive in AFP competitions, while they do not necessarily work well for so-calleddifficultproteins, which have ¡60% sequence identity to proteins with annotations already. Thus, the vital and challenging problem now is to develop a method for SAFP, particularly for difficult proteins.Methods: The key of this method is to extract not only homology information but also diverse, deep-rooted information/evidence from sequence inputs and integrate them into a predictor in an efficient and also effective manner. We propose GOLabeler, which integrates five component classifiers, trained from different features, including GO term frequency, sequence alignment, amino acid trigram, domains and motifs, and biophysical properties, etc., in the framework of learning to rank (LTR), a new paradigm of machine learning, especially powerful for multi-label classification.Results: The empirical results obtained by examining GOLabeler extensively and thoroughly by using large-scale datasets revealed numerous favorable aspects of GOLabeler, including significant performance advantage over state-of-the-art AFP methods.Contact:[email protected]

Download Full-text

Prediction of Protein Secondary Structure Using the Weighted Combination of Homology Information of Protein Sequences

The Journal of the Korean Institute of Information and Communication Engineering ◽

10.6109/jkiice.2016.20.9.1816 ◽

2016 ◽

Vol 20 (9) ◽

pp. 1816-1821

Author(s):

Sang-mun Chi

Keyword(s):

Secondary Structure ◽

Protein Secondary Structure ◽

Protein Sequences ◽

Homology Information ◽

Weighted Combination

Download Full-text

Exploiting Homology Information in Nontemplate Based Prediction of Protein Structures

Journal of Chemical Theory and Computation ◽

10.1021/acs.jctc.5b00371 ◽

2015 ◽

Vol 11 (10) ◽

pp. 5045-5051 ◽

Cited By ~ 1

Author(s):

Alfredo Iacoangeli ◽

Paolo Marcatili ◽

Anna Tramontano

Keyword(s):

Protein Structures ◽

Homology Information

Download Full-text

Searching combinatorial optimality using graph-based homology information

Applicable Algebra in Engineering Communication and Computing ◽

10.1007/s00200-014-0248-x ◽

2015 ◽

Vol 26 (1-2) ◽

pp. 103-120 ◽

Cited By ~ 3

Author(s):

Pedro Real ◽

Helena Molina-Abril ◽

Aldo Gonzalez-Lorenzo ◽

Alexandra Bac ◽

Jean-Luc Mari

Keyword(s):

Homology Information

Download Full-text

SnowyOwl: accurate prediction of fungal genes by using RNA-Seq and homology information to select among ab initio models

BMC Bioinformatics ◽

10.1186/1471-2105-15-229 ◽

2014 ◽

Vol 15 (1) ◽

Cited By ~ 23

Author(s):

Ian Reid ◽

Nicholas O’Toole ◽

Omar Zabaneh ◽

Reza Nourzadeh ◽

Mahmoud Dahdouli ◽

...

Keyword(s):

Ab Initio ◽

Accurate Prediction ◽

Rna Seq ◽

Homology Information ◽

Fungal Genes ◽

Ab Initio Models

Download Full-text

EXIA2: Web Server of Accurate and Rapid Protein Catalytic Residue Prediction

BioMed Research International ◽

10.1155/2014/807839 ◽

2014 ◽

Vol 2014 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Chih-Hao Lu ◽

Chin-Sheng Yu ◽

Yu-Tung Chien ◽

Shao-Wei Huang

Keyword(s):

Prediction Method ◽

Side Chain ◽

Catalytic Residue ◽

Chain Orientation ◽

Catalytic Residues ◽

Special Orientation ◽

Homology Information ◽

Benchmark Datasets ◽

Side Chain Orientation ◽

Better Than

We propose a method (EXIA2) of catalytic residue prediction based on protein structure without needing homology information. The method is based on the special side chain orientation of catalytic residues. We found that the side chain of catalytic residues usually points to the center of the catalytic site. The special orientation is usually observed in catalytic residues but not in noncatalytic residues, which usually have random side chain orientation. The method is shown to be the most accurate catalytic residue prediction method currently when combined with PSI-Blast sequence conservation. It performs better than other competing methods on several benchmark datasets that include over 1,200 enzyme structures. The areas under the ROC curve (AUC) on these benchmark datasets are in the range from 0.934 to 0.968.

Download Full-text

TheCandidaGenome Database: The new homology information page highlights protein similarity and phylogeny

Nucleic Acids Research ◽

10.1093/nar/gkt1046 ◽

2013 ◽

Vol 42 (D1) ◽

pp. D711-D716 ◽

Cited By ~ 28

Author(s):

Jonathan Binkley ◽

Martha B. Arnaud ◽

Diane O. Inglis ◽

Marek S. Skrzypek ◽

Prachi Shah ◽

...

Keyword(s):

Information Page ◽

Homology Information

Download Full-text

homology information
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Shape-restrained modelling of protein-small molecule complexes with HADDOCK

A Deep Semi-Supervised Framework for Accurate Modelling of Orphan Sequences

FunGeCo: a web-based tool for estimation of functional potential of bacterial genomes and microbiomes using gene context information

GOLabeler: Improving Sequence-based Large-scale Protein Function Prediction by Learning to Rank

Prediction of Protein Secondary Structure Using the Weighted Combination of Homology Information of Protein Sequences

Exploiting Homology Information in Nontemplate Based Prediction of Protein Structures

Searching combinatorial optimality using graph-based homology information

SnowyOwl: accurate prediction of fungal genes by using RNA-Seq and homology information to select among ab initio models

EXIA2: Web Server of Accurate and Rapid Protein Catalytic Residue Prediction

TheCandidaGenome Database: The new homology information page highlights protein similarity and phylogeny

Export Citation Format

homology informationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Shape-restrained modelling of protein-small molecule complexes with HADDOCK

A Deep Semi-Supervised Framework for Accurate Modelling of Orphan Sequences

FunGeCo: a web-based tool for estimation of functional potential of bacterial genomes and microbiomes using gene context information

GOLabeler: Improving Sequence-based Large-scale Protein Function Prediction by Learning to Rank

Prediction of Protein Secondary Structure Using the Weighted Combination of Homology Information of Protein Sequences

Exploiting Homology Information in Nontemplate Based Prediction of Protein Structures

Searching combinatorial optimality using graph-based homology information

SnowyOwl: accurate prediction of fungal genes by using RNA-Seq and homology information to select among ab initio models

EXIA2: Web Server of Accurate and Rapid Protein Catalytic Residue Prediction

TheCandidaGenome Database: The new homology information page highlights protein similarity and phylogeny

homology information
Recently Published Documents