A New Family of Similarity Measures for Scoring Confidence of Protein Interactions using Gene Ontology

10.1101/459107 ◽

2018 ◽

Cited By ~ 1

Author(s):

Madhusudan Paul ◽

Ashish Anand

Keyword(s):

Gene Ontology ◽

Protein Interactions ◽

Large Scale ◽

Similarity Measures ◽

Confidence Score ◽

False Positives ◽

Positive Interactions ◽

Relative Depth ◽

New Family ◽

Edge Based

AbstractThe large-scale protein-protein interaction (PPI) data has the potential to play a significant role in the endeavor of understanding cellular processes. However, the presence of a considerable fraction of false positives is a bottleneck in realizing this potential. There have been continuous efforts to utilize complementary resources for scoring confidence of PPIs in a manner that false positive interactions get a low confidence score. Gene Ontology (GO), a taxonomy of biological terms to represent the properties of gene products and their relations, has been widely used for this purpose. We utilize GO to introduce a new set of specificity measures: Relative Depth Specificity (RDS), Relative Node-based Specificity (RNS), and Relative Edge-based Specificity (RES), leading to a new family of similarity measures. We use these similarity measures to obtain a confidence score for each PPI. We evaluate the new measures using four different benchmarks. We show that all the three measures are quite effective. Notably, RNS and RES more effectively distinguish true PPIs from false positives than the existing alternatives. RES also shows a robust set-discriminating power and can be useful for protein functional clustering as well.

Download Full-text

Impact of the Continuous Evolution of Gene Ontology on the Performance of Similarity Measures for Scoring Confidence of Protein Interactions

SN Computer Science ◽

10.1007/s42979-020-00350-5 ◽

2020 ◽

Vol 1 (6) ◽

Author(s):

Madhusudan Paul ◽

Ashish Anand ◽

Saptarshi Pyne

Keyword(s):

Gene Ontology ◽

Protein Interactions ◽

Similarity Measures ◽

Continuous Evolution

Download Full-text

Gene Ontology-driven inference of protein–protein interactions using inducers

Bioinformatics ◽

10.1093/bioinformatics/btr610 ◽

2011 ◽

Vol 28 (1) ◽

pp. 69-75 ◽

Cited By ~ 43

Author(s):

Stefan R. Maetschke ◽

Martin Simonsen ◽

Melissa J. Davis ◽

Mark A. Ragan

Keyword(s):

Gene Ontology ◽

Protein Interactions ◽

Protein Protein Interactions

Download Full-text

gProt: Annotating Protein Interactions Using Google and Gene Ontology

Lecture Notes in Computer Science - Knowledge-Based Intelligent Information and Engineering Systems ◽

10.1007/11553939_166 ◽

2005 ◽

pp. 1195-1203 ◽

Cited By ~ 2

Author(s):

Rune Sætre ◽

Amund Tveit ◽

Martin Thorsen Ranang ◽

Tonje S. Steigedal ◽

Liv Thommesen ◽

...

Keyword(s):

Gene Ontology ◽

Protein Interactions

Download Full-text

Unraveling the Mysteries of Phospholipid Scrambling

Thrombosis and Haemostasis ◽

10.1055/s-0037-1616224 ◽

2001 ◽

Vol 86 (07) ◽

pp. 266-275 ◽

Cited By ~ 135

Author(s):

Therese Wiedmer ◽

Peter Sims

Keyword(s):

Plasma Membrane ◽

Protein Interactions ◽

Cell Activation ◽

Phosphorylation Site ◽

Reticuloendothelial System ◽

Phospholipid Scramblase ◽

New Family ◽

Receptor Sites ◽

Cell Clearance

SummaryPlasma membrane phospholipid asymmetry is maintained by an aminophospholipid translocase that transports phosphatidylserine (PS) and phosphatidylethanolamine (PE) from outer to inner membrane leaflet. Cell activation or injury leads to redistribution of all major lipid classes within the plasma membrane, resulting in surface exposure of PS and PE. Cell surface-exposed PS can serve as receptor sites for coagulation enzyme complexes, and contributes to cell clearance by the reticuloendothelial system. The mechanism(s) by which this PL ”scrambling” occurs is poorly understood. A protein called phospholipid scramblase (PLSCR1) has been cloned that exhibits Ca2+-activated PL scrambling activity in vitro. PLSCR1 belongs to a new family of proteins with no apparent homology to other known proteins. PLSCR1 is palmitoylated and contains a potential protein kinase C phosphorylation site. It further contains multiple PxxP and PPxY motifs, representing potential binding motifs for SH3 and WW domains implicated in mediating protein-protein interactions. Although at least two proteins have been shown to associate with PLSCR1, the functional significance of such interaction remains to be elucidated. Evidence that PLSCR1 may serve functions other than its proposed activity as PL scramblase is also presented.

Download Full-text

GSAn: an alternative to enrichment analysis for annotating gene sets

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqaa017 ◽

2020 ◽

Vol 2 (2) ◽

Cited By ~ 5

Author(s):

Aaron Ayllon-Benitez ◽

Romain Bourqui ◽

Patricia Thébault ◽

Fleur Mougin

Keyword(s):

Gene Ontology ◽

Semantic Similarity ◽

A Priori ◽

Similarity Measures ◽

Enrichment Analysis ◽

Biological Information ◽

Underlying Structure ◽

Gene Set ◽

Sequencing Technologies ◽

Gene Coverage

Abstract The revolution in new sequencing technologies is greatly leading to new understandings of the relations between genotype and phenotype. To interpret and analyze data that are grouped according to a phenotype of interest, methods based on statistical enrichment became a standard in biology. However, these methods synthesize the biological information by a priori selecting the over-represented terms and may suffer from focusing on the most studied genes that represent a limited coverage of annotated genes within a gene set. Semantic similarity measures have shown great results within the pairwise gene comparison by making advantage of the underlying structure of the Gene Ontology. We developed GSAn, a novel gene set annotation method that uses semantic similarity measures to synthesize a priori Gene Ontology annotation terms. The originality of our approach is to identify the best compromise between the number of retained annotation terms that has to be drastically reduced and the number of related genes that has to be as large as possible. Moreover, GSAn offers interactive visualization facilities dedicated to the multi-scale analysis of gene set annotations. Compared to enrichment analysis tools, GSAn has shown excellent results in terms of maximizing the gene coverage while minimizing the number of terms.

Download Full-text

Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and co-expression

BMC Genomics ◽

10.1186/1471-2164-10-288 ◽

2009 ◽

Vol 10 (1) ◽

pp. 288 ◽

Cited By ~ 71

Author(s):

Stefanie De Bodt ◽

Sebastian Proost ◽

Klaas Vandepoele ◽

Pierre Rouzé ◽

Yves Van de Peer

Keyword(s):

Arabidopsis Thaliana ◽

Gene Ontology ◽

Protein Interactions ◽

Protein Protein Interactions

Download Full-text

Exploring information from the topology beneath the Gene Ontology terms to improve semantic similarity measures

Gene ◽

10.1016/j.gene.2016.04.024 ◽

2016 ◽

Vol 586 (1) ◽

pp. 148-157 ◽

Cited By ~ 3

Author(s):

Shu-Bo Zhang ◽

Jian-Huang Lai

Keyword(s):

Gene Ontology ◽

Semantic Similarity ◽

Similarity Measures

Download Full-text

GENE ONTOLOGY SIMILARITY MEASURES BASED ON LINEAR ORDER STATISTICS

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488506004254 ◽

2006 ◽

Vol 14 (06) ◽

pp. 639-661 ◽

Cited By ~ 7

Author(s):

JAMES M. KELLER ◽

JAMES C. BEZDEK ◽

MIHAIL POPESCU ◽

NIKHIL R. PAL ◽

JOYCE A. MITCHELL ◽

...

Keyword(s):

Gene Ontology ◽

Order Statistics ◽

Gene Product ◽

Linear Order ◽

Similarity Measures ◽

Product Family ◽

Amino Acid Sequences ◽

Gene Products ◽

Multiple Sources ◽

Similarity Relations

The standard method for comparing gene products (proteins or RNA) is to compare their DNA or amino acid sequences. Additional information about some gene products may come from multiple sources, including the set of Gene Ontology (GO) annotations and the set of journal abstracts related to each gene product. Gene product similarity measures can be based on evaluating sets of descriptor terms found in the GO taxonomy, and/or the index term sets of the related documents (MeSH annotations). While our techniques can be applied to term sets from any taxonomy, we restrict our examples in this article to GO annotations. We investigate the use of linear order statistics (LOS) to build similarity relations on pairs of terms that are used in the GO as linguistic descriptors of genes and gene products. One of our objectives is to investigate the construction and utility of visual assessments of relational data (in this case, dissimilarity matrices) for discovering tendencies of groups of gene products to "cluster together". We use gene product data derived from a group of 194 gene products representing three protein families extracted from ENSEMBL. Our examples suggest that LOS similarity measures are more effective than traditional sequence-based similarity measures at capturing relationships between pairs of gene products in ENSEMBL families when annotation information is available. We show examples of how these similarity measures can assist in knowledge discovery and gene product family validation.

Download Full-text

Predicting shrimp protein-protein interactions and gene ontology terms using association rule and semantic similarity calculation

2014 International Computer Science and Engineering Conference (ICSEC) ◽

10.1109/icsec.2014.6978208 ◽

2014 ◽

Author(s):

Sirintra Vaiwsri ◽

Anuphap Prachumwat ◽

Sudsanguan Ngamsuriyaroj ◽

Ananta Srisuphab

Keyword(s):

Gene Ontology ◽

Semantic Similarity ◽

Protein Interactions ◽

Association Rule ◽

Protein Protein Interactions ◽

Similarity Calculation

Download Full-text