GO-Based Term Semantic Similarity

Author(s):  
Marco A. Alvarez ◽  
Xiaojun Qi ◽  
Changhui Yan

As the Gene Ontology (GO) plays more and more important roles in bioinformatics research, there has been great interest in developing objective and accurate methods for calculating semantic similarity between GO terms. In this chapter, the authors first introduce the basic concepts related to the GO and then briefly review the current advances and challenges in the development of methods for calculating semantic similarity between GO terms. Then, the authors introduce a semantic similarity method that does not rely on external data sources. Using this method as an example, the authors show how different properties of the GO can be explored to calculate semantic similarities between pairs of GO terms. The authors conclude the chapter by presenting some thoughts on the directions for future research in this field.

2013 ◽  
pp. 93-104
Author(s):  
Marco A. Alvarez ◽  
Xiaojun Qi ◽  
Changhui Yan

As the Gene Ontology (GO) plays more and more important roles in bioinformatics research, there has been great interest in developing objective and accurate methods for calculating semantic similarity between GO terms. In this chapter, the authors first introduce the basic concepts related to the GO and then briefly review the current advances and challenges in the development of methods for calculating semantic similarity between GO terms. Then, the authors introduce a semantic similarity method that does not rely on external data sources. Using this method as an example, the authors show how different properties of the GO can be explored to calculate semantic similarities between pairs of GO terms. The authors conclude the chapter by presenting some thoughts on the directions for future research in this field.


2004 ◽  
Vol 36 (5) ◽  
pp. 365-370 ◽  
Author(s):  
Shun-Liang Cao ◽  
Lei Qin ◽  
Wei-Zhong He ◽  
Yang Zhong ◽  
Yang-Yong Zhu ◽  
...  

Abstract Semantic search is a key issue in integration of heterogeneous biological databases. In this paper, we present a methodology for implementing semantic search in BioDW, an integrated biological data warehouse. Two tables are presented: the DB2GO table to correlate Gene Ontology (GO) annotated entries from BioDW data sources with GO, and the semantic similarity table to record similarity scores derived from any pair of GO terms. Based on the two tables, multifarious ways for semantic search are provided and the corresponding entries in heterogeneous biological databases in semantic terms can be expediently searched.


2015 ◽  
Vol 12 (4) ◽  
pp. 1235-1253 ◽  
Author(s):  
Shu-Bo Zhang ◽  
Jian-Huang Lai

Measuring the semantic similarity between pairs of terms in Gene Ontology (GO) can help to compare genes that can not be compared by other computational methods. In this study, we proposed an integrated information-based similarity measurement (IISM) to calculate the semantic similarity between two GO terms by taking into account multiple common ancestors that they share, and aggregating the semantic information and depth information of the non-redundant common ancestors. Our method searches for non-redundant common ancestors in an effective way. Validation experiments were conducted on both gene expression dataset and pathway dataset, and the experimental results suggest the superiority of our method against some existing methods.


2011 ◽  
Vol 09 (06) ◽  
pp. 681-695 ◽  
Author(s):  
MARCO A. ALVAREZ ◽  
CHANGHUI YAN

Existing methods for calculating semantic similarities between pairs of Gene Ontology (GO) terms and gene products often rely on external databases like Gene Ontology Annotation (GOA) that annotate gene products using the GO terms. This dependency leads to some limitations in real applications. Here, we present a semantic similarity algorithm (SSA), that relies exclusively on the GO. When calculating the semantic similarity between a pair of input GO terms, SSA takes into account the shortest path between them, the depth of their nearest common ancestor, and a novel similarity score calculated between the definitions of the involved GO terms. In our work, we use SSA to calculate semantic similarities between pairs of proteins by combining pairwise semantic similarities between the GO terms that annotate the involved proteins. The reliability of SSA was evaluated by comparing the resulting semantic similarities between proteins with the functional similarities between proteins derived from expert annotations or sequence similarity. Comparisons with existing state-of-the-art methods showed that SSA is highly competitive with the other methods. SSA provides a reliable measure for semantics similarity independent of external databases of functional-annotation observations.


2019 ◽  
Author(s):  
Dat Duong ◽  
Ankith Uppunda ◽  
Lisa Gai ◽  
Chelsea Ju ◽  
James Zhang ◽  
...  

AbstractProtein functions can be described by the Gene Ontology (GO) terms, allowing us to compare the functions of two proteins by measuring the similarity of the terms assigned to them. Recent works have applied neural network models to derive the vector representations for GO terms and compute similarity scores for these terms by comparing their vector embeddings. There are two typical ways to embed GO terms into vectors; a model can either embed the definitions of the terms or the topology of the terms in the ontology. In this paper, we design three tasks to critically evaluate the GO embeddings of two recent neural network models, and further introduce additional models for embedding GO terms, adapted from three popular neural network frameworks: Graph Convolution Network (GCN), Embeddings from Language Models (ELMo), and Bidirectional Encoder Representations from Transformers (BERT), which have not yet been explored in previous works. Task 1 studies edge cases where the GO embeddings may not provide meaningful similarity scores for GO terms. We find that all neural network based methods fail to produce high similarity scores for related terms when these terms have low Information Content values. Task 2 is a canonical task which estimates how well GO embeddings can compare functions of two orthologous genes or two interacting proteins. The best neural network methods for this task are those that embed GO terms using their definitions, and the differences among such methods are small. Task 3 evaluates how GO embeddings affect the performance of GO annotation methods, which predict whether a protein should be labeled by certain GO terms. When the annotation datasets contain many samples for each GO label, GO embeddings do not improve the classification accuracy. Machine learning GO annotation methods often remove rare GO labels from the training datasets so that the model parameters can be efficiently trained. We evaluate whether GO embeddings can improve prediction of rare labels unseen in the training datasets, and find that GO embeddings based on the BERT framework achieve the best results in this setting. We present our embedding methods and three evaluation tasks as the basis for future research on this topic.


2016 ◽  
Vol 35 (3) ◽  
pp. 1-32 ◽  
Author(s):  
Roger Simnett ◽  
Elizabeth Carson ◽  
Ann Vanstraelen

SUMMARY We present a comprehensive review of the 130 international archival auditing and assurance research articles that were published in eight leading accounting and auditing journals for 1995–2014. In order to support evidence-based international standard setting and regulation, and to identify what has been learned to date, we map this research to the International Auditing and Assurance Standards Board's (IAASB) Framework for Audit Quality. For the areas that have been well researched, we provide a summary of the findings and outline how they can inform standard setters and regulators. We also observe a significant evolution in international archival research over the 20 years of our study, as evidenced by the measures of audit quality, data sources used, and approaches used to address endogeneity concerns. Finally, we identify some challenges in undertaking international archival auditing and assurance research and identify opportunities for future research. Our review is of interest to researchers, practitioners, and standard setters/regulators involved in international auditing and assurance activities.


2019 ◽  
Vol 26 (1) ◽  
pp. 38-52 ◽  
Author(s):  
Dat Duong ◽  
Wasi Uddin Ahmad ◽  
Eleazar Eskin ◽  
Kai-Wei Chang ◽  
Jingyi Jessica Li

Urban Studies ◽  
2020 ◽  
Vol 57 (16) ◽  
pp. 3217-3235
Author(s):  
Martijn van den Hurk ◽  
Tuna Tasan-Kok

Urban regeneration projects involve complex contractual deals between public- and private-sector actors. Critics contend that contracts hamper opportunities for flexibility and change in these projects due to strict provisions that are incorporated in legal agreements. This article offers contrary empirical insights based on a study of contractual arrangements for urban regeneration projects in the Netherlands, including an analysis of interviews and confidential documents. It zooms in on provisions on safeguarding and adaptation, finding that urban regeneration projects remain receptive to flexibility and change. Public-sector actors use their room to manoeuvre while operating contracts, seeking to secure social relations and keep projects going. This article taps into data sources that are difficult to access, addressing what is included in contracts and how they are used by practitioners, and presents questions for future research on contracts in the urban built environment.


2008 ◽  
Vol 17 (2) ◽  
pp. 119-136 ◽  
Author(s):  
Mohamed Kohia ◽  
John Brackle ◽  
Kenny Byrd ◽  
Amanda Jennings ◽  
William Murray ◽  
...  

Objective:To analyze research literature that has examined the effectiveness of various physical therapy interventions on lateral epicondylitis.Data Sources:Evidence was compiled with data located using the PubMed, EBSCO, The Cochrane Library, and the Hooked on Evidence databases from 1994 to 2006 using the key words lateral epicondylitis, tennis elbow, modalities, intervention, management of, treatment for, radiohumeral bursitis, and experiment.Study Selection:The literature used included peer-reviewed studies that evaluated the effectiveness of physical therapy treatments on lateral epicondylitis. Future research is needed to provide a better understanding of beneficial treatment options for people living with this condition.Data Synthesis:Shockwave therapy and Cyriax therapy protocol are effective physical therapy interventions.Conclusions:There are numerous treatments for lateral epicondylitis and no single intervention has been proven to be the most efficient. Therefore, future research is needed to provide a better understanding of beneficial treatment options for people living with this condition.


Sign in / Sign up

Export Citation Format

Share Document