Sematch: Semantic similarity framework for Knowledge Graphs

2017 ◽  
Vol 130 ◽  
pp. 30-32 ◽  
Author(s):  
Ganggao Zhu ◽  
Carlos A. Iglesias
Author(s):  
Camilo Morales ◽  
Diego Collarana ◽  
Maria-Esther Vidal ◽  
Sören Auer

2021 ◽  
pp. 016555152110205
Author(s):  
Majed A Alkhamees ◽  
Mohammed A Alnuem ◽  
Saleh M Al-Saleem ◽  
Abdulrakeeb M Al-Ssulami

Semantic similarity between concepts concerns expressing the degree of similarity in meaning between two concepts in a computational model. This problem has recently attracted considerable attention from researchers in attempting to automate the understanding of word meanings to expedite the classification of users’ opinions and attitudes embedded in text. In this article, a semantic similarity metric is presented. The proposed metric, namely, weighted information-content ( wic), exploits the information content of the least common subsumer of two compared concepts and the depth information in knowledge graphs such as DBPedia and YAGO. The two similarity components were combined using calibrated cooperative contributions from both similarity components. A statistical test using the Spearman correlations on well-known human judgement word-similarity data sets showed that the wic metric produced more highly correlated similarities compared with state-of-the-art metrics. In addition, a real-world aspect category classification was evaluated, which exhibited further increased accuracy and recall.


2021 ◽  
Author(s):  
Alexandros Vassiliades ◽  
Theodore Patkos ◽  
Vasilis Efthymiou ◽  
Antonis Bikakis ◽  
Nick Bassiliades ◽  
...  

Infusing autonomous artificial systems with knowledge about the physical world they inhabit is of utmost importance and a long-lasting goal in Artificial Intelligence (AI) research. Training systems with relevant data is a common approach; yet, it is not always feasible to find the data needed, especially since a big portion of this knowledge is commonsense. In this paper, we propose a novel method for extracting and evaluating relations between objects and actions from knowledge graphs, such as ConceptNet and WordNet. We present a complete methodology of locating, enriching, evaluating, cleaning and exposing knowledge from such resources, taking into consideration semantic similarity methods. One important aspect of our method is the flexibility in deciding how to deal with the noise that exists in the data. We compare our method with typical approaches found in the relevant literature, such as methods that exploit the topology or the semantic information in a knowledge graph, and embeddings. We test the performance of these methods on the Something-Something Dataset.


2015 ◽  
Vol 4 (2) ◽  
pp. 471-492 ◽  
Author(s):  
Andrea Ballatore ◽  
Michela Bertolotto ◽  
David Wilson

2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Rita T. Sousa ◽  
Sara Silva ◽  
Catia Pesquita

Abstract Background In recent years, biomedical ontologies have become important for describing existing biological knowledge in the form of knowledge graphs. Data mining approaches that work with knowledge graphs have been proposed, but they are based on vector representations that do not capture the full underlying semantics. An alternative is to use machine learning approaches that explore semantic similarity. However, since ontologies can model multiple perspectives, semantic similarity computations for a given learning task need to be fine-tuned to account for this. Obtaining the best combination of semantic similarity aspects for each learning task is not trivial and typically depends on expert knowledge. Results We have developed a novel approach, evoKGsim, that applies Genetic Programming over a set of semantic similarity features, each based on a semantic aspect of the data, to obtain the best combination for a given supervised learning task. The approach was evaluated on several benchmark datasets for protein-protein interaction prediction using the Gene Ontology as the knowledge graph to support semantic similarity, and it outperformed competing strategies, including manually selected combinations of semantic aspects emulating expert knowledge. evoKGsim was also able to learn species-agnostic models with different combinations of species for training and testing, effectively addressing the limitations of predicting protein-protein interactions for species with fewer known interactions. Conclusions evoKGsim can overcome one of the limitations in knowledge graph-based semantic similarity applications: the need to expertly select which aspects should be taken into account for a given application. Applying this methodology to protein-protein interaction prediction proved successful, paving the way to broader applications.


2021 ◽  
Author(s):  
Rita T. Sousa ◽  
Sara Silva ◽  
Catia Pesquita

AbstractSemantic similarity between concepts in knowledge graphs is essential for several bioinformatics applications, including the prediction of protein-protein interactions and the discovery of associations between diseases and genes. Although knowledge graphs describe entities in terms of several perspectives (or semantic aspects), state-of-the-art semantic similarity measures are general-purpose. This can represent a challenge since different use cases for the application of semantic similarity may need different similarity perspectives and ultimately depend on expert knowledge for manual fine-tuning.We present a new approach that uses supervised machine learning to tailor aspect-oriented semantic similarity measures to fit a particular view on biological similarity or relatedness. We implement and evaluate it using different combinations of representative semantic similarity measures and machine learning methods with four biological similarity views: protein-protein interaction, protein function similarity, protein sequence similarity and phenotype-based gene similarity. The results demonstrate that our approach outperforms non-supervised methods, producing semantic similarity models that fit different biological perspectives significantly better than the commonly used manual combinations of semantic aspects. Moreover, although black-box machine learning models produce the best results, approaches such as genetic programming and linear regression still produce improved results while generating models that are interpretable.


Sign in / Sign up

Export Citation Format

Share Document