CO-graph: A new graph-based technique for cross-lingual word sense disambiguation

2015 ◽  
Vol 21 (5) ◽  
pp. 743-772 ◽  
Author(s):  
ANDRES DUQUE ◽  
LOURDES ARAUJO ◽  
JUAN MARTINEZ-ROMO

AbstractIn this paper, we present a new method based on co-occurrence graphs for performing Cross-Lingual Word Sense Disambiguation (CLWSD). The proposed approach comprises the automatic generation of bilingual dictionaries, and a new technique for the construction of a co-occurrence graph used to select the most suitable translations from the dictionary. Different algorithms that combine both the dictionary and the co-occurrence graph are then used for performing this selection of the final translations: techniques based on sub-graphs (communities) containing clusters of words with related meanings, based on distances between nodes representing words, and based on the relative importance of each node in the whole graph. The initial output of the system is enhanced with translation probabilities, provided by a statistical bilingual dictionary. The system is evaluated using datasets from two competitions: task 3 of SemEval 2010, and task 10 of SemEval 2013. Results obtained by the different disambiguation techniques are analysed and compared to those obtained by the systems participating in the competitions. Our system offers the best results in comparison with other unsupervised systems in most of the experiments, and even overcomes supervised systems in some cases.

2015 ◽  
Vol 81 ◽  
pp. 65-75 ◽  
Author(s):  
Andres Duque ◽  
Juan Martinez-Romo ◽  
Lourdes Araujo

Author(s):  
Mohamed Biniz ◽  
Rachid El Ayachi ◽  
Mohamed Fakir

<p>Ontology matching is a discipline that means two things: first, the process of discovering correspondences between two different ontologies, and second is the result of this process, that is to say the expression of correspondences. This discipline is a crucial task to solve problems merging and evolving of heterogeneous ontologies in applications of the Semantic Web. This domain imposes several challenges, among them, the selection of appropriate similarity measures to discover the correspondences. In this article, we are interested to study algorithms that calculate the semantic similarity by using Adapted Lesk algorithm, Wu &amp; Palmer Algorithm, Resnik Algorithm, Leacock and Chodorow Algorithm, and similarity flooding between two ontologies and BabelNet as reference ontology, we implement them, and compared experimentally. Overall, the most effective methods are Wu &amp; Palmer and Adapted Lesk, which is widely used for Word Sense Disambiguation (WSD) in the field of Automatic Natural Language Processing (NLP).</p>


2017 ◽  
Vol 41 ◽  
pp. 128-145 ◽  
Author(s):  
Ivan Lopez-Arevalo ◽  
Victor J. Sosa-Sosa ◽  
Franco Rojas-Lopez ◽  
Edgar Tello-Leal

2016 ◽  
Vol 56 ◽  
pp. 61-87 ◽  
Author(s):  
Nasrin Taghizadeh ◽  
Hesham Faili

‎Wordnets are an effective resource for natural language processing and information retrieval‎, ‎especially for semantic processing and meaning related tasks‎. ‎So far‎, ‎wordnets have been constructed for many languages‎. ‎However‎, ‎the automatic development of wordnets for low-resource languages has not been well studied‎. ‎In this paper‎, ‎an Expectation-Maximization algorithm is used to create high quality and large scale wordnets for poor-resource languages‎. ‎The proposed method benefits from possessing cross-lingual word sense disambiguation and develops a wordnet by only using a bi-lingual dictionary and a mono-lingual corpus‎. ‎The proposed method has been executed with Persian language and the resulting wordnet has been evaluated through several experiments‎. ‎The results show that the induced wordnet has a precision score of 90% and a recall score of 35%‎.


Sign in / Sign up

Export Citation Format

Share Document