Verb sense disambiguation based on dual distributional similarity

This paper presents a system for automatic verb sense disambiguation using a small corpus and a Machine-Readable Dictionary (MRD) in Korean. The system learns a set of typical uses listed in the MRD usage examples for each of the senses of a polysemous verb in the MRD definitions using verb-object co-occurrences acquired from the corpus. This paper concentrates on the problem of data sparseness in two ways. First, by extending word similarity measures from direct co-occurrences to co-occurrences of co-occurring words, we compute the word similarities using non co-occurring words but co-occurring clusters. Secondly, we acquire IS-A relations of nouns from the MRD definitions. It is possible to roughly cluster the nouns by the identification of the IS-A relationship. Using these methods, two words may be considered similar even if they do not share any word elements. Experiments show that this method can learn from a very small training corpus, achieving over an 86% correct disambiguation performance without any restriction on a word's senses.

Download Full-text

Bootstrapping Distributional Feature Vector Quality

Computational Linguistics ◽

10.1162/coli.08-032-r1-06-96 ◽

2009 ◽

Vol 35 (3) ◽

pp. 435-461 ◽

Cited By ~ 11

Author(s):

Maayan Zhitomirsky-Geffet ◽

Ido Dagan

Keyword(s):

Feature Vector ◽

Similarity Measures ◽

Feature Reduction ◽

Feature Weighting ◽

Superior Performance ◽

Weighting Functions ◽

Word Similarity ◽

Feature Vectors ◽

Distributional Similarity

This article presents a novel bootstrapping approach for improving the quality of feature vector weighting in distributional word similarity. The method was motivated by attempts to utilize distributional similarity for identifying the concrete semantic relationship of lexical entailment. Our analysis revealed that a major reason for the rather loose semantic similarity obtained by distributional similarity methods is insufficient quality of the word feature vectors, caused by deficient feature weighting. This observation led to the definition of a bootstrapping scheme which yields improved feature weights, and hence higher quality feature vectors. The underlying idea of our approach is that features which are common to similar words are also most characteristic for their meanings, and thus should be promoted. This idea is realized via a bootstrapping step applied to an initial standard approximation of the similarity space. The superior performance of the bootstrapping method was assessed in two different experiments, one based on direct human gold-standard annotation and the other based on an automatically created disambiguation dataset. These results are further supported by applying a novel quantitative measurement of the quality of feature weighting functions. Improved feature weighting also allows massive feature reduction, which indicates that the most characteristic features for a word are indeed concentrated at the top ranks of its vector. Finally, experiments with three prominent similarity measures and two feature weighting functions showed that the bootstrapping scheme is robust and is independent of the original functions over which it is applied.

Download Full-text

Frequency estimates for statistical word similarity measures

10.3115/1073445.1073477 ◽

2003 ◽

Cited By ~ 55

Author(s):

Egidio Terra ◽

C. L. A. Clarke

Keyword(s):

Similarity Measures ◽

Word Similarity

Download Full-text

A Comparison of Word Similarity Measures for Noun Compound Disambiguation

Artificial Intelligence and Cognitive Science - Lecture Notes in Computer Science ◽

10.1007/978-3-642-17080-5_25 ◽

2010 ◽

pp. 231-240 ◽

Cited By ~ 1

Author(s):

Paul Nulty ◽

Fintan Costello

Keyword(s):

Similarity Measures ◽

Word Similarity

Download Full-text

AutoExtend: Combining Word Embeddings with Semantic Resources

Computational Linguistics ◽

10.1162/coli_a_00294 ◽

2017 ◽

Vol 43 (3) ◽

pp. 593-617 ◽

Cited By ~ 4

Author(s):

Sascha Rothe ◽

Hinrich Schütze

Keyword(s):

Semantic Information ◽

State Of The Art ◽

Word Sense Disambiguation ◽

Input Word ◽

Word Sense ◽

Word Embeddings ◽

Training Corpus ◽

Context Similarity ◽

Sense Disambiguation ◽

Semantic Resources

We present AutoExtend, a system that combines word embeddings with semantic resources by learning embeddings for non-word objects like synsets and entities and learning word embeddings that incorporate the semantic information from the resource. The method is based on encoding and decoding the word embeddings and is flexible in that it can take any word embeddings as input and does not need an additional training corpus. The obtained embeddings live in the same vector space as the input word embeddings. A sparse tensor formalization guarantees efficiency and parallelizability. We use WordNet, GermaNet, and Freebase as semantic resources. AutoExtend achieves state-of-the-art performance on Word-in-Context Similarity and Word Sense Disambiguation tasks.

Download Full-text

Ontology Matching using BabelNet Dictionary and Word Sense Disambiguation Algorithms

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v5.i1.pp196-205 ◽

2017 ◽

Vol 5 (1) ◽

pp. 196 ◽

Cited By ~ 5

Author(s):

Mohamed Biniz ◽

Rachid El Ayachi ◽

Mohamed Fakir

Keyword(s):

Natural Language Processing ◽

Language Processing ◽

Word Sense Disambiguation ◽

Similarity Measures ◽

Ontology Matching ◽

Word Sense ◽

Sense Disambiguation ◽

Lesk Algorithm ◽

Reference Ontology ◽

Selection Of

<p>Ontology matching is a discipline that means two things: first, the process of discovering correspondences between two different ontologies, and second is the result of this process, that is to say the expression of correspondences. This discipline is a crucial task to solve problems merging and evolving of heterogeneous ontologies in applications of the Semantic Web. This domain imposes several challenges, among them, the selection of appropriate similarity measures to discover the correspondences. In this article, we are interested to study algorithms that calculate the semantic similarity by using Adapted Lesk algorithm, Wu & Palmer Algorithm, Resnik Algorithm, Leacock and Chodorow Algorithm, and similarity flooding between two ontologies and BabelNet as reference ontology, we implement them, and compared experimentally. Overall, the most effective methods are Wu & Palmer and Adapted Lesk, which is widely used for Word Sense Disambiguation (WSD) in the field of Automatic Natural Language Processing (NLP).</p>

Download Full-text

Knowledge-driven graph similarity for text classification

International Journal of Machine Learning and Cybernetics ◽

10.1007/s13042-020-01221-4 ◽

2020 ◽

Author(s):

Niloofer Shanavas ◽

Hui Wang ◽

Zhiwei Lin ◽

Glenn Hawe

Keyword(s):

Text Classification ◽

Structural Information ◽

Similarity Measures ◽

Semantic Knowledge ◽

Exact Matching ◽

Text Documents ◽

Graph Kernel ◽

Word Similarity ◽

Graph Similarity ◽

Automatic Text Classification

AbstractAutomatic text classification using machine learning is significantly affected by the text representation model. The structural information in text is necessary for natural language understanding, which is usually ignored in vector-based representations. In this paper, we present a graph kernel-based text classification framework which utilises the structural information in text effectively through the weighting and enrichment of a graph-based representation. We introduce weighted co-occurrence graphs to represent text documents, which weight the terms and their dependencies based on their relevance to text classification. We propose a novel method to automatically enrich the weighted graphs using semantic knowledge in the form of a word similarity matrix. The similarity between enriched graphs, knowledge-driven graph similarity, is calculated using a graph kernel. The semantic knowledge in the enriched graphs ensures that the graph kernel goes beyond exact matching of terms and patterns to compute the semantic similarity of documents. In the experiments on sentiment classification and topic classification tasks, our knowledge-driven similarity measure significantly outperforms the baseline text similarity measures on five benchmark text classification datasets.

Download Full-text

A Game-Theoretic Approach to Word Sense Disambiguation

Computational Linguistics ◽

10.1162/coli_a_00274 ◽

2017 ◽

Vol 43 (1) ◽

pp. 31-70 ◽

Cited By ~ 19

Author(s):

Rocco Tripodi ◽

Marcello Pelillo

Keyword(s):

Game Theory ◽

State Of The Art ◽

Constraint Satisfaction Problem ◽

Word Sense Disambiguation ◽

Evolutionary Game ◽

Similarity Measures ◽

Theoretic Approach ◽

Word Sense ◽

Sense Disambiguation ◽

Distributional Information

This article presents a new model for word sense disambiguation formulated in terms of evolutionary game theory, where each word to be disambiguated is represented as a node on a graph whose edges represent word relations and senses are represented as classes. The words simultaneously update their class membership preferences according to the senses that neighboring words are likely to choose. We use distributional information to weigh the influence that each word has on the decisions of the others and semantic similarity information to measure the strength of compatibility among the choices. With this information we can formulate the word sense disambiguation problem as a constraint satisfaction problem and solve it using tools derived from game theory, maintaining the textual coherence. The model is based on two ideas: Similar words should be assigned to similar classes and the meaning of a word does not depend on all the words in a text but just on some of them. The article provides an in-depth motivation of the idea of modeling the word sense disambiguation problem in terms of game theory, which is illustrated by an example. The conclusion presents an extensive analysis on the combination of similarity measures to use in the framework and a comparison with state-of-the-art systems. The results show that our model outperforms state-of-the-art algorithms and can be applied to different tasks and in different scenarios.

Download Full-text

Word sense disambiguation using machine-readable dictionaries

ACM SIGIR Forum ◽

10.1145/75335.75349 ◽

1989 ◽

Vol 23 (SI) ◽

pp. 127-136 ◽

Cited By ~ 2

Author(s):

R. Krovetz ◽

W. B. Croft

Keyword(s):

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation ◽

Machine Readable

Download Full-text

Utilizing semantic word similarity measures for video retrieval

2008 IEEE Conference on Computer Vision and Pattern Recognition ◽

10.1109/cvpr.2008.4587822 ◽

2008 ◽

Cited By ~ 8

Author(s):

Yusuf Aytar ◽

Mubarak Shah ◽

Jiebo Luo

Keyword(s):

Video Retrieval ◽

Similarity Measures ◽

Word Similarity

Download Full-text

Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity

Computational Linguistics ◽

10.1162/089120105775299122 ◽

2005 ◽

Vol 31 (4) ◽

pp. 439-475 ◽

Cited By ~ 65

Author(s):

Julie Weeds ◽

David Weir

Keyword(s):

Language Processing ◽

Similarity Measures ◽

Document Retrieval ◽

Wide Range ◽

Distributional Similarity ◽

Potential Applications ◽

The Difference ◽

Occurrence Type ◽

Definition Of ◽

Relationship Of

Techniques that exploit knowledge of distributional similarity between words have been proposed in many areas of Natural Language Processing. For example, in language modeling, the sparse data problem can be alleviated by estimating the probabilities of unseen co-occurrences of events from the probabilities of seen co-occurrences of similar events. In other applications, distributional similarity is taken to be an approximation to semantic similarity. However, due to the wide range of potential applications and the lack of a strict definition of the concept of distributional similarity, many methods of calculating distributional similarity have been proposed or adopted. In this work, a flexible, parameterized framework for calculating distributional similarity is proposed. Within this framework, the problem of finding distributionally similar words is cast as one of co-occurrence retrieval (CR) for which precision and recall can be measured by analogy with the way they are measured in document retrieval. As will be shown, a number of popular existing measures of distributional similarity are simulated with parameter settings within the CR framework. In this article, the CR framework is then used to systematically investigate three fundamental questions concerning distributional similarity. First, is the relationship of lexical similarity necessarily symmetric, or are there advantages to be gained from considering it as an asymmetric relationship? Second, are some co-occurrences inherently more salient than others in the calculation of distributional similarity? Third, is it necessary to consider the difference in the extent to which each word occurs in each co-occurrence type? Two application-based tasks are used for evaluation: automatic thesaurus generation and pseudo-disambiguation. It is possible to achieve significantly better results on both these tasks by varying the parameters within the CR framework rather than using other existing distributional similarity measures; it will also be shown that any single unparameterized measure is unlikely to be able to do better on both tasks. This is due to an inherent asymmetry in lexical substitutability and therefore also in lexical distributional similarity.

Download Full-text