Extracting Concepts’ Relations and Users’ Preferences for Personalizing Query Disambiguation

Author(s):  
Yan Chen ◽  
Yan-Qing Zhang

For most Web searching applications, queries are commonly ambiguous because words usually contain several meanings. Traditional Word Sense Disambiguation (WSD) methods use statistic models or ontology-based knowledge models to find the most appropriate sense for the ambiguous word. Since queries are usually short, the contexts of the queries may not always provide enough information for disambiguating queries. Thus, more than one interpretation may be found for one ambiguous query. In this paper, we propose a cluster-based WSD method, which finds out all appropriate interpretations for the query. Because some senses of one ambiguous word usually have very close semantic relations, we group those similar senses together for explaining the ambiguous word in one interpretation. If the cluster-based WSD method generates several contradictory interpretations for one ambiguous query, we extract users’ preferences from clickthrough data, and determine suitable concepts or concepts’ clusters that meet users’ interests for explaining the ambiguous query.

2002 ◽  
Vol 8 (4) ◽  
pp. 359-373 ◽  
Author(s):  
BERNARDO MAGNINI ◽  
CARLO STRAPPARAVA ◽  
GIOVANNI PEZZULO ◽  
ALFIO GLIOZZO

This paper explores the role of domain information in word sense disambiguation. The underlying hypothesis is that domain labels, such as MEDICINE, ARCHITECTURE and SPORT, provide a useful way to establish semantic relations among word senses, which can be profitably used during the disambiguation process. Results obtained at the SENSEVAL-2 initiative confirm that for a significant subset of words domain information can be used to disambiguate with a very high level of precision.


2013 ◽  
Vol 2013 ◽  
pp. 1-8 ◽  
Author(s):  
Xin Wang ◽  
Wanli Zuo ◽  
Ying Wang

Word sense disambiguation (WSD) is a fundamental problem in nature language processing, the objective of which is to identify the most proper sense for an ambiguous word in a given context. Although WSD has been researched over the years, the performance of existing algorithms in terms of accuracy and recall is still unsatisfactory. In this paper, we propose a novel approach to word sense disambiguation based on topical and semantic association. For a given document, supposing that its topic category is accurately discriminated, the correct sense of the ambiguous term is identified through the corresponding topic and semantic contexts. We firstly extract topic discriminative terms from document and construct topical graph based on topic span intervals to implement topic identification. We then exploit syntactic features, topic span features, and semantic features to disambiguate nouns and verbs in the context of ambiguous word. Finally, we conduct experiments on the standard data set SemCor to evaluate the performance of the proposed method, and the results indicate that our approach achieves relatively better performance than existing approaches.


Every year tens of millions of people suffer from depression and few of them get proper treatment on time. So, it is crucial to detect human stress and relaxation automatically via social media on a timely basis. It is very important to detect and manage stress before it goes into a severe problem. A huge number of informal messages are posted every day in social networking sites, blogs and discussion forums. This paper describes an approach to detect the stress using the information from social media networking sites, like tweeter.This paper presents a method to detect expressions of stress and relaxation on tweeter dataset i.e. working on sentiment analysis to find emotions or feelings about daily life. Sentiment analysis works the automatic extraction of sentiment related information from text. Here using TensiStrengthframework for sentiment strength detection on social networking sites to extract sentiment strength from the informal English text. TensiStrength is a system to detect the strength of stress and relaxation expressed in social media text messages. TensiStrength uses a lexical approach and a set of rules to detect direct and indirect expressions of stress or relaxation. This classifies both positive and negative emotions based on the strength scale from -5 to +5 indications of sentiments. Stressed sentences from the conversation are considered &categorised into stress and relax. TensiStrength is robust, it can be applied to a widevarietyofdifferent social web contexts. Theeffectiveness of TensiStrength depends on the nature of the tweets.In human being there is inborn capability to differentiate the multiple senses of an ambiguous word in a particular context, but machine executes only according to the instructions. The major drawback of machine translation is Word Sense Disambiguation. There is a fact that a single word can have multiple meanings or "senses." In the pre-processing partof-speech disambiguation is analysed and the drawback of WSD overcomes in the proposed method by unigram, bigram and trigram to give better result on ambiguous words. Here, SVM with Ngram gives better resultPrecision is65% and Recall is 67% .But, the main objective of this technique is to find the explicit and implicit amounts of stress and relaxation expressed in tweets. Keywords: Stress Detection, Data Mining, TensiStrength, word sense disambiguation.


Author(s):  
Roberto Navigli

This chapter is about ontologies: that is, knowledge models of a domain of interest. We introduce ontologies, its building blocks and sections, view them from the perspective of several fields of knowledge (computer science, philosophy, software engineering, etc.), and present existing ontologies and the different tasks of ontology building, learning, matching, mapping, and merging. We also review interfaces for building ontologies and the knowledge representation languages used to implement them. Finally, we discuss the different ways of evaluating an ontology and the applications in which it can be used, including word sense disambiguation, reasoning, question answering, semantic information retrieval. and machine translation.


2021 ◽  
Vol 11 (6) ◽  
pp. 2488
Author(s):  
Jinfeng Cheng ◽  
Weiqin Tong ◽  
Weian Yan

Word sense disambiguation (WSD) is one of the core problems in natural language processing (NLP), which is to map an ambiguous word to its correct meaning in a specific context. There has been a lively interest in incorporating sense definition (gloss) into neural networks in recent studies, which makes great contribution to improving the performance of WSD. However, disambiguating polysemes of rare senses is still hard. In this paper, while taking gloss into consideration, we further improve the performance of the WSD system from the perspective of semantic representation. We encode the context and sense glosses of the target polysemy independently using encoders with the same structure. To obtain a better presentation in each encoder, we leverage the capsule network to capture different important information contained in multi-head attention. We finally choose the gloss representation closest to the context representation of the target word as its correct sense. We do experiments on English all-words WSD task. Experimental results show that our method achieves good performance, especially having an inspiring effect on disambiguating words of rare senses.


Author(s):  
Saeed Rahmani ◽  
Seyed Mostafa Fakhrahmad ◽  
Mohammad Hadi Sadreddini

Abstract Word sense disambiguation (WSD) is the task of selecting correct sense for an ambiguous word in its context. Since WSD is one of the most challenging tasks in various text processing systems, improving its accuracy can be very beneficial. In this article, we propose a new unsupervised method based on co-occurrence graph created by monolingual corpus without any dependency on the structure and properties of the language itself. In the proposed method, the context of an ambiguous word is represented as a sub-graph extracted from a large word co-occurrence graph built based on a corpus. Most of the words are connected in this graph. To clarify the exact sense of an ambiguous word, its senses and relations are added to the context graph, and various similarity functions are employed based on the senses and context graph. In the disambiguation process, we select senses with highest similarity to the context graph. As opposite to other WSD methods, the proposed method does not use any language-dependent resources (e.g. WordNet) and it just uses a monolingual corpus. Therefore, the proposed method can be employed for other languages. Moreover, by increasing the size of corpus, it is possible to enhance the accuracy of WSD. Experimental results on English and Persian datasets show that the proposed method is competitive with existing supervised and unsupervised WSD approaches.


2011 ◽  
Vol 135-136 ◽  
pp. 160-166 ◽  
Author(s):  
Xin Hua Fan ◽  
Bing Jun Zhang ◽  
Dong Zhou

This paper presents a word sense disambiguation method by reconstructing the context using the correlation between words. Firstly, we figure out the relevance between words though the statistical quantity(co-occurrence frequency , the average distance and the information entropy) from the corpus. Secondly, we see the words that have lager correlation value between ambiguous word than other words in the context as the important words, and use this kind of words to reconstruct the context, then we use the reconstructed context as the new context of the ambiguous words .In the end, we use the method of the sememe co-occurrence data[10] for word sense disambiguation. The experimental results have proved the feasibility of this method.


Author(s):  
A. S. Bolshina ◽  
◽  
N. V. Loukachevitch ◽  

The best approaches in Word Sense Disambiguation (WSD) are supervised and rely on large amounts of hand-labelled data, which is not always available and costly to create. For the Russian language there is no sense-tagged resource of the size sufficient to train supervised word sense disambiguation algorithms. In our work we describe an approach that is used to create an automatically labelled collection based on the monosemous relatives (related unambiguous entries). The main contribution of our work is that we extracted monosemous relatives that can be located at relatively long distances from a target ambiguous word and ranked them according to the similarity measure to the target sense. The selected candidates are then used to extract training samples from the news corpus. We evaluated word sense disambiguation models based on a nearest neighbor classification on BERT and ELMo embeddings. Our work relies on the Russian wordnet RuWordNet.


Now-a-days digital documents are playing a major role in all the areas /web, as such all the information is digitalised. Queries are used by the search engines to retrieve the information. Query plays a major role in information retrieval system, as a result relevant and non relevant documents are retrieved. Query expansion techniques will better the performance of the information retrieval system. Our proposed query expansion technique is Word Sense Disambiguation. This is to find the correct sense of the ambiguous word in regional Telugu language. In Query expansion, if the added query term is an ambiguous word, accuracy of relevant documents will be very less. So to avoid this, proposed method Word Sense Disambiguation (WSD) is used, which is related to NLP Natural Language Processing and Artificial Intelligence AI. WSD improves the accuracy of information retrieval system.


Sign in / Sign up

Export Citation Format

Share Document