The Algorithm of Sense Disambiguation Based on Bayesian Model

2013 ◽  
Vol 427-429 ◽  
pp. 1879-1882
Author(s):  
Chun Xiang Zhang ◽  
Xue Yao Gao ◽  
Zhi Mao Lu

Sense disambiguation is an important problem in pattern recognition. In this paper, a new algorithm of sense disambiguation is proposed, in which part-of-speech tags of the left word and the right word around the ambiguous word are extracted as discriminative features. At the same time, the bayesian model is selected as the sense disambiguation classifier and it is built based on discriminative features. The architecture of sense classification is given. The new algorithm is trained on sense-annotated corpus. Then it is used to determine its sense category. Experimental results show that the accuracy rate of disambiguation arrives at 60%.

Author(s):  
Siti Nurmaini ◽  
Ahmad Zarkasi ◽  
Deris Stiawan ◽  
Bhakti Yudho Suprapto ◽  
Sri Desy Siswanti ◽  
...  

In terms of movement, mobile robots are equipped with various navigation techniques. One of the navigation techniques used is facial pattern recognition. But Mobile robot hardware usually uses embedded platforms which have limited resources. In this study, a new navigation technique is proposed by combining a face detection system with a ram-based artificial neural network. This technique will divide the face detection area into five frame areas, namely top, bottom, right, left, and neutral. In this technique, the face detection area is divided into five frame areas, namely top, bottom, right, left, and neutral. The value of each detection area will be grouped into the ram discriminator. Then a training and testing process will be carried out to determine which detection value is closest to the true value, which value will be compared with the output value in the output pattern so that the winning discriminator is obtained which is used as the navigation value. In testing 63 face samples for the Upper and Lower frame areas, resulting in an accuracy rate of 95%, then for the Right and Left frame areas, the resulting accuracy rate is 93%. In the process of testing the ram-based neural network algorithm pattern, the efficiency of memory capacity in ram, the discriminator is 50%, assuming a 16-bit input pattern to 8 bits. While the execution time of the input vector until the winner of the class is under milliseconds (ms).


Author(s):  
Saeed Rahmani ◽  
Seyed Mostafa Fakhrahmad ◽  
Mohammad Hadi Sadreddini

Abstract Word sense disambiguation (WSD) is the task of selecting correct sense for an ambiguous word in its context. Since WSD is one of the most challenging tasks in various text processing systems, improving its accuracy can be very beneficial. In this article, we propose a new unsupervised method based on co-occurrence graph created by monolingual corpus without any dependency on the structure and properties of the language itself. In the proposed method, the context of an ambiguous word is represented as a sub-graph extracted from a large word co-occurrence graph built based on a corpus. Most of the words are connected in this graph. To clarify the exact sense of an ambiguous word, its senses and relations are added to the context graph, and various similarity functions are employed based on the senses and context graph. In the disambiguation process, we select senses with highest similarity to the context graph. As opposite to other WSD methods, the proposed method does not use any language-dependent resources (e.g. WordNet) and it just uses a monolingual corpus. Therefore, the proposed method can be employed for other languages. Moreover, by increasing the size of corpus, it is possible to enhance the accuracy of WSD. Experimental results on English and Persian datasets show that the proposed method is competitive with existing supervised and unsupervised WSD approaches.


2011 ◽  
Vol 135-136 ◽  
pp. 160-166 ◽  
Author(s):  
Xin Hua Fan ◽  
Bing Jun Zhang ◽  
Dong Zhou

This paper presents a word sense disambiguation method by reconstructing the context using the correlation between words. Firstly, we figure out the relevance between words though the statistical quantity(co-occurrence frequency , the average distance and the information entropy) from the corpus. Secondly, we see the words that have lager correlation value between ambiguous word than other words in the context as the important words, and use this kind of words to reconstruct the context, then we use the reconstructed context as the new context of the ambiguous words .In the end, we use the method of the sememe co-occurrence data[10] for word sense disambiguation. The experimental results have proved the feasibility of this method.


2022 ◽  
Vol 12 (2) ◽  
pp. 732
Author(s):  
Abderrahim Lakehal ◽  
Adel Alti ◽  
Philippe Roose

This paper aims at ensuring an efficient recommendation. It proposes a new context-aware semantic-based probabilistic situations injection and adaptation using an ontology approach and Bayesian-classifier. The idea is to predict the relevant situations for recommending the right services. Indeed, situations are correlated with the user’s context. It can, therefore, be considered in designing a recommendation approach to enhance the relevancy by reducing the execution time. The proposed solution in which four probability-based-context rule situation items (user’s location and time, user’s role, their preferences and experiences) are chosen as inputs to predict user’s situations. Subsequently, the weighted linear combination is applied to calculate the similarity of rule items. The higher scores between the selected items are used to identify the relevant user’s situations. Three context parameters (CPU speed, sensor availability and RAM size) of the current devices are used to ensure adaptive service recommendation. Experimental results show that the proposed approach enhances accuracy rate with a high number of situations rules. A comparison with existing recommendation approaches shows that the proposed approach is more efficient and decreases the execution time.


Author(s):  
Farza Nurifan ◽  
Riyanarto Sarno ◽  
Cahyaningtyas Sekar Wahyuni

Word Sense Disambiguation (WSD) is one of the most difficult problems in the artificial intelligence field or well known as AI-hard or AI-complete. A lot of problems can be solved using word sense disambiguation approaches like sentiment analysis, machine translation, search engine relevance, coherence, anaphora resolution, and inference. In this paper, we do research to solve WSD problem with two small corpora. We propose the use of Word2vec and Wikipedia to develop the corpora. After developing the corpora, we measure the sentence similarity with the corpora using cosine similarity to determine the meaning of the ambiguous word. Lastly, to improve accuracy, we use Lesk algorithms and Wu Palmer similarity to deal with problems when there is no word from a sentence in the corpora (we call it as semantic similarity). The results of our research show an 86.94% accuracy rate and the semantic similarity improve the accuracy rate by 12.96% in determining the meaning of ambiguous words.


2014 ◽  
Vol 981 ◽  
pp. 157-160
Author(s):  
Chun Xiang Zhang ◽  
Li Li Guo ◽  
Xue Yao Gao

Word sense disambiguation is widely applied to information retrieval, semantic comprehension and automatic summarization. It is an important research problem in natural language processing. In this paper, the center window is determined from the target ambiguous word. The words in the center window are extracted as discriminative features. At the same time, a new method of word sense disambiguation is proposed and the disambiguation classifier is given. The classifier is optimized and tested on SemEval-2007 #Task5 corpus. Experimental results show that the accuracy rate of disambiguation arrives at 64.2%.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Chun-Xiang Zhang ◽  
Rui Liu ◽  
Xue-Yao Gao ◽  
Bo Yu

Word sense disambiguation (WSD) is an important research topic in natural language processing, which is widely applied to text classification, machine translation, and information retrieval. In order to improve disambiguation accuracy, this paper proposes a WSD method based on the graph convolutional network (GCN). Word, part of speech, and semantic category are extracted from contexts of the ambiguous word as discriminative features. Discriminative features and sentence containing the ambiguous word are used as nodes to construct the WSD graph. Word2Vec tool, Doc2Vec tool, pointwise mutual information (PMI), and TF-IDF are applied to compute embeddings of nodes and edge weights. GCN is used to fuse features of a node and its neighbors, and the softmax function is applied to determine the semantic category of the ambiguous word. Training corpus of SemEval-2007: Task #5 is adopted to optimize the proposed WSD classifier. Test corpus of SemEval-2007: Task #5 is used to test the performance of WSD classifier. Experimental results show that average accuracy of the proposed method is improved.


2013 ◽  
Vol 333-335 ◽  
pp. 1106-1109
Author(s):  
Wei Wu

Palm vein pattern recognition is one of the newest biometric techniques researched today. This paper proposes project the palm vein image matrix based on independent component analysis directly, then calculates the Euclidean distance of the projection matrix, seeks the nearest distance for classification. The experiment has been done in a self-build palm vein database. Experimental results show that the algorithm of independent component analysis is suitable for palm vein recognition and the recognition performance is practical.


Author(s):  
Necva Bölücü ◽  
Burcu Can

Part of speech (PoS) tagging is one of the fundamental syntactic tasks in Natural Language Processing, as it assigns a syntactic category to each word within a given sentence or context (such as noun, verb, adjective, etc.). Those syntactic categories could be used to further analyze the sentence-level syntax (e.g., dependency parsing) and thereby extract the meaning of the sentence (e.g., semantic parsing). Various methods have been proposed for learning PoS tags in an unsupervised setting without using any annotated corpora. One of the widely used methods for the tagging problem is log-linear models. Initialization of the parameters in a log-linear model is very crucial for the inference. Different initialization techniques have been used so far. In this work, we present a log-linear model for PoS tagging that uses another fully unsupervised Bayesian model to initialize the parameters of the model in a cascaded framework. Therefore, we transfer some knowledge between two different unsupervised models to leverage the PoS tagging results, where a log-linear model benefits from a Bayesian model’s expertise. We present results for Turkish as a morphologically rich language and for English as a comparably morphologically poor language in a fully unsupervised framework. The results show that our framework outperforms other unsupervised models proposed for PoS tagging.


Author(s):  
Frank Rehm ◽  
Roland Winkler ◽  
Rudolf Kruse

A well known issue with prototype-based clustering is the user’s obligation to know the right number of clusters in a dataset in advance or to determine it as a part of the data analysis process. There are different approaches to cope with this non-trivial problem. This chapter follows the approach to address this problem as an integrated part of the clustering process. An extension to repulsive fuzzy c-means clustering is proposed equipping non-Euclidean prototypes with repulsive properties. Experimental results are presented that demonstrate the feasibility of the authors’ technique.


Sign in / Sign up

Export Citation Format

Share Document