A COMPARATIVE STUDY OF STATISTICAL AND NATURAL LANGUAGE PROCESSING TECHNIQUES FOR SENTIMENT ANALYSIS

2015 ◽  
Vol 77 (18) ◽  
Author(s):  
Wai-Howe Khong ◽  
Lay-Ki Soon ◽  
Hui-Ngo Goh

Sentiment analysis has emerged as one of the most powerful tools in business intelligence. With the aim of proposing an effective sentiment analysis technique, we have performed experiments on analyzing the sentiments of 3,424 tweets using both statistical and natural language processing (NLP) techniques as part of our background study.  For statistical technique, machine learning algorithms such as Support Vector Machines (SVMs), decision trees and Naïve Bayes have been explored. The results show that SVM consistently outperformed the rest in both classifications. As for sentiment analysis using NLP techniques, we used two different tagging methods for part-of-speech (POS) tagging.  Subsequently, the output is used for word sense disambiguation (WSD) using WordNet, followed by sentiment identification using SentiWordNet.  Our experimental results indicate that adjectives and adverbs are sufficient to infer the sentiment of tweets compared to other combinations. Comparatively, the statistical approach records higher accuracy than the NLP approach by approximately 17%.

2018 ◽  
Vol 10 (10) ◽  
pp. 3729 ◽  
Author(s):  
Hei Wang ◽  
Yung Chi ◽  
Ping Hsin

With the advent of the knowledge economy, firms often compete for intellectual property rights. Being the first to acquire high-potential patents can assist firms in achieving future competitive advantages. To identify patents capable of being developed, firms often search for a focus by using existing patent documents. Because of the rapid development of technology, the number of patent documents is immense. A prominent topic among current firms is how to use this large number of patent documents to discover new business opportunities while avoiding conflicts with existing patents. In the search for technological opportunities, a crucial task is to present results in the form of an easily understood visualization. Currently, natural language processing can help in achieving this goal. In natural language processing, word sense disambiguation (WSD) is the problem of determining which “sense” (meaning) of a word is activated in a given context. Given a word and its possible senses, as defined by a dictionary, we classify the occurrence of a word in context into one or more of its sense classes. The features of the context (such as neighboring words) provide evidence for these classifications. The current method for patent document analysis warrants improvement in areas, such as the analysis of many dimensions and the development of recommendation methods. This study proposes a visualization method that supports semantics, reduces the number of dimensions formed by terms, and can easily be understood by users. Since polysemous words occur frequently in patent documents, we also propose a WSD method to decrease the calculated degrees of distortion between terms. An analysis of outlier distributions is used to construct a patent map capable of distinguishing similar patents. During the development of new strategies, the constructed patent map can assist firms in understanding patent distributions in commercial areas, thereby preventing patent infringement caused by the development of similar technologies. Subsequently, technological opportunities can be recommended according to the patent map, aiding firms in assessing relevant patents in commercial areas early and sustainably achieving future competitive advantages.


Author(s):  
Marina Sokolova ◽  
Stan Szpakowicz

This chapter presents applications of machine learning techniques to traditional problems in natural language processing, including part-of-speech tagging, entity recognition and word-sense disambiguation. People usually solve such problems without difficulty or at least do a very good job. Linguistics may suggest labour-intensive ways of manually constructing rule-based systems. It is, however, the easy availability of large collections of texts that has made machine learning a method of choice for processing volumes of data well above the human capacity. One of the main purposes of text processing is all manner of information extraction and knowledge extraction from such large text. Machine learning methods discussed in this chapter have stimulated wide-ranging research in natural language processing and helped build applications with serious deployment potential.


The information on WWW has mounted to a greater height, overriding to fledgling analysis in the direction of sentiments using Artificial Intelligence. Sentiment Analysis deals with the calculus exploration of sentiments, opinions and subjectivity. In this paper, multilingual tweets are analyzed for identifying the polarities of various political parties like AAP, BJP, Samajwadi, BSP and Congress; so that the users will get an idea that to which party they should give their vote. The data is being analyzed using Natural Language Processing. Using different smoothening techniques, noise is removed from data, classified by using Machine learning algorithms and then the accuracy of the system is gauged using various evaluation precision measures. The central premise of this research is to benignant common people and politicians both. For common people; is for deciding their precious vote, to which party to give will be good for themselves and nation too. For politicians; they will have an idea about themselves i.e. after seeking the polarities of different parties, the politicians will have an idea which party is preferable and which is not preferable, so that the politicians can work accordingly. The system shows comparison among VADER and SVM algorithm; and SVM algorithm showed 90% accuracy.


2016 ◽  
Vol 13 (10) ◽  
pp. 6929-6934
Author(s):  
Junting Chen ◽  
Liyun Zhong ◽  
Caiyun Cai

Word sense disambiguation (WSD) in natural language text is a fundamental semantic understanding task at the lexical level in natural language processing (NLP) applications. Kernel methods such as support vector machine (SVM) have been successfully applied to WSD. This is mainly due to their relatively high classification accuracy as well as their ability to handle high dimensional and sparse data. A significant challenge in WSD is to reduce the need for labeled training data while maintaining an acceptable performance. In this paper, we present a semi-supervised technique using the exponential kernel for WSD. Specifically, the semantic similarities between terms are first determined with both labeled and unlabeled training data by means of a diffusion process on a graph defined by lexicon and co-occurrence information, and the exponential kernel is then constructed based on the learned semantic similarity. Finally, the SVM classifier trains a model for each class during the training phase and this model is then applied to all test examples in the test phase. The main feature of this approach is that it takes advantage of the exponential kernel to reveal the semantic similarities between terms in an unsupervised manner, which provides a kernel framework for semi-supervised learning. Experiments on several SENSEVAL benchmark data sets demonstrate the proposed approach is sound and effective.


2015 ◽  
Vol 2015 ◽  
pp. 1-6 ◽  
Author(s):  
Yuntong Liu ◽  
Hua Sun

In order to use semantics more effectively in natural language processing, a word sense disambiguation method for Chinese based on semantics calculation was proposed. The word sense disambiguation for a Chinese clause could be achieved by solving the semantic model of the natural language; each step of the word sense disambiguation process was discussed in detail; and the computational complexity of the word sense disambiguation process was analyzed. Finally, some experiments were finished to verify the effectiveness of the method.


2012 ◽  
Vol 182-183 ◽  
pp. 2109-2112
Author(s):  
Lin Lin Yu ◽  
Deng Feng Xu ◽  
Li Fang Song ◽  
Guo Jie Li ◽  
Xu Dong Song

Word sense disambiguation (WSD) is a critical and difficult issue in natural language processing(NLP), as well as WSD is great significance in large area of research areas of NLP. This paper presents a method of multi-word sense disambiguation strategy. The method combines the method based on match word corpus and the method based on the similarity and relevance very well. While the calculation of similarity and relevance are make full use of the sememe-tree information from HowNet. The experiments show that the proposed WSD method can obtain better results.


Author(s):  
Mr. Prashant Y. Itankar ◽  
Dr. Nikhat Raza

Execution of Word Sense Disambiguation (WSD) is one of the difficult undertakings in the space of Natural language processing (NLP). Age of sense clarified corpus for multilingual WSD is far off for most languages regardless of whether assets are accessible. In this paper we propose a solo technique utilizing word and sense embeddings for working on the presentation of WSD frameworks utilizing untagged corpora and make two bags to be specific context bag and wiki sense bag to create the faculties with most noteworthy closeness. Wiki sense bag gives outer information to the framework needed to help the disambiguation exactness. We investigate Word2Vec model to produce the sense back and notice huge execution acquire for our dataset.


2019 ◽  
Author(s):  
Qasem Al-Tashi

In the process of natural language, a lot of words have different connotations. The correct sense of a word depends upon the context in which the word occurs. Word sense disambiguation known as the process of selecting the most correct sense of the word in a given sentence. Furthermore, Most of natural language processing applications, such as the extraction of information, machine translation, and Analysing of content are supported by the process of word sense disambiguation which can be an essential pre-processing step for them.


Sign in / Sign up

Export Citation Format

Share Document