scholarly journals Sarcasm Detection For Sentiment Analysis in Indonesian Tweets

Author(s):  
Yessi Yunitasari ◽  
Aina Musdholifah ◽  
Anny Kartika Sari

Twitter is one of the social medias that are widely used at the moment. Tweet conversations can be classified according to their sentiments. The existence of sarcasm contained in a tweet sometimes causes incorrect determination of the tweet’s sentiment because sarcasm is difficult to analyze automatically, even by humans. Hence, sarcasm detection needs to be conducted, which is expected to improve the results of sentiment analysis. The effect of sarcasm detection on sentiment analysis can be seen in terms of accuracy, precision and recall. In this paper, detection of sarcasm is applied to Indonesian tweets. The feature extraction of sarcasm detection uses unigram and 4 Boazizi feature sets which consist of sentiment-relate features, punctuation-relate features, lexical and syntactic features, and top word features. Detection of sarcasm uses the Random Forest algorithm. The feature extraction of sentiment analysis uses TF-IDF, while the classification uses Naïve Bayes algorithm. The evaluation shows that sentiment analysis with sarcasm detection improves the  accuracy of sentiment analysis about 5.49%. The accuracy of the model is 80.4%, while the precision is 83.2%, and the recall is 91.3%.

Author(s):  
Farrikh Alzami ◽  
Erika Devi Udayanti ◽  
Dwi Puji Prabowo ◽  
Rama Aria Megantara

Sentiment analysis in terms of polarity classification is very important in everyday life, with the existence of polarity, many people can find out whether the respected document has positive or negative sentiment so that it can help in choosing and making decisions. Sentiment analysis usually done manually. Therefore, an automatic sentiment analysis classification process is needed. However, it is rare to find studies that discuss extraction features and which learning models are suitable for unstructured sentiment analysis types with the Amazon food review case. This research explores some extraction features such as Word Bags, TF-IDF, Word2Vector, as well as a combination of TF-IDF and Word2Vector with several machine learning models such as Random Forest, SVM, KNN and Naïve Bayes to find out a combination of feature extraction and learning models that can help add variety to the analysis of polarity sentiments. By assisting with document preparation such as html tags and punctuation and special characters, using snowball stemming, TF-IDF results obtained with SVM are suitable for obtaining a polarity classification in unstructured sentiment analysis for the case of Amazon food review with a performance result of 87,3 percent.


The World Wide Web has boosted its content for the past years, it has a vast amount of multimedia resources that continuously grow specifically in documentary data. One of the major contributors of documentary contents can be evidently found on the social media called Facebook. People or netizens on Facebook are actively sharing their opinion about a certain topic or posts that can be related to them or not. With the huge amount of accessible documentary data that are seen on the so-called social media, there are research trends that can be made by the researchers in the field of opinion mining. A netizen’s comment on a particular post can either be a negative or a positive one. This study will discuss the opinion or comment of a netizen whether it is positive or negative or how she/he feels about a specific topic posted on Facebook; this is can be measured by the use of Sentiment Analysis. The combination of the Natural Language Processing and the analytics in textual form is also known as Sentiment Analysis that is use to the extraction of data in a useful manner. This study will be based on the product reviews of Filipinos in Filipino, English and Taglish (mixed Filipino and English) languages. To categorize a comment effectively, the Naïve Bayes Algorithm was implemented to the developed web system.


Author(s):  
Normi Sham Awang Abu Bakar ◽  
Ros Aziehan Rahmat ◽  
Umar Faruq Othman

<p>The popularity of the social media channels has increased the interest among researchers in the sentiment analysis(SA) area. One aspect of the SA research is the determination of the polarity of the comments in the social media, i.e. positive, negative, and neutral. However, there is a scarcity of Malay sentiment analysis tools because most of the work in the literature discuss the polarity classification tool in English. This paper presents the development of a polarity classification tool called Malay Polarity Classification Tool(MaCT). This tool is developed based on the AFINN sentiment lexicon for English language. We have attempted to translate each word in AFINN to its Malay equivalent and later, use the lexicon to collect the sentiment data from Twitter. The Twitter data are then classified into positive, negative, and neutral. For the validation purpose, we collect 400 positive tweets, 400 negative tweets, and 200 neutral tweets, and later, run the tweets through our sentiment lexicon and found 90% score for precision, recall and accuracy. Our main contribution in the research is the new AFINN translation for Malay language and also the classification of the sentiment data.</p>


2020 ◽  
Vol 222 (2) ◽  
pp. 978-988
Author(s):  
Yury Meshalkin ◽  
Anuar Shakirov ◽  
Evgeniy Popov ◽  
Dmitry Koroteev ◽  
Irina Gurbatova

SUMMARY Rock thermal conductivity is an essential input parameter for enhanced oil recovery methods design and optimization and for basin and petroleum system modelling. Absence of any effective technique for direct in situ measurements of rock thermal conductivity makes the development of well-log based methods for rock thermal conductivity determination highly desirable. A major part of the existing problem solutions is regression model-based approaches. Literature review revealed that there are only several studies performed to assess the applicability of neural network-based algorithms to predict rock thermal conductivity from well-logging data. In this research, we aim to define the most effective machine-learning algorithms for well-log based determination of rock thermal conductivity. Well-logging data acquired at a heavy oil reservoir together with results of thermal logging on cores extracted from two wells were the basis for our research. Eight different regression models were developed and tested to predict vertical variations of rock conductivity from well-logging data. Additionally, rock thermal conductivity was determined based on Lichtenecker–Asaad model. Comparison study of regression-based and theoretical-based approaches was performed. Among considered machine learning techniques Random Forest algorithm was found to be the most accurate at well-log based determination of rock thermal conductivity. From a comparison of the thermal conductivity—depth profile predicted from well-logging data with the experimental data, and it can be concluded that thermal conductivity can be determined with a total relative error of 12.54 per cent. The obtained results prove that rock thermal conductivity can be inferred from well-logging data for wells that are drilled in a similar geological setting based on the Random Forest algorithm with an accuracy sufficient for industrial needs.


ecommerce industries expose public page in the social network site (Facebook, twitter etc) for the intention of improving of business strategy. They extract public mood about the social network page in the forms of total likes, the total share of the page and sentiment of all comments to the social network page similar way celebrities expose public page in the social network sites for the intention of improving its fame. We have developed an assorted model for publicly available page of Facebook. This assorted model is the combination of data extractor model, language convertor and cleaned model, and sentiment analyzer model. Our data extractor model extract comments on all the posts of publicly expose Facebook page in the less span of time. Language convertor and cleaned model would work for conversion of text written in different Indian language to the English language and after that English written text would be cleaned through cleaned model. Language convertor is made after implementing CILTEL model. CILTEL model converts comments written in the Indian languages in the English language. Cleaning model will clean all the comments of all the posts on the Facebook page. Finally, sentiment extraction model will extract sentiments of all the comments of the Facebook page. We have implemented classification using three machine learning algorithm, namely naïve bayes algorithm, perceptron algorithm and rocchio algorithm for checking the performance of our sentiment analysis model. Our assorted sentiment analysis model is beneficial to users like marketing industry, election parties and celebrities


Author(s):  
Panchal Mayuriben ◽  
Dr. Priyanka Sharma ◽  
Jatin Patel

Analysis of the behavioral pattern of a people using data of the social media became a trend in last couple of years. Among this popular network, Twitter, Facebook and the Instagram become more and more popular and that’s why these platforms attract the lots of researchers to predict the sentiment regarding major events like election, product brand, movie, stock market and recent trends are some of them. By identifying the attitude associated with the text in terms of positive, negative or the neutral we are able to analyze the opinion behind the content generated by the user and this opinions about the sentiment are very helpful to for the organization or the political parties or among other entities. The task of sentiment analysis is conducted using identifying the polarity associated with the word or document or we can say sentence. This paper consists research work which is designed to improve the accuracy of the model by improving the Naïve Bayes algorithm and I also worked to improve the 3-gram method during my research


2019 ◽  
Author(s):  
Bruno Tavares Padovese ◽  
Linilson Rodrigues Padovese

AbstractAvian survey is a time-consuming and challenging task, often being conducted in remote and sometimes inhospitable locations. In this context, the development of automated acoustic landscape monitoring systems for bird survey is essential. We conducted a comparative study between two machine learning methods for the detection and identification of 2 endangered Brazilian bird species from the Psittacidae species, the Amazona brasiliensis and the Amazona vinacea. Specifically, we focus on the identification of these 2 species in an acoustic landscape where similar vocalizations from other Psittacidae species are present. A 3-step approach is presented, composed of signal segmentation and filtering, feature extraction, and classification. In the feature extraction step, the Mel-Frequency Cepstrum Coefficients features were extract and fed to the Random Forest Algorithm and the Multilayer Perceptron for training and classifying acoustic samples. The experiments showed promising results, particularly for the Random Forest algorithm, achieving accuracy of up to 99%. Using a combination of signal segmentation and filtering before the feature extraction steps greatly increased experimental results. Additionally, the results show that the proposed approach is robust and flexible to be adopted in passive acoustic monitoring systems.


Sign in / Sign up

Export Citation Format

Share Document