An Enhancement of Malay Social Media Text Normalization for Lexicon-Based Sentiment Analysis

In this digitized world, the Internet has become a prominent source to glean various kinds of information. In today’s scenario, people prefer virtual reality instead of one to one communication. The Majority of the population prefers social networking sites to voice themselves through posts, blogs, comments, likes, dislikes. Their sentiments can be found/traced using opinion mining or Sentiment analysis. Sentiment analysis of social media text is a useful technique for identifying peoples’ positive, negative or neutral emotions/sentiments/opinions. Sentiment analysis has gained special attention by researchers from last few years. Traditionally many machine learning algorithms were used to implement it like navie bays, Support Vector Machine and many more. But to overcome the drawbacks of ML in terms of complex classification algorithms different deep learning-based algorithms are introduced like CNN, RNN, and HNN. In this paper, we have studied different deep learning algorithms and intended to propose a deep learning-based model to analyze the behavior of an individual using social media text. Results given by the proposed model can utilize in a range of different fields like business, education, industry, politics, psychology, security, etc.

Download Full-text

NITS-Hinglish-SentiMix at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text Using an Ensemble Model

10.18653/v1/2020.semeval-1.175 ◽

2020 ◽

Author(s):

Subhra Jyoti Baroi ◽

Nivedita Singh ◽

Ringki Das ◽

Thoudam Doren Singh

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Ensemble Model ◽

Social Media Text

Download Full-text

Sentiment Analysis on Hindi–English Code-Mixed Social Media Text

Innovations in Computer Science and Engineering - Lecture Notes in Networks and Systems ◽

10.1007/978-981-33-4543-0_65 ◽

2021 ◽

pp. 615-622

Author(s):

T. Tulasi Sasidhar ◽

B. Premjith ◽

K. Sreelakshmi ◽

K. P. Soman

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Social Media Text

Download Full-text

Roman to Gurmukhi Social Media Text Normalization

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-08-2020-0096 ◽

2020 ◽

Vol 13 (4) ◽

pp. 407-435

Author(s):

Jagroop Kaur ◽

Jaswinder Singh

Keyword(s):

Social Media ◽

Language Processing ◽

Research Work ◽

Target Language ◽

Translation System ◽

Second Phase ◽

Content Type ◽

Social Media Text ◽

Practical Implications ◽

Text Normalization

PurposeNormalization is an important step in all the natural language processing applications that are handling social media text. The text from social media poses a different kind of problems that are not present in regular text. Recently, a considerable amount of work has been done in this direction, but mostly in the English language. People who do not speak English code mixed the text with their native language and posted text on social media using the Roman script. This kind of text further aggravates the problem of normalizing. This paper aims to discuss the concept of normalization with respect to code-mixed social media text, and a model has been proposed to normalize such text.Design/methodology/approachThe system is divided into two phases – candidate generation and most probable sentence selection. Candidate generation task is treated as machine translation task where the Roman text is treated as source language and Gurmukhi text is treated as the target language. Character-based translation system has been proposed to generate candidate tokens. Once candidates are generated, the second phase uses the beam search method for selecting the most probable sentence based on hidden Markov model.FindingsCharacter error rate (CER) and bilingual evaluation understudy (BLEU) score are reported. The proposed system has been compared with Akhar software and RB\_R2G system, which are also capable of transliterating Roman text to Gurmukhi. The performance of the system outperforms Akhar software. The CER and BLEU scores are 0.268121 and 0.6807939, respectively, for ill-formed text.Research limitations/implicationsIt was observed that the system produces dialectical variations of a word or the word with minor errors like diacritic missing. Spell checker can improve the output of the system by correcting these minor errors. Extensive experimentation is needed for optimizing language identifier, which will further help in improving the output. The language model also seeks further exploration. Inclusion of wider context, particularly from social media text, is an important area that deserves further investigation.Practical implicationsThe practical implications of this study are: (1) development of parallel dataset containing Roman and Gurmukhi text; (2) development of dataset annotated with language tag; (3) development of the normalizing system, which is first of its kind and proposes translation based solution for normalizing noisy social media text from Roman to Gurmukhi. It can be extended for any pair of scripts. (4) The proposed system can be used for better analysis of social media text. Theoretically, our study helps in better understanding of text normalization in social media context and opens the doors for further research in multilingual social media text normalization.Originality/valueExisting research work focus on normalizing monolingual text. This study contributes towards the development of a normalization system for multilingual text.

Download Full-text

A customizable pipeline for social media text normalization

Social Network Analysis and Mining ◽

10.1007/s13278-017-0464-z ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 5

Author(s):

Abeed Sarker

Keyword(s):

Social Media ◽

Social Media Text ◽

Text Normalization

Download Full-text

Sentiment analysis of Social Media Text-Emoticon Post with Machine learning Models Contribution Title

Journal of Physics Conference Series ◽

10.1088/1742-6596/2070/1/012079 ◽

2021 ◽

Vol 2070 (1) ◽

pp. 012079

Author(s):

V Jagadishwari ◽

A Indulekha ◽

Kiran Raghu ◽

P Harshini

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Online Social Networks ◽

Data Sets ◽

Learning Models ◽

Twitter Data ◽

The Social ◽

Social Media Text ◽

Machine Learning Models

Abstract Social Media is an arena in recent times for people to share their perspectives on a variety of topics. Most of the social interactions are through the Social Media. Though all the Online Social Networks allow users to express their views and opinions in many forms like audio, video, text etc, the most popular form of expression is text, Emoticons and Emojis. The work presented in this paper aims at detecting the sentiments expressed in the Social Media posts. The Machine Learning Models namely Bernoulli Bayes, Multinomial Bayes, Regression and SVM were implemented. All these models were trained and tested with Twitter Data sets. Users on Twitter express their opinions in the form of tweets with limited characters. Tweets also contain Emoticons and Emojis therefore Twitter data sets are best suited for the sentiment analysis. The effect of emoticons present in the tweet is also analyzed. The models are first trained only with the text and then they are trained with text and emoticon in the tweet. The performance of all the four models in both cases are tested and the results are presented in the paper.

Download Full-text

A Cascaded Approach for Social Media Text Normalization of Turkish

10.3115/v1/w14-1308 ◽

2014 ◽

Cited By ~ 5

Author(s):

Dilara Torunoğlu ◽

Gülsen Eryiğit

Keyword(s):

Social Media ◽

Social Media Text ◽

Text Normalization

Download Full-text