scholarly journals A Study on Feature Subsumption for Sentiment Classification in Social Networks using Natural Language Processing

2012 ◽  
Vol 53 (18) ◽  
pp. 29-33
Author(s):  
B. Jayanag ◽  
K. Vineela ◽  
S. Vasavi

Sentiment Classification is one of the well-known and most popular domain of machine learning and natural language processing. An algorithm is developed to understand the opinion of an entity similar to human beings. This research fining article presents a similar to the mention above. Concept of natural language processing is considered for text representation. Later novel word embedding model is proposed for effective classification of the data. Tf-IDF and Common BoW representation models were considered for representation of text data. Importance of these models are discussed in the respective sections. The proposed is testing using IMDB datasets. 50% training and 50% testing with three random shuffling of the datasets are used for evaluation of the model.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Julián Ramírez Sánchez ◽  
Alejandra Campo-Archbold ◽  
Andrés Zapata Rozo ◽  
Daniel Díaz-López ◽  
Javier Pastor-Galindo ◽  
...  

Among the myriad of applications of natural language processing (NLP), assisting law enforcement agencies (LEA) in detecting and preventing cybercrimes is one of the most recent and promising ones. The promotion of violence or hate by digital means is considered a cybercrime as it leverages the cyberspace to support illegal activities in the real world. The paper at hand proposes a solution that uses neural network (NN) based NLP to monitor suspicious activities in social networks allowing us to identify and prevent related cybercrimes. An LEA can find similar posts grouped in clusters, then determine their level of polarity, and identify a subset of user accounts that promote violent activities to be reviewed extensively as part of an effort to prevent crimes and specifically hostile social manipulation (HSM). Different experiments were also conducted to prove the feasibility of the proposal.


2021 ◽  
Author(s):  
◽  
Vrushang Patel

Text classification is a classical machine learning application in Natural Language Processing, which aims to assign labels to textual units such as documents, sentences, paragraphs, and queries. Applications of text classification include sentiment classification and news categorization. Sentiment classification identifies the polarity of text such as positive, negative or neutral based on textual features. In this thesis, we implemented a modified form of a tolerance-based algorithm (TSC) to classify sentiment polarities of tweets as well as news categories from text. The TSC algorithm is a supervised algorithm that was designed to perform short text classification with tolerance near sets (TNS). The proposed TSC algorithm uses pre-trained SBERT algorithm vectors for creating tolerance classes. The effectiveness of the TSC algorithm has been demonstrated by testing it on ten well-researched data sets. One of the datasets (Covid-Sentiment) was hand-crafted with tweets from Twitter of opinions related to COVID. Experiments demonstrate that TSC outperforms five classical ML algorithms with one dataset, and is comparable with all other datasets using a weighted F1-score measure.


2021 ◽  
pp. 233-252
Author(s):  
Upendar Rao Rayala ◽  
Karthick Seshadri

Sentiment analysis is perceived to be a multi-disciplinary research domain composed of machine learning, artificial intelligence, deep learning, image processing, and social networks. Sentiment analysis can be used to determine opinions of the public about products and to find the customers' interest and their feedback through social networks. To perform any natural language processing task, the input text/comments should be represented in a numerical form. Word embeddings represent the given text/sentences/words as a vector that can be employed in performing subsequent natural language processing tasks. In this chapter, the authors discuss different techniques that can improve the performance of sentiment analysis using concepts and techniques like traditional word embeddings, sentiment embeddings, emoticons, lexicons, and neural networks. This chapter also traces the evolution of word embedding techniques with a chronological discussion of the recent research advancements in word embedding techniques.


2020 ◽  
Author(s):  
Esra Kahya Özyirmidokuz ◽  
Kumru Uyar ◽  
Raian Ali ◽  
Eduard Alexandru Stoica ◽  
Betül Karakaş

BACKGROUND Measuring online Turkish happiness requires a Turkish happiness dictionary which could reflect norms and social values more culturally and linguistically instead of using a translation-oriented method. Analyzing data without neglecting cultural characteristics will not be reliable. Turkish translation of an English word in the Affective Norms of English Words (ANEW) dictionary does not express the same feeling of a Turkish word. In addition, existing emotional dictionaries are not developed for specifically for the social networks with emoticons. OBJECTIVE This research presents the Turkish Happiness Index (THI) which is a set of psychological normative happiness scores to measure an average level of happiness of Turkish online unstructured large-scale data. A well-being informatics analytics research is also done by using THI. METHODS Turkish Happiness Index was completely generated on social networks. 20000 words were extracted with web text mining from social networks. Natural Language Processing algorithms were applied. After data reduction quantitative research methodology is applied. The happiness scores were based detected based on 667 participants’ subjective happiness levels and their thoughts about the 1874 Turkish words. Alexithymia scale was also used to identify the emotional awareness of the participants. The evaluations of the words were done in the dimension of valence using the Self-Assessment Manikin in an online platform. NLP was used to measure online Turkish happiness of data. Data was collected from Facebook with negative #war and positive #family hashtags in a duration of one month using a 3rd party software tool. Natural language processing algorithms including tokenization, transformation, filtering and stemming after converting data to documents. The happiness levels of the documents based on hashtags were determined using the Turkish Happiness Index dictionary. RESULTS THI which contains 345 words and their happiness scores in the Turkish language was developed. The THI is given in Appendix 1. We also put a comparison between words of dictionaries to understand the cultural differences. CONCLUSIONS THI provide researchers with standard materials through which they can automatically measure online happiness of Turkish large-scale data. THI can be used in in real-time big data analytics.


2021 ◽  
Vol 30 (01) ◽  
pp. 257-263
Author(s):  
Natalia Grabar ◽  
Cyril Grouin ◽  

Summary Objectives: To analyze the content of publications within the medical NLP domain in 2020. Methods: Automatic and manual preselection of publications to be reviewed, and selection of the best NLP papers of the year. Analysis of the important issues. Results: Three best papers have been selected in 2020. We also propose an analysis of the content of the NLP publications in 2020, all topics included. Conclusion: The two main issues addressed in 2020 are related to the investigation of COVID-related questions and to the further adaptation and use of transformer models. Besides, the trends from the past years continue, such as diversification of languages processed and use of information from social networks


2018 ◽  
Vol 9 (3) ◽  
pp. 1
Author(s):  
Francisco Albernaz Machado Valério ◽  
Tatiane Gomes Guimarães ◽  
Raquel Oliveira Prates ◽  
Heloisa Candello

Recently, text-based chatbots had a rise in popularity, possibly due to new APIs for online social networks and messenger services, and development platforms that help dealing with all the necessary Natural Language Processing. But, as chatbots use natural language as interface, their users may struggle to discover which sentences the chatbots will understand and what they can do. Because of that it is important to support their designers in deciding how to convey the chatbots’ features, as this might determine whether the user will continue chatting or not. In this work, our goal is to analyze the communicative strategies used by popular chatbots when conveying their features to users. We used the Semiotic Inspection Method (SIM) for that end, and as a result we were able to identify a series of strategies used by the analyzed chatbots for conveying their features to users. We then consolidate these findings by analyzing other chatbots. Finally, we discuss the use of these strategies, as well as challenges for designing such interfaces and limitations of using SIM on them.


Author(s):  
Rafael Jiménez ◽  
Vicente García ◽  
Karla Olmos-Sánchez ◽  
Alan Ponce ◽  
Jorge Rodas-Osollo

Social networks have moved from online sites to interact with your friends to a platform where people, artists, brands, and even presidents interact with crowds of people daily. Airlines are some of the companies that use social networks such as Twitter to communicate with their clients through messages with offers, travel recommendations, videos of collaborations with YouTubers, and surveys. Among the many responses to airline tweets, there are users' suggestions on how to improve their services or processes. These recommendations are essential since the success of many companies is based on offering what the client wants or needs. A database of tweets was created using user tweets sent to airline accounts on Twitter between July 30 (2019) and August 8 (2019). Natural language processing techniques were used on the database to preprocess its data. The latest classification results using Naive Bayes show an accuracy of 72.44%.


Sign in / Sign up

Export Citation Format

Share Document