scholarly journals A ML and NLP based Framework for Sentiment Analysis on Bigdata

Big data as multiple sources and social media is one of them. Such data is rich in opinion of people and needs automated approach with Natural Language Processing (NLP) and Machine Learning (ML) to obtain and summarize social feedback. With ML as an integral part of Artificial Intelligence (AI), machines can demonstrate intelligence exhibited by humans. ML is widely used in different domains. With proliferation of Online Social Networks (OSNs), people of all walks of life exchange their views instantly. Thus they became platforms where opinions or people are available. In other words, social feedback on products and services are available. For instance, Twitter produces large volumes of such data which is of much use to enterprises to garner Business Intelligence (BI) useful to make expert decisions. In addition to the traditional feedback systems, the feedback (opinions) over social networks provide depth in the intelligence to revise strategies and policies. Sentiment analysis is the phenomenon which is employed to analyze opinions and classify them into positive, negative and neutral. Existing studies usually treated overall sentiment analysis and aspect-based sentiment analysis in isolation, and then introduce a variety of methods to analyse either overall sentiments or aspect-level sentiments, but not both. Usage of probabilistic topic model is a novel approach in sentiment analysis. In this paper, we proposed a framework for comprehensive analysis of overall and aspect-based sentiments. The framework is realized with aspect based topic modelling for sentiment analysis and ensemble learning algorithms. It also employs many ML algorithms with supervised learning approach. Benchmark datasets used in international SemEval conferences are used for empirical study. Experimental results revealed the efficiency of the proposed framework over the state of the art.

2021 ◽  
Vol 16 (6) ◽  
pp. 2031-2050
Author(s):  
Alireza Mohammadi ◽  
Seyyed Alireza Hashemi Golpayegani

Online social networks, as popular media and communications tools with their own extensive uses, play key roles in public opinion polls, politics, economy, and even governance. An important issue regarding these networks is the use of multiple sources of publishing or re-publishing news and propositions that can influence audiences depending on the level of trust in these sources between users. Therefore, estimating the level of trust in social networks between users can predict the extent of social networks’ impact on news and different publication and re-publication sources, and correspondingly provide effective strategies in news dissemination, advertisements, and other diverse contents for trustees. Therefore, trust is introduced and interpreted in the present study. A large portion of interactions in social networks is based on sending and receiving texts employing natural language processing techniques. A Hidden Markov Model (HMM) was designed via an efficient model, namely SenseTrust, to estimate the level of trust between users in social networks.


2021 ◽  
pp. 1-13
Author(s):  
C S Pavan Kumar ◽  
L D Dhinesh Babu

Sentiment analysis is widely used to retrieve the hidden sentiments in medical discussions over Online Social Networking platforms such as Twitter, Facebook, Instagram. People often tend to convey their feelings concerning their medical problems over social media platforms. Practitioners and health care workers have started to observe these discussions to assess the impact of health-related issues among the people. This helps in providing better care to improve the quality of life. Dementia is a serious disease in western countries like the United States of America and the United Kingdom, and the respective governments are providing facilities to the affected people. There is much chatter over social media platforms concerning the patients’ care, healthy measures to be followed to avoid disease, check early indications. These chatters have to be carefully monitored to help the officials take necessary precautions for the betterment of the affected. A novel Feature engineering architecture that involves feature-split for sentiment analysis of medical chatter over online social networks with the pipeline is proposed that can be used on any Machine Learning model. The proposed model used the fuzzy membership function in refining the outputs. The machine learning model has obtained sentiment score is subjected to fuzzification and defuzzification by using the trapezoid membership function and center of sums method, respectively. Three datasets are considered for comparison of the proposed and the regular model. The proposed approach delivered better results than the normal approach and is proved to be an effective approach for sentiment analysis of medical discussions over online social networks.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-13 ◽  
Author(s):  
Yanni Liu ◽  
Dongsheng Liu ◽  
Yuwei Chen

With the rapid development of mobile Internet, the social network has become an important platform for users to receive, release, and disseminate information. In order to get more valuable information and implement effective supervision on public opinions, it is necessary to study the public opinions, sentiment tendency, and the evolution of the hot events in social networks of a smart city. In view of social networks’ characteristics such as short text, rich topics, diverse sentiments, and timeliness, this paper conducts text modeling with words co-occurrence based on the topic model. Besides, the sentiment computing and the time factor are incorporated to construct the dynamic topic-sentiment mixture model (TSTS). Then, four hot events were randomly selected from the microblog as datasets to evaluate the TSTS model in terms of topic feature extraction, sentiment analysis, and time change. The results show that the TSTS model is better than the traditional models in topic extraction and sentiment analysis. Meanwhile, by fitting the time curve of hot events, the change rules of comments in the social network is obtained.


2022 ◽  
pp. 255-263
Author(s):  
Chirag Visani ◽  
Vishal Sorathiya ◽  
Sunil Lavadiya

The popularity of the internet has increased the use of e-commerce websites and news channels. Fake news has been around for many years, and with the arrival of social media and modern-day news at its peak, easy access to e-platform and exponential growth of the knowledge available on social media networks has made it intricate to differentiate between right and wrong information, which has caused large effects on the offline society already. A crucial goal in improving the trustworthiness of data in online social networks is to spot fake news so the detection of spam news becomes important. For sentiment mining, the authors specialise in leveraging Facebook, Twitter, and Whatsapp, the most prominent microblogging platforms. They illustrate how to assemble a corpus automatically for sentiment analysis and opinion mining. They create a sentiment classifier using the corpus that can classify between fake, real, and neutral opinions in a document.


Author(s):  
Athanasios Kokkos ◽  
Theodoros Tzouramanis

Online social networking services have come to dominate the dot com world: Countless online communities coexist on the social Web. Some typically characteristic user attributes, such as gender, age group, sexual orientation, are not automatically part of the profile information. In some cases user attributes can even be deliberately and maliciously falsified. This paper examines automated inference of gender on online social networks by analyzing written text with a combination of natural language processing and classification techniques. Extensive experimentation on LinkedIn and Twitter has yielded accuracy of this gender identification technique of up to 98.4 percent.


2022 ◽  
Vol 29 (1) ◽  
pp. 11-27
Author(s):  
Alan Keller Gomes ◽  
Kaique Matheus Rodrigues Cunha ◽  
Guilherme Augusto da Silva Ferreira

We present in this paper a novel approach for measuring Bourdieusian Social Capital (BSC) within  Institutional Pages and Profiles. We analyse Facebook's Institutional Pages and Twitter's Institutional Profiles. Supported by Pierre Bourdie's theory, we search for directions to identify and capture data related to sociability practices, i. e. actions performed such as Like, Comment and Share. The system of symbolic exchanges and mutual recognition treated by Pierre Bourdieu is represented and extracted automatically from these data in the form of generalized sequential patterns. In this format, the social interactions captured from each page are represented as sequences of actions. Next, we also use such data to measure the frequency of occurrence of each sequence. From such frequencies, we compute the effective mobilization capacity. Finally, the volume of BSC is computed based on the capacity of effective mobilization, the number of social interactions captured and the number of followers on each page. The results are aligned with Bourdieu's theory. The approach can be generalized to institutional pages or profiles in Online Social Networks.


Author(s):  
Gabriel Tavares ◽  
Saulo Mastelini ◽  
Sylvio Jr.

This paper proposes a technique for classifying user accounts on social networks to detect fraud in Online Social Networks (OSN). The main purpose of our classification is to recognize the patterns of users from Human, Bots or Cyborgs. Classic and consolidated approaches of Text Mining employ textual features from Natural Language Processing (NLP) for classification, but some drawbacks as computational cost, the huge amount of data could rise in real-life scenarios. This work uses an approach based on statistical frequency parameters of the user posting to distinguish the types of users without textual content. We perform the experiment over a Twitter dataset and as learn-based algorithms in classification task we compared Random Forest (RF), Support Vector Machine (SVM), k-nearest Neighbors (k-NN), Gradient Boosting Machine (GBM) and Extreme Gradient Boosting (XGBoost). Using the standard parameters of each algorithm, we achieved accuracy results of 88% and 84% by RF and XGBoost, respectively


Sign in / Sign up

Export Citation Format

Share Document