Real-Time Streaming Data Analysis Using a Three-Way Classification Method for Sentimental Analysis

Author(s):  
Srinidhi Hiriyannaiah ◽  
G.M. Siddesh ◽  
K.G. Srinivasa

This article describes how recent advances in computing have led to an increase in the generation of data in fields such as social media, medical, power and others. With the rapid increase in internet users, social media has given power for sentiment analysis or opinion mining. It is a highly challenging task for storing, querying and analyzing such types of data. This article aims at providing a solution to store, query and analyze streaming data using Apache Kafka as the platform and twitter data as an example for analysis. A three-way classification method is proposed for sentimental analysis of twitter data that combines both the approaches for knowledge-based and machine-learning using three stages namely emotion classification, word classification and sentiment classification. The hybrid three-way classification approach was evaluated using a sample of five query strings on twitter and compared with existing emotion classifier, polarity classifier and Naïve Bayes classifier for sentimental analysis. The accuracy of the results of the proposed approach is superior when compared to existing approaches.

2020 ◽  
pp. 1377-1390
Author(s):  
Srinidhi Hiriyannaiah ◽  
G.M. Siddesh ◽  
K.G. Srinivasa

This article describes how recent advances in computing have led to an increase in the generation of data in fields such as social media, medical, power and others. With the rapid increase in internet users, social media has given power for sentiment analysis or opinion mining. It is a highly challenging task for storing, querying and analyzing such types of data. This article aims at providing a solution to store, query and analyze streaming data using Apache Kafka as the platform and twitter data as an example for analysis. A three-way classification method is proposed for sentimental analysis of twitter data that combines both the approaches for knowledge-based and machine-learning using three stages namely emotion classification, word classification and sentiment classification. The hybrid three-way classification approach was evaluated using a sample of five query strings on twitter and compared with existing emotion classifier, polarity classifier and Naïve Bayes classifier for sentimental analysis. The accuracy of the results of the proposed approach is superior when compared to existing approaches.


Author(s):  
Yufang Wang ◽  
Kuai Xu ◽  
Yun Kang ◽  
Haiyan Wang ◽  
Feng Wang ◽  
...  

The large volume of geotagged Twitter streaming data on flu epidemics provides chances for researchers to explore, model, and predict the trends of flu cases in a timely manner. However, the explosive growth of data from social media makes data sampling a natural choice. In this paper, we develop a method for influenza prediction based on the real-time tweet data from social media, and this method ensures real-time prediction and is applicable to sampling data. Specifically, we first simulate the sampling process of flu tweets, and then develop a specific partial differential equation (PDE) model to characterize and predict the aggregated flu tweet volumes. Our PDE model incorporates the effects of flu spreading, flu recovery, and active human interventions for reducing flu. Our extensive simulation results show that this PDE model can almost eliminate the data reduction effects from the sampling process: It requires lesser historical data but achieves stronger prediction results with a relative accuracy of over 90% on the 1% sampling data. Even for the more aggressive data sampling ratios such as 0.1% and 0.01% sampling, our model is still able to achieve relative accuracies of 85% and 83%, respectively. These promising results highlight the ability of our mechanistic PDE model in predicting temporal–spatial patterns of flu trends even in the scenario of small sampling Twitter data.


With the advancements in web technology and its growth, there's an incredible volume of information present everywhere on the net for internet users and plenty more data is generated on a daily basis. Internet emerged as place for exchanging ideas, sharing opinions, online learning and political views. Social networking sites such as Facebook, Twitter, are rapidly growing as the users are allowed to post and revel their views on various topics, and can discussion with different groups and communities, or post messages across the world. In the area of sentiment analysis large numbers of researchers are working. The main focus is on twitter data for sentiment analysis, that's helpful to research the info within the tweets,where opinions are heterogeneous, highly unstructured, and are either positive,or negative, or neutral.in many cases. In this paper, we provide a study and comparative analysis of existing techniques used for opinion mining through machine learning approach. Naive Bayes & Support Vector Machine, we provide research on twitter data.


Author(s):  
Navonil Majumder ◽  
Soujanya Poria ◽  
Devamanyu Hazarika ◽  
Rada Mihalcea ◽  
Alexander Gelbukh ◽  
...  

Emotion detection in conversations is a necessary step for a number of applications, including opinion mining over chat history, social media threads, debates, argumentation mining, understanding consumer feedback in live conversations, and so on. Currently systems do not treat the parties in the conversation individually by adapting to the speaker of each utterance. In this paper, we describe a new method based on recurrent neural networks that keeps track of the individual party states throughout the conversation and uses this information for emotion classification. Our model outperforms the state-of-the-art by a significant margin on two different datasets.


2020 ◽  
Vol 8 (5) ◽  
pp. 4219-4224

Social media emerged as one of the key components to reach disaster affected people, as they supplement planning and operational coordination. Sentiment analysis was expended to identify, extract or characterize subjective information, such as opinions, expressed in a tweet. The sentiment expressed is analyzed and is classified as positive or negative sentiment, which is not versatile enough to capture the exact sentiment conveyed by the user. Opinion mining is a machine learning process used to extract information conveyed by the user in the form of text. In this paper, the lexical analysis to sentiment analysis of twitter data is employed. Conventionally, the sentiment is conveyed using the polarity of the data but in this paper, sentiment intensity is employed to convey the sentiments. Performing sentiment analysis on tweets gives us the sentiment intensity conveyed by the user, which in turn is used to calculate the severity of the disaster event specified by the user. Further, it is also used to classify the tweets based on their severity. This paper proposes a methodology to extract relevant sentiment information from Location Based Social Network (LBSN) and suggests a unique scale to classify this information to help disaster management authority.


2018 ◽  
Vol 7 (4.5) ◽  
pp. 374
Author(s):  
Yazala Ritika Siril Paul ◽  
Dilipkumar A. Borikar

Sentiment analysis is the process of identifying people’s attitude and emotional state from the language they use via any social websites or other sources. The main aim is to identify a set of potential features in the review and extract the opinion expressions of those features by making full use of their associations. The Twitter has now become a routine for the people around the world to post thousands of reactions and opinions on every topic, every second of every single day. It’s like one big psychological database that’s constantly being updated and which can be used to analyze the sentiments of the people. Hadoop is one of the best options available for twitter data sentiment analysis and which also works for the distributed big data, streaming data, text data etc.  This paper provides an efficient mechanism to perform sentiment analysis/ opinion mining on Twitter data over Hortonworks Data platform, which provides Hadoop on Windows, with the assistance of Apache Flume, Apache HDFS and Apache Hive. 


2021 ◽  
Vol 4 (1) ◽  
pp. 64-72
Author(s):  
Ashif Dzilfiqar Thayyibi ◽  
◽  
Juliana Mansur ◽  

Currently, the growth of internet users has been accompanied by the development of applications that support interaction among users, which is called social media. One of the popular social media in society today is twitter. Data on Twitter can be presented in a graph structure visualization in nodes that represent actors and edges that represent relationships between actors. In an effort to find the most influential actors and actors who interact the most in spreading the Natuna topic on social media twitter, an analysis will be carried out using the Social Network Analysis method using the Degree Centrality approach. The data used in this study were taken from December 20, 2019 at 00.00 WIB to January 7, 2020 at 10.00 WIB consisting of 71,477 nodes and 147066 edges. The results of this study can be concluded that the @susipudjiastuti account is the most influential actor and plays an important role in social networking because the @susipudjiastuti account is the most linked account with 29755 links. Meanwhile, the @ shaktia704 account was the most active account during the data collection period, which reached 259 links.


Infoman s ◽  
2018 ◽  
Vol 12 (2) ◽  
pp. 115-124
Author(s):  
Yopi Hidayatul Akbar ◽  
Muhammad Agreindra Helmiawan

Social media is one of the information media that is currently widely used by several companies and personally to convey information, with the presence of social media companies no longer need to spread offers through print media, they can use information technology tools in this case social media to submit offers the products they sell to users globally through social media. This social media marketing technique is the process of reaching visits by internet users to certain sites or public attention through social media sites. Marketing activities using social media are usually centered on the efforts of a company to create content that attracts attention, thus encouraging readers to share the content through their social media networks. The application of the QMS method is certainly not only submitted through search engine webmasters, but also on a website keywords must be applied that relate to the contents of the website content, because with the keyword it will automatically attract visitors to the university website based on keyword phrases that they type in the search engine. With Search Media Marketing Technique (SMM) is one of the techniques that must be applied in conducting sales promotions, especially in car dealers in Bandung, it is considered important because each product requires price, feature and convenience socialization through social media so that sales traffic can increase. Each dealer should be able to apply the techniques of Social Media Marketing (SMM) well so that car sales can reach the expected target and provide profits for sales as car sellers in the field.


2018 ◽  
Vol 6 (1) ◽  
pp. 60
Author(s):  
Ranny Rastati

In 2017 the majority of internet users are 19-34 years old or 49.52% (APJI, 2017). Almost half of the internet users in Indonesia are digital natives who were born after 1980: Generation Y (1980-1995) and Generation Z (1996-2009). This research will be focused on Generation Z as the true generation of the internet. Generation Z was born when the internet is available, a contrast to Generation Y who is still experiencing the transition of the internet. The purpose of this research is to find an effective way of providing information about media literacy to Generation Z. Through descriptive qualitative, the study was conducted with in-depth interview and observation toward 12 university students in Jakarta. The results showed that there are four effective ways of providing information about media literacy which is i) videos distributed to social media such as Youtube and Instagram, ii) interesting memes in communicative style, iii) through selebgram or micro-celebrity in Instagram who is consider as a role model and have a positive image, and iv) roadside billboards. Another interesting finding is that male informants tend to like media literacy information through videos and memes, while female informants prefer campaigns conducted by positive image selebgram and billboard. AbstrakPada tahun 2017 pengguna internet di Indonesia mayoritas berusia 19-34 tahun yaitu sebanyak 49,52% (APJI, 2017). Dari data tersebut terlihat bahwa hampir sebagian pengguna internet di Indonesia adalah digital natives atau penutur asli teknologi digital yaitu orang-orang yang lahir setelah tahun 1980: Generasi Y (1980-1995) dan Generasi Z (1996-2009). Penelitian ini akan difokuskan kepada Generasi Z karena mereka dianggap sebagai sebenar-benarnya generasi internet. Generasi Z lahir saat teknologi tersebut sudah tersedia, berbeda dengan Generasi Y yang masih mengalami transisi teknologi hingga menuju internet. Tujuan penelitian ini adalah mencari tahu cara yang efektif dalam memberikan informasi mengenai media literasi kepada generasi Z. Metode yang digunakan adalah deskriptif kualitatif dengan observasi dan wawancara mendalam. Informan berjumlah 12 orang mahasiswa di Jakarta. Hasil penelitian menunjukkan bahwa ada empat cara yang efektif dalam memberikan informasi mengenai media literasi yaitu i) video yang disebarkan ke media sosial seperti Youtube dan Instagram, ii) meme menarik dengan bahasa yang mudah dimengerti, iii) melalui selebgram yang menjadi panutan dan berimage positif, dan iv) papan iklan di pinggir jalan. Temuan menarik lainnya adalah informan laki-laki cenderung menyukai informasi media literasi melalui video dan meme yang disebarkan ke media sosial, sementara perempuan lebih menyukai kampanye yang dilakukan oleh selebgram berimage positif dan papan iklan.


2019 ◽  
Vol 23 (1) ◽  
pp. 346-357
Author(s):  
Vithya G ◽  
Naren J ◽  
Varun V

Sign in / Sign up

Export Citation Format

Share Document