scholarly journals Classification Connection of Twitter Data using K-Means Clustering

The rise of social media platforms like Twitter and the increasing adoption by people in order to stay connected provide a large source of data to perform analysis based on the various trends, events and even various personalities. Such analysis also provides insight into a person’s likes and inclinations in real time independent of the data size. Several techniques have been created to retrieve such data however the most efficient technique is clustering. This paper provides an overview of the algorithms of the various clustering methods as well as looking at their efficiency in determining trending information. The clustered data may be further classified by topics for real time analysis on a large dynamic data set. In this paper, data classification is performed and analyzed for flaws followed by another classification on the same data set.

Author(s):  
Ritesh Srivastava ◽  
M.P.S. Bhatia

Twitter behaves as a social sensor of the world. The tweets provided by the Twitter Firehose reveal the properties of big data (i.e. volume, variety, and velocity). With millions of users on Twitter, the Twitter's virtual communities are now replicating the real-world communities. Consequently, the discussions of real world events are also very often on Twitter. This work has performed the real-time analysis of the tweets related to a targeted event (e.g. election) to identify those potential sub-events that occurred in the real world, discussed over Twitter and cause the significant change in the aggregated sentiment score of the targeted event with time. Such type of analysis can enrich the real-time decision-making ability of the event bearer. The proposed approach utilizes a three-step process: (1) Real-time sentiment analysis of tweets (2) Application of Bayesian Change Points Detection to determine the sentiment change points (3) Major sub-events detection that have influenced the sentiment of targeted event. This work has experimented on Twitter data of Delhi Election 2015.


Author(s):  
Rodrigo Martínez-Castaño ◽  
Juan C. Pichel ◽  
David E. Losada 

In this paper we propose a scalable platform for real-time processing of Social Media data. The platform ingests huge amounts of contents, such as Social Media posts or comments, and can support Public Health surveillance tasks. The processing and analytical needs of multiple screening tasks can easily be handled by incorporating user-defined execution graphs. The design is modular and supports different processing elements, such as crawlers to extract relevant contents or classifiers to categorise Social Media. We describe here an implementation of a use case built on the platform that monitors Social Media users and detects early signs of depression.


2014 ◽  
Vol 2 (45) ◽  
pp. 19338-19346 ◽  
Author(s):  
Alice E. Williams ◽  
Peter J. Holliman ◽  
Matthew J. Carnie ◽  
Matthew L. Davies ◽  
David A. Worsley ◽  
...  

A real-time analysis by STA-FTIR of changes occurring and volatiles evolved during processing of perovskites for PV technology. Solvent retention, presence of chemical species and decomposition of materials can be evaluated to gain insight into material composition.


2017 ◽  
Vol 53 (5) ◽  
pp. 854-856 ◽  
Author(s):  
Eric Janusson ◽  
Harmen S. Zijlstra ◽  
Peter P. T. Nguyen ◽  
Landon MacGillivray ◽  
Julio Martelino ◽  
...  

Real-time UV-Vis/ESI-MS monitoring of Pd2(dba)3 activation provides insight into active species and the effect of activation protocol on their formation.


Author(s):  
JP Kelly

This article examines recent innovations in how television audiences are measured, paying particular attention to the industry's growing efforts to utilize the large bodies of data generated through social media platforms – a paradigm of research known as Big Data. Although Big Data is considered by many in the television industry as a more veracious model of audience research, this essay uses Boyd and Crawford's (2011) `Six Provocations of Big Data' to problematize and interrogate this prevailing industrial consensus. In doing so, this article explores both the affordances and the limitations of this emerging research paradigm – the latter having largely been ignored by those in the industry – and considers the consequences of these developments for the production culture of television more broadly. Although the full impact of the television industry's adoption of Big Data remains unclear, this article traces some preliminary connections between the introduction of these new measurement practices and the production culture of contemporary television. First, I demonstrate how the design of Big Data privileges real-time analysis, which, in turn, encourages increased investment in ‘live’ and/or ‘event’ television. Second, I argue that despite its potential to produce real-time insights, the scale of Big Data actually limits its utility in the context of the creative industries. Third, building on this discussion of the debatable value and applicability of Big Data, I describe how the introduction of social media metrics is further contributing to a ‘data divide’ in which access to these new information data sets is highly uneven, generally favouring institutions over individuals. Taken together, these three different but overlapping developments provide evidence that the introduction of Big Data is already having a notable effect on the television industry in a number of interesting and unexpected ways.


Data ◽  
2020 ◽  
Vol 5 (1) ◽  
pp. 20
Author(s):  
Amir Haghighati ◽  
Kamran Sedig

Through social media platforms, massive amounts of data are being produced. As a microblogging social media platform, Twitter enables its users to post short updates as “tweets” on an unprecedented scale. Once analyzed using machine learning (ML) techniques and in aggregate, Twitter data can be an invaluable resource for gaining insight into different domains of discussion and public opinion. However, when applied to real-time data streams, due to covariate shifts in the data (i.e., changes in the distributions of the inputs of ML algorithms), existing ML approaches result in different types of biases and provide uncertain outputs. In this paper, we describe VARTTA (Visual Analytics for Real-Time Twitter datA), a visual analytics system that combines data visualizations, human-data interaction, and ML algorithms to help users monitor, analyze, and make sense of the streams of tweets in a real-time manner. As a case study, we demonstrate the use of VARTTA in political discussions. VARTTA not only provides users with powerful analytical tools, but also enables them to diagnose and to heuristically suggest fixes for the errors in the outcome, resulting in a more detailed understanding of the tweets. Finally, we outline several issues to be considered while designing other similar visual analytics systems.


Sign in / Sign up

Export Citation Format

Share Document