scholarly journals Cluster-discovery of twitter messages for event detection and trending.

Author(s):  
Shakira Banu Kaleel

Social media data carries abundant hidden occurrences of real-time events in the world which raises the demand for efficient event detection and trending system. The Locality Sensitive Hashing (LSH) technique is capable of processing the large-scale big datasets. In this thesis, a novel framework is proposed for detecting and trending events from tweet clusters presence in Twitter1 dataset that are discovered using LSH. The experimental results obtained from this research work showed that the LSH technique took only 12.99% of the running time compared to that required for K-means to find all of the tweet clusters. Key challenges include: 1) construction of dictionary using incremental TF-IDF in high-dimensional data in order to create tweet feature vector 2) leveraging LSH to find truly interesting events 3) trending the behavior of event based on time, geo-locations and cluster size and 4) speed-up the cluster-discovery process while retaining the cluster quality.

2021 ◽  
Author(s):  
Shakira Banu Kaleel

Social media data carries abundant hidden occurrences of real-time events in the world which raises the demand for efficient event detection and trending system. The Locality Sensitive Hashing (LSH) technique is capable of processing the large-scale big datasets. In this thesis, a novel framework is proposed for detecting and trending events from tweet clusters presence in Twitter1 dataset that are discovered using LSH. The experimental results obtained from this research work showed that the LSH technique took only 12.99% of the running time compared to that required for K-means to find all of the tweet clusters. Key challenges include: 1) construction of dictionary using incremental TF-IDF in high-dimensional data in order to create tweet feature vector 2) leveraging LSH to find truly interesting events 3) trending the behavior of event based on time, geo-locations and cluster size and 4) speed-up the cluster-discovery process while retaining the cluster quality.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Yasmeen George ◽  
Shanika Karunasekera ◽  
Aaron Harwood ◽  
Kwan Hui Lim

AbstractA key challenge in mining social media data streams is to identify events which are actively discussed by a group of people in a specific local or global area. Such events are useful for early warning for accident, protest, election or breaking news. However, neither the list of events nor the resolution of both event time and space is fixed or known beforehand. In this work, we propose an online spatio-temporal event detection system using social media that is able to detect events at different time and space resolutions. First, to address the challenge related to the unknown spatial resolution of events, a quad-tree method is exploited in order to split the geographical space into multiscale regions based on the density of social media data. Then, a statistical unsupervised approach is performed that involves Poisson distribution and a smoothing method for highlighting regions with unexpected density of social posts. Further, event duration is precisely estimated by merging events happening in the same region at consecutive time intervals. A post processing stage is introduced to filter out events that are spam, fake or wrong. Finally, we incorporate simple semantics by using social media entities to assess the integrity, and accuracy of detected events. The proposed method is evaluated using different social media datasets: Twitter and Flickr for different cities: Melbourne, London, Paris and New York. To verify the effectiveness of the proposed method, we compare our results with two baseline algorithms based on fixed split of geographical space and clustering method. For performance evaluation, we manually compute recall and precision. We also propose a new quality measure named strength index, which automatically measures how accurate the reported event is.


2021 ◽  
Author(s):  
Hansi Hettiarachchi ◽  
Mariam Adedoyin-Olowe ◽  
Jagdev Bhogal ◽  
Mohamed Medhat Gaber

AbstractSocial media is becoming a primary medium to discuss what is happening around the world. Therefore, the data generated by social media platforms contain rich information which describes the ongoing events. Further, the timeliness associated with these data is capable of facilitating immediate insights. However, considering the dynamic nature and high volume of data production in social media data streams, it is impractical to filter the events manually and therefore, automated event detection mechanisms are invaluable to the community. Apart from a few notable exceptions, most previous research on automated event detection have focused only on statistical and syntactical features in data and lacked the involvement of underlying semantics which are important for effective information retrieval from text since they represent the connections between words and their meanings. In this paper, we propose a novel method termed Embed2Detect for event detection in social media by combining the characteristics in word embeddings and hierarchical agglomerative clustering. The adoption of word embeddings gives Embed2Detect the capability to incorporate powerful semantical features into event detection and overcome a major limitation inherent in previous approaches. We experimented our method on two recent real social media data sets which represent the sports and political domain and also compared the results to several state-of-the-art methods. The obtained results show that Embed2Detect is capable of effective and efficient event detection and it outperforms the recent event detection methods. For the sports data set, Embed2Detect achieved 27% higher F-measure than the best-performed baseline and for the political data set, it was an increase of 29%.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 114851-114861 ◽  
Author(s):  
Zhiguang Zhou ◽  
Xinlong Zhang ◽  
Xiaoyun Zhou ◽  
Yuhua Liu

2019 ◽  
Vol 38 (5) ◽  
pp. 633-650 ◽  
Author(s):  
Josh Pasek ◽  
Colleen A. McClain ◽  
Frank Newport ◽  
Stephanie Marken

Researchers hoping to make inferences about social phenomena using social media data need to answer two critical questions: What is it that a given social media metric tells us? And who does it tell us about? Drawing from prior work on these questions, we examine whether Twitter sentiment about Barack Obama tells us about Americans’ attitudes toward the president, the attitudes of particular subsets of individuals, or something else entirely. Specifically, using large-scale survey data, this study assesses how patterns of approval among population subgroups compare to tweets about the president. The findings paint a complex picture of the utility of digital traces. Although attention to subgroups improves the extent to which survey and Twitter data can yield similar conclusions, the results also indicate that sentiment surrounding tweets about the president is no proxy for presidential approval. Instead, after adjusting for demographics, these two metrics tell similar macroscale, long-term stories about presidential approval but very different stories at a more granular level and over shorter time periods.


Author(s):  
Suppawong Tuarob ◽  
Conrad S. Tucker

The authors of this work propose a Knowledge Discovery in Databases (KDD) model for predicting product market adoption and longevity using large scale, social media data. Social media data, available through sites such as Twitter® and Facebook®, have been shown to be leading indicators and predictors of events ranging from influenza spread, financial stock market prices, and movie revenues. Being ubiquitous and colloquial in nature allows users to honestly express their opinions in a unified, dynamic manner. This makes social media a relatively new data gathering source that can potentially appeal to designers and enterprise decision makers aiming to understand consumers response to their upcoming/newly launched products. Existing design methodologies for leveraging large scale data have traditionally relied on product reviews available on the internet to mine product information. However, such web reviews often come from disparate sources, making the aggregation and knowledge discovery process quite cumbersome, especially reviews for poorly received products. Furthermore, such web reviews have not been shown to be strong indicators of new product market adoption. In this paper, the authors demonstrate how social media can be used to predict and mine information relating to product features, product competition and market adoption. In particular, the authors analyze the sentiment in tweets and use the results to predict product sales. The authors present a mathematical model that can quantify the correlations between social media sentiment and product market adoption in an effort to compute the ability to stay in the market of individual products. The proposed technique involves computing the Subjectivity, Polarity, and Favorability of the product. Finally, the authors utilize Information Retrieval techniques to mine users’ opinions about strong, weak, and controversial features of a given product model. The authors evaluate their approaches using the real-world smartphone data, which are obtained from www.statista.com and www.gsmarena.com.


Author(s):  
Xiaomo Liu ◽  
Armineh Nourbakhsh ◽  
Quanzhi Li ◽  
Sameena Shah ◽  
Robert Martin ◽  
...  

2020 ◽  
Vol 376 ◽  
pp. 244-255 ◽  
Author(s):  
Zhiguang Zhou ◽  
Xinlong Zhang ◽  
Zhiyong Guo ◽  
Yuhua Liu

Sign in / Sign up

Export Citation Format

Share Document