scholarly journals Embed2Detect: temporally clustered embedded words for event detection in social media

2021 ◽  
Author(s):  
Hansi Hettiarachchi ◽  
Mariam Adedoyin-Olowe ◽  
Jagdev Bhogal ◽  
Mohamed Medhat Gaber

AbstractSocial media is becoming a primary medium to discuss what is happening around the world. Therefore, the data generated by social media platforms contain rich information which describes the ongoing events. Further, the timeliness associated with these data is capable of facilitating immediate insights. However, considering the dynamic nature and high volume of data production in social media data streams, it is impractical to filter the events manually and therefore, automated event detection mechanisms are invaluable to the community. Apart from a few notable exceptions, most previous research on automated event detection have focused only on statistical and syntactical features in data and lacked the involvement of underlying semantics which are important for effective information retrieval from text since they represent the connections between words and their meanings. In this paper, we propose a novel method termed Embed2Detect for event detection in social media by combining the characteristics in word embeddings and hierarchical agglomerative clustering. The adoption of word embeddings gives Embed2Detect the capability to incorporate powerful semantical features into event detection and overcome a major limitation inherent in previous approaches. We experimented our method on two recent real social media data sets which represent the sports and political domain and also compared the results to several state-of-the-art methods. The obtained results show that Embed2Detect is capable of effective and efficient event detection and it outperforms the recent event detection methods. For the sports data set, Embed2Detect achieved 27% higher F-measure than the best-performed baseline and for the political data set, it was an increase of 29%.

2018 ◽  
Author(s):  
Anika Oellrich ◽  
George Gkotsis ◽  
Richard James Butler Dobson ◽  
Tim JP Hubbard ◽  
Rina Dutta

BACKGROUND Dementia is a growing public health concern with approximately 50 million people affected worldwide in 2017 and this number is expected to reach more than 131 million by 2050. The toll on caregivers and relatives cannot be underestimated as dementia changes family relationships, leaves people socially isolated, and affects the finances of all those involved. OBJECTIVE The aim of this study was to explore using automated analysis (i) the age and gender of people who post to the social media forum Reddit about dementia diagnoses, (ii) the affected person and their diagnosis, (iii) relevant subreddits authors are posting to, (iv) the types of messages posted and (v) the content of these posts. METHODS We analysed Reddit posts concerning dementia diagnoses. We used a previously developed text analysis pipeline to determine attributes of the posts as well as their authors to characterise online communications about dementia diagnoses. The posts were also examined by manual curation for the diagnosis provided and the person affected. Furthermore, we investigated the communities these people engage in and assessed the contents of the posts with an automated topic gathering technique. RESULTS Our results indicate that the majority of posters in our data set are women, and it is mostly close relatives such as parents and grandparents that are mentioned. Both the communities frequented and topics gathered reflect not only the sufferer's diagnosis but also potential outcomes, e.g. hardships experienced by the caregiver. The trends observed from this dataset are consistent with findings based on qualitative review, validating the robustness of social media automated text processing. CONCLUSIONS This work demonstrates the value of social media data sources as a resource for in-depth studies of those affected by a dementia diagnosis and the potential to develop novel support systems based on their real time processing in line with the increasing digitalisation of medical care.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Yasmeen George ◽  
Shanika Karunasekera ◽  
Aaron Harwood ◽  
Kwan Hui Lim

AbstractA key challenge in mining social media data streams is to identify events which are actively discussed by a group of people in a specific local or global area. Such events are useful for early warning for accident, protest, election or breaking news. However, neither the list of events nor the resolution of both event time and space is fixed or known beforehand. In this work, we propose an online spatio-temporal event detection system using social media that is able to detect events at different time and space resolutions. First, to address the challenge related to the unknown spatial resolution of events, a quad-tree method is exploited in order to split the geographical space into multiscale regions based on the density of social media data. Then, a statistical unsupervised approach is performed that involves Poisson distribution and a smoothing method for highlighting regions with unexpected density of social posts. Further, event duration is precisely estimated by merging events happening in the same region at consecutive time intervals. A post processing stage is introduced to filter out events that are spam, fake or wrong. Finally, we incorporate simple semantics by using social media entities to assess the integrity, and accuracy of detected events. The proposed method is evaluated using different social media datasets: Twitter and Flickr for different cities: Melbourne, London, Paris and New York. To verify the effectiveness of the proposed method, we compare our results with two baseline algorithms based on fixed split of geographical space and clustering method. For performance evaluation, we manually compute recall and precision. We also propose a new quality measure named strength index, which automatically measures how accurate the reported event is.


2021 ◽  
Vol 20 (3) ◽  
pp. 402-416
Author(s):  
Amirhossein Teimouri

Abstract Social media platforms have been increasingly reinvigorating extreme movements, especially rightist movements. Utilizing unique Google Plus data, the author shows the rise and fall of the 2015 rightist anti-Nuclear Deal movement in Iran. He argues that the Google Plus platform in 2015 provided the new generation of revolutionary Islamist rightist activists with a contentious space of mobilization, enabling them to develop a new revolutionary rightist identity. This revolutionary identity and its corresponding language and discourse did not fully unfold in Iranian mainstream rightist media, even though rightist groups, compared to liberal groups, are not censored and repressed. The new generation of rightist activists perceived the Nuclear Deal as an existential threat to revolutionary principles of the country, and thus played out their outrage and identity anxieties on Google Plus. The author contends that this online outrage, due to the activists’ identity bond with the regime and the 1979 Iranian Revolution, however, did not translate into any massive offline mobilization against the Nuclear Deal. He also discusses the methodological implications of using social media data, especially the discontinuation of Google Plus.


2021 ◽  
Author(s):  
Shishuo Xu

<div>Small-scale events involve interactive human movement in limited space and time. Social media platforms possibly generate large amount of geospatially-referenced information related to small-scale events. It benefits individuals, management departments, and urban systems if small-scale events can be timely detected from social media platforms, where measuring the abnormal patterns of human movement to discover events and analyzing associated texts to interpret the reasons behind abnormal movement are two keys. Through investigating how people move as different events occur and measuring the patterns on social media platforms, small-scale events can be generally classified into two types, namely type I events with abrupt patterns and type II events with random occurrence of key factors, where social events and traffic events are representative correspondingly.</div><div>Despite many studies have been conducted to detect social events and traffic events using geosocial media data, there still are some un-answered questions requiring further research. Most existing studies did not identify occurring events from a full coverage of spatial, temporal, and semantic perspectives. Studies concerning social event detection lack efficient semantic analysis summarizing event content to infer the reasons driving the abnormal movement. The typical classification-based method regarding traffic event detection lacks investigation on how the spatiotemporal distribution of traffic relevant posts associate with the occurring traffic events, and simply assigns the detected events with predefined categories, missing events that indicate traffic anomalies but go beyond the predetermined categories.<br></div><div>In this thesis, spatial-temporal-semantic approaches are proposed to measure spatiotemporal patterns of posts and users of social media platforms to capture abnormal human movement, and analyze the content of associated posts to mine the reasons driving the movement. A variety of techniques including machine learning, natural language processing, and spatiotemporal analysis are adopted to realize effective detection. Based on one-year Twitter data collected in Toronto, 2014 Toronto International Film Festival and traffic anomaly detection are selected as two case studies to evaluate the performance of proposed approaches. Through comparing with the ground truth data, the result reveals that more than 80% of the detected events do refer to real-world events, which illustrates the feasibility and efficiency of proposed approaches.<br></div><div><br></div><div>Keywords: Small-scale event, Event detection, Geosocial media data, Traffic event, Social event, Twitter, Spatiotemporal clustering<br></div>


2012 ◽  
Vol 7 (1) ◽  
pp. 174-197 ◽  
Author(s):  
Heather Small ◽  
Kristine Kasianovitz ◽  
Ronald Blanford ◽  
Ina Celaya

Social networking sites and other social media have enabled new forms of collaborative communication and participation for users, and created additional value as rich data sets for research. Research based on accessing, mining, and analyzing social media data has risen steadily over the last several years and is increasingly multidisciplinary; researchers from the social sciences, humanities, computer science and other domains have used social media data as the basis of their studies. The broad use of this form of data has implications for how curators address preservation, access and reuse for an audience with divergent disciplinary norms related to privacy, ownership, authenticity and reliability.In this paper, we explore how the characteristics of the Twitter platform, coupled with an ambiguous and evolving understanding of privacy in networked communication, and divergent disciplinary understandings of the resulting data, combine to create complex issues for curators trying to ensure broad-based and ethical reuse of Twitter data. We provide a case study of a specific data set to illustrate how data curators can engage with the topics and questions raised in the paper. While some initial suggestions are offered to librarians and other information professionals who are beginning to receive social media data from researchers, our larger goal is to stimulate discussion and prompt additional research on the curation and preservation of social media data.


2018 ◽  
Vol 7 (4.38) ◽  
pp. 939
Author(s):  
Nur Atiqah Sia Abdullah ◽  
Hamizah Binti Anuar

Facebook and Twitter are the most popular social media platforms among netizen. People are now more aggressive to express their opinions, perceptions, and emotions through social media platforms. These massive data provide great value for the data analyst to understand patterns and emotions related to a certain issue. Mining the data needs techniques and time, therefore data visualization becomes trending in representing these types of information. This paper aims to review data visualization studies that involved data from social media postings. Past literature used node-link diagram, node-link tree, directed graph, line graph, heatmap, and stream graph to represent the data collected from the social media platforms. An analysis by comparing the social media data types, representation, and data visualization techniques is carried out based on the previous studies. This paper critically discussed the comparison and provides a suggestion for the suitability of data visualization based on the type of social media data in hand.      


2019 ◽  
Vol 10 (2) ◽  
pp. 57-70 ◽  
Author(s):  
Vikas Kumar ◽  
Pooja Nanda

With the amplification of social media platforms, the importance of social media analytics has exponentially increased for many brands and organizations across the world. Tracking and analyzing the social media data has been contributing as a success parameter for such organizations, however, the data is being poorly harnessed. Therefore, the ethical implications of social media analytics need to be identified and explored for both the organizations and targeted users of social media data. The present work is an exploratory study to identify the various techno-ethical concerns of social media engagement, as well as social media analytics. The impact of these concerns on the individuals, organizations, and society as a whole are discussed. Ethical engagement for the most common social media platforms has been outlined with a number of specific examples to understand the prominent techno-ethical concerns. Both the individual and organizational perspectives have been taken into account to identify the implications of social media analytics.


2018 ◽  
Vol 4 (3) ◽  
pp. 205630511878780 ◽  
Author(s):  
Luci Pangrazio ◽  
Neil Selwyn

Young people’s engagements with social media now generate large quantities of personal data, with “big social data” becoming an increasingly important “currency” in the digital economy. While using social media platforms is ostensibly “free,” users nevertheless “pay” for these services through their personal data—enabling advertisers, content developers, and other third parties to profile, predict, and position individuals. Such developments have prompted calls for social media users to adopt more informed and critical stances toward how and why their data are being used—that is, to build “critical data literacies.” This article reports on research that explores young social media users’ understandings of their personal data and its attendant issues. Drawing on research with groups of young people (aged 13–17 years), the article investigates the consequences of making third party (re)uses of personal data openly available for social media users to interpret and make critical sense of. The findings provide valuable insights into young people’s understandings of the technical, social, and cultural issues that underpin their ability to engage with, and make sense of, social media data. The article concludes by considering how research into critical data literacies might connect in more meaningful and effective ways with everyday lived experiences of social media use.


2021 ◽  
pp. 227797522110118
Author(s):  
Amit K. Srivastava ◽  
Rajhans Mishra

Social media platforms have become very popular these days among individuals and organizations. On the one hand, organizations use social media as a potential tool to create awareness of their products among consumers, and on the other hand, social media data is useful to predict the national crisis, election polls, stock prediction, etc. However, nowadays, a debate is going on about the quality of data generated on social media platforms, whether it is relevant for prediction and generalization. The article discusses the relevance and quality of data obtained from social media in the context of research and development. Social media data quality issues may impact the generalizability and reproducibility of the results of the study. The paper explores possible reasons for quality issues in the data generated over social media platforms along with the suggestive measures to minimize them using the proposed social media data quality framework.


2021 ◽  
Author(s):  
Elizabeth Dubois ◽  
Anatoliy Gruzd ◽  
Jenna Jacobson

Journalists increasingly use social media data to infer and report public opinion by quoting social media posts, identifying trending topics, and reporting general sentiment. In contrast to traditional approaches of inferring public opinion, citizens are often unaware of how their publicly available social media data is being used and how public opinion is constructed using social media analytics. In this exploratory study based on a census-weighted online survey of Canadian adults (N=1,500), we examine citizens’ perceptions of journalistic use of social media data. We demonstrate that: (1) people find it more appropriate for journalists to use aggregate social media data rather than personally identifiable data; (2) people who use more social media are more likely to positively perceive journalistic use of social media data to infer public opinion; and (3) the frequency of political posting is positively related to acceptance of this emerging journalistic practice, which suggests some citizens want to be heard publicly on social media while others do not. We provide recommendations for journalists on the ethical use of social media data and social media platforms on opt-in functionality.


Sign in / Sign up

Export Citation Format

Share Document