Performance Evaluation of Fuzzy C Mean Clustering on Social Media Data Set

2018 ◽  
Vol 6 (6) ◽  
pp. 1376-1380
Author(s):  
Kothapalli Revathi ◽  
Chalumuri Avinash
2018 ◽  
Author(s):  
Anika Oellrich ◽  
George Gkotsis ◽  
Richard James Butler Dobson ◽  
Tim JP Hubbard ◽  
Rina Dutta

BACKGROUND Dementia is a growing public health concern with approximately 50 million people affected worldwide in 2017 and this number is expected to reach more than 131 million by 2050. The toll on caregivers and relatives cannot be underestimated as dementia changes family relationships, leaves people socially isolated, and affects the finances of all those involved. OBJECTIVE The aim of this study was to explore using automated analysis (i) the age and gender of people who post to the social media forum Reddit about dementia diagnoses, (ii) the affected person and their diagnosis, (iii) relevant subreddits authors are posting to, (iv) the types of messages posted and (v) the content of these posts. METHODS We analysed Reddit posts concerning dementia diagnoses. We used a previously developed text analysis pipeline to determine attributes of the posts as well as their authors to characterise online communications about dementia diagnoses. The posts were also examined by manual curation for the diagnosis provided and the person affected. Furthermore, we investigated the communities these people engage in and assessed the contents of the posts with an automated topic gathering technique. RESULTS Our results indicate that the majority of posters in our data set are women, and it is mostly close relatives such as parents and grandparents that are mentioned. Both the communities frequented and topics gathered reflect not only the sufferer's diagnosis but also potential outcomes, e.g. hardships experienced by the caregiver. The trends observed from this dataset are consistent with findings based on qualitative review, validating the robustness of social media automated text processing. CONCLUSIONS This work demonstrates the value of social media data sources as a resource for in-depth studies of those affected by a dementia diagnosis and the potential to develop novel support systems based on their real time processing in line with the increasing digitalisation of medical care.


2021 ◽  
Author(s):  
Hansi Hettiarachchi ◽  
Mariam Adedoyin-Olowe ◽  
Jagdev Bhogal ◽  
Mohamed Medhat Gaber

AbstractSocial media is becoming a primary medium to discuss what is happening around the world. Therefore, the data generated by social media platforms contain rich information which describes the ongoing events. Further, the timeliness associated with these data is capable of facilitating immediate insights. However, considering the dynamic nature and high volume of data production in social media data streams, it is impractical to filter the events manually and therefore, automated event detection mechanisms are invaluable to the community. Apart from a few notable exceptions, most previous research on automated event detection have focused only on statistical and syntactical features in data and lacked the involvement of underlying semantics which are important for effective information retrieval from text since they represent the connections between words and their meanings. In this paper, we propose a novel method termed Embed2Detect for event detection in social media by combining the characteristics in word embeddings and hierarchical agglomerative clustering. The adoption of word embeddings gives Embed2Detect the capability to incorporate powerful semantical features into event detection and overcome a major limitation inherent in previous approaches. We experimented our method on two recent real social media data sets which represent the sports and political domain and also compared the results to several state-of-the-art methods. The obtained results show that Embed2Detect is capable of effective and efficient event detection and it outperforms the recent event detection methods. For the sports data set, Embed2Detect achieved 27% higher F-measure than the best-performed baseline and for the political data set, it was an increase of 29%.


2012 ◽  
Vol 7 (1) ◽  
pp. 174-197 ◽  
Author(s):  
Heather Small ◽  
Kristine Kasianovitz ◽  
Ronald Blanford ◽  
Ina Celaya

Social networking sites and other social media have enabled new forms of collaborative communication and participation for users, and created additional value as rich data sets for research. Research based on accessing, mining, and analyzing social media data has risen steadily over the last several years and is increasingly multidisciplinary; researchers from the social sciences, humanities, computer science and other domains have used social media data as the basis of their studies. The broad use of this form of data has implications for how curators address preservation, access and reuse for an audience with divergent disciplinary norms related to privacy, ownership, authenticity and reliability.In this paper, we explore how the characteristics of the Twitter platform, coupled with an ambiguous and evolving understanding of privacy in networked communication, and divergent disciplinary understandings of the resulting data, combine to create complex issues for curators trying to ensure broad-based and ethical reuse of Twitter data. We provide a case study of a specific data set to illustrate how data curators can engage with the topics and questions raised in the paper. While some initial suggestions are offered to librarians and other information professionals who are beginning to receive social media data from researchers, our larger goal is to stimulate discussion and prompt additional research on the curation and preservation of social media data.


Author(s):  
F. O. Ostermann ◽  
H. Huang ◽  
G. Andrienko ◽  
N. Andrienko ◽  
C. Capineri ◽  
...  

Increasing availability of Geo-Social Media (e.g. Facebook, Foursquare and Flickr) has led to the accumulation of large volumes of social media data. These data, especially geotagged ones, contain information about perception of and experiences in various environments. Harnessing these data can be used to provide a better understanding of the semantics of places. We are interested in the similarities or differences between different Geo-Social Media in the description of places. This extended abstract presents the results of a first step towards a more in-depth study of semantic similarity of places. Particularly, we took places extracted through spatio-temporal clustering from one data source (Twitter) and examined whether their structure is reflected semantically in another data set (Flickr). Based on that, we analyse how the semantic similarity between places varies over space and scale, and how Tobler's first law of geography holds with regards to scale and places.


2021 ◽  
Author(s):  
J. Bradford Jensen ◽  
Lisa Singh ◽  
Pamela Davis-Kean ◽  
Katharine Abraham ◽  
Paul Beatty ◽  
...  

This is the fifth in a series of white papers providing a summary of the discussions and future directions that are derived from these topical meetings. This paper focuses on issues related to analysis and visual analytics. While these two topics are distinct, there are clear overlaps between the two. It is common to use different visualizations during analysis and given the sheer volume of social media data, visual analytic tools can be important during analysis, as well as during other parts of the research lifecycle. Choices about analysis may be informed by visualization plans and vice versa - both are key in communicating about a data set and what it means. We also recognized that each field of research has different analysis techniques and different levels of familiarity with visual analytics. Putting these two topics into the same meeting provided us with the opportunity to think about analysis and visual analytics/visualization in new, synergistic ways.


2019 ◽  
Vol 3 (3) ◽  
pp. 38 ◽  
Author(s):  
Stefan Spettel ◽  
Dimitrios Vagianos

Social media are heavily used to shape political discussions. Thus, it is valuable for corporations and political parties to be able to analyze the content of those discussions. This is exemplified by the work of Cambridge Analytica, in support of the 2016 presidential campaign of Donald Trump. One of the most straightforward metrics is the sentiment of a message, whether it is considered as positive or negative. There are many commercial and/or closed-source tools available which make it possible to analyze social media data, including sentiment analysis (SA). However, to our knowledge, not many publicly available tools have been developed that allow for analyzing social media data and help researchers around the world to enter this quickly expanding field of study. In this paper, we provide a thorough description of implementing a tool that can be used for performing sentiment analysis on tweets. In an effort to underline the necessity for open tools and additional monitoring on the Twittersphere, we propose an implementation model based exclusively on publicly available open-source software. The resulting tool is capable of downloading Tweets in real-time based on hashtags or account names and stores the sentiment for replies to specific tweets. It is therefore capable of measuring the average reaction to one tweet by a person or a hashtag, which can be represented with graphs. Finally, we tested our open-source tool within a case study based on a data set of Twitter accounts and hashtags referring to the Syrian war, covering a short time window of one week in the spring of 2018. The results show that while high accuracy of commercial or other complicated tools may not be achieved, our proposed open source tool makes it possible to get a good overview of the overall replies to specific tweets, as well as a practical perception of tweets, related to specific hashtags, identifying them as positive or negative.


2019 ◽  
Author(s):  
Matthew Andreotta ◽  
Robertus Nugroho ◽  
Mark Hurlstone ◽  
Fabio Boschetti ◽  
Simon Farrell ◽  
...  

To qualitative researchers, social media offers a novel opportunity to harvest a massive and diverse range of content, without the need for intrusive or intensive data collection procedures. However, performing a qualitative analysis across a massive social media data set is cumbersome and impractical. Instead, researchers often extract a subset of content to analyze, but a framework to facilitate this process is currently lacking. We present a four-phased framework for improving this extraction process, which blends the capacities of data science techniques to compress large data sets into smaller spaces, with the capabilities of qualitative analysis to address research questions. We demonstrate this framework by investigating the topics of Australian Twitter commentary on climate change, using quantitative (Non-Negative Matrix inter-joint Factorization; Topic Alignment) and qualitative (Thematic Analysis) techniques. Our approach is useful for researchers seeking to perform qualitative analyses of social media, or researchers wanting to supplement their quantitative work with a qualitative analysis of broader social context and meaning.


Author(s):  
Thiago R. C. de Lima

Social media comprises of platforms that surpassed their initial goal to connect people just for the sake of socializing and currently provide powerful tools for businesses to reach millions of views worldwide, increasing their chances of gaining new customers. This short paper utilizes the Buzz in Social Media data set available at UCI Machine Learning Repository for identifying the attributes in social media content that have the highest correlation to the amount of repercussion it gained. To achieve such result, several linear regression models are constructed, then ranked based on their respective model fit measure (R-squared) and accuracy when tested against unseen data.


2020 ◽  
Vol 48 (5) ◽  
pp. 612-621
Author(s):  
Nicholas Joseph Adams-Cohen

This article uses Twitter data and machine-learning methods to analyze the causal impact of the Supreme Court’s legalization of same-sex marriage at the federal level in the United States on political sentiment and discourse toward gay rights. In relying on social media text data, this project constructs a large data set of expressed political opinions in the short time frame before and after the Obergefell v. Hodges decision. Due to the variation in state laws regarding the legality of same-sex marriage prior to the Supreme Court’s decision, I use a difference-in-difference estimator to show that, in those states where the Court’s ruling produced a policy change, there was relatively more negative movement in public opinion toward same-sex marriage and gay rights issues as compared with other states. This confirms previous studies that show Supreme Court decisions polarize public opinion in the short term, extends previous results by demonstrating opinion becomes relatively more negative in states where policy is overturned, and demonstrates how to use social media data to engage in causal analyses.


2019 ◽  
Vol 32 (1) ◽  
pp. 152-169 ◽  
Author(s):  
Wu He ◽  
Weidong Zhang ◽  
Xin Tian ◽  
Ran Tao ◽  
Vasudeva Akula

Purpose Customer knowledge from social media can become an important organizational asset. The purpose of this paper is to identify useful customer knowledge including knowledge for customer, knowledge about customers and knowledge from customers from social media data and facilitate social media-based customer knowledge management. Design/methodology/approach The authors conducted a case study to analyze people’s online discussion on Twitter regarding laptop brands and manufacturers. After collecting relevant tweets using Twitter search APIs, the authors applied statistical analysis, text mining and sentiment analysis techniques to analyze the social media data set and visualize relevant insights and patterns in order to identify customer knowledge. Findings The paper identifies useful insights and knowledge from customers and knowledge about customers from social media data. Furthermore, the paper shows how the authors can use knowledge from customers and knowledge about customers to help companies develop knowledge for customers. Originality/value This is an original social media analytics study that discusses how to transform large-scale social media data into useful customer knowledge including knowledge for customer, knowledge about customers and knowledge from customers.


Sign in / Sign up

Export Citation Format

Share Document