Towards a General Architecture for Social Media Data Capture from a Multi-Domain Perspective

The increasing availability of huge volumes of social media ‘Big Data’ from Facebook, Flickr, Instagram, Twitter and other social network platforms, combined with the development of software designed to operate at web scale, has fuelled the growth of computational social science. Often analysed by ‘data scientists’, social media data differ substantially from the datasets officially disseminated as by-products of government-sponsored activity, such as population censuses or administrative data, which have long been analysed by professional statisticians. This chapter outlines the characteristics of social media data and identifies key data sources and methods of data capture, introducing several of the technologies used to acquire, store, query, visualise and augment social media data. Unrepresentativeness of, and lack of (geo)demographic control in, social media data are problematic for population-based research. These limitations, alongside wider epistemological and ethical concerns surrounding data validity, inadvertent co-option into research and protection of user privacy, suggest that caution should be exercised when analysing social media datasets. While care must be taken to respect personal privacy and sample assiduously, this chapter concludes that statisticians, who may be unfamiliar with some of the programmatic steps involved in accessing social media data, must play a pivotal role in analysing it.

Download Full-text

Location Integration and Data Markets

Cultural Economies of Locative Media ◽

10.1093/oso/9780190234911.003.0004 ◽

2019 ◽

pp. 66-88

Author(s):

Rowan Wilken

Keyword(s):

Social Media ◽

Social Networking ◽

Case Studies ◽

Data Capture ◽

Social Networking Service ◽

Social Media Data ◽

Locative Media ◽

Mobile Social Networking ◽

Revenue Models ◽

Media Data

This chapter explores the still-evolving business and revenue models and geolocation data capture efforts of two commercial businesses now central to the contemporary settlement of locative media: Foursquare and Facebook. In Foursquare’s case, it underwent a quite dramatic series of transformations, evolving from a check-in based mobile social networking service, to a search and recommendation service, and now also serving as a firm offering location intelligence related enterprise services. In Facebook’s case, it set about further strengthening its grip on social media data markets by adding geolocation functionalities and geodata capture capabilities to its social networking operations. These two case studies provide a rich composite picture of the business ecologies of locational information. The aim in selecting these cases is to develop a clearer understanding of how both firms accrue location data and how they extract location value—that is, how this information is shared, harvested, valued, reused, and commodified.

Download Full-text

Post, Mine, and Be Disturbed: Social Media Data Mining

PsycCRITIQUES ◽

10.1037/a0040619 ◽

2016 ◽

Vol 61 (51) ◽

Author(s):

Daniel Keyes

Keyword(s):

Data Mining ◽

Social Media ◽

Social Media Data ◽

Media Data

Download Full-text

Understanding the Interrelationships between Infrastructure Resilience and Social Equity Using Social Media Data

Construction Research Congress 2020 ◽

10.1061/9780784482858.065 ◽

2020 ◽

Author(s):

Sunil Dhakal ◽

Lu Zhang

Keyword(s):

Social Media ◽

Social Equity ◽

Social Media Data ◽

Infrastructure Resilience ◽

Media Data

Download Full-text

Psychological Stress Detection from Social Media Data using a Novel Hybrid Model

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i8.853862 ◽

2018 ◽

Vol 6 (8) ◽

pp. 853-862

Author(s):

Shaikha Hajera ◽

Mohammed Mahmood Ali

Keyword(s):

Social Media ◽

Psychological Stress ◽

Hybrid Model ◽

Stress Detection ◽

Social Media Data ◽

Media Data

Download Full-text

Smart De-Identification of Social Media Data

10.21236/ada608548 ◽

2014 ◽

Author(s):

Kathleen M. Carley ◽

L. R. Carley ◽

Jonathan Storrick

Keyword(s):

Social Media ◽

Social Media Data ◽

Media Data

Download Full-text

The Psychology of Job Loss: Using Social Media Data to Characterize and Predict Unemployment

SSRN Electronic Journal ◽

10.2139/ssrn.2783520 ◽

2016 ◽

Cited By ~ 1

Author(s):

Davide Proserpio ◽

Scott Counts ◽

Apurv Jain

Keyword(s):

Social Media ◽

Job Loss ◽

Social Media Data ◽

Media Data

Download Full-text

Mining Social Media Data to Study the Consequences of Dementia Diagnosis on Caregivers and Relatives (Preprint)

10.2196/preprints.10506 ◽

2018 ◽

Author(s):

Anika Oellrich ◽

George Gkotsis ◽

Richard James Butler Dobson ◽

Tim JP Hubbard ◽

Rina Dutta

Keyword(s):

Social Media ◽

Family Relationships ◽

Text Processing ◽

Automated Analysis ◽

Health Concern ◽

Dementia Diagnosis ◽

Data Set ◽

Social Media Data ◽

Real Time Processing ◽

Media Data

BACKGROUND Dementia is a growing public health concern with approximately 50 million people affected worldwide in 2017 and this number is expected to reach more than 131 million by 2050. The toll on caregivers and relatives cannot be underestimated as dementia changes family relationships, leaves people socially isolated, and affects the finances of all those involved. OBJECTIVE The aim of this study was to explore using automated analysis (i) the age and gender of people who post to the social media forum Reddit about dementia diagnoses, (ii) the affected person and their diagnosis, (iii) relevant subreddits authors are posting to, (iv) the types of messages posted and (v) the content of these posts. METHODS We analysed Reddit posts concerning dementia diagnoses. We used a previously developed text analysis pipeline to determine attributes of the posts as well as their authors to characterise online communications about dementia diagnoses. The posts were also examined by manual curation for the diagnosis provided and the person affected. Furthermore, we investigated the communities these people engage in and assessed the contents of the posts with an automated topic gathering technique. RESULTS Our results indicate that the majority of posters in our data set are women, and it is mostly close relatives such as parents and grandparents that are mentioned. Both the communities frequented and topics gathered reflect not only the sufferer's diagnosis but also potential outcomes, e.g. hardships experienced by the caregiver. The trends observed from this dataset are consistent with findings based on qualitative review, validating the robustness of social media automated text processing. CONCLUSIONS This work demonstrates the value of social media data sources as a resource for in-depth studies of those affected by a dementia diagnosis and the potential to develop novel support systems based on their real time processing in line with the increasing digitalisation of medical care.

Download Full-text

Citizens, Elites, and Social Media Methodological Challenges and Opportunities in the Study of Persuasion and Mobilization

The Oxford Handbook of Electoral Persuasion ◽

10.1093/oxfordhb/9780190860806.013.27 ◽

2019 ◽

pp. 1036-1058

Author(s):

Philip Habel ◽

Yannis Theocharis

Keyword(s):

Social Media ◽

Big Data ◽

Supply And Demand ◽

Political Process ◽

The Political ◽

The Novel ◽

Complete Understanding ◽

Social Media Data ◽

Challenges And Opportunities ◽

Media Data

In the last decade, big data, and social media in particular, have seen increased popularity among citizens, organizations, politicians, and other elites—which in turn has created new and promising avenues for scholars studying long-standing questions of communication flows and influence. Studies of social media play a prominent role in our evolving understanding of the supply and demand sides of the political process, including the novel strategies adopted by elites to persuade and mobilize publics, as well as the ways in which citizens react, interact with elites and others, and utilize platforms to persuade audiences. While recognizing some challenges, this chapter speaks to the myriad of opportunities that social media data afford for evaluating questions of mobilization and persuasion, ultimately bringing us closer to a more complete understanding Lasswell’s (1948) famous maxim: “who, says what, in which channel, to whom, [and] with what effect.”

Download Full-text