scholarly journals Making Inferences Using Incidentally Collected Data

2016 ◽  
Author(s):  
Jonathan Mellon

This chapter discusses the use of large quantities of incidentallycollected data (ICD) to make inferences about politics. This type of datais sometimes referred to as “big data” but I avoid this term because of itsconflicting definitions (Monroe, 2012; Ward & Barker, 2013). ICD is datathat was created or collected primarily for a purpose other than analysis.Within this broad definition, this chapter focuses particularly on datagenerated through user interactions with websites. While ICD has beenaround for at least half a century, the Internet greatly expanded theavailability and reduced the cost of ICD. Examples of ICD include data onInternet searches, social media data, and user data from civic platforms.This chapter briefly explains some sources and uses of ICD and thendiscusses some of the potential issues of analysis and interpretation thatarise when using ICD, including the different approaches to inference thatresearchers can use.

Author(s):  
Philip Habel ◽  
Yannis Theocharis

In the last decade, big data, and social media in particular, have seen increased popularity among citizens, organizations, politicians, and other elites—which in turn has created new and promising avenues for scholars studying long-standing questions of communication flows and influence. Studies of social media play a prominent role in our evolving understanding of the supply and demand sides of the political process, including the novel strategies adopted by elites to persuade and mobilize publics, as well as the ways in which citizens react, interact with elites and others, and utilize platforms to persuade audiences. While recognizing some challenges, this chapter speaks to the myriad of opportunities that social media data afford for evaluating questions of mobilization and persuasion, ultimately bringing us closer to a more complete understanding Lasswell’s (1948) famous maxim: “who, says what, in which channel, to whom, [and] with what effect.”


2018 ◽  
Vol 03 (03) ◽  
pp. 1850003 ◽  
Author(s):  
Jared Oliverio

Big Data is a very popular term today. Everywhere you turn companies and organizations are talking about their Big Data solutions and Analytic applications. The source of the data used in these applications varies. However, one type of data is of great interest to most organizations, Social Media Data. Social Media applications are used by a large percentage of the world’s population. The ability to instantly connect and reach other people and companies over distributed distances is an important part of today’s society. Social Media applications allow users to share comments, opinions, ideas, and media with friends, family, businesses, and organizations. The data contained in these comments, ideas, and media are valuable to many types of organizations. Through Data Mining and Analysis, it is possible to predict specific behavior in users of the applications. Currently, several technologies aid in collecting, analyzing, and displaying this data. These technologies allow users to apply this data to solve different problems, in different organizations, including the finance, medicine, environmental, education, and advertising industries. This paper aims to highlight the current technologies used in Data Mining and Analyzing Social Media data, the industries using this data, as well as the future of this field.


2018 ◽  
Vol 5 (2) ◽  
pp. 205395171880773 ◽  
Author(s):  
Cheryl Cooky ◽  
Jasmine R Linabary ◽  
Danielle J Corple

Social media offers an attractive site for Big Data research. Access to big social media data, however, is controlled by companies that privilege corporate, governmental, and private research firms. Additionally, Institutional Review Boards’ regulative practices and slow adaptation to emerging ethical dilemmas in online contexts creates challenges for Big Data researchers. We examine these challenges in the context of a feminist qualitative Big Data analysis of the hashtag event #WhyIStayed. We argue power, context, and subjugated knowledges must each be central considerations in conducting Big Data social media research. In doing so, this paper offers a feminist practice of holistic reflexivity in order to help social media researchers navigate and negotiate this terrain.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Fengjun Tian ◽  
Yang Yang ◽  
Zhenxing Mao ◽  
Wenyue Tang

Purpose This paper aims to compare the forecasting performance of different models with and without big data predictors from search engines and social media. Design/methodology/approach Using daily tourist arrival data to Mount Longhu, China in 2018 and 2019, the authors estimated ARMA, ARMAX, Markov-switching auto-regression (MSAR), lasso model, elastic net model and post-lasso and post-elastic net models to conduct one- to seven-days-ahead forecasting. Search engine data and social media data from WeChat, Douyin and Weibo were incorporated to improve forecasting accuracy. Findings Results show that search engine data can substantially reduce forecasting error, whereas social media data has very limited value. Compared to the ARMAX/MSAR model without big data predictors, the corresponding post-lasso model reduced forecasting error by 39.29% based on mean square percentage error, 33.95% based on root mean square percentage error, 46.96% based on root mean squared error and 45.67% based on mean absolute scaled error. Practical implications Results highlight the importance of incorporating big data predictors into daily demand forecasting for tourism attractions. Originality/value This study represents a pioneering attempt to apply the regularized regression (e.g. lasso model and elastic net) in tourism forecasting and to explore various daily big data indicators across platforms as predictors.


2019 ◽  
Vol 97 (3) ◽  
pp. 811-834 ◽  
Author(s):  
Lei Guo ◽  
Kate Mays ◽  
Sha Lai ◽  
Mona Jalal ◽  
Prakash Ishwar ◽  
...  

Crowdcoding, a method that outsources “coding” tasks to numerous people on the internet, has emerged as a popular approach for annotating texts and visuals. However, the performance of this approach for analyzing social media data in the context of journalism and mass communication research has not been systematically assessed. This study evaluated the validity and efficiency of crowdcoding based on the analysis of 4,000 tweets about the 2016 U.S. presidential election. The results show that compared with the traditional quantitative content analysis, crowdcoding yielded comparably valid results and was superior in efficiency, but was more expensive under most circumstances.


2021 ◽  
Vol 12 ◽  
Author(s):  
Muhammad Usman Tariq ◽  
Muhammad Babar ◽  
Marc Poulin ◽  
Akmal Saeed Khattak ◽  
Mohammad Dahman Alshehri ◽  
...  

Intelligent big data analysis is an evolving pattern in the age of big data science and artificial intelligence (AI). Analysis of organized data has been very successful, but analyzing human behavior using social media data becomes challenging. The social media data comprises a vast and unstructured format of data sources that can include likes, comments, tweets, shares, and views. Data analytics of social media data became a challenging task for companies, such as Dailymotion, that have billions of daily users and vast numbers of comments, likes, and views. Social media data is created in a significant amount and at a tremendous pace. There is a very high volume to store, sort, process, and carefully study the data for making possible decisions. This article proposes an architecture using a big data analytics mechanism to efficiently and logically process the huge social media datasets. The proposed architecture is composed of three layers. The main objective of the project is to demonstrate Apache Spark parallel processing and distributed framework technologies with other storage and processing mechanisms. The social media data generated from Dailymotion is used in this article to demonstrate the benefits of this architecture. The project utilized the application programming interface (API) of Dailymotion, allowing it to incorporate functions suitable to fetch and view information. The API key is generated to fetch information of public channel data in the form of text files. Hive storage machinist is utilized with Apache Spark for efficient data processing. The effectiveness of the proposed architecture is also highlighted.


Sign in / Sign up

Export Citation Format

Share Document