Computational Communication Research

Abstract Computational communication science (CCS) is embraced by many as a fruitful methodological approach to studying communication in the digital era. However, theoretical advances have not been considered equally important in CCS. Specifically, we observe an emphasis on mid-range and micro theories that misses a larger discussion on how macro-theoretical frameworks can serve CCS scholarship. With this article, we aim to stimulate such a discussion. Although macro frameworks might not point directly to specific questions and hypotheses, they shape our research through influencing which kinds of questions we ask, which kinds of hypotheses we formulate, and which methods we find adequate and useful. We showcase how three selected theoretical frameworks might advance CCS scholarship in this way: (1) complexity theory, (2) theories of the public sphere, and (3) mediatization theory. Using online protest as an example, we discuss how the focus (and the blind spots) of our research designs shifts with each framework.

Download Full-text

Statistical Power in Content Analysis Designs: How Effect Size, Sample Size and Coding Accuracy Jointly Affect Hypothesis Testing ‐ A Monte Carlo Simulation Approach.

Computational Communication Research ◽

10.5117/ccr2021.1.003.geis ◽

2021 ◽

Vol 3 (1) ◽

pp. 61-89

Author(s):

Stefan Geiß

Keyword(s):

Monte Carlo Simulation ◽

Monte Carlo ◽

Content Analysis ◽

Sample Size ◽

Effect Size ◽

Statistical Power ◽

Effect Sizes ◽

Sample Sizes ◽

Expected Effect ◽

Sample Size Effect

Abstract This study uses Monte Carlo simulation techniques to estimate the minimum required levels of intercoder reliability in content analysis data for testing correlational hypotheses, depending on sample size, effect size and coder behavior under uncertainty. The ensuing procedure is analogous to power calculations for experimental designs. In most widespread sample size/effect size settings, the rule-of-thumb that chance-adjusted agreement should be ≥.80 or ≥.667 corresponds to the simulation results, resulting in acceptable α and β error rates. However, this simulation allows making precise power calculations that can consider the specifics of each study’s context, moving beyond one-size-fits-all recommendations. Studies with low sample sizes and/or low expected effect sizes may need coder agreement above .800 to test a hypothesis with sufficient statistical power. In studies with high sample sizes and/or high expected effect sizes, coder agreement below .667 may suffice. Such calculations can help in both evaluating and in designing studies. Particularly in pre-registered research, higher sample sizes may be used to compensate for low expected effect sizes and/or borderline coding reliability (e.g. when constructs are hard to measure). I supply equations, easy-to-use tables and R functions to facilitate use of this framework, along with example code as online appendix.

Download Full-text

A Weakly Supervised and Deep Learning Method for an Additive Topic Analysis of Large Corpora

Computational Communication Research ◽

10.5117/ccr2021.1.002.foge ◽

2021 ◽

Vol 3 (1) ◽

pp. 29-59

Author(s):

Yair Fogel-Dror ◽

Shaul R. Shenhav ◽

Tamir Sheafer

Keyword(s):

Content Analysis ◽

Deep Learning ◽

Text Classification ◽

Large Scale ◽

Low Cost ◽

Initial Number ◽

Analysis Method ◽

Training Set ◽

Topic Analysis ◽

Weakly Supervised

Abstract The collaborative effort of theory-driven content analysis can benefit significantly from the use of topic analysis methods, which allow researchers to add more categories while developing or testing a theory. This additive approach enables the reuse of previous efforts of analysis or even the merging of separate research projects, thereby making these methods more accessible and increasing the discipline’s ability to create and share content analysis capabilities. This paper proposes a weakly supervised topic analysis method that uses both a low-cost unsupervised method to compile a training set and supervised deep learning as an additive and accurate text classification method. We test the validity of the method, specifically its additivity, by comparing the results of the method after adding 200 categories to an initial number of 450. We show that the suggested method provides a foundation for a low-cost solution for large-scale topic analysis.

Download Full-text

Four best practices for measuring news sentiment using ‘off-the-shelf’ dictionaries: a large-scale p-hacking experiment

Computational Communication Research ◽

10.5117/ccr2021.1.001.chan ◽

2021 ◽

Vol 3 (1) ◽

pp. 1-27

Author(s):

Chung-hong Chan ◽

Joseph Bajjalieh ◽

Loretta Auvil ◽

Hartmut Wessler ◽

Scott Althaus ◽

...

Keyword(s):

Best Practices ◽

Large Scale ◽

Statistical Significance ◽

Statistical Hypothesis ◽

Validity And Reliability ◽

Presidential Approval ◽

Sentiment Dictionary ◽

News Corpus ◽

News Sentiment ◽

Different Levels

Abstract We examined the validity of 37 sentiment scores based on dictionary-based methods using a large news corpus and demonstrated the risk of generating a spectrum of results with different levels of statistical significance by presenting an analysis of relationships between news sentiment and U.S. presidential approval. We summarize our findings into four best practices: 1) use a suitable sentiment dictionary; 2) do not assume that the validity and reliability of the dictionary is ‘built-in’; 3) check for the influence of content length and 4) do not use multiple dictionaries to test the same statistical hypothesis.

Download Full-text

A tool for tracking the propagation of words on Reddit

Computational Communication Research ◽

10.5117/ccr2021.1.005.will ◽

2021 ◽

Vol 3 (1) ◽

pp. 117-132

Author(s):

Tom Willaert ◽

Paul Van Eecke ◽

Jeroen Van Soest ◽

Katrien Beuls

Keyword(s):

Social Media ◽

Information Diffusion ◽

Computational Social Science ◽

Data Driven ◽

Online Media ◽

Online Tool ◽

Online Social Media ◽

Social Media Platforms ◽

Over Time ◽

The Web

Abstract The data-driven study of cultural information diffusion in online (social) media is currently an active area of research. The availability of data from the web thereby generates new opportunities to examine how words propagate through online media and communities, as well as how these diffusion patterns are intertwined with the materiality and culture of social media platforms. In support of such efforts, this paper introduces an online tool for tracking the consecutive occurrences of words across subreddits on Reddit between 2005 and 2017. By processing the full Pushshift.io Reddit comment archive for this period (Baumgartner et al., 2020), we are able to track the first occurrences of 76 million words, allowing us to visualize which subreddits subsequently adopt any of those words over time. We illustrate this approach by addressing the spread of terms referring to famous internet controversies, and the percolation of alt-right terminology. By making our instrument and the processed data publically available, we aim to facilitate a range of exploratory analyses in computational social science, the digital humanities, and related fields.

Download Full-text

Down to a r/science: Integrating Computational Approaches to the Study of Credibility on Reddit

Computational Communication Research ◽

10.5117/ccr2021.1.004.hubn ◽

2021 ◽

Vol 3 (1) ◽

pp. 91-115

Author(s):

Austin Hubner ◽

Jessica McKnight ◽

Matthew Sweitzer ◽

Robert Bond

Keyword(s):

Content Analysis ◽

Social Network ◽

Online Community ◽

Large Scale ◽

Online Communication ◽

Computational Approaches ◽

Network Position ◽

Shared Content ◽

Automated Content Analysis ◽

Communication Processes

Abstract Digital trace data enable researchers to study communication processes at a scale previously impossible. We combine social network analysis and automated content analysis to examine source and message factors’ impact on ratings of user-shared content. We found that the expertise of the author, the network position that the author occupies, and characteristics of the content the author creates have a significant impact on how others respond to that content. By observationally examining a large-scale online community, we provide a real-world test of how message consumers react to source and message characteristics. Our results show that it is important to think of online communication as occurring interactively between networks of individuals, and that the network positions people inhabit may inform their behavior.

Download Full-text

Comparing automated content analysis methods to distinguish issue communication by political parties on Twitter

Computational Communication Research ◽

10.5117/ccr2021.2.004.prae ◽

2021 ◽

Vol 3 (2) ◽

pp. 1-27

Author(s):

Stiene Praet ◽

Peter Van Aelst ◽

Walter Daelemans ◽

Tim Kreutz ◽

Jeroen Peeters ◽

...

Keyword(s):

Political Parties ◽

Western Europe ◽

Online Communication ◽

Party Competition ◽

Classical Literature ◽

Party Discipline ◽

Content Type ◽

Automated Content Analysis ◽

Approaches To Study ◽

Issue Competition

Abstract Party competition in Western Europe is increasingly focused on “issue competition”, which is the selective emphasis on issues by parties. The aim of this paper is to contribute methodologically to the increasing number of studies that deal with different aspects of parties’ issue competition and communication. We systematically compare the value and shortcomings of three exploratory text representation approaches to study the issue communication of parties on Twitter. More specifically, we analyze which issues separate the online communication of one party from that of the other parties and how consistent party communication is. Our analysis was performed on two years of Twitter data from six Belgian political parties, comprising of over 56,000 political tweets. The results indicate that our exploratory approach is useful to study how political parties profile themselves on Twitter and which strategies are at play. Second, our method allows to analyze communication of individual politicians which contributes to classical literature on party unity and party discipline. A comparison of our three methods shows a clear trade-off between interpretability and discriminative power, where a combination of all three simultaneously provides the best insights.

Download Full-text

Extracting semantic relations using syntax

Computational Communication Research ◽

10.5117/ccr2021.2.003.welb ◽

2021 ◽

Vol 3 (2) ◽

pp. 1-16

Author(s):

Kasper Welbers ◽

Wouter van Atteveldt ◽

Jan Kleinnijenhuis

Keyword(s):

Text Analysis ◽

R Package ◽

Semantic Relations ◽

Content Type ◽

Fine Grained ◽

Syntactic Information ◽

Dependency Trees ◽

Automatic Text Analysis ◽

Rule Based Approach ◽

Automatic Text

Abstract Most common methods for automatic text analysis in communication science ignore syntactic information, focusing on the occurrence and co-occurrence of individual words, and sometimes n-grams. This is remarkably effective for some purposes, but poses a limitation for fine-grained analyses into semantic relations such as who does what to whom and according to what source. One tested, effective method for moving beyond this bag-of-words assumption is to use a rule-based approach for labeling and extracting syntactic patterns in dependency trees. Although this method can be used for a variety of purposes, its application is hindered by the lack of dedicated and accessible tools. In this paper we introduce the rsyntax R package, which is designed to make working with dependency trees easier and more intuitive for R users, and provides a framework for combining multiple rules for reliably extracting useful semantic relations.

Download Full-text

The Accuracy and Precision of Measurement

Computational Communication Research ◽

10.5117/ccr2021.2.001.calc ◽

2021 ◽

Vol 3 (2) ◽

pp. 1-20

Author(s):

Leandro Calcagnotto ◽

Richard Huskey ◽

Gerald M. Kosicki

Keyword(s):

Reaction Time ◽

Response Time ◽

Open Source ◽

Cognitive Processing ◽

Response Latency ◽

Video Game ◽

Reaction Time Data ◽

Time Data ◽

Asteroid Impact ◽

Time Latency

Abstract Measurement noise differs by instrument and limits the validity and reliability of findings. Researchers collecting reaction time data introduce noise in the form of response time latency from hardware and software, even when collecting data on standardized computer-based experimental equipment. Reaction time is a measure with broad application for studying cognitive processing in communication research that is vulnerable to response latency noise. In this study, we utilized an Arduino microcontroller to generate a ground truth value of average response time latency in Asteroid Impact, an open source, naturalistic, experimental video game stimulus. We tested if response time latency differed across computer operating system, software, and trial modality. Here we show that reaction time measurements collected using Asteroid Impact were susceptible to response latency variability on par with other response-latency measuring software tests. These results demonstrate that Asteroid Impact is a valid and reliable stimulus for measuring reaction time data. Moreover, we provide researchers with a low-cost and open-source tool for evaluating response time latency in their own labs. Our results highlight the importance of validating measurement tools and support the philosophy of contributing methodological improvements in communication science.

Download Full-text

To Whom Do Politicians Talk and Listen? : Mapping Swiss Politicians’ Public Sphere on Twitter

Computational Communication Research ◽

10.5117/ccr2020.2.003.kell ◽

2020 ◽

Vol 2 (2) ◽

pp. 175-202

Author(s):

Tobias R. Keller

Keyword(s):

Social Media ◽

Public Sphere ◽

The Public ◽

Automated Content Analysis ◽

Social Media Platforms ◽

The Media ◽

Twitter Users ◽

The Public Sphere ◽

Mps I ◽

Ordinary Citizens

Abstract Politicians use social media platforms such as Twitter to connect with the public. However, it remains largely unknown who constitutes the public sphere to whom politicians actually connect, talk, and listen. Focusing on the Twitter network of all Swiss MPs, I identified 129,063 Twitter users with whom politicians connected (i.e., their follower‐followee network) or with whom they interacted (e.g., [were] replied to or retweeted). I qualitatively analyzed top connected, talking, and listening MPs, and conducted a semi-automated content analysis of the Twitter users to classify them (N = 70.589). Politicians’ audience consists primarily of ordinary citizens, who also react most often to the politicians’ messages. However, politicians listen more often to actors close to politics and the media than to ordinary citizens. Thus, politicians navigate between engaging with everyone without losing control over the communication situation and address key multipliers such journalist to get their messages out.

Download Full-text

Computational Communication Research
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Amsterdam University Press

Toward a Stronger Theoretical Grounding of Computational Communication Science

Statistical Power in Content Analysis Designs: How Effect Size, Sample Size and Coding Accuracy Jointly Affect Hypothesis Testing ‐ A Monte Carlo Simulation Approach.

A Weakly Supervised and Deep Learning Method for an Additive Topic Analysis of Large Corpora

Four best practices for measuring news sentiment using ‘off-the-shelf’ dictionaries: a large-scale p-hacking experiment

A tool for tracking the propagation of words on Reddit

Down to a r/science: Integrating Computational Approaches to the Study of Credibility on Reddit

Comparing automated content analysis methods to distinguish issue communication by political parties on Twitter

Extracting semantic relations using syntax

The Accuracy and Precision of Measurement

To Whom Do Politicians Talk and Listen? : Mapping Swiss Politicians’ Public Sphere on Twitter

Export Citation Format

Computational Communication ResearchLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Amsterdam University Press

Toward a Stronger Theoretical Grounding of Computational Communication Science

Statistical Power in Content Analysis Designs: How Effect Size, Sample Size and Coding Accuracy Jointly Affect Hypothesis Testing ‐ A Monte Carlo Simulation Approach.

A Weakly Supervised and Deep Learning Method for an Additive Topic Analysis of Large Corpora

Four best practices for measuring news sentiment using ‘off-the-shelf’ dictionaries: a large-scale p-hacking experiment

A tool for tracking the propagation of words on Reddit

Down to a r/science: Integrating Computational Approaches to the Study of Credibility on Reddit

Comparing automated content analysis methods to distinguish issue communication by political parties on Twitter

Extracting semantic relations using syntax

The Accuracy and Precision of Measurement

To Whom Do Politicians Talk and Listen? : Mapping Swiss Politicians’ Public Sphere on Twitter

Computational Communication Research
Latest Publications