Data analysis requires translating higher level questions and hypotheses into computable statistical models. We present a mixed-methods study aimed at identifying the steps, considerations, and challenges involved in operationalizing hypotheses into statistical models, a process we refer to as
. In a formative content analysis of 50 research papers, we find that researchers highlight decomposing a hypothesis into sub-hypotheses, selecting proxy variables, and formulating statistical models based on data collection design as key steps. In a lab study, we find that analysts fixated on implementation and shaped their analyses to fit familiar approaches, even if sub-optimal. In an analysis of software tools, we find that tools provide inconsistent, low-level abstractions that may limit the statistical models analysts use to formalize hypotheses. Based on these observations, we characterize hypothesis formalization as a dual-search process balancing conceptual and statistical considerations constrained by data and computation and discuss implications for future tools.
Although individuals increasingly use mobile applications (apps) in their daily lives, uncertainty exists regarding how the apps will use the information they request, and it is necessary to protect users from privacy-invasive apps. Recent literature has begun to pay much attention to the privacy issue in the context of mobile apps. However, little attention has been given to designing the permission request interface to reduce individuals’ perceived uncertainty and to support their informed decisions. Drawing on the principal–agent perspective, our study aims to understand the effects of permission justification, certification, and permission relevance on users’ perceived uncertainty, which in turn influences their permission authorization. Two studies were conducted with vignettes. Our results show that certification and permission relevance indeed reduce users’ perceived uncertainty. Moreover, permission relevance moderates the relationship between permission justification and perceived uncertainty. Implications for theory and practice are discussed.
Navigating conception, pregnancy, and loss is challenging for lesbian, gay, bisexual, transgender, and queer (LGBTQ) people, who experience stigma due to LGBTQ identity, other identities (e.g., loss), and intersections thereof. We conducted interviews with 17 LGBTQ people with recent pregnancy loss experiences. Taking LGBTQ identity and loss as a starting point, we used an intracategorical intersectional lens to uncover the benefits and challenges of LGBTQ-specific and non-LGBTQ-specific pregnancy and loss-related online spaces. Participants used LGBTQ-specific online spaces to enact individual, interpersonal, and collective resilience. However, those with multiple marginalized identities (e.g., people of color and non-partnered individuals), faced barriers in finding support within LGBTQ-specific spaces compared to those holding privileged identities (e.g., White and married). Non-LGBTQ spaces were beneficial for some informational needs, but not community and emotional needs due to pervasive heteronormativity, cisnormativity, and a perceived need to educate. We conceptualize experiences of exclusion as symbolic annihilation and intersectional invisibility, and discuss clinical implications and design directions.
We advocate for the usage of hotkeys on touch-based devices by capitalising on soft keyboards through four studies. First, we evaluated visual designs and recommended icons with command names for novices while letters with command names for experts. Second, we investigated the discoverability by asking crowdworkers to use our prototype, with some tasks only doable upon successfully discovering the technique. Discovery rates were high regardless of conditions that vary the familiarity and saliency of modifier keys. However, familiarity with desktop hotkeys boosted discoverability. Our third study focused on how prior knowledge of hotkeys could be leveraged and resulted in a 5% selection time improvement and identified the role of spatial memory in retention. Finally, we compared our soft keyboard layout with a grid layout similar to FastTap. The latter offered a 12–16% gain on selection speed, but at a high cost in terms of screen estate and low spatial stability.
Modern experiments in many disciplines generate large quantities of network (graph) data. Researchers require aesthetic layouts of these networks that clearly convey the domain knowledge and meaning. However, the problem remains challenging due to multiple conflicting aesthetic criteria and complex domain-specific constraints. In this article, we present a strategy for generating visualizations that can help network biologists understand the protein interactions that underlie processes that take place in the cell. Specifically, we have developed Flud, a crowd-powered system that allows humans with no expertise to design biologically meaningful graph layouts with the help of algorithmically generated suggestions. Furthermore, we propose a novel hybrid approach for graph layout wherein crowd workers and a simulated annealing algorithm build on each other’s progress. A study of about 2,000 crowd workers on Amazon Mechanical Turk showed that the hybrid crowd–algorithm approach outperforms the crowd-only approach and state-of-the-art techniques when workers were asked to lay out complex networks that represent signaling pathways. Another study of seven participants with biological training showed that Flud layouts are more effective compared to those created by state-of-the-art techniques. We also found that the algorithmically generated suggestions guided the workers when they are stuck and helped them improve their score. Finally, we discuss broader implications for mixed-initiative interactions in layout design tasks beyond biology.
We clarify fundamental aspects of end-user elicitation, enabling such studies to be run and analyzed with confidence, correctness, and scientific rigor. To this end, our contributions are multifold. We introduce a formal model of end-user elicitation in HCI and identify three types of agreement analysis:
. We show that agreement is a mathematical
generating a tolerance space over the set of elicited proposals. We review current measures of agreement and show that all can be computed from an
. In response to recent criticisms, we show that chance agreement represents an issue solely for inter-rater reliability studies and not for end-user elicitation, where it is opposed by
. We conduct extensive simulations of 16 statistical tests for agreement rates, and report Type I errors and power. Based on our findings, we provide recommendations for practitioners and introduce a five-level hierarchy for elicitation studies.
We present a long-term study of use of the Messaging Kettle, an Internet of Things (IOT) research prototype that augments an everyday kettle with both sensing and messaging capability and a beautiful light display in order to investigate connecting geographically distant loved ones to their family through the routine of boiling the kettle. Connection at a distance has been of sustained interest to the CHI community, and the social connection of older people is of increasing importance in recognition of ageing populations globally. However, very few novel designs in this domain have been investigated in situ or over the long term to examine whether their use sustains, and if so, how they impact communication in a relationship. The Messaging Kettle was trialled with four pairs of dispersed older mothers and adult daughters over timeframes that lasted between two months to more than two years. We observed the phenomenon of
wherein each party creatively made the technology work for them both through a combination of the gradual transformation of their everyday practices, arrangements, and living. Through developing these joint practices over time, participants expressed feelings of
that nurture their relationship at a distance. Three of the four couples continued to use the prototype for years, beyond the initial trial. We reflect on the artful integration of features of the Messaging Kettle and the way in which these features supported
. We also reflect on lessons and implications for the design of such relational technologies.
Autistic teenagers are suspected to be more vulnerable to privacy and safety threats on social networking sites (SNS) than the general population. However, there are no studies comparing these users’ privacy and safety concerns and protective strategies online with those reported by non-autistic teenagers. Furthermore, researchers have yet to identify possible explanations for autistic teenagers’ increased risk of online harms. To address these research gaps, we conducted semi-structured interviews with 12 autistic and 16 non-autistic teenagers assessing their privacy- and safety-related attitudes and behaviors on SNS, and factors affecting them. We used videos demonstrating relevant SNS scenarios as prompts to engage participants in conversation. Through our thematic analyses, we found evidence that autistic teenagers may be more averse to taking risks on SNS than non-autistic teenagers. Yet, several personal, social, and SNS design factors may make autistic teenagers more vulnerable to cyberbullying and social exclusion online. We provide recommendations for making SNS safer for autistic teenagers. Our research highlights the need for more inclusive usable privacy and security research with this population.
Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code, and neglect creating or updating their documentation during quick iterations. Inspired by human documentation practices learned from 80 highly-voted Kaggle notebooks, we design and implement Themisto, an automated documentation generation system to explore how human-centered AI systems can support human data scientists in the machine learning code documentation scenario. Themisto facilitates the creation of documentation via three approaches: a deep-learning-based approach to generate documentation for source code, a query-based approach to retrieve online API documentation for source code, and a user prompt approach to nudge users to write documentation. We evaluated Themisto in a within-subjects experiment with 24 data science practitioners, and found that automated documentation generation techniques reduced the time for writing documentation, reminded participants to document code they would have ignored, and improved participants’ satisfaction with their computational notebook.
The standard definition for “physicalizations” is
“a physical artifact whose geometry or material properties encode data”
]. While this working definition provides the fundamental groundwork for conceptualizing physicalization, in practice many physicalization systems go beyond the scope of this definition as they consist of distributed physical and digital elements that involve complex interaction mechanisms. In this article, we examine how “physicalization” is part of a broader ecology—the “physecology”—with properties that go beyond the scope of the working definition. Through analyzing 60 representative physicalization papers, we derived six design dimensions of a physecology: (i) represented data type, (ii) way of information communication, (iii) interaction mechanisms, (iv) spatial input–output coupling, (v) physical setup, and (vi) audiences involved. Our contribution is the extension of the definition of physicalization to the broader concept of “physecology,” to provide conceptual clarity on the design of physicalizations for future work.