Visualizing data quality: tools and views

2011 ◽  
Vol 4 (0) ◽  
Author(s):  
Ian Painter ◽  
Julie Eaton ◽  
Don Olson ◽  
William Lober ◽  
Debra Revere
Keyword(s):  
2019 ◽  
Vol 214 ◽  
pp. 01030
Author(s):  
Juraj Smiesko

An integrated system for data quality and conditions assessment for the ATLAS Tile Calorimeter is known amongst the ATLAS Tile Calorimeter as the Tile-in-One. It is a platform for combining all of the ATLAS Tile Calorimeter offline data quality tools in one unified web interface. It achieves this by using simple main web server to serve as central hub and group of small web applications called plugins, which provide the data quality assessment tools. Every plugin runs in its own virtual machine in order to prevent interference between the plugins and also to increase stability of the platform.


Author(s):  
Thomas N. Herzog ◽  
Fritz J. Scheuren ◽  
William E. Winkler

Author(s):  
Thomas Foken ◽  
Mathias Göckede ◽  
Johannes Lüers ◽  
Lukas Siebicke ◽  
Corinna Rebmann ◽  
...  
Keyword(s):  

2021 ◽  
Author(s):  
Aaron J Moss ◽  
Cheskie Rosenzweig ◽  
Shalom Noach Jaffe ◽  
Richa Gautam ◽  
Jonathan Robinson ◽  
...  

Online data collection has become indispensable to the social sciences, polling, marketing, and corporate research. However, in recent years, online data collection has been inundated with low quality data. Low quality data threatens the validity of online research and, at times, invalidates entire studies. It is often assumed that random, inconsistent, and fraudulent data in online surveys comes from ‘bots.’ But little is known about whether bad data is caused by bots or ill-intentioned or inattentive humans. We examined this issue on Mechanical Turk (MTurk), a popular online data collection platform. In the summer of 2018, researchers noticed a sharp increase in the number of data quality problems on MTurk, problems that were commonly attributed to bots. Despite this assumption, few studies have directly examined whether problematic data on MTurk are from bots or inattentive humans, even though identifying the source of bad data has important implications for creating the right solutions. Using CloudResearch’s data quality tools to identify problematic participants in 2018 and 2020, we provide evidence that much of the data quality problems on MTurk can be tied to fraudulent users from outside of the U.S. who pose as American workers. Hence, our evidence strongly suggests that the source of low quality data is real humans, not bots. We additionally present evidence that these fraudulent users are behind data quality problems on other platforms.


Sign in / Sign up

Export Citation Format

Share Document