scholarly journals Emerging problems of data quality in citizen science

2016 ◽  
Vol 30 (3) ◽  
pp. 447-449 ◽  
Author(s):  
Roman Lukyanenko ◽  
Jeffrey Parsons ◽  
Yolanda F. Wiersma
Keyword(s):  
Mathematics ◽  
2021 ◽  
Vol 9 (8) ◽  
pp. 875
Author(s):  
Jesus Cerquides ◽  
Mehmet Oğuz Mülâyim ◽  
Jerónimo Hernández-González ◽  
Amudha Ravi Shankar ◽  
Jose Luis Fernandez-Marquez

Over the last decade, hundreds of thousands of volunteers have contributed to science by collecting or analyzing data. This public participation in science, also known as citizen science, has contributed to significant discoveries and led to publications in major scientific journals. However, little attention has been paid to data quality issues. In this work we argue that being able to determine the accuracy of data obtained by crowdsourcing is a fundamental question and we point out that, for many real-life scenarios, mathematical tools and processes for the evaluation of data quality are missing. We propose a probabilistic methodology for the evaluation of the accuracy of labeling data obtained by crowdsourcing in citizen science. The methodology builds on an abstract probabilistic graphical model formalism, which is shown to generalize some already existing label aggregation models. We show how to make practical use of the methodology through a comparison of data obtained from different citizen science communities analyzing the earthquake that took place in Albania in 2019.


2021 ◽  
Vol 444 ◽  
pp. 109453
Author(s):  
Camille Van Eupen ◽  
Dirk Maes ◽  
Marc Herremans ◽  
Kristijn R.R. Swinnen ◽  
Ben Somers ◽  
...  

2021 ◽  
Vol 10 (4) ◽  
pp. 207
Author(s):  
Annie Gray ◽  
Colin Robertson ◽  
Rob Feick

Citizen science initiatives span a wide range of topics, designs, and research needs. Despite this heterogeneity, there are several common barriers to the uptake and sustainability of citizen science projects and the information they generate. One key barrier often cited in the citizen science literature is data quality. Open-source tools for the analysis, visualization, and reporting of citizen science data hold promise for addressing the challenge of data quality, while providing other benefits such as technical capacity-building, increased user engagement, and reinforcing data sovereignty. We developed an operational citizen science tool called the Community Water Data Analysis Tool (CWDAT)—a R/Shiny-based web application designed for community-based water quality monitoring. Surveys and facilitated user-engagement were conducted among stakeholders during the development of CWDAT. Targeted recruitment was used to gather feedback on the initial CWDAT prototype’s interface, features, and potential to support capacity building in the context of community-based water quality monitoring. Fourteen of thirty-two invited individuals (response rate 44%) contributed feedback via a survey or through facilitated interaction with CWDAT, with eight individuals interacting directly with CWDAT. Overall, CWDAT was received favourably. Participants requested updates and modifications such as water quality thresholds and indices that reflected well-known barriers to citizen science initiatives related to data quality assurance and the generation of actionable information. Our findings support calls to engage end-users directly in citizen science tool design and highlight how design can contribute to users’ understanding of data quality. Enhanced citizen participation in water resource stewardship facilitated by tools such as CWDAT may provide greater community engagement and acceptance of water resource management and policy-making.


2017 ◽  
Vol 16 (02) ◽  
pp. C05
Author(s):  
Stuart Allan ◽  
Joanna Redden

This article examines certain guiding tenets of science journalism in the era of big data by focusing on its engagement with citizen science. Having placed citizen science in historical context, it highlights early interventions intended to help establish the basis for an alternative epistemological ethos recognising the scientist as citizen and the citizen as scientist. Next, the article assesses further implications for science journalism by examining the challenges posed by big data in the realm of citizen science. Pertinent issues include potential risks associated with data quality, access dynamics, the difficulty investigating algorithms, and concerns about certain constraints impacting on transparency and accountability.


Author(s):  
Emily Baker ◽  
Jonathan Drury ◽  
Johanna Judge ◽  
David Roy ◽  
Graham Smith ◽  
...  

Citizen science schemes (projects) enable ecological data collection over very large spatial and temporal scales, producing datasets of high value for both pure and applied research. However, the accuracy of citizen science data is often questioned, owing to issues surrounding data quality and verification, the process by which records are checked after submission for correctness. Verification is a critical process for ensuring data quality and for increasing trust in such datasets, but verification approaches vary considerably among schemes. Here, we systematically review approaches to verification across ecological citizen science schemes, which feature in published research, aiming to identify the options available for verification, and to examine factors that influence the approaches used (Baker et al. 2021). We reviewed 259 schemes and were able to locate verification information for 142 of those. Expert verification was most widely used, especially among longer-running schemes. Community consensus was the second most common verification approach, used by schemes such as Snapshot Serengeti (Swanson et al. 2016) and MammalWeb (Hsing et al. 2018). It was more common among schemes with a larger number of participants and where photos or video had to be submitted with each record. Automated verification was not widely used among the schemes reviewed. Schemes that used automation, such as eBird (Kelling et al. 2011) and Project FeederWatch (Bonter and Cooper 2012) did so in conjunction with other methods such as expert verification. Expert verification has been the default approach for schemes in the past, but as the volume of data collected through citizen science schemes grows and the potential of automated approaches develops, many schemes might be able to implement approaches that verify data more efficiently. We present an idealised system for data verification, identifying schemes where this hierachical system could be applied and the requirements for implementation. We propose a hierarchical approach in which the bulk of records are verified by automation or community consensus, and any flagged records can then undergo additional levels of verification by experts.


Author(s):  
Tom August ◽  
J Terry ◽  
David Roy

The rapid rise of Artificial Intelligence (AI) methods has presented new opportunities for those who work with biodiversity data. Computer vision, in particular where computers can be trained to identify species in digital photographs, has significant potential to address a number of existing challenges in citizen science. The Biological Records Centre (www.brc.ac.uk) has been a central focus for terrestrial and freshwater citizen science in the United Kingdom for over 50 years. We will present our research on how computer vision can be embedded in citizen science, addressing three important questions. How can contextual information, such as time of year, be included in computer vision? A naturalist will use a wealth of ecological knowledge about species in combination with information about where and when the image was taken to augment their decision making; we should emulate this in our AI. How can citizen scientists be best supported by computer vision? Our ambition is not to replace identification skills with AI but to use AI to support the learning process. How can computer vision support our limited resource of expert verifiers as data volumes increase? We receive more and more data each year, which puts a greater demand on our expert verifiers, who review all records to ensure data quality. We have been exploring how computer vision can lighten this workload. How can contextual information, such as time of year, be included in computer vision? A naturalist will use a wealth of ecological knowledge about species in combination with information about where and when the image was taken to augment their decision making; we should emulate this in our AI. How can citizen scientists be best supported by computer vision? Our ambition is not to replace identification skills with AI but to use AI to support the learning process. How can computer vision support our limited resource of expert verifiers as data volumes increase? We receive more and more data each year, which puts a greater demand on our expert verifiers, who review all records to ensure data quality. We have been exploring how computer vision can lighten this workload. We will present work that addresses these questions including: developing machine learning techniques that incorporate ecological information as well as images to arrive at a species classification; co-designing an identification tool to help farmers identify flowers beneficial to wildlife; and assessing the optimal combination of computer vision and expert verification to improve our verification systems.


PLoS ONE ◽  
2019 ◽  
Vol 14 (2) ◽  
pp. e0211907 ◽  
Author(s):  
Marina Torre ◽  
Shinnosuke Nakayama ◽  
Tyrone J. Tolbert ◽  
Maurizio Porfiri
Keyword(s):  

PLoS ONE ◽  
2019 ◽  
Vol 14 (6) ◽  
pp. e0218086 ◽  
Author(s):  
Daniel Langenkämper ◽  
Erik Simon-Lledó ◽  
Brett Hosking ◽  
Daniel O. B. Jones ◽  
Tim W. Nattkemper

Sign in / Sign up

Export Citation Format

Share Document