scholarly journals Importance of the Open Data Assessment: An Insight Into the (Meta) Data Quality Dimensions

SAGE Open ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 215824402110231
Author(s):  
Barbara Šlibar ◽  
Dijana Oreški ◽  
Nina Begičević Ređep

Data are the most important resource of the 21st century. The open data (OD) movement provides publicly available data for the development of a knowledge-based society. As such, the concept of OD is a valuable information technology (IT) tool for economic, social, and human development, which adds value. To further develop these processes on a global scale, users need to manage the quality of OD in their practices. Otherwise, what is the point of using data just for the sake of using it (in science or practice) without thinking about data compliance with norms, standards, and so forth? This article aims to provide an overview of (meta)data quality dimensions, sub-dimensions, and metrics used within OD assessment-related research papers. To achieve this, the authors performed a systematic literature review (SLR) and extracted data from 86 relevant studies dealing with the evaluation of OD. The article endows the progress made so far in OD assessment research. Findings of reviewing the assessment of the OD in the light of existing (meta)data quality dimensions unveil the potential of metadata. Furthermore, the analysis disclosed the need for greater use of quantitative methods in research, and metadata can greatly assist in this.

2018 ◽  
Vol 44 (6) ◽  
pp. 785-801
Author(s):  
Hong Huang

This article aims to understand the views of genomic scientists with regard to the data quality assurances associated with semiotics and data–information–knowledge (DIK). The resulting communication of signs generated from genomic curation work, was found within different semantic levels of DIK that correlate specific data quality dimensions with their respective skills. Syntactic data quality dimensions were ranked the highest among all other semiotic data quality dimensions, which indicated that scientists spend great efforts for handling data wrangling activities in genome curation work. Semantic- and pragmatic-related sign communications were about meaningful interpretation, thus required additional adaptive and interpretative skills to deal with data quality issues. This expanded concept of ‘curation’ as sign/semiotic was not previously explored from the practical to the theoretical perspectives. The findings inform policy makers and practitioners to develop framework and cyberinfrastructure that facilitate the initiatives and advocacies of ‘Big Data to Knowledge’ by funding agencies. The findings from this study can also help plan data quality assurance policies and thus maximise the efficiency of genomic data management. Our results give strong support to the relevance of data quality skills communication for relationship with data quality assurance in genome curation activities.


2016 ◽  
Vol 28 (6) ◽  
pp. 933-953 ◽  
Author(s):  
Patrícia Moura e Sá ◽  
Rita Martins

Purpose The purpose of this paper is to uncover the customers’ concerns with the information disclosed in water services invoices and to analyse them with reference to the data quality dimensions usually proposed in the literature. In the context of services of general interest invoices are particularly relevant as a vehicle to convey information to all consumers. Design/methodology/approach Based on the principles of quality planning, the research uses a qualitative approach to identify the data quality requirements of water invoices. Customer voices were collected by means of focus groups and their meanings analysed using an affinity diagram. Findings Findings show that plain language efforts and strategies to enhance trust on the service provided need to be further reinforced. Consumers’ requirements together with the regulator recommendations also confirm the data quality dimensions identified in the literature. Practical implications This research highlights that avoiding technical language and making visible the consequences of different consumption levels on the amounts to be paid is essential when designing water invoices. Moreover, it emphasises that there is still room for improvement in the way the economic regulator performs its role in ensuring the provision of sound information. Originality/value This research addresses a literature gap by conducting a study on data quality requirements outside the context of information systems for organisations. The study is original because it looks at water invoices as a “product” that can be designed to meet the needs of their users.


2021 ◽  
Author(s):  
Sylvia Cho ◽  
Chunhua Weng ◽  
Michael G Kahn ◽  
Karthik Natarajan

BACKGROUND There is a growing interest in using person-generated wearable device data for biomedical research, but concerns in the quality of data such as missing or incorrect data exists. This emphasizes the importance of assessing data quality prior to conducting research. In order to perform data quality assessments, it is essential to define what data quality means for person-generated wearable device data by identifying data quality dimensions. OBJECTIVE The goal of this study was to identify data quality dimensions for person-generated wearable device data for research purposes. METHODS Study was conducted in three phases: (1) literature review, (2) survey, and (3) focus group discussion. Literature review was conducted following the PRISMA guideline to identify factors affecting data quality and its associated data quality challenges. In addition, a survey was conducted to confirm and complement results from the literature review, and to understand researchers’ perception on data quality dimensions that were previously identified as dimensions for the secondary use of electronic health record (EHR) data. The survey was sent out to researchers with experience in analyzing wearable device data. Focus group discussion sessions were conducted with domain experts to derive data quality dimensions for person-generated wearable device data. Based on the results from the literature review and survey, a facilitator proposed potential data quality dimensions relevant to person-generated wearable device data, and the domain experts accepted or rejected the suggested dimensions. RESULTS Nineteen studies were included in the literature review. Three major themes emerged: device- and technical-related, user-related, and data governance-related factors. Associated data quality problems were incomplete data, incorrect data, and heterogeneous data. Twenty respondents answered the survey. Major data quality challenges faced by researchers were completeness, accuracy, and plausibility. The importance ratings on data quality dimensions in an existing framework showed that dimensions for secondary use of EHR data is applicable to person-generated wearable device data. There were three focus group sessions with domain experts in data quality and wearable device research. The experts concluded that intrinsic data quality features such as conformance, completeness, and plausibility, and contextual/fitness-for-use data quality features such as completeness (breadth and density) and temporal data granularity are important data quality dimensions for assessing person-generated wearable device data for research purposes. CONCLUSIONS In this study, intrinsic and contextual/fitness-for-use data quality dimensions for person-generated wearable device data were identified. The dimensions were adapted from data quality terminologies and frameworks for the secondary use of EHR data with a few modifications. Further research on how data quality can be assessed in regards to each dimension is needed.


Sign in / Sign up

Export Citation Format

Share Document