Georeferencing and data quality: SANBI’s story

Biodiversity Information Science and Standards ◽

10.3897/biss.2.25310 ◽

2018 ◽

Vol 2 ◽

pp. e25310

Author(s):

Fhatani Ranwashe

Keyword(s):

Data Quality ◽

South African ◽

Controlled Vocabulary ◽

Quality Of Data ◽

Biodiversity Data ◽

Darwin Core ◽

Data Collections ◽

Biodiversity Information ◽

Locality Information

Georeferencing helps to fill in biodiversity information gaps, allowing biodiversity data to be represented spatially to allow for valuable assessments to be conducted. The South African National Biodiversity Institute has embarked on a number of projects that have required the georeferencing of biodiversity data to assist in assessments for redlisting of species and measuring the protection levels of species. Data quality in biodiversity information is an important aspect. Due to a lack of standardisation in collection and recording methods historical biodiversity data collections provide a challenge when it comes to ascertaining fitness for use or determining the quality of data. The quality of historical locality information recorded in biodiversity data collections faces the scrutiny of fitness for use as these information is critical in performing assessments. The lack of descriptive locality information, or ambiguous locality information deems most historical biodiversity records unfit for use. Georeferencing should essentially improve the quality of biodiversity data, but how do you measure the fitness for use of georeferenced data? Through the use of the Darwin Core coordinateUncertaintyinMeters, georeferenced data can be queried to investigate and determine the quality of the georeferenced data produced. My presentation will cover the scope of ascertaining georeferenced data quality through the use of the DarwinCore term coordinateUncertatintyInMeters, the impacts of using a controlled vocabulary in representing the coordinateUncertaintyInMeters, and will highlight how SANBI’s georeferencing efforts have contributed to data quality within the management of biodiversity information.

Download Full-text

Developing Standards for Improved Data Quality and for Selecting Fit for Use Biodiversity Data

Biodiversity Information Science and Standards ◽

10.3897/biss.4.50889 ◽

2020 ◽

Vol 4 ◽

Cited By ~ 4

Author(s):

Arthur Chapman ◽

Lee Belbin ◽

Paula Zermoglio ◽

John Wieczorek ◽

Paul Morris ◽

...

Keyword(s):

Data Quality ◽

Species Distribution Modelling ◽

Use Cases ◽

Biodiversity Data ◽

Global Biodiversity Information Facility ◽

As Species ◽

Darwin Core ◽

Core Set ◽

Event Date ◽

Biodiversity Information

The quality of biodiversity data publicly accessible via aggregators such as GBIF (Global Biodiversity Information Facility), the ALA (Atlas of Living Australia), iDigBio (Integrated Digitized Biocollections), and OBIS (Ocean Biogeographic Information System) is often questioned, especially by the research community. The Data Quality Interest Group, established by Biodiversity Information Standards (TDWG) and GBIF, has been engaged in four main activities: developing a framework for the assessment and management of data quality using a fitness for use approach; defining a core set of standardised tests and associated assertions based on Darwin Core terms; gathering and classifying user stories to form contextual-themed use cases, such as species distribution modelling, agrobiodiversity, and invasive species; and developing a standardised format for building and managing controlled vocabularies of values. Using the developed framework, data quality profiles have been built from use cases to represent user needs. Quality assertions can then be used to filter data suitable for a purpose. The assertions can also be used to provide feedback to data providers and custodians to assist in improving data quality at the source. A case study, using two different implementations of tests and assertions based around the Darwin Core "Event Date" terms, were also tested against GBIF data, to demonstrate that the tests are implementation agnostic, can be run on large aggregated datasets, and can make biodiversity data more fit for typical research uses.

Download Full-text

Designing Information Product (IP) Maps On the Process of Data Processing and Academic Information

International Journal of New Media Technology ◽

10.31937/ijnmt.v4i1.534 ◽

2017 ◽

Vol 4 (1) ◽

pp. 25-31 ◽

Cited By ~ 1

Author(s):

Diana Effendi

Keyword(s):

Data Quality ◽

Data Management ◽

Information Management ◽

Information Quality ◽

Quality Data ◽

Management Approach ◽

Quality Of Data ◽

Information Product ◽

Academic Activities

Information Product Approach (IP Approach) is an information management approach. It can be used to manage product information and data quality analysis. IP-Map can be used by organizations to facilitate the management of knowledge in collecting, storing, maintaining, and using the data in an organized. The process of data management of academic activities in X University has not yet used the IP approach. X University has not given attention to the management of information quality of its. During this time X University just concern to system applications used to support the automation of data management in the process of academic activities. IP-Map that made in this paper can be used as a basis for analyzing the quality of data and information. By the IP-MAP, X University is expected to know which parts of the process that need improvement in the quality of data and information management. Index term: IP Approach, IP-Map, information quality, data quality. REFERENCES[1] H. Zhu, S. Madnick, Y. Lee, and R. Wang, “Data and Information Quality Research: Its Evolution and Future,” Working Paper, MIT, USA, 2012.[2] Lee, Yang W; at al, Journey To Data Quality, MIT Press: Cambridge, 2006.[3] L. Al-Hakim, Information Quality Management: Theory and Applications. Idea Group Inc (IGI), 2007.[4] “Access : A semiotic information quality framework: development and comparative analysis : Journal ofInformation Technology.” [Online]. Available: http://www.palgravejournals.com/jit/journal/v20/n2/full/2000038a.html. [Accessed: 18-Sep-2015].[5] Effendi, Diana, Pengukuran Dan Perbaikan Kualitas Data Dan Informasi Di Perguruan Tinggi MenggunakanCALDEA Dan EVAMECAL (Studi Kasus X University), Proceeding Seminar Nasional RESASTEK, 2012, pp.TIG.1-TI-G.6.

Download Full-text

Response Behavior and Quality of Survey Data: Comparing Elderly Respondents in Institutions and Private Households

Sociological Methods & Research ◽

10.1177/0049124121995534 ◽

2021 ◽

pp. 004912412199553

Author(s):

Jan-Lucas Schanze

Keyword(s):

Data Quality ◽

The Elderly ◽

Response Behavior ◽

Quality Of Data ◽

Social Surveys ◽

Private Households ◽

Confounding Variables ◽

Health Related ◽

Survey Interviews

An increasing age of respondents and cognitive impairment are usual suspects for increasing difficulties in survey interviews and a decreasing data quality. This is why survey researchers tend to label residents in retirement and nursing homes as hard-to-interview and exclude them from most social surveys. In this article, I examine to what extent this label is justified and whether quality of data collected among residents in institutions for the elderly really differs from data collected within private households. For this purpose, I analyze the response behavior and quality indicators in three waves of Survey of Health, Ageing and Retirement in Europe. To control for confounding variables, I use propensity score matching to identify respondents in private households who share similar characteristics with institutionalized residents. My results confirm that most indicators of response behavior and data quality are worse in institutions compared to private households. However, when controlling for sociodemographic and health-related variables, differences get very small. These results suggest the importance of health for the data quality irrespective of the housing situation.

Download Full-text

Integrating Experimental and Analytic Approaches to Improve Data Quality in Genome-wide RNAi Screens

CrossRef Listing of Deleted DOIs ◽

10.1177/1087057108317145 ◽

2008 ◽

Vol 13 (5) ◽

pp. 378-389 ◽

Cited By ~ 26

Author(s):

Xiaohua Douglas Zhang ◽

Amy S. Espeseth ◽

Eric N. Johnson ◽

Jayne Chin ◽

Adam Gates ◽

...

Keyword(s):

Data Quality ◽

High Throughput Screening ◽

Daily Practice ◽

Systematic Errors ◽

Quality Of Data ◽

Improve Data Quality ◽

Research Journal ◽

Genome Wide ◽

Assess Data Quality

RNA interference (RNAi) not only plays an important role in drug discovery but can also be developed directly into drugs. RNAi high-throughput screening (HTS) biotechnology allows us to conduct genome-wide RNAi research. A central challenge in genome-wide RNAi research is to integrate both experimental and computational approaches to obtain high quality RNAi HTS assays. Based on our daily practice in RNAi HTS experiments, we propose the implementation of 3 experimental and analytic processes to improve the quality of data from RNAi HTS biotechnology: (1) select effective biological controls; (2) adopt appropriate plate designs to display and/or adjust for systematic errors of measurement; and (3) use effective analytic metrics to assess data quality. The applications in 5 real RNAi HTS experiments demonstrate the effectiveness of integrating these processes to improve data quality. Due to the effectiveness in improving data quality in RNAi HTS experiments, the methods and guidelines contained in the 3 experimental and analytic processes are likely to have broad utility in genome-wide RNAi research. ( Journal of Biomolecular Screening 2008:378-389)

Download Full-text

Peningkatan Kualitas Data Bidang Tanah di Kantor Pertanahan Kota Administrasi Jakarta Selatan

Tunas Agraria ◽

10.31292/jta.v4i2.143 ◽

2021 ◽

Vol 4 (2) ◽

pp. 168-174

Author(s):

Maslusatun Mawadah

Keyword(s):

Quality Improvement ◽

Data Quality ◽

Research Method ◽

The South ◽

Quality Of Data ◽

Regional Division ◽

Land Administration ◽

Problems And Solutions ◽

Descriptive Approach

The South Jakarta Administrative City Land Office is one of the cities targeted to be a city with complete land administration in 2020. The current condition of land parcel data demands an update, namely improving the quality of data from KW1 to KW6 towards KW1 valid. The purpose of this study is to determine the condition of land data quality in South Jakarta, the implementation of data quality improvement, as well as problems and solutions in implementing data quality improvement. The research method used is qualitative with a descriptive approach. The results showed that the condition of the data quality after the implementation of the improvement, namely KW1 increased from 86.45% to 87.01%. The roles of man, material, machine, and method have been fulfilled and the implementation of data quality improvement is not in accordance with the 2019 Complete City Guidelines in terms of territorial boundary inventory, and there are still obstacles in the implementation of improving the quality of land parcel data, namely the absence of buku tanah, surat ukur, and gambar ukur at the land office, the existence of regional division, the boundaries of the sub district are not yet certain, and the existence of land parcels that have been separated from mapping without being noticed by the office administrator.

Download Full-text

Data Quality Associated with Big Data Processing: A Survey

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst/21/05386 ◽

2021 ◽

Vol 23 (06) ◽

pp. 1011-1018

Author(s):

Aishrith P Rao ◽

◽

Raghavendra J C ◽

Dr. Sowmyarani C N ◽

Dr. Padmashree T ◽

...

Keyword(s):

Big Data ◽

Data Quality ◽

Data Gathering ◽

Cost Effective ◽

Data Repository ◽

Critical Approach ◽

Critical Aspect ◽

Quality Of Data ◽

Data Group

With the advancement of technology and the large volume of data produced, processed, and stored, it is becoming increasingly important to maintain the quality of data in a cost-effective and productive manner. The most important aspects of Big Data (BD) are storage, processing, privacy, and analytics. The Big Data group has identified quality as a critical aspect of its maturity. Nonetheless, it is a critical approach that should be adopted early in the lifecycle and gradually extended to other primary processes. Companies are very reliant and drive profits from the huge amounts of data they collect. When its consistency deteriorates, the ramifications are uncertain and may result in completely undesirable conclusions. In the sense of BD, determining data quality is difficult, but it is essential that we uphold the data quality before we can proceed with any analytics. We investigate data quality during the stages of data gathering, preprocessing, data repository, and evaluation/analysis of BD processing in this paper. The related solutions are also suggested based on the elaboration and review of the proposed problems.

Download Full-text

Quality of Open Research Data: Values, Convergences and Governance

Information ◽

10.3390/info11040175 ◽

2020 ◽

Vol 11 (4) ◽

pp. 175 ◽

Cited By ~ 3

Author(s):

Tibor Koltay

Keyword(s):

Big Data ◽

Data Quality ◽

Academic Research ◽

Research Data ◽

Data Governance ◽

Quality Of Data ◽

Open Research ◽

Research Environments

This paper focuses on the characteristics of research data quality, and aims to cover the most important issues related to it, giving particular attention to its attributes and to data governance. The corporate word’s considerable interest in the quality of data is obvious in several thoughts and issues reported in business-related publications, even if there are apparent differences between values and approaches to data in corporate and in academic (research) environments. The paper also takes into consideration that addressing data quality would be unimaginable without considering big data.

Download Full-text

Evaluating the Data Quality of a National Sample of Young Sexual and Gender Minorities Recruited Using Social Media: The Influence of Different Design Formats

Social Science Computer Review ◽

10.1177/0894439320928240 ◽

2020 ◽

pp. 089443932092824 ◽

Cited By ~ 1

Author(s):

Michael J. Stern ◽

Erin Fordyce ◽

Rachel Carpenter ◽

Melissa Heim Viox ◽

Stuart Michaels ◽

...

Keyword(s):

Social Media ◽

Data Quality ◽

National Sample ◽

Quality Data ◽

Quality Of Data ◽

Gender Minorities ◽

Population Recruitment ◽

Youth Population ◽

And Gender

Social media recruitment is no longer an uncharted avenue for survey research. The results thus far provide evidence of an engaging means of recruiting hard-to-reach populations. Questions remain, however, regarding whether the data collected using this method of recruitment produce quality data. This article assesses one aspect that may influence the quality of data gathered through nonprobability sampling using social media advertisements for a hard-to-reach sexual and gender minority youth population: recruitment design formats. The data come from the Survey of Today’s Adolescent Relationships and Transitions, which used a variety of forms of advertisements as survey recruitment tools on Facebook, Instagram, and Snapchat. Results demonstrate that design decisions such as the format of the advertisement (e.g., video or static) and the use of eligibility language on the advertisements impact the quality of the data as measured by break-off rates and the use of nonsubstantive responses. Additionally, the type of device used affected the measures of data quality.

Download Full-text

Analyzing Social Media Research: A Data Quality and Research Reproducibility Perspective

IIM Kozhikode Society & Management Review ◽

10.1177/22779752211011810 ◽

2021 ◽

pp. 227797522110118

Author(s):

Amit K. Srivastava ◽

Rajhans Mishra

Keyword(s):

Social Media ◽

Data Quality ◽

Quality Of Data ◽

Social Media Data ◽

National Crisis ◽

Social Media Platforms ◽

Quality Issues ◽

The One ◽

Media Data

Social media platforms have become very popular these days among individuals and organizations. On the one hand, organizations use social media as a potential tool to create awareness of their products among consumers, and on the other hand, social media data is useful to predict the national crisis, election polls, stock prediction, etc. However, nowadays, a debate is going on about the quality of data generated on social media platforms, whether it is relevant for prediction and generalization. The article discusses the relevance and quality of data obtained from social media in the context of research and development. Social media data quality issues may impact the generalizability and reproducibility of the results of the study. The paper explores possible reasons for quality issues in the data generated over social media platforms along with the suggestive measures to minimize them using the proposed social media data quality framework.

Download Full-text

Economic Aspects of the Missing Data Problem – the Case of the Patient Registry

Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis ◽

10.11118/actaun201765051779 ◽

2017 ◽

Vol 65 (5) ◽

pp. 1779-1791

Author(s):

Hatice Uenal ◽

David Hampel

Keyword(s):

Missing Data ◽

Data Quality ◽

Quality Analysis ◽

Quality Of Data ◽

Quality Costs ◽

Missing Data Problem ◽

Study Results ◽

The Cost ◽

Cost Factors

Registries are indispensable in medical studies and provide the basis for reliable study results for research questions. Depending on the purpose of use, a high quality of data is a prerequisite. However, with increasing registry quality, costs also increase accordingly. Considering these time and cost factors, this work is an attempt to estimate the cost advantages of applying statistical tools to existing registry data, including quality evaluation. Results for quality analysis showed that there are unquestionable savings of millions in study costs by reducing the time horizon and saving on average € 523,126 for every reduced year. Replacing additionally the over 25 % missing data in some variables, data quality was immensely improved. To conclude, our findings showed dearly the importance of data quality and statistical input in avoiding biased conclusions due to incomplete data.

Download Full-text