Guidelines for MSAT and SNP panels that lead to high-quality data for genetic mark–recapture studies

Molecular markers with inadequate power to discriminate among individuals can lead to false recaptures (shadows), and inaccurate genotyping can lead to missed recaptures (ghosts), potentially biasing genetic mark–recapture estimates. We used simulations to examine the impact of microsatellite (MSAT) and single nucleotide polymorphism (SNP) marker-set size, allelic frequency, multitubes approaches, and sample matching protocols on shadow and ghost events in genetic mark–recapture studies, presenting guidance on the specifications for MSAT and SNP marker panels, and sample matching protocols necessary to produce high-quality data. Shadow events are controllable by increasing the number of markers or by selecting markers with high discriminatory power; reasonably sized marker sets (e.g., ≥9 MSATs or ≥32 SNPs) of moderate allelic diversity lead to low probabilities of shadow errors. Ghost events are more challenging to control and low allelic dropout or false-allele error rates produced high rates of erroneous mismatches in mark–recapture sampling. Fortunately, error-tolerant matching protocols, which use information from positively matching loci between comparisons of samples, and multitubes protocols to achieve consensus genotypes are effective at eliminating ghost events. We present a case study on Pacific walrus, Odobenus rosmarus divergens (Illiger, 1815), using simulation results to inform genetic marker choices.

Download Full-text

A Systematic Framework for Analyzing Observation Data in Patient-Centered Registries: Case Study for Patients With Depression (Preprint)

10.2196/preprints.18366 ◽

2020 ◽

Author(s):

Maryam Zolnoori ◽

Mark D Williams ◽

William B Leasure ◽

Kurt B Angstman ◽

Che Ngufor

Keyword(s):

Data Quality ◽

Quality Data ◽

Patient Centered ◽

High Quality ◽

Data Set ◽

High Quality Data ◽

Research Questions ◽

Quality Issues ◽

The Impact ◽

Systematic Framework

BACKGROUND Patient-centered registries are essential in population-based clinical care for patient identification and monitoring of outcomes. Although registry data may be used in real time for patient care, the same data may further be used for secondary analysis to assess disease burden, evaluation of disease management and health care services, and research. The design of a registry has major implications for the ability to effectively use these clinical data in research. OBJECTIVE This study aims to develop a systematic framework to address the data and methodological issues involved in analyzing data in clinically designed patient-centered registries. METHODS The systematic framework was composed of 3 major components: visualizing the multifaceted and heterogeneous patient-centered registries using a data flow diagram, assessing and managing data quality issues, and identifying patient cohorts for addressing specific research questions. RESULTS Using a clinical registry designed as a part of a collaborative care program for adults with depression at Mayo Clinic, we were able to demonstrate the impact of the proposed framework on data integrity. By following the data cleaning and refining procedures of the framework, we were able to generate high-quality data that were available for research questions about the coordination and management of depression in a primary care setting. We describe the steps involved in converting clinically collected data into a viable research data set using registry cohorts of depressed adults to assess the impact on high-cost service use. CONCLUSIONS The systematic framework discussed in this study sheds light on the existing inconsistency and data quality issues in patient-centered registries. This study provided a step-by-step procedure for addressing these challenges and for generating high-quality data for both quality improvement and research that may enhance care and outcomes for patients. INTERNATIONAL REGISTERED REPORT DERR1-10.2196/18366

Download Full-text

A Systematic Framework for Analyzing Observation Data in Patient-Centered Registries: Case Study for Patients With Depression

JMIR Research Protocols ◽

10.2196/18366 ◽

2020 ◽

Vol 9 (10) ◽

pp. e18366

Author(s):

Maryam Zolnoori ◽

Mark D Williams ◽

William B Leasure ◽

Kurt B Angstman ◽

Che Ngufor

Keyword(s):

Data Quality ◽

Quality Data ◽

Patient Centered ◽

High Quality ◽

Data Set ◽

High Quality Data ◽

Research Questions ◽

Quality Issues ◽

The Impact ◽

Systematic Framework

Background Patient-centered registries are essential in population-based clinical care for patient identification and monitoring of outcomes. Although registry data may be used in real time for patient care, the same data may further be used for secondary analysis to assess disease burden, evaluation of disease management and health care services, and research. The design of a registry has major implications for the ability to effectively use these clinical data in research. Objective This study aims to develop a systematic framework to address the data and methodological issues involved in analyzing data in clinically designed patient-centered registries. Methods The systematic framework was composed of 3 major components: visualizing the multifaceted and heterogeneous patient-centered registries using a data flow diagram, assessing and managing data quality issues, and identifying patient cohorts for addressing specific research questions. Results Using a clinical registry designed as a part of a collaborative care program for adults with depression at Mayo Clinic, we were able to demonstrate the impact of the proposed framework on data integrity. By following the data cleaning and refining procedures of the framework, we were able to generate high-quality data that were available for research questions about the coordination and management of depression in a primary care setting. We describe the steps involved in converting clinically collected data into a viable research data set using registry cohorts of depressed adults to assess the impact on high-cost service use. Conclusions The systematic framework discussed in this study sheds light on the existing inconsistency and data quality issues in patient-centered registries. This study provided a step-by-step procedure for addressing these challenges and for generating high-quality data for both quality improvement and research that may enhance care and outcomes for patients. International Registered Report Identifier (IRRID) DERR1-10.2196/18366

Download Full-text

Collecting High-Quality Data

10.1093/oso/9780199366088.003.0007 ◽

2018 ◽

Author(s):

Mary Kay Gugerty ◽

Dean Karlan

Keyword(s):

Data Collection ◽

Social Desirability ◽

Measurement Errors ◽

Monitoring And Evaluation ◽

Quality Data ◽

Social Desirability Bias ◽

High Quality ◽

High Quality Data ◽

Evaluation Systems ◽

Demand Effect

Without high-quality data, even the best-designed monitoring and evaluation systems will collapse. Chapter 7 introduces some the basics of collecting high-quality data and discusses how to address challenges that frequently arise. High-quality data must be clearly defined and have an indicator that validly and reliably measures the intended concept. The chapter then explains how to avoid common biases and measurement errors like anchoring, social desirability bias, the experimenter demand effect, unclear wording, long recall periods, and translation context. It then guides organizations on how to find indicators, test data collection instruments, manage surveys, and train staff appropriately for data collection and entry.

Download Full-text

Developmental Impacts of Remittances on Migrant-Sending Households: Micro-Level Evidence from Punjab, Pakistan

Journal of South Asian Development ◽

10.1177/0973174119887302 ◽

2019 ◽

Vol 14 (3) ◽

pp. 338-366

Author(s):

Kashif Imran ◽

Evelyn S. Devadason ◽

Cheong Kee Cheok

Keyword(s):

Human Development ◽

Human Development Index ◽

Quality Data ◽

High Quality ◽

High Quality Data ◽

Development Index ◽

Policy Interventions ◽

Education And Health ◽

Micro Level ◽

The Government

This article analyzes the overall and type of developmental impacts of remittances for migrant-sending households (HHs) in districts of Punjab, Pakistan. For this purpose, an HH-based human development index is constructed based on the dimensions of education, health and housing, with a view to enrich insights into interactions between remittances and HH development. Using high-quality data from a HH micro-survey for Punjab, the study finds that most migrant-sending HHs are better off than the HHs without this stream of income. More importantly, migrant HHs have significantly higher development in terms of housing in most districts of Punjab relative to non-migrant HHs. Thus, the government would need policy interventions focusing on housing to address inequalities in human development at the district-HH level, and subsequently balance its current focus on the provision of education and health.

Download Full-text

Capturing sexual assault data: An information system designed by forensic clinicians and healthcare researchers

Health Information Management Journal ◽

10.1177/1833358316687575 ◽

2017 ◽

Vol 47 (1) ◽

pp. 46-55 ◽

Cited By ~ 1

Author(s):

S Aqif Mukhtar ◽

Debbie A Smith ◽

Maureen A Phillips ◽

Maire C Kelly ◽

Renate R Zilkens ◽

...

Keyword(s):

Information System ◽

Sexual Assault ◽

Data Management ◽

Sexual Violence ◽

Western Australia ◽

Management System ◽

Data Management System ◽

Quality Data ◽

High Quality ◽

High Quality Data

Background: The Sexual Assault Resource Center (SARC) in Perth, Western Australia provides free 24-hour medical, forensic, and counseling services to persons aged over 13 years following sexual assault. Objective: The aim of this research was to design a data management system that maintains accurate quality information on all sexual assault cases referred to SARC, facilitating audit and peer-reviewed research. Methods: The work to develop SARC Medical Services Clinical Information System (SARC-MSCIS) took place during 2007–2009 as a collaboration between SARC and Curtin University, Perth, Western Australia. Patient demographics, assault details, including injury documentation, and counseling sessions were identified as core data sections. A user authentication system was set up for data security. Data quality checks were incorporated to ensure high-quality data. Results: An SARC-MSCIS was developed containing three core data sections having 427 data elements to capture patient’s data. Development of the SARC-MSCIS has resulted in comprehensive capacity to support sexual assault research. Four additional projects are underway to explore both the public health and criminal justice considerations in responding to sexual violence. The data showed that 1,933 sexual assault episodes had occurred among 1881 patients between January 1, 2009 and December 31, 2015. Sexual assault patients knew the assailant as a friend, carer, acquaintance, relative, partner, or ex-partner in 70% of cases, with 16% assailants being a stranger to the patient. Conclusion: This project has resulted in the development of a high-quality data management system to maintain information for medical and forensic services offered by SARC. This system has also proven to be a reliable resource enabling research in the area of sexual violence.

Download Full-text

We Need High Quality Data in Our Diversity Articles

Academic Radiology ◽

10.1016/j.acra.2020.11.029 ◽

2021 ◽

Author(s):

Daniel B. Chonde ◽

Seifu J. Chonde

Keyword(s):

Quality Data ◽

High Quality ◽

High Quality Data

Download Full-text

Fooled by Performance Randomness: Overrewarding Luck

Review of Economics and Statistics ◽

10.1162/rest_a_00783 ◽

2019 ◽

Vol 101 (4) ◽

pp. 658-666 ◽

Cited By ~ 5

Author(s):

Romain Gauriot ◽

Lionel Page

Keyword(s):

Causal Effect ◽

Experimental Situation ◽

Quality Data ◽

High Quality ◽

High Quality Data ◽

Economic Organizations ◽

Quasi Experimental ◽

Similar Location ◽

Do So

We provide evidence of a violation of the informativeness principle whereby lucky successes are overly rewarded. We isolate a quasi-experimental situation where the success of an agent is as good as random. To do so, we use high-quality data on football (soccer) matches and select shots on goal that landed on the goal posts. Using nonscoring shots, taken from a similar location on the pitch, as counterfactuals to scoring shots, we estimate the causal effect of a lucky success (goal) on the evaluation of the player's performance. We find clear evidence that luck is overly influencing managers' decisions and evaluators' ratings. Our results suggest that this phenomenon is likely to be widespread in economic organizations.

Download Full-text

R factors in Rietveld analysis: How good is good enough?

Powder Diffraction ◽

10.1154/1.2179804 ◽

2006 ◽

Vol 21 (1) ◽

pp. 67-70 ◽

Cited By ~ 508

Author(s):

Brian H. Toby

Keyword(s):

Rietveld Analysis ◽

Poor Quality ◽

Quality Data ◽

High Quality ◽

High Quality Data ◽

Poor Quality Data ◽

Error Index ◽

R Factors ◽

Very High

The definitions for important Rietveld error indices are defined and discussed. It is shown that while smaller error index values indicate a better fit of a model to the data, wrong models with poor quality data may exhibit smaller values error index values than some superb models with very high quality data.

Download Full-text

AGRIBALYSE®, the French LCI Database for agricultural products: high quality data for producers and environmental labelling

OCL ◽

10.1051/ocl/20140047 ◽

2014 ◽

Vol 22 (1) ◽

pp. D104 ◽

Cited By ~ 12

Author(s):

Vincent Colomb ◽

Samy Ait Amar ◽

Claudine Basset Mens ◽

Armelle Gac ◽

Gérard Gaillard ◽

...

Keyword(s):

Agricultural Products ◽

Quality Data ◽

High Quality ◽

High Quality Data ◽

Environmental Labelling

Download Full-text

A Study on Pattern Recognition with the Histograms of Oriented Gradients in Distorted and Noisy Images

JUCS - Journal of Universal Computer Science ◽

10.3897/jucs.2020.024 ◽

2020 ◽

Vol 26 (4) ◽

pp. 454-478

Author(s):

Andrzej Bukała ◽

Michał Koziarski ◽

Bogusław Cyganek ◽

Osman Koç ◽

Alperen Kara

Keyword(s):

Pattern Recognition ◽

Negative Impact ◽

Quality Data ◽

High Quality ◽

Denoising Method ◽

High Quality Data ◽

Histograms Of Oriented Gradients ◽

Different Types ◽

Simple Implementation ◽

Salt And Pepper

Histograms of oriented gradients (HOG) are still one of the most frequently used low-level features for pattern recognition in images. Despite their great popularity and simple implementation performance of the HOG features almost always has been measured on relatively high quality data which are far from real conditions. To fill this gap we experimentally evaluate their performance in the more realistic conditions, based on images affected by different types of noise, such as Gaussian, quantization, and salt-and-pepper, as well on images distorted by occlusions. Different noise scenarios were tested such anti-distortions during training as well as application of a proper denoising method in the recognition stage. As underpinned with experimental results, the negative impact of distortions and noise on object recognition with HOG features can be significantly reduced by employment of a proper denoising strategy.

Download Full-text