Investigating Perceptual Biases, Data Reliability, and Data Discovery in a Methodology for Collecting Speech Errors From Audio Recordings

This work describes a methodology of collecting speech errors from audio recordings and investigates how some of its assumptions affect data quality and composition. Speech errors of all types (sound, lexical, syntactic, etc.) were collected by eight data collectors from audio recordings of unscripted English speech. Analysis of these errors showed that: (i) different listeners find different errors in the same audio recordings, but (ii) the frequencies of error patterns are similar across listeners; (iii) errors collected “online” using on the spot observational techniques are more likely to be affected by perceptual biases than “offline” errors collected from audio recordings; and (iv) datasets built from audio recordings can be explored and extended in a number of ways that traditional corpus studies cannot be.

Download Full-text

Practical Application of a Data Stewardship Maturity Matrix for the NOAA OneStop Project

10.31219/osf.io/fp3js ◽

2018 ◽

Author(s):

Ge Peng ◽

Anna Milan ◽

Nancy A. Ritchey ◽

Robert P. Partee ◽

Sonny Zinn ◽

...

Keyword(s):

North Carolina ◽

Best Practices ◽

Data Quality ◽

User Needs ◽

Data Quality Control ◽

Practical Application ◽

Data Discovery ◽

Data Quality Assessment ◽

Data Stewardship ◽

Do So

Assessing the stewardship maturity of individual datasets is an essential part of ensuring and improving the way datasets are documented, preserved, and disseminated to users. It is a critical step towards meeting U.S. federal regulations, organizational requirements, and user needs. However, it is challenging to do so consistently and quantifiably. The Data Stewardship Maturity Matrix (DSMM), developed jointly by NOAA’s National Centers for Environmental Information (NCEI) and the Cooperative Institute for Climate and Satellites–North Carolina (CICS-NC), provides a uniform framework for consistently rating stewardship maturity of individual datasets in nine key components: preservability, accessibility, usability, production sustainability, data quality assurance, data quality control/monitoring, data quality assessment, transparency/traceability, and data integrity. So far, the DSMM has been applied to over 900 individual datasets that are archived and/or managed by NCEI, in support of the NOAA’s OneStop Data Discovery and Access Framework Project. As a part of the OneStop-ready process, tools, implementation guidance, workflows, and best practices are developed to assist the application of the DSMM and described in this paper. The DSMM ratings are also consistently captured in the ISO standard-based dataset-level quality metadata and citable quality descriptive information documents, which serve as interoperable quality information to both machine and human end-users. These DSMM implementation and integration workflows and best practices could be adopted by other data management and stewardship projects or adapted for applications of other maturity assessment models.

Download Full-text

The Effect of Tongue-Tie Release on Speech Articulation and Intelligibility

Ear Nose & Throat Journal ◽

10.1177/01455613211064045 ◽

2021 ◽

pp. 014556132110640

Author(s):

Jonathan Melong ◽

Michael Bezuhly ◽

Paul Hong

Keyword(s):

Speech Intelligibility ◽

Developmentally Appropriate ◽

Speech Sound ◽

Speech Errors ◽

Speech Articulation ◽

Speech Language Pathologist ◽

Audio Recordings ◽

Before And After ◽

Consistent Manner ◽

Age Appropriate

Objective The relationship between ankyloglossia and speech is controversial. The objective of this study was to determine the effect of tongue-tie release on speech articulation and intelligibility. Methods A prospective cohort study was conducted. Pediatric patients (>2 years of age) being referred for speech concerns due to ankyloglossia were assessed by a pediatric otolaryngologist, and speech articulation was formally assessed by a speech language pathologist using the Goldman-Fristoe Test of Articulation 2 (GFTA-2). Patients then underwent a tongue-tie release procedure in clinic. After 1 month, speech articulation was reassessed with GFTA-2. Audio-recordings of sessions were evaluated by independent reviewers to assess speech intelligibility before and after tongue-tie release. Results Twenty-five participants were included (mean age 3.7 years; 20 boys). The most common speech errors identified were phonological substitutions (80%) and gliding errors (56%). Seven children (28%) had abnormal lingual-alveolar and interdental sounds. Most speech sound errors (87.9%) were age/developmentally appropriate. GFTA-2 standard scores before and after tongue-tie release were 85.61 (SD 9.75) and 87.54 (SD 10.21), respectively, (P=.5). Mean intelligibility scores before and after tongue-tie release were 3.15 (SD .22) and 3.21 (SD .31), respectively, (P=.43). Conclusion The majority of children being referred for speech concerns thought to be due to ankyloglossia had age-appropriate speech errors at presentation. Ankyloglossia was not associated with isolated tongue mobility related speech articulation errors in a consistent manner, and there was no benefit of tongue-tie release in improving speech articulation or intelligibility.

Download Full-text

Spatial Metadata Usability Evaluation

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9070463 ◽

2020 ◽

Vol 9 (7) ◽

pp. 463 ◽

Cited By ~ 1

Author(s):

Mohsen Kalantari ◽

Syahrudin Syahrudin ◽

Abbas Rajabifard ◽

Hardi Subagyo ◽

Hannah Hubbard

Keyword(s):

Data Quality ◽

User Interfaces ◽

Spatial Data ◽

Irrelevant Information ◽

End Users ◽

Data Discovery ◽

Data Infrastructure ◽

Critical Part ◽

Efficiency And Effectiveness ◽

Effectiveness And Efficiency

Spatial metadata is a critical part of any spatial data infrastructure, which enables the organising, sharing, discovery and use of spatial data. This paper highlights a knowledge gap in the usability of the metadata systems for the end–users. It then addresses the gap by applying the User Centred Design approach to investigate the usability of metadata records. The research engages with end–users concerning efficiency and effectiveness of metadata systems, and end–users’ satisfaction and expectations. The results indicate significant gaps with the effectiveness and efficiency of metadata systems for spatial data discovery and selection. Inconsistency and irrelevant information in the metadata records were found in the title, keywords, abstracts, data quality and other elements of the metadata. Additionally, essential improvements were identified for user interfaces. Discouraging presentation of the metadata is a prominent problem found in the interface of the metadata systems.

Download Full-text

An Optimality Analysis and Treatment of Phonological Disorders in the Speech of Jordanian Children: A Case Study

International Journal of Linguistics ◽

10.5296/ijl.v10i5.12615 ◽

2018 ◽

Vol 10 (5) ◽

pp. 1

Author(s):

Maha S. Yaseen ◽

Radwan S. Mahadin

Keyword(s):

Speech Disorders ◽

Speech Errors ◽

Treatment Goals ◽

Error Patterns ◽

Phonological Disorders ◽

Children's Speech ◽

Cluster Reduction

This paper presents a case study of a Jordanian child with phonological speech disorders. It seeks to investigate functional phonological disorders and their treatment among Jordanian children within an Optimality Theoretic (OT) perspective. It aims to provide treatment for children’s speech errors within a constraint-based system. The analysis of the data identifies seven error patterns in the child’s productions, namely: fronting, lateralization, stopping, devoicing, de-emphasization, syllable deletion and cluster reduction. Furthermore, OT is employed at the end of the study as a guideline to select the priority of treatment goals by demoting responsible markedness constraints below faithfulness constraints.

Download Full-text

Going beyond FAIR to increase data reliability

10.5194/egusphere-egu2020-11117 ◽

2020 ◽

Author(s):

Uta Koedel ◽

Peter Dietrich

Keyword(s):

Data Quality ◽

Information Needs ◽

Transfer Functions ◽

Measurement Data ◽

Secondary Data ◽

Data Uncertainty ◽

Calibration Data ◽

Data Reliability ◽

Secondary Users ◽

Site Field

The FAIR principle is on its way to becoming a conventional standard for all kinds of data. However, it is often forgotten that this principle does not consider data quality or data reliability issues. If the data quality isis not sufficiently described, a wrong interpretation and use of these data in a common interpretation can lead to false scientific conclusions. Hence, the statement about data reliability is an essential component for secondary data processing and joint interpretation efforts. Information on data reliability, uncertainty, quality as well as information on the used devices are essential and needs to be introduced or even implemented in the workflow from the sensor to a database if data is to be considered in a broader context.In the past, many publications have shown that the same devices at the same location do not necessarily provide the same measurement data. Likewise, statistical quantities and confidence intervals are rarely given in publications in order to assess the reliability of the data. Many secondary users of measurement data assume that calibration data and the measurement of other auxiliary variables are sufficient to estimate the data reliability. However, even if some devices require on-site field calibration, that does not mean that the data are comparable. Heat, cold, internal processes on electronic components can lead to differences in measurement data recorded with devices of the same type at the same location, especially with the increasingly complex devices themselves.The data reliability can be increased by implementing data uncertainty issues within the FAIR principle. The poster presentation will show the importance of comparative measurements, the information needs for the application of proxy-transfer functions, and suitable uncertainty analysis for databases.

Download Full-text

An Optimality Analysis and Treatment of Phonological Disorders in the Speech of Jordanian Children: A Case Study

International Journal of Linguistics ◽

10.5296/ijl.v10i1.12615 ◽

2018 ◽

Vol 10 (1) ◽

pp. 162

Author(s):

Maha S. Yaseen ◽

Radwan S. Mahadin

Keyword(s):

Speech Disorders ◽

Speech Errors ◽

Treatment Goals ◽

Error Patterns ◽

Phonological Disorders ◽

Children's Speech ◽

Cluster Reduction

Download Full-text

MLGaze: Machine Learning-Based Analysis of Gaze Error Patterns in Consumer Eye Tracking Systems

Vision ◽

10.3390/vision4020025 ◽

2020 ◽

Vol 4 (2) ◽

pp. 25

Author(s):

Anuradha Kar

Keyword(s):

Machine Learning ◽

Eye Tracking ◽

Data Quality ◽

Regression Models ◽

Operating Conditions ◽

Eye Tracker ◽

Error Sources ◽

Error Patterns ◽

Machine Learning Methods ◽

The Impact

Analyzing the gaze accuracy characteristics of an eye tracker is a critical task as its gaze data is frequently affected by non-ideal operating conditions in various consumer eye tracking applications. In previous research on pattern analysis of gaze data, efforts were made to model human visual behaviors and cognitive processes. What remains relatively unexplored are questions related to identifying gaze error sources as well as quantifying and modeling their impacts on the data quality of eye trackers. In this study, gaze error patterns produced by a commercial eye tracking device were studied with the help of machine learning algorithms, such as classifiers and regression models. Gaze data were collected from a group of participants under multiple conditions that commonly affect eye trackers operating on desktop and handheld platforms. These conditions (referred here as error sources) include user distance, head pose, and eye-tracker pose variations, and the collected gaze data were used to train the classifier and regression models. It was seen that while the impact of the different error sources on gaze data characteristics were nearly impossible to distinguish by visual inspection or from data statistics, machine learning models were successful in identifying the impact of the different error sources and predicting the variability in gaze error levels due to these conditions. The objective of this study was to investigate the efficacy of machine learning methods towards the detection and prediction of gaze error patterns, which would enable an in-depth understanding of the data quality and reliability of eye trackers under unconstrained operating conditions. Coding resources for all the machine learning methods adopted in this study were included in an open repository named MLGaze to allow researchers to replicate the principles presented here using data from their own eye trackers.

Download Full-text

Conducting in-depth interviews with and without voice recorders: a comparative analysis

Qualitative Research ◽

10.1177/1468794119884806 ◽

2019 ◽

Vol 20 (5) ◽

pp. 565-581 ◽

Cited By ~ 5

Author(s):

Rwamahe Rutakumwa ◽

Joseph Okello Mugisha ◽

Sarah Bernays ◽

Elizabeth Kabunga ◽

Grace Tumwekwase ◽

...

Keyword(s):

Comparative Analysis ◽

Data Quality ◽

Group Discussions ◽

Second Best ◽

Audio Recordings ◽

Depth Interviews

The use of audio recordings has become a taken-for-granted approach to generating transcripts of in-depth interviewing and group discussions. In this paper we begin by describing circumstances where the use of a recorder is not, or may not be, possible, before sharing our comparative analysis of audio-recorded transcriptions and interview scripts made from notes taken during the interview (by experienced, well-trained interviewers). Our comparison shows that the data quality between audio-recorded transcripts and interview scripts written directly after the interview were comparable in the detail captured. The structures of the transcript and script were usually different because in the interview scripts, topics and ideas were grouped, rather than being in the more scattered order of the conversation in the transcripts. We suggest that in some circumstances not recording is the best approach, not ‘second best’.

Download Full-text

The representation of phonological information during speech production planning:evidence from vowel errors in spontaneous speech

Phonology Yearbook ◽

10.1017/s0952675700000609 ◽

1986 ◽

Vol 3 ◽

pp. 117-149 ◽

Cited By ~ 40

Author(s):

John J. Ohala ◽

Stefanie Shattuck-Hufnagel

Keyword(s):

Speech Production ◽

Distinctive Feature ◽

Spontaneous Speech ◽

Lexical Stress ◽

Speech Errors ◽

Phonological Information ◽

Error Patterns ◽

Feature Similarity

ABSTRACTA corpus of more than 500 speech errors that involve a vowel or syllabic nucleus is examined for evidence that bears on the nature of the processing representation that is in force when such errors occur. Evidence is obtained from the patterns of similarity between target segments and the intrusion segments that replace them in errors, on the assumption that target– intrusion similarity arises from characteristics of the processing representation. Findings include (1) a distinctive feature similarity between vowel targets and intrusions, (2) evidence that complex syllabic nuclei can function as error units and (3) evidence that vowel errors are constrained by lexical stress. Finally, the error patterns in both vowels and consonants, and the processing representations they suggest, are evaluated in the light of recent theoretical proposals about the phonological component of the grammar.

Download Full-text

Data Reliability in a Citizen Science Protocol for Monitoring Stingless Bees Flight Activity

Insects ◽

10.3390/insects12090766 ◽

2021 ◽

Vol 12 (9) ◽

pp. 766

Author(s):

Jailson N. Leocadio ◽

Natalia P. Ghilardi-Lopes ◽

Sheina Koffler ◽

Celso Barbiéri ◽

Tiago M. Francoy ◽

...

Keyword(s):

Data Quality ◽

Citizen Science ◽

Stingless Bees ◽

Scientific Literature ◽

Original Data ◽

Flight Activity ◽

Practical Training ◽

Data Reliability ◽

Potential Source

Although the quality of citizen science (CS) data is often a concern, evidence for high-quality CS data increases in the scientific literature. This study aimed to assess the data reliability of a structured CS protocol for monitoring stingless bees’ flight activity. We tested (1) data accuracy for replication among volunteers and for expert validation and (2) precision, comparing dispersion between citizen scientists and expert data. Two distinct activity dimensions were considered: (a) perception of flight activity and (b) flight activity counts (entrances, exits, and pollen load). No significant differences were found among groups regarding entrances and exits. However, replicator citizen scientists presented a higher chance of perceiving pollen than original data collectors and experts, likely a false positive. For those videos in which there was an agreement about pollen presence, the effective pollen counts were similar (with higher dispersion for citizen scientists), indicating the reliability of CS-collected data. The quality of the videos, a potential source of variance, did not influence the results. Increasing practical training could be an alternative to improve pollen data quality. Our study shows that CS provides reliable data for monitoring bee activity and highlights the relevance of a multi-dimensional approach for assessing CS data quality.

Download Full-text