Evaluation of Acoustic Analyses of Voice in Nonoptimized Conditions

2020 ◽  
Vol 63 (12) ◽  
pp. 3991-3999
Author(s):  
Benjamin van der Woerd ◽  
Min Wu ◽  
Vijay Parsa ◽  
Philip C. Doyle ◽  
Kevin Fung

Objectives This study aimed to evaluate the fidelity and accuracy of a smartphone microphone and recording environment on acoustic measurements of voice. Method A prospective cohort proof-of-concept study. Two sets of prerecorded samples (a) sustained vowels (/a/) and (b) Rainbow Passage sentence were played for recording via the internal iPhone microphone and the Blue Yeti USB microphone in two recording environments: a sound-treated booth and quiet office setting. Recordings were presented using a calibrated mannequin speaker with a fixed signal intensity (69 dBA), at a fixed distance (15 in.). Each set of recordings (iPhone—audio booth, Blue Yeti—audio booth, iPhone—office, and Blue Yeti—office), was time-windowed to ensure the same signal was evaluated for each condition. Acoustic measures of voice including fundamental frequency ( f o ), jitter, shimmer, harmonic-to-noise ratio (HNR), and cepstral peak prominence (CPP), were generated using a widely used analysis program (Praat Version 6.0.50). The data gathered were compared using a repeated measures analysis of variance. Two separate data sets were used. The set of vowel samples included both pathologic ( n = 10) and normal ( n = 10), male ( n = 5) and female ( n = 15) speakers. The set of sentence stimuli ranged in perceived voice quality from normal to severely disordered with an equal number of male ( n = 12) and female ( n = 12) speakers evaluated. Results The vowel analyses indicated that the jitter, shimmer, HNR, and CPP were significantly different based on microphone choice and shimmer, HNR, and CPP were significantly different based on the recording environment. Analysis of sentences revealed a statistically significant impact of recording environment and microphone type on HNR and CPP. While statistically significant, the differences across the experimental conditions for a subset of the acoustic measures (viz., jitter and CPP) have shown differences that fell within their respective normative ranges. Conclusions Both microphone and recording setting resulted in significant differences across several acoustic measurements. However, a subset of the acoustic measures that were statistically significant across the recording conditions showed small overall differences that are unlikely to have clinical significance in interpretation. For these acoustic measures, the present data suggest that, although a sound-treated setting is ideal for voice sample collection, a smartphone microphone can capture acceptable recordings for acoustic signal analysis.

2018 ◽  
Vol 7 (1) ◽  
pp. 1-15
Author(s):  
Wellington da Silva ◽  
Ana Carolina Constantini

The acoustic analysis of speech has proved useful in the clinical evaluation of dysphonia, for it allows an objective assessment of the voice. However, the literature has suggested that the type of speech task used to obtain voice samples from patients (sustained vowel or connected speech) may affect both the perceptual and the acoustic evaluation of dysphonic voices. This study aimed at investigating whether the type of speech task significantly influences the acoustic analysis of dysphonic voices. Five acoustic parameters related to voice quality (cepstral peak prominence, difference between the magnitudes of the first and second harmonics, harmonics-to-noise ratio, jitter and shimmer) were automatically computed from voice samples of 5 female and 5 male subjects with and without dysphonia. These recordings consisted of three types of speech task: connected speech, count and sustained vowel. Analyses of variance with repeated measures showed that all five acoustic parameters were significantly affected by speech task. Further analyses through the Duncan’s multiple-range test indicated that the type of speech task may also influence the discrimination of dysphonic voices. It is concluded that speech task affects the acoustic assessment of dysphonic voices by significantly raising or reducing the values of the acoustic parameters.


2020 ◽  
Vol 10 (23) ◽  
pp. 8598 ◽  
Author(s):  
Angélica Emygdio da Silva Antonetti ◽  
Larissa Thais Donalonso Siqueira ◽  
Maria Paula de Almeida Gobbo ◽  
Alcione Ghedini Brasolotto ◽  
Kelly Cristina Alves Silverio

Cepstral peak prominence-smoothed (CPPs) and long-term average spectrum (LTAS) are robust measures that represent the glottal source and source-filter interactions, respectively. Until now, little has been known about how physiological events impact auditory–perceptual characteristics in the objective measures of CPPs and LTAS (alpha ratio; L1–L0). Thus, this paper aims to analyze the relationship between such acoustic measures and auditory–perceptual analysis and then determine which acoustic measure best represents voice quality. We analyzed 53 voice samples of vocally healthy participants (vocally healthy group-VHG) and 49 voice samples of participants with behavioral dysphonia (dysphonic group-DG). Each voice sample was composed of sustained vowel /a/ and connected speech. CPPs seem to be the best predictor of voice deviation in both studied populations because there was moderate to strong negative correlations with general degree, breathiness, roughness, and strain (auditory–perceptual parameters). Regarding L1–L0, this measure is related to breathiness (moderate negative correlations). Hence, L1–L0 provides information about air leak through closed glottis, assisting the phonatory efficiency analysis.


2021 ◽  
Vol 2 (1) ◽  
Author(s):  
Celine Larkin ◽  
Alexandra M. Sanseverino ◽  
James Joseph ◽  
Lauren Eisenhauer ◽  
Martin A. Reznek

Abstract Background Audit and feedback (A&F) has been used as a strategy to modify clinician behavior with moderate success. Although A&F is theorized to work by improving the accuracy of clinicians’ estimates of their own behavior, few interventions have included assessment of clinicians’ estimates at baseline to examine whether they account for intervention success or failure. We tested an A&F intervention to reduce computed tomography (CT) ordering by emergency physicians, while also examining the physicians’ baseline estimates of their own behavior compared to peers. Methods Our study was a prospective, multi-site, 20-month, randomized trial to examine the effect of an A&F intervention on CT ordering rates, overall and by test subtype. From the electronic health record, we obtained 12 months of baseline CT ordering per 100 patients treated for every physician from four emergency departments. Those who were randomized to receive A&F were shown a de-identified graph of the group’s baseline CT utilization, asked to estimate wherein the distribution of their own CT order practices fell, and then shown their actual performance. All participants also received a brief educational intervention. CT ordering rates were collected for all physicians for 6 months after the intervention. Pre-post ordering rates were compared using independent and repeated measures t tests. Results Fifty-one of 52 eligible physicians participated. The mean CT ordering rate increased significantly in both experimental conditions after the intervention (intervention pre = 35.7, post = 40.3, t = 4.13, p < 0.001; control pre = 33.9, post = 38.9, t = 3.94, p = 0.001), with no significant between-group difference observed at follow-up (t = 0.43, p = 0.67). Within the intervention group, physicians had poor accuracy in estimating their own ordering behavior at baseline: most overestimated and all guessed that they were in the upper half of the distribution of their peers. CT ordering increased regardless of self-estimate accuracy. Conclusions Our A&F intervention failed to reduce physician CT ordering: our feedback to the physicians showed most of them that they had overestimated their CT ordering behavior, and they were therefore unlikely to reduce it as a result. After “audit,” it may be prudent to assess baseline clinician awareness of behavior before moving toward a feedback intervention.


2017 ◽  
Vol 23 (1) ◽  
pp. 1-20
Author(s):  
Kathy Connaughton ◽  
Irena Yanushevskaya

Objective: This study explores the immediate impact of prolonged voice use by professional sports coaches. Method: Speech samples including sustained phonation of vowel /a/ and a short read passage were collected from two professional sports coaches. The audio recordings were made within an hour before and after a coaching session, over three sessions. Perceptual evaluation of voice quality was done using the GRBAS scale. The speech samples were subsequently analyzed using Praat. The acoustic measures included fundamental frequency (f0), jitter, shimmer, Harmonics-to-Noise ratio and Cepstral Peak Prominence. Main results: The results of perceptual and acoustic analysis suggest a slight shift towards a tenser phonation post-coaching session, which is a likely consequence of laryngeal muscle adaptation to prolonged voice use. This tendency was similar in sustained vowels and connected speech. Conclusion: Acoustic measures used in this study can be useful to capture the voice change post-coaching session. It is desirable, however, that more sophisticated and robust and at the same time intuitive and easy-to-use tools for voice assessment and monitoring be made available to clinicians and professional voice users.


2019 ◽  
Vol 30 (19) ◽  
pp. 2435-2438 ◽  
Author(s):  
Jonah Cool ◽  
Richard S. Conroy ◽  
Sean E. Hanlon ◽  
Shannon K. Hughes ◽  
Ananda L. Roy

Improvements in the sensitivity, content, and throughput of microscopy, in the depth and throughput of single-cell sequencing approaches, and in computational and modeling tools for data integration have created a portfolio of methods for building spatiotemporal cell atlases. Challenges in this fast-moving field include optimizing experimental conditions to allow a holistic view of tissues, extending molecular analysis across multiple timescales, and developing new tools for 1) managing large data sets, 2) extracting patterns and correlation from these data, and 3) integrating and visualizing data and derived results in an informative way. The utility of these tools and atlases for the broader scientific community will be accelerated through a commitment to findable, accessible, interoperable, and reusable data and tool sharing principles that can be facilitated through coordination and collaboration between programs working in this space.


1992 ◽  
Vol 262 (3) ◽  
pp. G581-G592 ◽  
Author(s):  
V. Licko ◽  
E. B. Ekblad

A simple single-state nonlinear mathematical model of an open metabolic system is shown to be an adequate representation of acid secretion in frog gastric mucosa. The parameters of the elemental model were estimated from data, and subsystems of augmented models were established for histamine, a nonconservative stimulus acting via binding to receptors, and for two inhibitors of acid secretion. The latter included metiamide, a nonconservative histamine antagonist, which affects the formation of acid by competitive binding to the histamine receptors, and nitrite, a conservative inhibitor, which affects the rate of acid translocation. For parameter estimation, two data sets were analyzed by a nonlinear least-squares procedure: a dynamic (time dependent) set consisting of individual acid secretion rate curves and an integral (time independent) set consisting of the curves of acid secreted and suppressed above or below baseline, respectively, as functions of agent exposure. Because the model is instrumental in the estimation of parameters of subsystems that are not accessible through direct observation, it can serve as a research tool in the investigation of the mechanism of gastric acid secretion under a variety of experimental conditions.


GigaScience ◽  
2020 ◽  
Vol 9 (1) ◽  
Author(s):  
T Cameron Waller ◽  
Jordan A Berg ◽  
Alexander Lex ◽  
Brian E Chapman ◽  
Jared Rutter

Abstract Background Metabolic networks represent all chemical reactions that occur between molecular metabolites in an organism’s cells. They offer biological context in which to integrate, analyze, and interpret omic measurements, but their large scale and extensive connectivity present unique challenges. While it is practical to simplify these networks by placing constraints on compartments and hubs, it is unclear how these simplifications alter the structure of metabolic networks and the interpretation of metabolomic experiments. Results We curated and adapted the latest systemic model of human metabolism and developed customizable tools to define metabolic networks with and without compartmentalization in subcellular organelles and with or without inclusion of prolific metabolite hubs. Compartmentalization made networks larger, less dense, and more modular, whereas hubs made networks larger, more dense, and less modular. When present, these hubs also dominated shortest paths in the network, yet their exclusion exposed the subtler prominence of other metabolites that are typically more relevant to metabolomic experiments. We applied the non-compartmental network without metabolite hubs in a retrospective, exploratory analysis of metabolomic measurements from 5 studies on human tissues. Network clusters identified individual reactions that might experience differential regulation between experimental conditions, several of which were not apparent in the original publications. Conclusions Exclusion of specific metabolite hubs exposes modularity in both compartmental and non-compartmental metabolic networks, improving detection of relevant clusters in omic measurements. Better computational detection of metabolic network clusters in large data sets has potential to identify differential regulation of individual genes, transcripts, and proteins.


2005 ◽  
Vol 01 (01) ◽  
pp. 129-145 ◽  
Author(s):  
XIAOBO ZHOU ◽  
XIAODONG WANG ◽  
EDWARD R. DOUGHERTY

In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables (gene expressions) and the small number of experimental conditions. Many gene-selection and classification methods have been proposed; however most of these treat gene selection and classification separately, and not under the same model. We propose a Bayesian approach to gene selection using the logistic regression model. The Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the minimum description length (MDL) principle are used in constructing the posterior distribution of the chosen genes. The same logistic regression model is then used for cancer classification. Fast implementation issues for these methods are discussed. The proposed methods are tested on several data sets including those arising from hereditary breast cancer, small round blue-cell tumors, lymphoma, and acute leukemia. The experimental results indicate that the proposed methods show high classification accuracies on these data sets. Some robustness and sensitivity properties of the proposed methods are also discussed. Finally, mixing logistic-regression based gene selection with other classification methods and mixing logistic-regression-based classification with other gene-selection methods are considered.


Phonetica ◽  
2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Qandeel Hussain ◽  
Alexei Kochetov

Abstract Punjabi is an Indo-Aryan language which contrasts a rich set of coronal stops at dental and retroflex places of articulation across three laryngeal configurations. Moreover, all these stops occur contrastively in various positions (word-initially, -medially, and -finally). The goal of this study is to investigate how various coronal place and laryngeal contrasts are distinguished acoustically both within and across word positions. A number of temporal and spectral correlates were examined in data from 13 speakers of Eastern Punjabi: Voice Onset Time, release and closure durations, fundamental frequency, F1-F3 formants, spectral center of gravity and standard deviation, H1*-H2*, and cepstral peak prominence. The findings indicated that higher formants and spectral measures were most important for the classification of place contrasts across word positions, whereas laryngeal contrasts were reliably distinguished by durational and voice quality measures. Word-medially and -finally, F2 and F3 of the preceding vowels played a key role in distinguishing the dental and retroflex stops, while spectral noise measures were more important word-initially. The findings of this study contribute to a better understanding of factors involved in the maintenance of typologically rare and phonetically complex sets of place and laryngeal contrasts in the coronal stops of Indo-Aryan languages.


2004 ◽  
Vol 29 (5) ◽  
pp. 538-544 ◽  
Author(s):  
P.N. Carding ◽  
I.N. Steen ◽  
A. Webb ◽  
K. MacKenzie ◽  
I.J. Deary ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document