acoustic measures
Recently Published Documents

Abstract Since the fundamental phases of the learning process take place in elementary classrooms, it is necessary to guarantee a proper acoustic environment for the listening activity to children immersed in them. In this framework, speech intelligibility is especially important. In order to better understand and objectively quantify the effect of background noise and reverberation on speech intelligibility various models have been developed. Here, a binaural speech intelligibility model (BSIM) is investigated for speech intelligibility predictions in a real classroom considering the effect of talker-to-listener distance and binaural unmasking due to the spatial separation of noise and speech source. BSIM predictions are compared to the well-established room acoustic measures as reverberation time (T30), clarity or definition. Objective acoustical measurements were carried out in one Italian primary school classroom before (T30= 1.43s±0.03 s) and after (T30= 0.45±0.02 s) the acoustical treatment. Speech reception thresholds (SRTs) corresponding to signal-to-noise ratio yielding 80% of speech intelligibility will be obtained through the BSIM simulations using the measured binaural room impulse responses (BRIRs). A focus on the effect of different speech and noise source spatial positions on the SRT values will aim to show the importance of a model able to deal with the binaural aspects of the auditory system. In particular, it will be observed how the position of the noise source influences speech intelligibility when the target speech source lies always in the same position.

Download Full-text

ACOUSTIC ANALYSIS OF VOICE MEAS- URES IN CHRONIC LARYNGITIS PATIENTS

Fiziolohichnyĭ zhurnal ◽

10.15407/fz67.06.046 ◽

2021 ◽

Vol 67 (6) ◽

pp. 46-51

Author(s):

P.M. Kovalchuk ◽

◽

T.A. Shydlovska ◽

Keyword(s):

Significant Proportion ◽

Harmonic Component ◽

Noise Component ◽

Acoustic Measures ◽

Voice Signal ◽

Chemical Factors ◽

Chronic Laryngitis ◽

Group 2 ◽

The Voice ◽

Group 1

We aimed to analyse voice signals in 40 patients with chronic laryngitis elicited by exposure to chemical factors. We ex- amined 20 people with catarrhal chronic laryngitis (group 1), 20 people with subatrophic chronic laryngitis (group 2) and 15 healthy volunteers as controls. All subjects underwent acoustic examination of the voice signal using the software Praat V 4.2.1. We studied acoustic measures as follows: Jitter, Shimmer and NHR (noise-to-harmonics ratio). The analysis of the obtained data revealed statistically significant differ- ences in the average values of Jitter and Shimmer measures, as well as in the ratio of nonharmonic (noise) and harmonic component in the spectrum ( NHR) in patients with chronic laryngitis (groups 1 and 2) compared with controls. In group 1 (chronic catarrhal laryngitis), the average values of acoustic measures such as Jitter, Shimmer and NHR were as follows: Jitter - 0.92 ± 0.1%, Shimmer - 5.31 ± 0.5%, NHR - 0.078 ± 0.04. In group 2 (subatorophic laryngitis), the average values of acoustic measures were: Jitter - 0.67 ± 0.6%, Shimmer - 6.57 ± 0.7% and NHR - 0.028 ± 0.003. The obtained data indicate a pronounced instability of the voice in frequency and amplitude, a significant proportion of the noise component in the spectrum of the voice signal in the examined patients with chronic laryngitis exposed to chemical factors. The most pronounced alterations were found in patients with catarrhal chronic laryngitis. We conclude that the quantitative values of spectral analysis of the voice signal Jitter, Shimmer, NHR may serve as valuable criteria of the degree of voice impair- ment. This may be helpful in determining the effectiveness of rehabilitation measures.

Download Full-text

Co-Occurrence of Hypernasality and Voice Impairment in Amyotrophic Lateral Sclerosis: Acoustic Quantification

Journal of Speech Language and Hearing Research ◽

10.1044/2021_jslhr-21-00123 ◽

2021 ◽

pp. 1-12

Author(s):

Marziye Eshghi ◽

Kathryn P. Connaghan ◽

Sarah E. Gutz ◽

James D. Berry ◽

Yana Yunusova ◽

...

Keyword(s):

Amyotrophic Lateral Sclerosis ◽

Spectral Ratio ◽

Speech Impairment ◽

Acoustic Measures ◽

Diagnostic Potential ◽

Acoustic Quantification ◽

The One ◽

Optimal Approach ◽

Future Evaluation ◽

Lateral Sclerosis

Purpose Hypernasality and atypical voice characteristics are common features of dysarthric speech due to amyotrophic lateral sclerosis (ALS). Existing acoustic measures have been developed to primarily target either hypernasality or voice impairment, and the effects of co-occurring hypernasality-voice problems on these measures are unknown. This report explores (a) the extent to which acoustic measures are affected by concurrent perceptually identified hypernasality and voice impairment due to ALS and (b) candidate acoustic measures of early indicators of hypernasality and voice impairment in the presence of multisystem involvement in individuals with ALS. Method Two expert listeners rated severity of hypernasality and voice impairment in sentences produced by individuals with ALS ( n = 27). The samples were stratified based on perceptual ratings: voice/hypernasality asymptomatic, predominantly hypernasal, predominantly voice impairment, and mixed (co-occurring hypernasality and voice impairment). Groups were compared using established acoustic measures of hypernasality (one-third octave analysis) and voice (cepstral/spectral analysis) impairment. Results The one-third octave analysis differentiated all groups; the cepstral peak prominence differentiated all groups except asymptomatic versus mixed, whereas the low-to-high spectral ratio did not differ among groups. Additionally, one-third octave analyses demonstrated promising speech diagnostic potential. Conclusions The results highlight the need to consider the validity of measures in the context of multisubsystem involvement. Our preliminary findings further suggest that the one-third octave analysis may be an optimal approach to quantify hypernasality and voice abnormalities in the presence of multisystem speech impairment. Future evaluation of the diagnostic accuracy of the one-third octave analysis is warranted.

Download Full-text

Convergence in voice fundamental frequency during synchronous speech

PLoS ONE ◽

10.1371/journal.pone.0258747 ◽

2021 ◽

Vol 16 (10) ◽

pp. e0258747

Author(s):

Abigail R. Bradshaw ◽

Carolyn McGettigan

Keyword(s):

Fundamental Frequency ◽

Speech Motor Control ◽

Self And Other ◽

Acoustic Measures ◽

Speech Timing ◽

Video Recordings ◽

Speech Feedback ◽

Voice Fundamental Frequency ◽

Speech Motor ◽

Speech Condition

Joint speech behaviours where speakers produce speech in unison are found in a variety of everyday settings, and have clinical relevance as a temporary fluency-enhancing technique for people who stutter. It is currently unknown whether such synchronisation of speech timing among two speakers is also accompanied by alignment in their vocal characteristics, for example in acoustic measures such as pitch. The current study investigated this by testing whether convergence in voice fundamental frequency (F0) between speakers could be demonstrated during synchronous speech. Sixty participants across two online experiments were audio recorded whilst reading a series of sentences, first on their own, and then in synchrony with another speaker (the accompanist) in a number of between-subject conditions. Experiment 1 demonstrated significant convergence in participants’ F0 to a pre-recorded accompanist voice, in the form of both upward (high F0 accompanist condition) and downward (low and extra-low F0 accompanist conditions) changes in F0. Experiment 2 demonstrated that such convergence was not seen during a visual synchronous speech condition, in which participants spoke in synchrony with silent video recordings of the accompanist. An audiovisual condition in which participants were able to both see and hear the accompanist in pre-recorded videos did not result in greater convergence in F0 compared to synchronisation with the pre-recorded voice alone. These findings suggest the need for models of speech motor control to incorporate interactions between self- and other-speech feedback during speech production, and suggest a novel hypothesis for the mechanisms underlying the fluency-enhancing effects of synchronous speech in people who stutter.

Download Full-text

Disentangling acoustic measures from lexical statistics in child-directed speech

The Journal of the Acoustical Society of America ◽

10.1121/10.0007946 ◽

2021 ◽

Vol 150 (4) ◽

pp. A150-A150

Author(s):

Margaret Cychosz ◽

Jan R. Edwards ◽

Nan Bernstein Ratner ◽

Catherine Torrington Eaton ◽

Rochelle S. Newman

Keyword(s):

Acoustic Measures ◽

Lexical Statistics

Download Full-text

Source and Filter Acoustic Measures of Young, Middle-Aged and Elderly Adults for Application in Vowel Synthesis

Journal of Voice ◽

10.1016/j.jvoice.2021.08.025 ◽

2021 ◽

Author(s):

Giovanna Castilho Davatz ◽

Rosiane Yamasaki ◽

Adriana Hachiya ◽

Domingos Hiroshi Tsuji ◽

Arlindo Neto Montagnoli

Keyword(s):

Elderly Adults ◽

Middle Aged ◽

Acoustic Measures

Download Full-text

Experimental Evaluation of Deep Learning Methods for an Intelligent Pathological Voice Detection System Using the Saarbruecken Voice Database

Applied Sciences ◽

10.3390/app11157149 ◽

2021 ◽

Vol 11 (15) ◽

pp. 7149

Author(s):

Ji-Yeoun Lee

Keyword(s):

Neural Network ◽

Deep Learning ◽

Linear Prediction ◽

Detection System ◽

Mel Frequency Cepstral Coefficients ◽

Learning Methods ◽

Acoustic Measures ◽

Voice Detection ◽

Voice Data ◽

Pathological Voice

This work is focused on deep learning methods, such as feedforward neural network (FNN) and convolutional neural network (CNN), for pathological voice detection using mel-frequency cepstral coefficients (MFCCs), linear prediction cepstrum coefficients (LPCCs), and higher-order statistics (HOSs) parameters. In total, 518 voice data samples were obtained from the publicly available Saarbruecken voice database (SVD), comprising recordings of 259 healthy and 259 pathological women and men, respectively, and using /a/, /i/, and /u/ vowels at normal pitch. Significant differences were observed between the normal and the pathological voice signals for normalized skewness (p = 0.000) and kurtosis (p = 0.000), except for normalized kurtosis (p = 0.051) that was estimated in the /u/ samples in women. These parameters are useful and meaningful for classifying pathological voice signals. The highest accuracy, 82.69%, was achieved by the CNN classifier with the LPCCs parameter in the /u/ vowel in men. The second-best performance, 80.77%, was obtained with a combination of the FNN classifier, MFCCs, and HOSs for the /i/ vowel samples in women. There was merit in combining the acoustic measures with HOS parameters for better characterization in terms of accuracy. The combination of various parameters and deep learning methods was also useful for distinguishing normal from pathological voices.

Download Full-text

Towards a cumulative science of prosody in autism: a cross-linguistic meta-analysis-based investigation of acoustic markers in American and Danish autistic children

10.1101/2021.07.13.452165 ◽

2021 ◽

Author(s):

Riccardo Fusaroli ◽

Ruth Grossman ◽

Niels Bilenberg ◽

Cathriona Cantio ◽

Jens Richardt Moellegaard Jepsen ◽

...

Keyword(s):

Speech Rate ◽

Meta Analysis ◽

Open Science ◽

Autism Spectrum ◽

Small Sample ◽

Clinical Feature ◽

Autistic Children ◽

Acoustic Measures ◽

Acoustic Profile ◽

Meta Analyses

Acoustic atypicalities in speech production are widely documented in Autism Spectrum Disorder (ASD) and argued to be both a potential factor in atypical social development and potential markers of clinical features. A recent meta-analysis highlighted shortcomings in the field, in particular small sample sizes and study heterogeneity (Fusaroli, Lambrechts, Bang, Bowler, & Gaigg, 2017). We showcase a cumulative yet self-correcting approach to prosody in ASD to overcome these issues. We analyze a cross-linguistic corpus of multiple speech productions in 77 autistic children and adolescents and 72 TD ones (>1000 recordings in Danish and US English). We replicate findings of a minimal cross-linguistically reliable distinctive acoustic profile for ASD (higher pitch and longer pauses) with moderate effect sizes. We identified novel general reliable differences between the two groups for normalized amplitude quotient, maxima dispersion quotient and creakiness. However, all these relations are small, and there is likely no one general extensive acoustic profile characterizing all autistic individuals. We identified reliable and consistent relations of acoustic features with individual differences (age, gender), and clinical feature: speech rate and ADOS sub-scores (Communication, Social, Stereotyped). Besides cumulatively building our understanding of acoustic atypicalities in ASD, the study concretely shows how to use systematic reviews and meta-analyses to guide follow-up studies, both in their design and their statistical inferences. We indicate future directions: larger and more diverse cross-linguistic datasets, use of previous findings as statistical priors, understanding of covariance between acoustic measures, reliance on machine learning procedures, and open science.

Download Full-text

Improved Estimation of Parkinsonian Vowel Quality through Acoustic Feature Assimilation

The Scientific World JOURNAL ◽

10.1155/2021/6076828 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Amr Gaballah ◽

Vijay Parsa ◽

Daryn Cushnie-Sparrow ◽

Scott Adams

Keyword(s):

Linear Prediction ◽

Voice Quality ◽

Support Vector ◽

Vowel Quality ◽

Acoustic Feature ◽

Acoustic Measures ◽

Quality Ratings ◽

Before And After ◽

Reduction Methods ◽

Perceptual Assessment

This paper investigated the performance of a number of acoustic measures, both individually and in combination, in predicting the perceived quality of sustained vowels produced by people impaired with Parkinson’s disease (PD). Sustained vowel recordings were collected from 51 PD patients before and after the administration of the Levodopa medication. Subjective ratings of the overall vowel quality were garnered using a visual analog scale. These ratings served to benchmark the effectiveness of the acoustic measures. Acoustic predictors of the perceived vowel quality included the harmonics-to-noise ratio (HNR), smoothed cepstral peak prominence (CPP), recurrence period density entropy (RPDE), Gammatone frequency cepstral coefficients (GFCCs), linear prediction (LP) coefficients and their variants, and modulation spectrogram features. Linear regression (LR) and support vector regression (SVR) models were employed to assimilate multiple features. Different feature dimensionality reduction methods were investigated to avoid model overfitting and enhance the prediction capabilities for the test dataset. Results showed that the RPDE measure performed the best among all individual features, while a regression model incorporating a subset of features produced the best overall correlation of 0.80 between the predicted and actual vowel quality ratings. This model may therefore serve as a surrogate for auditory-perceptual assessment of Parkinsonian vowel quality. Furthermore, the model may offer the clinician a tool to predict who may benefit from Levodopa medication in terms of enhanced voice quality.

Download Full-text

Differences in final /z/ realization in Southwest and Northern Virginia

American Speech ◽

10.1215/00031283-9308362 ◽

2021 ◽

pp. 1-49

Author(s):

Rachel Hargrave ◽

Amy Southall ◽

Abby Walker

Keyword(s):

Regional Differences ◽

The South ◽

Surface Level ◽

Acoustic Measures ◽

Southern Us

Two apparently contradictory observations have been made about consonantal voicing in Southern US English: compared to other US varieties, Southern speakers produce more voicing on “voiced” stops, but they also “devoice” word-final /z/ at higher rates. In this paper, regional differences in final /z/ realization within Virginia are investigated. 36 students from Southwest and Northern Virginia were recorded completing tasks designed to elicit /z/-final tokens. Tokens were acoustically analyzed for duration and voicing, and automatically categorized as being [z] or [s] using an HTK forced aligner. At the surface level, the two approaches yield incompatible results: the single acoustic measures suggest Southwest Virginians produce more [z]-like /z/ tokens than Northern Virginians, and the aligner finds that Southern-identifying participants produce the most [s]-like tokens. However, both analyses converge on the importance of following environment: Southwest Virginians are relatively least voiced pre-pausally, and more voiced in other environments. These combined findings confirm previous work showing that Southern “voiced” consonants generally have more voicing than other regional US varieties but also suggest that the dialect may exhibit greater phrase-final fortition. There are also differences within Southwest Virginian speakers based on differences in their rurality, or in their orientation to the South.

Download Full-text

acoustic measuresRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Binaural Speech Intelligibility in a Real Elementary Classroom

ACOUSTIC ANALYSIS OF VOICE MEAS- URES IN CHRONIC LARYNGITIS PATIENTS

Co-Occurrence of Hypernasality and Voice Impairment in Amyotrophic Lateral Sclerosis: Acoustic Quantification

Convergence in voice fundamental frequency during synchronous speech

Disentangling acoustic measures from lexical statistics in child-directed speech

Source and Filter Acoustic Measures of Young, Middle-Aged and Elderly Adults for Application in Vowel Synthesis

Experimental Evaluation of Deep Learning Methods for an Intelligent Pathological Voice Detection System Using the Saarbruecken Voice Database

Towards a cumulative science of prosody in autism: a cross-linguistic meta-analysis-based investigation of acoustic markers in American and Danish autistic children

Improved Estimation of Parkinsonian Vowel Quality through Acoustic Feature Assimilation

Differences in final /z/ realization in Southwest and Northern Virginia

acoustic measures
Recently Published Documents