voice quality
Recently Published Documents


TOTAL DOCUMENTS

1613
(FIVE YEARS 378)

H-INDEX

63
(FIVE YEARS 6)

2022 ◽  
Vol 22 (1) ◽  
pp. 1-16
Author(s):  
Laura Verde ◽  
Nadia Brancati ◽  
Giuseppe De Pietro ◽  
Maria Frucci ◽  
Giovanna Sannino

Edge Analytics and Artificial Intelligence are important features of the current smart connected living community. In a society where people, homes, cities, and workplaces are simultaneously connected through various devices, primarily through mobile devices, a considerable amount of data is exchanged, and the processing and storage of these data are laborious and difficult tasks. Edge Analytics allows the collection and analysis of such data on mobile devices, such as smartphones and tablets, without involving any cloud-centred architecture that cannot guarantee real-time responsiveness. Meanwhile, Artificial Intelligence techniques can constitute a valid instrument to process data, limiting the computation time, and optimising decisional processes and predictions in several sectors, such as healthcare. Within this field, in this article, an approach able to evaluate the voice quality condition is proposed. A fully automatic algorithm, based on Deep Learning, classifies a voice as healthy or pathological by analysing spectrogram images extracted by means of the recording of vowel /a/, in compliance with the traditional medical protocol. A light Convolutional Neural Network is embedded in a mobile health application in order to provide an instrument capable of assessing voice disorders in a fast, easy, and portable way. Thus, a straightforward mobile device becomes a screening tool useful for the early diagnosis, monitoring, and treatment of voice disorders. The proposed approach has been tested on a broad set of voice samples, not limited to the most common voice diseases but including all the pathologies present in three different databases achieving F1-scores, over the testing set, equal to 80%, 90%, and 73%. Although the proposed network consists of a reduced number of layers, the results are very competitive compared to those of other “cutting edge” approaches constructed using more complex neural networks, and compared to the classic deep neural networks, for example, VGG-16 and ResNet-50.


Author(s):  
Chieh Kao ◽  
Maria D. Sera ◽  
Yang Zhang

Purpose: The aim of this study was to investigate infants' listening preference for emotional prosodies in spoken words and identify their acoustic correlates. Method: Forty-six 3- to-12-month-old infants ( M age = 7.6 months) completed a central fixation (or look-to-listen) paradigm in which four emotional prosodies (happy, sad, angry, and neutral) were presented. Infants' looking time to the string of words was recorded as a proxy of their listening attention. Five acoustic variables—mean fundamental frequency (F0), word duration, intensity variation, harmonics-to-noise ratio (HNR), and spectral centroid—were also analyzed to account for infants' attentiveness to each emotion. Results: Infants generally preferred affective over neutral prosody, with more listening attention to the happy and sad voices. Happy sounds with breathy voice quality (low HNR) and less brightness (low spectral centroid) maintained infants' attention more. Sad speech with shorter word duration (i.e., faster speech rate), less breathiness, and more brightness gained infants' attention more than happy speech did. Infants listened less to angry than to happy and sad prosodies, and none of the acoustic variables were associated with infants' listening interests in angry voices. Neutral words with a lower F0 attracted infants' attention more than those with a higher F0. Neither age nor sex effects were observed. Conclusions: This study provides evidence for infants' sensitivity to the prosodic patterns for the basic emotion categories in spoken words and how the acoustic properties of emotional speech may guide their attention. The results point to the need to study the interplay between early socioaffective and language development.


2021 ◽  
Vol 26 (4) ◽  
pp. 921-932
Author(s):  
Ji Sung Kim ◽  
Seong Hee Choi ◽  
Kyoungjae Lee ◽  
Chul-Hee Choi ◽  
Soo-Geun Wang ◽  
...  

Objectives: The purpose of this study is to investigate the characteristics of vocal fold vibration during sustained vowel /a/ phonation and various semi-occluded vocal tract exercise (SOVTEs) using a vibration simulator and digital kymography (DKG).Methods: A total of 12 normal young speakers (6 males, 6 females) aged 20-30 years participated in the study. They phonated a sustained /a/ vowel and performed SOVTE. The vocal fold vibration characteristics were measured according to the number of vibration sources (single vs. double), and vocal tract occlusion degree using a vibration simulator and DKG. Glottal gap quotient (GQ, %), speed quotient (SQ, %) and amplitude (pixel) were estimated quantitatively from the DKG image.Results: The results showed that significantly higher GQ (p = .000) and SQ (p = .000) were observed in the humming and bilabial fricative /β/ compared to open vowels. The amplitude was significantly higher in the open vowel /a/ than in humming (p = .018) and bilabial fricative /β/ (p = .003). Also, when comparing the vocal fold vibration parameters according to vibration type (single source: straw phonation vs. double source: straw phonation with water), the double source presented a significantly higher GQ (p = .000) as well as SQ (p = .008) in comparison with a single source.Conclusion: SOVTE showed a glottal gap that is different from the opened vowel /a/. It also had a longer opening of the vocal fold and a smaller amplitude than the vowel. This suggests that SOVTE may be helpful for facilitating vocal fold vibration and good voice quality in clinical practice. The current study can be meaningful in providing theoretical and clinical evidence for SOVTE.


Author(s):  
Seung Jin Lee

The auditory-perceptual evaluation of speech-language pathologists (SLP) in patients with voice disorders is often regarded as a touchstone in the multi-dimensional voice evaluation procedures and provides important information not available in other assessment modalities. Therefore, it is necessary for the SLPs to conduct a comprehensive and in-depth evaluation of not only voice but also the overall speech production mechanism, and they often encounter various difficulties in the evaluation process. In addition, SLPs should strive to avoid bias during the evaluation process and to maintain a wide and constant spectrum of severity for each parameter of voice quality. Lastly, it is very important for the SLPs to perform a team approach by documenting and delivering important information pertaining to auditory-perceptual characteristics in an appropriate and efficient way through close communication with the laryngologists.


2021 ◽  
pp. 1-9
Author(s):  
Jess C.S. Chan ◽  
Julie C. Stout ◽  
Christopher A. Shirbin ◽  
Adam P. Vogel

Background: Subtle progressive changes in speech motor function and cognition begin prior to diagnosis of Huntington’s disease (HD). Objective: To determine the nature of listener-rated speech differences in premanifest and early-stage HD (i.e., PreHD and EarlyHD), compared to neurologically healthy controls. Methods: We administered a speech battery to 60 adults (16 people with PreHD, 14 with EarlyHD, and 30 neurologically healthy controls), and conducted a cognitive test of processing speed/visual attention, the Symbol Digit Modalities Test (SDMT) on participants with HD. Voice recordings were rated by expert listeners and analyzed for acoustic and perceptual speech features. Results: Listeners perceived subtle differences in the speech of PreHD compared to controls, including abnormal pitch level and speech rate, reduced loudness and loudness inflection, altered voice quality, hypernasality, imprecise articulation, and reduced naturalness of speech. Listeners detected abnormal speech rate in PreHD compared to healthy speakers on a reading task, which correlated with slower speech rate from acoustic analysis and a lower cognitive performance score. In early-stage HD, continuous speech was characterized by longer pauses, a higher proportion of silence, and slower rate. Conclusion: Differences in speech and voice acoustic features are detectable in PreHD by expert listeners and align with some acoustically-derived objective speech measures. Slower speech rate in PreHD suggests altered oral motor control and/or subtle cognitive deficits that begin prior to diagnosis. Speakers with EarlyHD exhibited more silences compared to the PreHD and control groups, raising the likelihood of a link between speech and cognition that is not yet well characterized in HD.


Phonetica ◽  
2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Qandeel Hussain ◽  
Alexei Kochetov

Abstract Punjabi is an Indo-Aryan language which contrasts a rich set of coronal stops at dental and retroflex places of articulation across three laryngeal configurations. Moreover, all these stops occur contrastively in various positions (word-initially, -medially, and -finally). The goal of this study is to investigate how various coronal place and laryngeal contrasts are distinguished acoustically both within and across word positions. A number of temporal and spectral correlates were examined in data from 13 speakers of Eastern Punjabi: Voice Onset Time, release and closure durations, fundamental frequency, F1-F3 formants, spectral center of gravity and standard deviation, H1*-H2*, and cepstral peak prominence. The findings indicated that higher formants and spectral measures were most important for the classification of place contrasts across word positions, whereas laryngeal contrasts were reliably distinguished by durational and voice quality measures. Word-medially and -finally, F2 and F3 of the preceding vowels played a key role in distinguishing the dental and retroflex stops, while spectral noise measures were more important word-initially. The findings of this study contribute to a better understanding of factors involved in the maintenance of typologically rare and phonetically complex sets of place and laryngeal contrasts in the coronal stops of Indo-Aryan languages.


2021 ◽  
Vol 11 (12) ◽  
pp. 293-298
Author(s):  
Piotr Artur Machowiec ◽  
Marcela Maksymowicz ◽  
Gabriela Ręka ◽  
Halina Piecewicz-Szczęsna

Introduction and purpose: Currently, we can distinguish three basic groups of instruments. These are wind instruments, percussion instruments, and plucked instruments. In the case of wind instruments, the source of sound is a vibrating column of air, which is created by blowing by the player. It is suspected that such vibration may cause specific vocal and laryngeal symptoms. The aim of the study was to present the current state of knowledge regarding the potential relation between playing wind instruments and vocal tract disorders. Material and methods: The article reviews 19 publications available on the PubMed and Google Scholar, Web of Science databases meeting assumed criteria: published as a full text, without time limit and conducted on humans. The studies were found using initially established searching strategies as well as subsequent manual searching in order not to miss adequate articles. State of knowledge: Laryngeal symptoms may be combined with vocal symptoms. The main raised vocal manifestations among instrumentalists are dysphonia, hoarseness, and altered voice quality. Comparing a group that used wind instruments with control, VHI-10 (Voice Handicap Index) and F0 (fundamental frequency) and HNR (harmonics-to-noise ratio) were higher while jitter % and shimmer %, which are perturbation parameters, were lower in the study group. The majority of studies has a limitation because they were performed in a limited number of volunteers. Conclusions: The symptoms of the vocal tract related to playing wind instruments are characterized by a low frequency of occurrence and intensity. However, further research is needed to assess this relation.


Languages ◽  
2021 ◽  
Vol 6 (4) ◽  
pp. 211
Author(s):  
Susanne Fuchs ◽  
Laura L. Koenig ◽  
Annette Gerstenberg

Aging in speech production is a multidimensional process. Biological, cognitive, social, and communicative factors can change over time, stay relatively stable, or may even compensate for each other. In this longitudinal work, we focus on stability and change at the laryngeal and supralaryngeal levels in the discourse particle euh produced by 10 older French-speaking females at two times, 10 years apart. Recognizing the multiple discourse roles of euh, we divided out occurrences according to utterance position. We quantified the frequency of euh, and evaluated acoustic changes in formants, fundamental frequency, and voice quality across time and utterance position. Results showed that euh frequency was stable with age. The only acoustic measure that revealed an age effect was harmonics-to-noise ratio, showing less noise at older ages. Other measures mostly varied with utterance position, sometimes in interaction with age. Some voice quality changes could reflect laryngeal adjustments that provide for airflow conservation utterance-finally. The data suggest that aging effects may be evident in some prosodic positions (e.g., utterance-final position), but not others (utterance-initial position). Thus, it is essential to consider the interactions among these factors in future work and not assume that vocal aging is evident throughout the signal.


2021 ◽  
pp. 102986492110478
Author(s):  
Mauro B. Fiuza ◽  
Maria Luisa Sevillano ◽  
M.B. Lã Filipa

Menopause is a certainty in a female singer’s life; depletion of estrogens may lead to physical, mental, and vocal symptoms. To investigate the extent to which these symptoms affect singers, a systematic literature review was carried out using eight interdisciplinary bibliographic databases. Combinations of the following key words were used: menopause, climacterium, singing, singers, and choir. From 18 studies, including three doctoral dissertations and a master’s thesis, only 10 met the inclusion criteria. The heterogeneity of study designs and methods of data collection and analysis precluded the carrying out of a meta-analysis. Instead, descriptors of symptoms affecting the voice, and vocal characteristics associated with menopause ( menopause descriptors) were categorized, and their frequency of occurrence determined, according to six types of primary dataset: (1) self-reported symptoms, (2) acoustic characteristics, (3) observations of the larynx, (4) perceptual evaluations, (5) analysis of electrolaryngographic waveform characteristics, and (6) analysis of hormone concentrations. The descriptors that occurred most frequently in the literature concerned aspects of voice production, whereas those concerning vocal health, and vocal practice and performance, were less common. Of the three subsystems that comprise the vocal instrument, the vibrating vocal folds seem to be more affected than breathing and resonance. Changes in vocal range, timbre, endurance, and vocal fold mobility occur during menopause, affecting singers’ voice quality. Some singers reported that their ability to perform was compromised, mainly due to memory lapses and lack of confidence. Maintaining regular singing and practicing semi-occluded vocal tract exercises throughout the menopausal transition seem to help singers to overcome the negative impacts of menopause on vocal performance.


Sign in / Sign up

Export Citation Format

Share Document