intelligible speech
Recently Published Documents


TOTAL DOCUMENTS

77
(FIVE YEARS 16)

H-INDEX

19
(FIVE YEARS 2)

2021 ◽  
Vol 64 (10) ◽  
pp. 3786-3793
Author(s):  
Koji Sato ◽  
Junji Genda ◽  
Ryoya Minabe ◽  
Takumi Taniguchi

Purpose The aim of this study was to investigate the characteristics of electrolaryngeal (EL) speech among untrained speakers to aid in its effective introduction and to identify syllables and words that are easy or difficult to pronounce. Method A total of 21 healthy individuals who had never used an EL were included. The participants were briefed, and tests comprising 100 Japanese syllables and 50 single words were conducted to evaluate EL speech intelligibility. A trained speaker was defined as a certified speech-language pathologist who underwent EL training for 3 months. A 5-point electrolarynx effectivity score (EES) was used for the subjective assessment of EL. Results The median (interquartile range) intelligibility scores of the untrained and trained groups were 24.0% (20.0%–34.0%) and 40.0% (36.0%–45.0%) for syllables and 48.0% (38.0%–60.0%) and 88.0% (82.0%–90.0%) for words, respectively. The intelligibility scores for syllables and words were higher in the trained group than those in the untrained group. Only two syllable subgroups (/m/ and /w/) had > 80% correct answers among untrained speakers. A total of 14 syllable subgroups (/k, kʲ, s, ɕ, t, t͡ɕ, ts, ɲ, h, ç, ɸ, p, pʲ, and a/), a number of which contained voiceless consonants, had < 40% correct answers among both speaker groups. A greater number of morae were associated with higher intelligibility scores. An EES of 4, indicating that the EL was effective, was the most frequent score. Conclusions It was difficult for untrained speakers to produce intelligible speech using an EL. Syllables, including voiceless consonants, were difficult to pronounce using an EL. Longer words with a greater number of morae were more intelligible, even for untrained EL speakers. Supplemental Material https://doi.org/10.23641/asha.16632622


2021 ◽  
pp. 945-950
Author(s):  
Jonathan Pollock ◽  
Maniram Ragbir

Reconstruction of the pharynx and cervical oesophagus represents a significant challenge for the plastic surgeon due to the complex functions and anatomical structures to be restored. Malignancy is the most common cause of pharyngeal defects, and this patient cohort has a generally poor prognosis due to the tendency to present late with advanced disease. While tumours of the hypopharynx make up only 5% of all head and neck cancers, the surgical management of laryngeal and upper oesophageal malignancy frequently involves the need to reconstruct or reinforce the pharynx. Reconstruction for these patients is further complicated by medical comorbidity, synchronous malignancy, and the patients’ poor nutritional state. With the move in recent years to chemoradiotherapy protocols for advanced disease, surgery is performed less frequently. Nevertheless, there is still a need for surgical management in both primary treatment and salvage cases following chemoradiotherapy. As survival is poor in this group, it is important that quality of life after reconstruction is considered, and hospital stay minimized. The restoration of an oral diet and intelligible speech is the priority, but there are a multitude of factors which must be considered when selecting the best reconstruction. This chapter outlines some historical methods of pharyngeal reconstruction, followed by the indications, advantages, and disadvantages of the methods in current use.


Author(s):  
Syifa' Khuriyatuz Zahro

There have been widely exposed researches that analyze accented and intelligible speech of foreign language; yet, the study concerning listeners' awareness of the pronunciation errors is hardly found. Therefore, the current study aims to find out segmental features and the source of errors that have caused unintelligible speech of Indonesian-accented speech and describe the listeners' awareness of the errors. This descriptive qualitative research investigates listeners' transcripts of Indonesian-accented speeches through purposive sampling. The standard orthography transcripts are then transformed into phonemic transcripts. They are analyzed by error analysis based on phonological operation by Davenport and Hannahs. The result is further checked by the listeners to generate their awareness of the errors through interviews. There are consonants causing unintelligibility more than vowels. Furthermore, it is found that there are six pronunciation features affecting listeners' awareness of speakers' pronunciation errors:  1) aspiration, 2) spelling system, 3) blended phonemes, 4) absent phonemes, 5) different articulation, and 6) homophone in the research. 


2021 ◽  
Author(s):  
Jonathan Henry Venezia ◽  
Virginia Richards ◽  
Gregory Hickok

We recently developed a method to estimate speech-driven spectrotemporal receptive fields (STRFs) using fMRI. The method uses spectrotemporal modulation filtering, a form of acoustic distortion that renders speech sometimes intelligible and sometimes unintelligible. Using this method, we found significant STRF tuning only in classic auditory regions throughout the superior temporal lobes. However, our analysis was not optimized to detect small clusters of tuned STRFs as might be expected in non-auditory regions. Here, we re-analyze our data using a more sensitive multivariate procedure, and we identify STRF tuning in non-auditory regions including the left dorsal premotor cortex (left dPM), left inferior frontal gyrus (LIFG), and bilateral calcarine sulcus (calcS). All three regions responded more to intelligible than unintelligible speech, but left dPM and calcS responded significantly to vocal pitch and demonstrated strong functional connectivity with early auditory regions. However, only left dPM’s STRF predicted activation on trials rated as unintelligible by listeners, a hallmark auditory profile. LIFG, on the other hand, responded almost exclusively to intelligible speech and was functionally connected with classic speech-language regions in the superior temporal sulcus and middle temporal gyrus. LIFG’s STRF was also (weakly) able to predict activation on unintelligible trials, suggesting the presence of a partial ‘acoustic trace’ in the region. We conclude that left dPM is part of the human dorsal laryngeal motor cortex, a region previously shown to be capable of operating in an ‘auditory mode’ to encode vocal pitch. Further, given previous observations that LIFG is involved in syntactic working memory and/or processing of linear order, we conclude that LIFG is part of a higher-order speech circuit that exerts a top-down influence on processing of speech acoustics. Finally, because calcS is modulated by emotion, we speculate that changes in the quality of vocal pitch may have contributed to its response.


2021 ◽  
Author(s):  
Francis Xavier Smith ◽  
Bob McMurray

Listeners often process speech in adverse conditions. One challenge is spectral degradation, where information is missing from the signal. Lexical competition dynamics change when processing degraded speech, but it is unclear why and how these changes occur. We ask if these changes are driven solely by the quality of the input from the auditory periphery, or if these changes are modulated by cognitive mechanisms. Across two experiments, we used the visual world paradigm to investigate changes in lexical processing. Listeners heard different levels of noise-vocoded speech (4- or 15-channel vocoding) and matched the auditory input to pictures of a target word and its phonological competitors. In Experiment 1 levels of vocoding were either blocked together consistently or randomly interleaved from trial-to-trial. Listeners in the blocked condition showed more differentiation between the two levels of vocoding; this suggests that some level of learning is in effect to adapt to the varying levels of uncertainty in the input. Exploratory analyses suggested that when less intelligible speech is processed there is a cost to switching processing modes. In Experiment 2 levels of vocoding were always randomly interleaved. A visual cue was added to inform listeners of the level of difficulty of the upcoming speech. This was enough to attenuate the effects of interleaving as well as the switch cost. These experiments support a role for central processing in dealing with degraded speech. Listeners may be actively forming expectations about the level of degradation they will encounter and altering the dynamics of lexical access.


2020 ◽  
Vol 53 (03) ◽  
pp. 363-370
Author(s):  
Hemant A. Saraiya

Abstract Background Ameloblastoma is a benign yet locally aggressive odontogenic tumor of the jaw with high recurrence rates. Despite many studies, the search is still on for the treatment approach which can render the acceptable recurrence rates with good functional and esthetic results. Methods In this prospective study, we operated on 37 patients of mandibular ameloblastoma between 2009 and 2018. Two patients were treated with curettage and chemical sterilization of the cavity. Resection of a tumor with a 2-cm margin was performed in the rest of 35 patients. The mandibular defect was primarily reconstructed with the microvascular free fibular flap in 29 patients. Results The follow-up ranged from 6 months to 7.7 years with a mean of 5.1 years. A tumor recurred within a year in all two patients (100%) treated with curettage. Out of 35 radical excisions, only one patient (2.85%) developed recurrence 3 years after the disease-free interval. Good mouth opening, intelligible speech, satisfactory lower jaw shape, and facial profiles were achieved in all 29 patients who were treated with primary free fibular flap. Conclusion We prefer wide excision with 2-cm margins on each side of a tumor with the primary reconstruction of the mandible in all cases of mandibular ameloblastoma. The free fibular microvascular flap is our treatment of choice as all defects of the mandible can be reconstructed with the free fibular flap. Wide excision is the key to prevent a recurrence.


2020 ◽  
Author(s):  
Irena Lovcevic ◽  
Marina Kalashnikova ◽  
Denis Burnham

This study investigated the effects of hearing loss and hearing experience on the acoustic features of infant-directed speech (IDS) to infants with hearing loss (HL) compared to controls with normal hearing (NH) matched by either chronological or hearing age (Experiment 1) and across development in infants with HL as well as the relation between IDS features and infants’ developing lexical abilities (Experiment 2). Both experiments included detailed acoustic analyses of mothers’ productions of the three corner vowels /a, i, u/ and utterance-level pitch in IDS and in adult-directed speech (ADS). Experiment 1 demonstrated that IDS to infants with HL was acoustically more variable than IDS to hearing-age matched infants with NH. Experiment 2 yielded no changes in IDS features over development; however, the results did show a positive relationship between formant distances in mothers’ speech and infants’ concurrent receptive vocabulary size, as well as between vowel hyperarticulation and infants’ expressive vocabulary. These findings suggest that despite their HL and thus diminished access to speech input, infants with HL are exposed to IDS with generally similar acoustic qualities as are infants with NH. However, some differences persist, indicating that infants with HL might receive less intelligible speech.


2020 ◽  
Author(s):  
Sander van Bree ◽  
Ediz Sohoglu ◽  
Matthew H Davis ◽  
Benedikt Zoefel

AbstractRhythmic sensory or electrical stimulation will produce rhythmic brain responses. These rhythmic responses are often interpreted as endogenous neural oscillations aligned to the stimulus rhythm. However, stimulus-aligned brain responses can also be explained as a sequence of evoked responses, which only appear regular due to the rhythmicity of the stimulus, without necessarily involving underlying neural oscillations. To distinguish evoked responses from true oscillatory activity, we tested whether rhythmic stimulation produces oscillatory responses which continue after the end of the stimulus. Such sustained effects provide evidence for true involvement of neural oscillations. In Experiment 1, we found that rhythmic intelligible, but not unintelligible speech produces oscillatory responses in magnetoencephalography (MEG) which outlast the stimulus at parietal sensors. In Experiment 2, we found that transcranial alternating current stimulation (tACS) leads to rhythmic fluctuations in speech perception outcomes which continue after the end of electrical stimulation. We further report that the phase relation between electroencephalography (EEG) and rhythmic intelligible speech can predict the tACS phase that leads to most accurate speech perception. Together, our results lay the foundation for a new account of speech perception which includes endogenous neural oscillations as a key underlying principle.


2020 ◽  
Vol 9 (1) ◽  
pp. 1764-1769

Text to Speech System is a Speech Synthesis application that converts a text to speech. The current project focuses on developing a TTS System for the Tamil Language with the Synthesis Technique as Unit Selection Synthesis. Letter Level Segmentation of an input text helps in the reduction of corpus size compared to Syllable Level Segmentation. The segmented units are retrieved with respect to Unicode values, concatenated and the synthesized speech is produced. Intelligibility and Naturalness of the spoken word can be improved using the Smoothing Techniques. Optimal Coupling Smoothing Technique is implemented for the smooth transition in between the concatenated speech segments to create continuous Speech output like human voice. Fraction based Waveform Concatenation method is used to produce the intelligible speech segments as output from the pre-recorded speech database.


2020 ◽  
Vol 201 ◽  
pp. 104713
Author(s):  
Nanxi Fei ◽  
Jianqiao Ge ◽  
Yi Wang ◽  
Jia-Hong Gao

Sign in / Sign up

Export Citation Format

Share Document