scholarly journals Effect of Text-to-Speech Rate on Reading Comprehension by Adults With Aphasia

2020 ◽  
Vol 29 (1) ◽  
pp. 168-184 ◽  
Author(s):  
Karen Hux ◽  
Jessica A. Brown ◽  
Sarah Wallace ◽  
Kelly Knollman-Porter ◽  
Anna Saylor ◽  
...  

Purpose Accessing auditory and written material simultaneously benefits people with aphasia; however, the extent of benefit as well as people's preferences and experiences may vary given different auditory presentation rates. This study's purpose was to determine how 3 text-to-speech rates affect comprehension when adults with aphasia access newspaper articles through combined modalities. Secondary aims included exploring time spent reviewing written texts after speech output cessation, rate preference, preference consistency, and participant rationales for preferences. Method Twenty-five adults with aphasia read and listened to passages presented at slow (113 words per minute [wpm]), medium (154 wpm), and fast (200 wpm) rates. Participants answered comprehension questions, selected most and least preferred rates following the 1st and 3rd experimental sessions and after receiving performance feedback, and explained rate preferences and reading and listening strategies. Results Comprehension accuracy did not vary significantly across presentation rates, but reviewing time after cessation of auditory content did. Visual data inspection revealed that, in particular, participants with substantial extra reviewing time took longer given fast than medium or slow presentation. Regardless of exposure amount or receipt of performance feedback, participants most preferred the medium rate and least preferred the fast rate; rationales centered on reading and listening synchronization, benefits to comprehension, and perceived normality of speaking rate. Conclusion As a group, people with aphasia most preferred and were most efficient given a text-to-speech rate around 150 wpm when processing dual modality content; individual differences existed, however, and mandate attention to personal preferences and processing strengths.

Author(s):  
Louisa M. Slowiaczek ◽  
Howard C. Nusbaum

The increased use of voice-response systems has resulted in a greater need for systematic evaluation of the role of segmental and suprasegmental factors in determining the intelligibility of synthesized speech. Two experiments were conducted to examine the effects of pitch contour and speech rate on the perception of synthetic speech. In Experiment 1, subjects transcribed sentences that were either syntactically correct and meaningful or syntactically correct but semantically anomalous. In Experiment 2, subjects transcribed sentences that varied in length and syntactic structure. In both experiments a text-to-speech system generated synthetic speech at either 150 or 250 words/min. Half of the test sentences were generated with a flat pitch (monotone) and half were generated with normally inflected clausal intonation. The results indicate that the identification of words in fluent synthetic speech is influenced by speaking rate, meaning, length, and, to a lesser degree, pitch contour. The results suggest that in many applied situations the perception of the segmental information in the speech signal may be more critical to the intelligibility of synthesized speech than are suprasegmental factors.


1989 ◽  
Vol 32 (4) ◽  
pp. 837-848 ◽  
Author(s):  
Therese M. Brancewicz ◽  
Alan R. Reich

This study explored the effects of reduced speech rate on nasal/voice accelerometric measures and nasality ratings. Nasal/voice accelerometric measures were obtained from normal adults for various speech stimuli and speaking rates. Stimuli included three sentences (one obstruent-loaded, one semivowel-loaded, and one containing a single nasal), and /p/ syllable trains. Speakers read the stimuli at their normal rate, half their normal rate, and as slowly as possible. In addition, a computer program paced each speaker at rates of 1, 2, and 3 syllables per second. The nasal/voice accelerometric values revealed significant stimulus effects but no rate effects. The nasality ratings of experienced listeners, evaluated as a function of stimulus and speaking rate, were compared to the accelerometric measures. The nasality scale values demonstrated small, but statistically significant, stimulus and rate effects. However, the nasality percepts were poorly correlated with the nasal/voice accelerometric measures.


2019 ◽  
Vol 2019 ◽  
pp. 1-11 ◽  
Author(s):  
Yana Yunusova ◽  
Jamal Ansari ◽  
Joel Ramirez ◽  
Sanjana Shellikeri ◽  
Greg J. Stanisz ◽  
...  

The goal of this study was to identify neurostructural frontal lobe correlates of cognitive and speaking rate changes in amyotrophic lateral sclerosis (ALS). 17 patients diagnosed with ALS and 12 matched controls underwent clinical, bulbar, and neuropsychological assessment and structural neuroimaging. Neuropsychological testing was performed via a novel computerized frontal battery (ALS-CFB), based on a validated theoretical model of frontal lobe functions, and focused on testing energization, executive function, emotion processing, theory of mind, and behavioral inhibition via antisaccades. The measure of speaking rate represented bulbar motor changes. Neuroanatomical assessment was performed using volumetric analyses focused on frontal lobe regions, postcentral gyrus, and occipital lobes as controls. Partial least square regressions (PLS) were used to predict behavioral (cognitive and speech rate) outcomes using volumetric measures. The data supported the overall hypothesis that distinct behavioral changes in cognition and speaking rate in ALS were related to specific regional neurostructural brain changes. These changes did not support a notion of a general dysexecutive syndrome in ALS. The observed specificity of behavior-brain changes can begin to provide a framework for subtyping of ALS. The data also support a more integrative framework for clinical assessment of frontal lobe functioning in ALS, which requires both behavioral testing and neuroimaging.


2014 ◽  
Vol 57 (1) ◽  
pp. 81-89 ◽  
Author(s):  
Fred Cummins ◽  
Anja Lowit ◽  
Frits van Brenk

Purpose Following recent attempts to quantify articulatory impairment in speech, the present study evaluates the usefulness of a novel measure of motor stability to characterize dysarthria. Method The study included 8 speakers with ataxic dysarthria (AD), 16 speakers with hypokinetic dysarthria (HD) as a result of Parkinson's disease, and 24 unimpaired control participants. Each participant performed a series of sentence repetitions under habitual, fast, and slow speaking rate conditions. An algorithm to measure utterance-to-utterance spectro-temporal variation (UUV; Cummins, 2009) was used. Speech rate and intelligibility were also measured. Results UUV scores were significantly correlated with perceptually based intelligibility scores. There were significant differences in UUV between control speakers and the AD but not the HD groups, presumably because of differences in intelligibility in the samples used and not because of differences in pathology. Habitual speaking rate did not correlate with UUV scores. All speaker groups had greater UUV levels in the slow conditions compared with habitual and fast speaking rates. Conclusions UUV results were consistent with those of other variability indices and thus appear to capture motor control issues in a similar way. The results suggest that the UUV could be developed into an easy-to-use clinical tool that could function as a valid and reliable assessment and outcome measure.


2020 ◽  
Vol 18 (2) ◽  
pp. 207-220
Author(s):  
Mirosław Michalik ◽  
Ewa Czaplewska ◽  
Anna Solak ◽  
Anna Szkotak

The basic aim of the research presented in this paper was to check whether the language proficiency level of bilingual children with Polish as one of their languages is also related to the pace of speech, which is the result of two specific parameters i.e. articulation rate and speaking rate. It was assumed that children who use Polish more rarely and mostly at home will display slower speaking and articulation rates when contrasted with children who use Polish both at home and at school on an everyday basis. Participants were thirty-two children who speak Polish as one of two languages, the first research group consisting of sixteen Polish-French students at the age of 8.11 living in Wal-lonia. The second group consisted of sixteen Flemish-Polish students living in Flanders. Here the average age was 9.3 and subjects used Polish much less than their first group coun-terparts. The comparative analysis included the following parameters essential for the de-scription of the rate of speech: 1. basic: average speaking rate (phones/sec., syllables/sec, duration of pauses), average articulation rate (phones/sec., syllables/sec.), average ratio of pauses in speech sample (number and percentage), 2. accessory: average duration of all pauses (sec.), average duration of proper pauses (sec.), average duration of filled pauses (sec.), average duration of semi-filled pauses (sec.). The numerical data from the research was obtained with the use of free Audacity software. The results showed that there were no statistically significant differences between the two research groups in either the basic or the accessory speech rate parameters. In the Polish-French group the results were comparatively better but still statistically insignificant. It seems that the data obtained will confirm the need for considerable caution in the evalua-tion of the competence of bilingual children with high language skills. Similar to children with imbalanced bilingualism, these children may also, perhaps, require some extra time to deal with certain language tasks.


1999 ◽  
Vol 8 (2) ◽  
pp. 164-170 ◽  
Author(s):  
Martine Vanryckeghem ◽  
Jeffrey J. Glessing ◽  
Gene J. Brutten ◽  
Peter McAlindon

Twenty-four adults participated in a 2 (group) by 3 (rate) factorial study designed to determine the main and interactive effects of speech rate during reading on the frequency of stuttering. In this regard, the participants orally read three passages, one at their normal rate, one that was 30% faster than this rate, and one that was 30% slower. Rate was controlled by means of a computer software program, and passage order and reading rate were counter-balanced. The main effect of rate was significant. There was statistically more stuttering in the fast rate condition than in either the normal or slow rate condition. However, the frequency of stuttering in the normal and the slow rate conditions was not significantly different. Analysis of the experimental data of the eight participants who stuttered the most and the eight who stuttered the least, during base-rate oral readings, evidenced the presence of an interaction between group and rate. Those who stuttered the most showed a statistically significant increase in stuttering between the slow, normal, and fast rate conditions. In contrast, there was no significant difference in frequency between any of the three conditions for the group of eight participants who stuttered the least. These findings suggest that the extent to which rate affects fluency is a function of the degree to which stuttering is displayed. This possibility warrants consideration in relation to the use of rate management procedures.


The Text To Speech (TTS) system takes text as an input and generates speech as an output. If input text is incorrect then overall quality of speech output may degrade. The main aim of the proposed system is to provide correct input text to the TTS. The system takes Unicode word as an input, identifies invalid word and corrects it by inserting, deleting or updating characters of the word. In this system, the State Machine is used to identify and correct invalid word in the Devanagari script which in turn is based on rules. Rules are developed for converting character to input symbol. Actions and States are identified for State Machine. Finally, the state transition table is developed for validation and correction of word. Using this system, incorrect words of the Devanagari script can be corrected to valid words (word contains all the valid Devanagari syllables) based on Devanagari script grammar. Since, all Devanagari characters are not present in Hindi language; this system will correct these nonHindi characters to Hindi.


2018 ◽  
Vol 29 (07) ◽  
pp. 596-608 ◽  
Author(s):  
Shelby Tiffin ◽  
Susan Gordon-Hickey

AbstractOlder adults often struggle with accurate perception of rate-altered speech and have difficulty understanding speech in noise. The acceptable noise level (ANL) quantifies a listener’s willingness to listen to speech in background noise and has been found to accurately predict hearing aid success. Based on the difficulty older adults experience with rapid speech, we were interested in how older adults may change the amount of background noise they willingly accept in a variety of speech rate conditions.To determine the effects of age and speech rate on the ANL.A quasi-experimental mixed design was employed.Fifteen young adults (19–27 yr) and fifteen older adults (55–73 yr) with audiometrically normal hearing or hearing loss within age-normed limits served as participants.Most comfortable listening levels (MCLs) and background noise levels (BNLs) were measured using three different speech rates (slow, normal, and fast). The ANL was calculated by subtracting BNL from MCL. Repeated measures analysis of variances were used to analyze the effects of age and speech rate on ANL.A significant main effect of speech rate was observed; however, a significant main effect of age was not found. Results indicated that as speech rate increased the ANLs increased. This suggests that participants became less accepting of background noise as speech rates increased.The findings of the present study provide support for communication strategies that recommend slowing an individual’s speaking rate and/or reducing background noise, if possible. Participants in the present study were better able to cope with background noise when the primary stimulus was presented at slow and normal speaking rates.


2021 ◽  
pp. 1-9
Author(s):  
Tak Fai Hui ◽  
Steven Randall Cox ◽  
Ting Huang ◽  
Wei-Rong Chen ◽  
Manwa Lawrence Ng

<b><i>Background/Aim:</i></b> The purpose of this study was to provide preliminary data concerning the effect of clear speech (CS) on Cantonese alaryngeal speakers’ intelligibility. <b><i>Methods:</i></b> Voice recordings of 11 sentences randomly selected from the Cantonese Sentence Intelligibility Test (CSIT) were obtained from 31 alaryngeal speakers (9 electrolarynx [EL] users, 10 esophageal speakers and 12 tracheoesophageal [TE] speakers) in habitual speech (HS) and CS. Two naïve listeners orthographically transcribed a total of 1,364 sentences. <b><i>Results:</i></b> Significant effects of speaking condition on speaking rate and CSIT scores were observed, but no significant effect of alaryngeal communication methods was noted. CS was significantly slower than HS by 0.78 syllables/s. Esophageal speakers demonstrated the slowest speech rate when using CS, while EL users demonstrated the largest decrease in speaking rate when using CS compared to HS. TE speakers had the highest CSIT scores in HS (listener 1 = 81.4%; listener 2 = 81.3%), and esophageal speakers had the highest CSIT scores in CS (listener 1 = 87.5%; listener 2 = 89.7%). EL users experienced the largest increase in intelligibility while using CS compared to HS (9.1%) followed by esophageal speakers (8.9%) and TE speakers (1.4%). <b><i>Conclusion:</i></b> Preliminary data indicate that CS may significantly affect Cantonese alaryngeal speakers’ speaking rate and intelligibility. However, intelligibility appeared to vary considerably across speakers. Further research involving larger, heterogeneous groups of speakers and listeners alongside longer and more refined CS training protocols should be conducted to confirm that CS can improve Cantonese alaryngeal speakers’ intelligibility.


Sign in / Sign up

Export Citation Format

Share Document