scholarly journals Impression of Speaker's Personality and the Naturalistic Qualities of Speech: Speech Rate and Pause Duration

2005 ◽  
Vol 53 (1) ◽  
pp. 1-13 ◽  
Author(s):  
TERUHISA UCHIDA
Keyword(s):  
Author(s):  
Lynda Feenaughty ◽  
Ling-Yu Guo ◽  
Bianca Weinstock-Guttman ◽  
Meredith Ray ◽  
Ralph H.B. Benedict ◽  
...  

Abstract Objective: To investigate the impact of cognitive impairment on spoken language produced by speakers with multiple sclerosis (MS) with and without dysarthria. Method: Sixty speakers comprised operationally defined groups. Speakers produced a spontaneous speech sample to obtain speech timing measures of speech rate, articulation rate, and silent pause frequency and duration. Twenty listeners judged the overall perceptual severity of the samples using a visual analog scale that ranged from no impairment to severe impairment (speech severity). A 2 × 2 factorial design examined main and interaction effects of dysarthria and cognitive impairment on speech timing measures and speech severity in individuals with MS. Each speaker group with MS was further compared to a healthy control group. Exploratory regression analyses examined relationships between cognitive and biopsychosocial variables and speech timing measures and perceptual judgments of speech severity, for speakers with MS. Results: Speech timing was significantly slower for speakers with dysarthria compared to speakers with MS without dysarthria. Silent pause durations also significantly differed for speakers with both dysarthria and cognitive impairment compared to MS speakers without either impairment. Significant interactions between dysarthria and cognitive factors revealed comorbid dysarthria and cognitive impairment contributed to slowed speech rates in MS, whereas dysarthria alone impacted perceptual judgments of speech severity. Speech severity was strongly related to pause duration. Conclusions: The findings suggest the nature in which dysarthria and cognitive symptoms manifest in objective, acoustic measures of speech timing and perceptual judgments of severity is complex.


1992 ◽  
Vol 36 (3) ◽  
pp. 232-236
Author(s):  
Hiroshi Hamada ◽  
Jin'ichi Chiba

For the purpose of designing a method to control the main speech parameters for keyword emphasis in a text-to-speech synthesizer, the relation between speech parameters and emphasis level is determined from experiments. Twelve subjects are instructed to modify keyword emphasis to achieve natural sounding speech from three sentences. An interactive speech editor with a graphical user interface is developed for the experiments. The editor allows the subjects to control speech intensity, speech rate and average fundamental frequency of the keyword, and of the other sentence components. Furthermore, subjects can also control pause (silence) duration preceding and following the keyword. Extracted relations between prosodic feature parameters and emphasis level shows that speech intensity and speech rate are independent of sentence content. Speech intensity increases linearly and speech rate decreases linearly with emphasis level. On the other hand, average fundamental frequency and pause duration depend on sentence content, and relatively large changes are required to strongly emphasize keywords using pause insertion and increased fundamental frequency.


2019 ◽  
Vol 28 (2) ◽  
pp. 521-535
Author(s):  
Sih-Chiao Hsu ◽  
Megan J. McAuliffe ◽  
Peiyi Lin ◽  
Ruey-Meei Wu ◽  
Erika S. Levy

PurposeThis study investigated the effects of cueing for increased loudness and reduced speech rate on scaled intelligibility and acoustics of speech produced by Mandarin speakers with hypokinetic dysarthria due to Parkinson's disease (PD).MethodEleven speakers with PD read passages in habitual, loud, and slow speaking conditions. Fifteen listeners rated ease of understanding (EOU) of the speech samples on a visual analog scale. Effects of the cues on EOU, vocal loudness, pitch range, pause duration and frequency, articulation rate, and vowel space, as well as relationships between EOU gains and acoustic features, were analyzed.ResultsEOU increased significantly in the loud condition only. The loud cue resulted in increased intensity, and the slow cue resulted both in reduced articulation rate and increased pause frequency. In the loud condition, EOU increased significantly as intensity increased and vowel centralization decreased. In the slow condition, EOU tended to increase as intensity increased and vowel centralization decreased but did not reach statistical significance.ConclusionCueing for loud speech may yield greater EOU gains than cueing for slow speech in Mandarin speakers with PD. Theoretical and clinical implications are discussed, although further investigations with more participants and a larger range of dysarthria severity are warranted.


2019 ◽  
Vol 11 (1) ◽  
pp. 87-102
Author(s):  
Mirjana M. Kovač ◽  
Gloria Vickov

The main purpose of this study is to investigate the effects of pre-task planning on L2 fluency performance by measuring the temporal variables. Performing a picture description task, two groups of thirty-seven students were given 10 minutes of planning time and no planning time before the performance, respectively. The temporal fluency variables are extracted by means of the PRAAT speech analysis program in order to be automatically measured for evaluation purposes. Fluency is operationalized as speed fluency (i.e. speech rate and articulation rate) and breakdown fluency (i.e. average pause duration and number of pauses). The results indicate that no significant difference is found when comparing the non-planning and planning condition for each temporal variable. Presumably, the chosen task type containing highly frequent lexemes does not seem to impose increased conscious attention on the part of the more proficient speakers, and thus the formulation and articulation can, to a high degree, run in parallel. Based on the observed results, a modified task design is proposed, i.e. guided pre-task planning directed to attend to less frequent formulae as vocabulary or lexical items for everyday contexts, having a clear potential as a pedagogic device, aiming at activating relatively underused vocabulary and promoting ultimate fluency in the temporal sense.


2020 ◽  
Vol 29 (1S) ◽  
pp. 449-462 ◽  
Author(s):  
Gayle DeDe ◽  
Christos Salis

Purpose The purpose of this study was to improve our understanding of the language characteristics of people with latent aphasia using measures that examined temporal (i.e., real-time) and episodic organization of discourse production. Method Thirty AphasiaBank participants were included (10 people with latent aphasia, 10 people with anomic aphasia, and 10 neurotypical control participants). Speech material of Cinderella narratives was analyzed with Praat software. We devised a protocol that coded the presence and duration of all speech segments, dysfluencies such as silent and filled pauses, and other speech behaviors. Using these durations, we generated a range of temporal measures such as speech, articulation, and pure word rates. Narratives were also coded into episodes, which provided information about the discourse macrostructure abilities of the participants. Results The latent aphasia group differed from controls in number of words produced, silent pause duration, and speech rate, but not articulation rate or pure word rate. Episodic organization of the narratives was similar in these 2 groups. The latent and anomic aphasia groups were similar in most measures, apart from articulation rate, which was lower in the anomic group. The anomic aphasia group also omitted more episodes than the latent aphasia group. Conclusions The differences between latent aphasia and neurotypical controls can be attributed to a processing speed deficit. We propose that this deficit results in an impaired ability to process information from multiple cognitive domains simultaneously.


2020 ◽  
Vol 46 (Supplement_1) ◽  
pp. S230-S230
Author(s):  
Alberto Parola ◽  
Arndis Simonsen ◽  
Vibeke Bliksted ◽  
Yuan Zhou ◽  
Shiho Ubukata ◽  
...  

Abstract Background Schizophrenia (SCZ) has been associated to distinctive voice since its first definitions. Distinctive voice patterns are often associated with core negative symptoms and with social impairment. They may thus represent markers of the disorder. A recent meta-analysis identified weak atypicalities for pitch variability, and stronger atypicalities in duration (speech percentage, pause duration and speech rate). However, heterogeneity across studies was large, most of the studies underpowered (small sample and no repeated measures) and replications across studies almost nonexistent. In addition, there is a lack of cross-linguistic studies comparing voice and linguistic patterns in SCZ across different languages to assess whether the patterns are distinctive of SCZ in general, or specific to linguistic and/or cultural groups. In the present study, we aim to advance the understanding of voice patterns in SCZ by collecting and analyzing a cross-linguistic corpus of repeated voice measures. Such corpus enables us to systematically assess the replicability of previous meta-analytic results, better accounting for between and within participant variability, as well as cross-linguistic differences. Methods We collected a Danish (DK), Chinese (CH) and Japanese (JP) cross-linguistic dataset involving 163 participants with SCZ (105 DK, 51 CH, 7 JP) and 173 matched controls (HC) (117 DK, 43 CH, 13 JP) for a total of 3851 audio-recordings. Data were collected using the Animated Triangle 1 2020 Congress of the Schizophrenia International Research Society Task. Voice recordings were preprocessed using consolidated algorithms (Covarep, Praat) to extract the following features, in order to compare results with the effect sizes (ES) of previous meta-analysis (MA): 1) Duration measures (speech rate, duration of utterance, number of pauses, pause duration), as well as 2) pitch and intensity (mean and variability). To investigate differences between SCZ and HC, we ran multilevel regression models with the acoustic feature as outcome, diagnosis (SZ, HC) and language (DK, JP, CH) as predictors, and varying effects by participant and corpus. Predictors were scaled in order to allow comparison with meta-analysis ES. Results We were only able to partially replicate previous findings. The meta-analysis found: 1) lower pitch variability, replicated for JP only (β= -1.25, SE = 0.37, p < .001); 2) lower speech rate replicated for DK only (β= -0.23, SE = .08, p < .01); 3) increased pause duration replicated for DK (β= 0.29, SE = .08, p < .001) and JP (β= 0.59, SE = .30, p < .05); 4) lack of evidence for atypical number of pauses replicated for DK, JP and CH; 5) lack of evidence for atypical duration of utterance replicated for CH and JP (DK presented higher duration: β= 0.01, SE = 0.01, p < .01); 6) lower proportion of spoken time, not replicated; 7) lack of evidence for pitch mean, replicated for DK, but higher in CH (β= 0.37, SE = .18, p < .05), and lower in JP (β= -1.46, SE = .41, p < .001). Discussion We found only partial replication of previous meta-analytic findings for reduced pitch variability, increased pause duration and lower speech rate, with ES generally smaller than in previous meta-analysis. On the contrary, we were not able to replicate previous findings of lower proportion of spoken time. Estimations of ES were largely affected by different languages, and replications held only for specific languages (pitch variability for JP, speech rate for DK, and pause duration for DK and JP). This indicates the important role that linguistic factors may play in originating vocal patterns in SCZ. Voice patterns seem not to be distinctive of SCZ in general, but bounded to linguistic/cultural differences. Future studies should better investigate how different acoustic and linguistic features interact in originating atypical voice patterns in SCZ.


2019 ◽  
Author(s):  
Parola Alberto ◽  
Simonsen Arndis ◽  
Bliksted Vibeke ◽  
Fusaroli Riccardo

AbstractVoice atypicalities have been a characteristic feature of schizophrenia since its first definitions. They are often associated with core negative symptoms such as flat affect and alogia, and with the social impairments seen in the disorder. This suggests that voice atypicalities may represent a marker of clinical features and social functioning in schizophrenia. We systematically reviewed and meta-analyzed the evidence for distinctive acoustic patterns in schizophrenia, as well as their relation to clinical features. We identified 46 articles, including 55 studies with a total of 1254 patients with schizophrenia and 699 healthy controls. Summary effect sizes (Hedges’g and Pearson’s r) estimates were calculated using multilevel Bayesian modeling. We identified weak atypicalities in pitch variability (g = - 0.55) related to flat affect, and stronger atypicalities in proportion of spoken time, speech rate, and pauses (g’s between -0.75 and -1.89) related to alogia and flat affect. However, the effects were mostly modest (with the important exception of pause duration) compared to perceptual and clinical judgments, and characterized by large heterogeneity between studies. Moderator analyses revealed that tasks with a more demanding cognitive and social component showed larger effects both in contrasting patients and controls and in assessing symptomatology. In conclusion, studies of acoustic patterns are a promising but, yet unsystematic avenue for establishing markers of schizophrenia. We outline recommendations towards more cumulative, open, and theory-driven research.


Sign in / Sign up

Export Citation Format

Share Document