scholarly journals Phonetic convergence in the shadowing for natural and synthesized speech in Polish

2020 ◽  
Vol 62 (2) ◽  
pp. 7-17
Author(s):  
Karolina Jankowska ◽  
Tomasz Kuczmarski ◽  
Grażyna Demenko

Abstract The matter of shadowing natural speech has been discussed in many studies and papers. However, there is very little knowledge of human phonetical convergence to synthesized speech. To find out more about this issue an experiment in the Polish language was conducted. Two types of stimuli were used – natural speech and synthesised speech. Five sets of sentences with various phonetic phenomena in Polish were prepared. A group of twenty persons were recorded which gave the total number of 100 samples for each phenomenon. The summary of results shows convergence in both natural and synthesised speech in set number 1, 2, 4 while in group 3 and 5 the convergence was not observed. The baseline production shown that the great majority of participants prefer ɛn/ɛm version of phonetic feature which was reflected in 83 out of 100 sentences. In the shadowing natural speech participants changed ɛn/ɛm to ɛw/ɛ̃ in 26 cases and in 4 ɛw/ɛ̃ to ɛn/ɛm. When shadowing synthesised speech shift from ɛn/ɛm to ɛw/ɛ̃ in 18 sentences and 4 from ɛw/ɛ̃ to ɛn/ɛm. The intonation convergence was also observed in the perceptual analysis, however the analysis of F0 statistics did not show statistically significant differences.

1987 ◽  
Vol 30 (3) ◽  
pp. 425-431 ◽  
Author(s):  
Julia Hoover ◽  
Joe Reichle ◽  
Dianne Van Tasell ◽  
David Cole

The intelligibility of two speech synthesizers [ECHO II (Street Electronics, 1982) and VOTRAX (VOTRAX Division, 1981)] was compared to the intelligibility of natural speech in each of three different contextual conditions: (a) single words, (b)"low-probability sentences" in which the last word could not be predicted from preceding context, and (c) "high-probability sentences" in which the last word could be predicted from preceding context. Additionally, the effect of practice on performance in each condition was examined. Natural speech was more intelligible than either type of synthesized speech regardless of word/sentence condition. In both sentence conditions, VOTRAX speech was significantly more intelligible than ECHO II speech. No practice effect was observed for VOTRAX, while an ascending linear trend occurred for ECHO II. Implications for the use of inexpensive speech synthesis units as components of augmentative communication aids for persons with severe speech and/or language impairments are discussed.


2002 ◽  
Vol 45 (4) ◽  
pp. 802-810 ◽  
Author(s):  
Mary E. Reynolds ◽  
Charlene Isaacs-Duvall ◽  
Michelle Lynn Haddox

This study examined the effect of listening practice on the ability of young adults to comprehend natural speech and DECtalk synthesized speech by having them perform a sentence verification task over a 5-day period. Results showed that response latencies of participants shortened in a similar fashion to sentences presented in both types of speech across the 5-day period, with latencies remaining significantly longer in response to DECtalk than to natural speech across the days. These results suggest that high-quality synthesized speech, such as DECtalk, can be useful in many human factors applications.


1988 ◽  
Vol 19 (4) ◽  
pp. 401-409 ◽  
Author(s):  
Holly J. Massey

The Token Test for Children was given in a synthesized-speech version and a natural-speech version to 11 language-impaired children aged 8 years, 9 months to 10 years, 1 month and to 11 control subjects matched for age and sex. The scores of the language-impaired children on the synthesized version were significantly lower than (a) the synthesized-speech scores of the control group and (b) their own scores on the natural-speech version. Task complexity was a significant factor for the experimental group. Language-impaired children may have difficulty understanding some synthesized voice commands.


2014 ◽  
Vol 59 (3) ◽  
pp. 289-297 ◽  
Author(s):  
Steven E. Stern ◽  
Chelsea M. Chobany ◽  
Disha V. Patel ◽  
Justin J. Tressler

Author(s):  
Phung Trung Nghia ◽  
Nguyen Van Tao ◽  
Pham Thi Mai Huong ◽  
Nguyen Thi Bich Diep ◽  
Phung Thi Thu Hien

The articulators typically move smoothly during speech production. Therefore, speech features of natural speech are generally smooth. However, over-smooth causes the “muffleness" and the reduction in identification emotions / expressions / styles in synthesized speech that can affect to the perception of the naturalness in synthesized speech. In the literature, statistical variances of static spectral features have been used as a measure of smoothness in synthesized speech but they are not sufficient enough. This paper aims to propose a speech smoothness measure that can be efficiently applied to evaluate the smoothness of synthesized speech. Experiments show that the proposed measures are reliable and efficient to measure smoothness of different kinds of synthesized speech.


Author(s):  
Iona Gessinger ◽  
Eran Raveh ◽  
Sébastien Le Maguer ◽  
Bernd Möbius ◽  
Ingmar Steiner

2011 ◽  
Vol 97 (5) ◽  
pp. 852-868 ◽  
Author(s):  
Peter Počta ◽  
Jan Holub

This paper investigates the impact of independent and dependent losses and coding on speech quality predictions provided by PESQ (also known as ITU-T P.862) and P.563 models, when both naturally-produced and synthesized speech are used. Two synthesized speech samples generated with two different Text-to-Speech systems and one naturally-produced sample are investigated. In addition, we assess the variability of PESQ's and P.563's predictions with respect to the type of speech used (naturally-produced or synthesized) and loss conditions as well as their accuracy, by comparing the predictions with subjective assessments. The results show that there is no difference between the impact of packet loss on naturally-produced speech and synthesized speech. On the other hand, the impact of coding is different for the two types of stimuli. In addition, synthesized speech seems to be insensitive to degradations provided by most of the codecs investigated here. The reasons for those findings are particularly discussed. Finally, it is concluded that both models are capable of predicting the quality of transmitted synthesized speech under the investigated conditions to a certain degree. As expected, PESQ achieves the best performance over almost all of the investigated conditions.


1979 ◽  
Vol 44 ◽  
pp. 349-355
Author(s):  
R.W. Milkey

The focus of discussion in Working Group 3 was on the Thermodynamic Properties as determined spectroscopically, including the observational techniques and the theoretical modeling of physical processes responsible for the emission spectrum. Recent advances in observational techniques and theoretical concepts make this discussion particularly timely. It is wise to remember that the determination of thermodynamic parameters is not an end in itself and that these are interesting chiefly for what they can tell us about the energetics and mass transport in prominences.


Author(s):  
P. Bagavandoss ◽  
JoAnne S. Richards ◽  
A. Rees Midgley

During follicular development in the mammalian ovary, several functional changes occur in the granulosa cells in response to steroid hormones and gonadotropins (1,2). In particular, marked changes in the content of membrane-associated receptors for the gonadotropins have been observed (1).We report here scanning electron microscope observations of morphological changes that occur on the granulosa cell surface in response to the administration of estradiol, human follicle stimulating hormone (hFSH), and human chorionic gonadotropin (hCG).Immature female rats that were hypophysectcmized on day 24 of age were treated in the following manner. Group 1: control groups were injected once a day with 0.1 ml phosphate buffered saline (PBS) for 3 days; group 2: estradiol (1.5 mg/0.2 ml propylene glycol) once a day for 3 days; group 3: estradiol for 3 days followed by 2 days of hFSH (1 μg/0.1 ml) twice daily, group 4: same as in group 3; group 5: same as in group 3 with a final injection of hCG (5 IU/0.1 ml) on the fifth day.


Author(s):  
E.J. Prendiville ◽  
S. Laliberté Verdon ◽  
K. E. Gould ◽  
K. Ramberg ◽  
R. J. Connolly ◽  
...  

Endothelial cell (EC) seeding is postulated as a mechanism of improving patency in small caliber vascular grafts. However the majority of seeded EC are lost within 24 hours of restoration of blood flow in previous canine studies . We postulate that the cells have insufficient time to fully develop their attachment to the graft surface prior to exposure to hemodynamic stress. We allowed EC to incubate on fibronectin-coated ePTFE grafts for four different time periods after seeding and measured EC retention after perfusion in a canine ex vivo shunt circuit.Autologous canine EC, were enzymatically harvested, grown to confluence, and labeled with 30 μCi 111 Indium-oxine/80 cm 2 flask. Four groups of 5 cm x 4 mm ID ePTFE vascular prostheses were coated with 1.5 μg/cm.2 human fibronectin, and seeded with 1.5 x 105 EC/ cm.2. After seeding grafts in Group 1 were incubated in complete growth medium for 90 minutes, Group 2 were incubated for 24 hours, Group 3 for 72 hours and Group 4 for 6 days. Grafts were then placed in the canine ex vivo circuit, constructed between femoral artery and vein, and subjected to blood flow of 75 ml per minute for 6 hours. Continuous counting of γ-activity was made possible by placing the seeded graft inside the γ-counter detection crystal for the duration of perfusion. EC retention data after 30 minutes, 2 hours and 6 hours of flow are shown in the table.


Sign in / Sign up

Export Citation Format

Share Document