scholarly journals Crosslinguistic Intelligibility of Russian and German Speech in Noisy Environment

2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Rodmonga Potapova ◽  
Maria Grigorieva

This paper discusses the results of the pilot experimental research dedicated to speech recognition and perception of the semantic content of the utterances in noisy environment. The experiment included perceptual-auditory analysis of words and phrases in Russian and German (in comparison) in the same noisy environment: various (pink and white) types of noise with various levels of signal-to-noise ratio. The statistical analysis showed that intelligibility and perception of the speech in noisy environment are influenced not only by noise type and its signal-to-noise ratio, but also by some linguistic and extralinguistic factors, such as the existing redundancy of a particular language at various levels of linguistic structure, changes in the acoustic characteristics of the speaker while switching from one language to another one, the level of speaker and listener’s proficiency in a specific language, and acoustic characteristics of the speaker’s voice.

2020 ◽  
Author(s):  
chaofeng lan ◽  
yuanyuan Zhang ◽  
hongyun Zhao

Abstract This paper draws on the training method of Recurrent Neural Network (RNN), By increasing the number of hidden layers of RNN and changing the layer activation function from traditional Sigmoid to Leaky ReLU on the input layer, the first group and the last set of data are zero-padded to enhance the effective utilization of data such that the improved reduction model of Denoise Recurrent Neural Network (DRNN) with high calculation speed and good convergence is constructed to solve the problem of low speaker recognition rate in noisy environment. According to this model, the random semantic speech signal with a sampling rate of 16 kHz and a duration of 5 seconds in the speech library is studied. The experimental settings of the signal-to-noise ratios are − 10dB, -5dB, 0dB, 5dB, 10dB, 15dB, 20dB, 25dB. In the noisy environment, the improved model is used to denoise the Mel Frequency Cepstral Coefficients (MFCC) and the Gammatone Frequency Cepstral Coefficents (GFCC), impact of the traditional model and the improved model on the speech recognition rate is analyzed. The research shows that the improved model can effectively eliminate the noise of the feature parameters and improve the speech recognition rate. When the signal-to-noise ratio is low, the speaker recognition rate can be more obvious. Furthermore, when the signal-to-noise ratio is 0dB, the speaker recognition rate of people is increased by 40%, which can be 85% improved compared with the traditional speech model. On the other hand, with the increase in the signal-to-noise ratio, the recognition rate is gradually increased. When the signal-to-noise ratio is 15dB, the recognition rate of speakers is 93%.


2001 ◽  
Vol 44 (6) ◽  
pp. 1315-1320 ◽  
Author(s):  
Mary H. Bellandese ◽  
Jay W. Lerman ◽  
Harvey R. Gilbert

Acoustic data for female esophageal speakers is sparse, particularly with regard to characteristics of female tracheoesophageal speakers. This study quantified and compared six acoustic characteristics of excellent female tracheoesophageal (TE), standard esophageal (SE), and laryngeal (LA) speakers. Results indicated there were no significant differences between TE and SE speakers with regard to mean F 0 of sustained /α/, mean F 0 (reading), signal-to-noise ratio, total duration of passage read, number of pauses, or syllables per minute. Significant differences were found between LA speakers and both alaryngeal groups for all variables, with the exception of mean F 0 (reading).


Author(s):  
David A. Grano ◽  
Kenneth H. Downing

The retrieval of high-resolution information from images of biological crystals depends, in part, on the use of the correct photographic emulsion. We have been investigating the information transfer properties of twelve emulsions with a view toward 1) characterizing the emulsions by a few, measurable quantities, and 2) identifying the “best” emulsion of those we have studied for use in any given experimental situation. Because our interests lie in the examination of crystalline specimens, we've chosen to evaluate an emulsion's signal-to-noise ratio (SNR) as a function of spatial frequency and use this as our critereon for determining the best emulsion.The signal-to-noise ratio in frequency space depends on several factors. First, the signal depends on the speed of the emulsion and its modulation transfer function (MTF). By procedures outlined in, MTF's have been found for all the emulsions tested and can be fit by an analytic expression 1/(1+(S/S0)2). Figure 1 shows the experimental data and fitted curve for an emulsion with a better than average MTF. A single parameter, the spatial frequency at which the transfer falls to 50% (S0), characterizes this curve.


Author(s):  
W. Kunath ◽  
K. Weiss ◽  
E. Zeitler

Bright-field images taken with axial illumination show spurious high contrast patterns which obscure details smaller than 15 ° Hollow-cone illumination (HCI), however, reduces this disturbing granulation by statistical superposition and thus improves the signal-to-noise ratio. In this presentation we report on experiments aimed at selecting the proper amount of tilt and defocus for improvement of the signal-to-noise ratio by means of direct observation of the electron images on a TV monitor.Hollow-cone illumination is implemented in our microscope (single field condenser objective, Cs = .5 mm) by an electronic system which rotates the tilted beam about the optic axis. At low rates of revolution (one turn per second or so) a circular motion of the usual granulation in the image of a carbon support film can be observed on the TV monitor. The size of the granular structures and the radius of their orbits depend on both the conical tilt and defocus.


Author(s):  
D. C. Joy ◽  
R. D. Bunn

The information available from an SEM image is limited both by the inherent signal to noise ratio that characterizes the image and as a result of the transformations that it may undergo as it is passed through the amplifying circuits of the instrument. In applications such as Critical Dimension Metrology it is necessary to be able to quantify these limitations in order to be able to assess the likely precision of any measurement made with the microscope.The information capacity of an SEM signal, defined as the minimum number of bits needed to encode the output signal, depends on the signal to noise ratio of the image - which in turn depends on the probe size and source brightness and acquisition time per pixel - and on the efficiency of the specimen in producing the signal that is being observed. A detailed analysis of the secondary electron case shows that the information capacity C (bits/pixel) of the SEM signal channel could be written as :


Sign in / Sign up

Export Citation Format

Share Document