critical band
Recently Published Documents


TOTAL DOCUMENTS

145
(FIVE YEARS 9)

H-INDEX

20
(FIVE YEARS 2)

2022 ◽  
pp. 1146-1156
Author(s):  
Revathi A. ◽  
Sasikaladevi N.

This chapter on multi speaker independent emotion recognition encompasses the use of perceptual features with filters spaced in Equivalent rectangular bandwidth (ERB) and BARK scale and vector quantization (VQ) classifier for classifying groups and artificial neural network with back propagation algorithm for emotion classification in a group. Performance can be improved by using the large amount of data in a pertinent emotion to adequately train the system. With the limited set of data, this proposed system has provided consistently better accuracy for the perceptual feature with critical band analysis done in ERB scale.


2021 ◽  
Author(s):  
◽  
Judi Lapsley Miller

<p>The bandwidth-duration product, WT , is a fundamental parameter in most theories of aural amplitude discrimination of Gaussian noise. These theories predict that detectability is dependent on WT , but not on the individual values of bandwidth and duration. Due to the acoustical uncertainty principle, it is impossible to completely specify an acoustic waveform with both finite duration and finite bandwidth. An observer must decide how best to trade-off information in the time domain with information in the frequency domain. As Licklider (1963) states, "The nature of [the ear's] solution to the time-frequency problem is, in fact, one of the central problems in the psychology of hearing."This problem is still unresolved, primarily due to observer inconsistency in experiments, which degrades performance making it difficult to compare models. The aim was to compare human observers' ability to trade bandwidth and duration, with simulated and theoretical observers. Human observers participated in a parametric study where the bandwidth and duration of 500 Hz noise waveforms was systematically varied for the same bandwidth-duration products (WT = 1, 2, and 4, where W varied over 2.5-160 Hz, and T varied over 400-6.25 ms, in octave steps). If observers can trade bandwidth and duration, detectability should be constant for the same WT . The observers replicated the experiments six times so that group operating characteristic (GOC) analysis could be used to reduce the effects of their inconsistent decision making. Asymptotic errorless performance was estimated by extrapolating results from the GOC analysis, as a function of replications added. Three simulated ideal observers: the energy, envelope, and full-linear (band-pass filter, full-wave rectifier, and true integrator) detectors were compared with each other, with mathematical theory and with human observers. Asymptotic detectability relative to the full-linear detector indicates that human observers best detect signals with a bandwidth of 40-80 Hz and a duration of 50-100 ms, and that other values are traded off in approximately concentric ellipses of equal detectability. Human detectability of Gaussian noise was best modelled by the full-linear detector using a non-optimal filter. Comparing psychometric functions for this detector with human data shows many striking similarities, indicating that human observers can sometimes perform as well as an ideal observer, once their inconsistency is minimised. These results indicate that the human hearing system can trade bandwidth and duration of signals, but not optimally. This accounts for many of the disparate estimates of the critical band, rectifier, and temporal integrator, found in the literature, because (a) the critical band is adjustable, but has a minimum of 40-50 Hz, (b) the rectifier is linear, rather than square-law, and (c) the temporal integrator is either true or leaky with a very long time constant.</p>


2021 ◽  
Author(s):  
◽  
Judi Lapsley Miller

<p>The bandwidth-duration product, WT , is a fundamental parameter in most theories of aural amplitude discrimination of Gaussian noise. These theories predict that detectability is dependent on WT , but not on the individual values of bandwidth and duration. Due to the acoustical uncertainty principle, it is impossible to completely specify an acoustic waveform with both finite duration and finite bandwidth. An observer must decide how best to trade-off information in the time domain with information in the frequency domain. As Licklider (1963) states, "The nature of [the ear's] solution to the time-frequency problem is, in fact, one of the central problems in the psychology of hearing."This problem is still unresolved, primarily due to observer inconsistency in experiments, which degrades performance making it difficult to compare models. The aim was to compare human observers' ability to trade bandwidth and duration, with simulated and theoretical observers. Human observers participated in a parametric study where the bandwidth and duration of 500 Hz noise waveforms was systematically varied for the same bandwidth-duration products (WT = 1, 2, and 4, where W varied over 2.5-160 Hz, and T varied over 400-6.25 ms, in octave steps). If observers can trade bandwidth and duration, detectability should be constant for the same WT . The observers replicated the experiments six times so that group operating characteristic (GOC) analysis could be used to reduce the effects of their inconsistent decision making. Asymptotic errorless performance was estimated by extrapolating results from the GOC analysis, as a function of replications added. Three simulated ideal observers: the energy, envelope, and full-linear (band-pass filter, full-wave rectifier, and true integrator) detectors were compared with each other, with mathematical theory and with human observers. Asymptotic detectability relative to the full-linear detector indicates that human observers best detect signals with a bandwidth of 40-80 Hz and a duration of 50-100 ms, and that other values are traded off in approximately concentric ellipses of equal detectability. Human detectability of Gaussian noise was best modelled by the full-linear detector using a non-optimal filter. Comparing psychometric functions for this detector with human data shows many striking similarities, indicating that human observers can sometimes perform as well as an ideal observer, once their inconsistency is minimised. These results indicate that the human hearing system can trade bandwidth and duration of signals, but not optimally. This accounts for many of the disparate estimates of the critical band, rectifier, and temporal integrator, found in the literature, because (a) the critical band is adjustable, but has a minimum of 40-50 Hz, (b) the rectifier is linear, rather than square-law, and (c) the temporal integrator is either true or leaky with a very long time constant.</p>


Author(s):  
Revathi A. ◽  
Sasikaladevi N.

This chapter on multi speaker independent emotion recognition encompasses the use of perceptual features with filters spaced in Equivalent rectangular bandwidth (ERB) and BARK scale and vector quantization (VQ) classifier for classifying groups and artificial neural network with back propagation algorithm for emotion classification in a group. Performance can be improved by using the large amount of data in a pertinent emotion to adequately train the system. With the limited set of data, this proposed system has provided consistently better accuracy for the perceptual feature with critical band analysis done in ERB scale.


2019 ◽  
Vol 277 ◽  
pp. 02001 ◽  
Author(s):  
Qiwei Yin ◽  
Ruixun Zhang ◽  
XiuLi Shao

In this paper, we propose a CNN(Convolutional neural networks) and RNN(recurrent neural networks) mixed model for image classification, the proposed network, called CNN-RNN model. Image data can be viewed as two-dimensional wave data, and convolution calculation is a filtering process. It can filter non-critical band information in an image, leaving behind important features of image information. The CNN-RNN model can use the RNN to Calculate the Dependency and Continuity Features of the Intermediate Layer Output of the CNN Model, connect the characteristics of these middle tiers to the final full-connection network for classification prediction, which will result in better classification accuracy. At the same time, in order to satisfy the restriction of the length of the input sequence by the RNN model and prevent the gradient explosion or gradient disappearing in the network, this paper combines the wavelet transform (WT) method in the Fourier transform to filter the input data. We will test the proposed CNN-RNN model on a widely-used datasets CIFAR-10. The results prove the proposed method has a better classification effect than the original CNN network, and that further investigation is needed.


Sign in / Sign up

Export Citation Format

Share Document