An Acoustic Model of Civil Aviation's Radiotelephony Communication

Author(s):  
Yuanqing Liu ◽  
Xiaojing Guo ◽  
Haigang Zhang ◽  
Jinfeng Yang
Keyword(s):  
2012 ◽  
Vol 71 (17) ◽  
pp. 1589-1597 ◽  
Author(s):  
l.Sh. Nevlyudov ◽  
A.M. Tsimbal ◽  
S.S. Milyutina ◽  
V.Y. Sharkovsky

2019 ◽  
Author(s):  
Masashi Aso ◽  
Shinnosuke Takamichi ◽  
Norihiro Takamune ◽  
Hiroshi Saruwatari

Energies ◽  
2020 ◽  
Vol 13 (8) ◽  
pp. 2048
Author(s):  
Jianfeng Zhu ◽  
Wenguo Luo ◽  
Yuqing Wei ◽  
Cheng Yan ◽  
Yancheng You

The buzz phenomenon of a typical supersonic inlet is analyzed on the basis of numerical simulations and duct acoustic theory. Considering that the choked inlet could be treated as a duct with one end closed, a one-dimensional (1D) mathematical model based on the duct acoustic theory is proposed to describe the periodic pressure oscillation of the little buzz and the big buzz. The results of the acoustic model agree well with that of the numerical simulations and the experimental data. It could verify that the dominated oscillation patterns of the little buzz and the big buzz are closely related to the first and second resonant mode of the standing wave, respectively. The discrepancies between the numerical simulation and the ideal acoustic model might be attributed to the viscous damping in the fluid oscillation system. In order to explore the damping, a small perturbation jet is introduced to trigger the resonance of the buzz system and the nonlinear amplification effect of resonance might be helpful to estimate the damping. Through the comparison between the linear acoustic model and the nonlinear simulation, the calculated pressure oscillation damping of the little buzz and the big buzz are 0.33 and 0.16, which could be regarded as an estimation of real damping.


Author(s):  
Ryo Nishikimi ◽  
Eita Nakamura ◽  
Masataka Goto ◽  
Kazuyoshi Yoshii

This paper describes an automatic singing transcription (AST) method that estimates a human-readable musical score of a sung melody from an input music signal. Because of the considerable pitch and temporal variation of a singing voice, a naive cascading approach that estimates an F0 contour and quantizes it with estimated tatum times cannot avoid many pitch and rhythm errors. To solve this problem, we formulate a unified generative model of a music signal that consists of a semi-Markov language model representing the generative process of latent musical notes conditioned on musical keys and an acoustic model based on a convolutional recurrent neural network (CRNN) representing the generative process of an observed music signal from the notes. The resulting CRNN-HSMM hybrid model enables us to estimate the most-likely musical notes from a music signal with the Viterbi algorithm, while leveraging both the grammatical knowledge about musical notes and the expressive power of the CRNN. The experimental results showed that the proposed method outperformed the conventional state-of-the-art method and the integration of the musical language model with the acoustic model has a positive effect on the AST performance.


Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 634
Author(s):  
Alakbar Valizada ◽  
Natavan Akhundova ◽  
Samir Rustamov

In this paper, various methodologies of acoustic and language models, as well as labeling methods for automatic speech recognition for spoken dialogues in emergency call centers were investigated and comparatively analyzed. Because of the fact that dialogue speech in call centers has specific context and noisy, emotional environments, available speech recognition systems show poor performance. Therefore, in order to accurately recognize dialogue speeches, the main modules of speech recognition systems—language models and acoustic training methodologies—as well as symmetric data labeling approaches have been investigated and analyzed. To find an effective acoustic model for dialogue data, different types of Gaussian Mixture Model/Hidden Markov Model (GMM/HMM) and Deep Neural Network/Hidden Markov Model (DNN/HMM) methodologies were trained and compared. Additionally, effective language models for dialogue systems were defined based on extrinsic and intrinsic methods. Lastly, our suggested data labeling approaches with spelling correction are compared with common labeling methods resulting in outperforming the other methods with a notable percentage. Based on the results of the experiments, we determined that DNN/HMM for an acoustic model, trigram with Kneser–Ney discounting for a language model and using spelling correction before training data for a labeling method are effective configurations for dialogue speech recognition in emergency call centers. It should be noted that this research was conducted with two different types of datasets collected from emergency calls: the Dialogue dataset (27 h), which encapsulates call agents’ speech, and the Summary dataset (53 h), which contains voiced summaries of those dialogues describing emergency cases. Even though the speech taken from the emergency call center is in the Azerbaijani language, which belongs to the Turkic group of languages, our approaches are not tightly connected to specific language features. Hence, it is anticipated that suggested approaches can be applied to the other languages of the same group.


Sign in / Sign up

Export Citation Format

Share Document