A multi-channel/multi-speaker interactive 3D audio-visual speech corpus in Mandarin

Author(s):  
Jun Yu ◽  
Rongfeng Su ◽  
Lan Wang ◽  
Wenpeng Zhou
Author(s):  
Bowon Lee ◽  
Mark Hasegawa-Johnson ◽  
Camille Goudeseune ◽  
Suketu Kamdar ◽  
Sarah Borys ◽  
...  
Keyword(s):  

2020 ◽  
Vol 5 ◽  
pp. 87-93
Author(s):  
A.A. Axyonov ◽  
◽  
D.V. Ivanko ◽  
I.B. Lashkov ◽  
D.A. Ryumin ◽  
...  

This paper introduces a new methodology of multimodal corpus creation for audio-visual speech recognition in driver monitoring systems. Multimodal speech recognition allows using audio data when video data are useless (e.g. at nighttime), as well as applying video data in acoustically noisy conditions (e.g., at highways). The article discusses several basic scenarios when speech recognition in the vehicle environment is required to interact with the driver monitoring system. The methodology defi nes the main stages and requirements for the design of a multimodal building. The paper also describes metaparameters that the multimodal corpus must correspond to. In addition, a software package for recording an audiovisual speech corpus is described.


Author(s):  
Takashi Kasuya ◽  
Manabu Tsukada ◽  
Yu Komohara ◽  
Shigeki Takasaka ◽  
Takuhiro Mizuno ◽  
...  
Keyword(s):  
3D Audio ◽  

Author(s):  
Ji-Eun Lee ◽  
Wook-Eun Kim ◽  
Kwang Hyun Kim ◽  
Myung-Whun Sung ◽  
Tack-Kyun Kwon

Sign in / Sign up

Export Citation Format

Share Document