A multi-channel/multi-speaker interactive 3D audio-visual speech corpus in Mandarin

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) ◽

10.1109/iscslp.2016.7918453 ◽

2016 ◽

Author(s):

Jun Yu ◽

Rongfeng Su ◽

Lan Wang ◽

Wenpeng Zhou

Keyword(s):

Visual Speech ◽

Speech Corpus ◽

3D Audio ◽

Download Full-text

Multimodal learning using 3D audio-visual data for audio-visual speech recognition

2017 International Conference on Asian Language Processing (IALP) ◽

10.1109/ialp.2017.8300541 ◽

2017 ◽

Author(s):

Rongfeng Su ◽

Lan Wang ◽

Xunying Liu

Keyword(s):

Speech Recognition ◽

Visual Speech ◽

Multimodal Learning ◽

Visual Data ◽

3D Audio ◽

Visual Speech Recognition

Download Full-text

AVICAR: audio-visual speech corpus in a car environment

10.21437/interspeech.2004-424 ◽

2004 ◽

Author(s):

Bowon Lee ◽

Mark Hasegawa-Johnson ◽

Camille Goudeseune ◽

Suketu Kamdar ◽

Sarah Borys ◽

...

Keyword(s):

Visual Speech ◽

Download Full-text

Indonesian audio-visual speech corpus for multimodal automatic speech recognition

2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS) ◽

10.1109/icacsis.2017.8355062 ◽

2017 ◽

Author(s):

Muhammad Rizki Aulia Rahman Maulana ◽

Mohamad Ivan Fanany

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Visual Speech ◽

Download Full-text

A methodology of multimodal corpus creation for audio-visual speech recognition in assistive transport systems

Informatization and communication ◽

10.34219/2078-8320-2020-11-5-87-93 ◽

2020 ◽

Vol 5 ◽

pp. 87-93

Author(s):

A.A. Axyonov ◽

◽

D.V. Ivanko ◽

I.B. Lashkov ◽

D.A. Ryumin ◽

...

Keyword(s):

Speech Recognition ◽

Visual Speech ◽

Transport Systems ◽

Speech Corpus ◽

Visual Speech Recognition ◽

Driver Monitoring ◽

Multimodal Corpus ◽

Corpus Creation ◽

This paper introduces a new methodology of multimodal corpus creation for audio-visual speech recognition in driver monitoring systems. Multimodal speech recognition allows using audio data when video data are useless (e.g. at nighttime), as well as applying video data in acoustically noisy conditions (e.g., at highways). The article discusses several basic scenarios when speech recognition in the vehicle environment is required to interact with the driver monitoring system. The methodology defi nes the main stages and requirements for the design of a multimodal building. The paper also describes metaparameters that the multimodal corpus must correspond to. In addition, a software package for recording an audiovisual speech corpus is described.

Download Full-text

Development of audio-visual speech corpus toward speaker-independent Japanese LVCSR

2016 Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA) ◽

10.1109/icsda.2016.7918976 ◽

2016 ◽

Author(s):

Kazuto Ukai ◽

Satoshi Tamura ◽

Satoru Hayamizu

Keyword(s):

Visual Speech ◽

Speech Corpus ◽

Speaker Independent

Download Full-text

Construction of Audio-Visual Speech Corpus Using Motion-Capture System and Corpus Based Facial Animation

IEICE Transactions on Information and Systems ◽

10.1093/ietisy/e88-d.11.2477 ◽

2005 ◽

Vol E88-D (11) ◽

pp. 2477-2483 ◽

Author(s):

T. YOTSUKURA

Keyword(s):

Motion Capture ◽

Facial Animation ◽

Visual Speech ◽

Motion Capture System ◽

Download Full-text

LiVRation: Remote VR live platform with interactive 3D audio-visual service

2019 IEEE Games, Entertainment, Media Conference (GEM) ◽

10.1109/gem.2019.8811549 ◽

2019 ◽

Author(s):

Takashi Kasuya ◽

Manabu Tsukada ◽

Yu Komohara ◽

Shigeki Takasaka ◽

Takuhiro Mizuno ◽

...

Keyword(s):

3D Audio ◽

Download Full-text

Visual speech segmentation: Using facial cues to locate word boundaries in continuous speech

PsycEXTRA Dataset ◽

10.1037/e520592012-436 ◽

2010 ◽

Author(s):

Aaron D. Mitchel ◽

Daniel J. Weiss

Keyword(s):

Visual Speech ◽

Speech Segmentation ◽

Continuous Speech ◽

Facial Cues ◽

Word Boundaries

Download Full-text

Uni- and bimodal threat cueing with vibrotactile and 3D audio technologies in a combat vehicle

PsycEXTRA Dataset ◽

10.1037/e577702012-008 ◽

2006 ◽

Author(s):

Otto Carlander ◽

Lars Eriksson

Keyword(s):

3D Audio ◽

Combat Vehicle ◽

Audio Technologies

Download Full-text

Research on Construction of the Korean Speech Corpus in Patient with Velopharyngeal Insufficiency

Korean Journal of Otorhinolaryngology - Head and Neck Surgery ◽

10.3342/kjorl-hns.2012.55.8.498 ◽

2012 ◽

Vol 55 (8) ◽

pp. 498 ◽

Author(s):

Ji-Eun Lee ◽

Wook-Eun Kim ◽

Kwang Hyun Kim ◽

Myung-Whun Sung ◽

Tack-Kyun Kwon

Keyword(s):

Velopharyngeal Insufficiency ◽

Download Full-text