scholarly journals Neural oscillations in the temporal pole for a temporally congruent audio-visual speech detection task

2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Takefumi Ohki ◽  
Atsuko Gunji ◽  
Yuichi Takei ◽  
Hidetoshi Takahashi ◽  
Yuu Kaneko ◽  
...  
2014 ◽  
Vol 135 (4) ◽  
pp. 2162-2162
Author(s):  
Charlotte Morse-Fortier ◽  
Giang Pham ◽  
Richard L. Freyman

2021 ◽  
Author(s):  
Shashidhar R ◽  
Sudarshan Patil Kulkarni

Abstract In the current scenario, audio visual speech recognition is one of the emerging fields of research, but there is still deficiency of appropriate visual features for recognition of visual speech. Human lip-readers are increasingly being presented as useful in the gathering of forensic evidence but, like all human, suffer from unreliability in analyzing the lip movement. Here we used a custom dataset and design the system in such a way that it predicts the output for the lip reading. The problem of speaker independent lip-reading is very demanding due to unpredictable variations between people. Also due to recent developments and advances in the fields of signal processing and computer vision. The task of automating the lip reading is becoming a field of great interest. Here we use MFCC techniques for audio processing and LSTM method for visual speech recognition and finally integrate the audio and video using feed forward neural network (FFNN) and also got good accuracy. That is why the AVSR technique attract a great attention as a reliable solution for the speech detection problem. The final model was capable of taking more appropriate decision while predicting the spoken word. We were able to get a good accuracy of about 92.38% for the final model.


Author(s):  
Cristina Bosco ◽  
Felice Dell’Orletta ◽  
Fabio Poletto ◽  
Manuela Sanguinetti ◽  
Maurizio Tesconi

2004 ◽  
Vol 44 (1-4) ◽  
pp. 19-30 ◽  
Author(s):  
Jeesun Kim ◽  
Chris Davis

2019 ◽  
Vol 62 (10) ◽  
pp. 3860-3875 ◽  
Author(s):  
Kaylah Lalonde ◽  
Lynne A. Werner

Purpose This study assessed the extent to which 6- to 8.5-month-old infants and 18- to 30-year-old adults detect and discriminate auditory syllables in noise better in the presence of visual speech than in auditory-only conditions. In addition, we examined whether visual cues to the onset and offset of the auditory signal account for this benefit. Method Sixty infants and 24 adults were randomly assigned to speech detection or discrimination tasks and were tested using a modified observer-based psychoacoustic procedure. Each participant completed 1–3 conditions: auditory-only, with visual speech, and with a visual signal that only cued the onset and offset of the auditory syllable. Results Mixed linear modeling indicated that infants and adults benefited from visual speech on both tasks. Adults relied on the onset–offset cue for detection, but the same cue did not improve their discrimination. The onset–offset cue benefited infants for both detection and discrimination. Whereas the onset–offset cue improved detection similarly for infants and adults, the full visual speech signal benefited infants to a lesser extent than adults on the discrimination task. Conclusions These results suggest that infants' use of visual onset–offset cues is mature, but their ability to use more complex visual speech cues is still developing. Additional research is needed to explore differences in audiovisual enhancement (a) of speech discrimination across speech targets and (b) with increasingly complex tasks and stimuli.


2021 ◽  
pp. 273-286
Author(s):  
Kosisochukwu Judith Madukwe ◽  
Xiaoying Gao ◽  
Bing Xue

Sign in / Sign up

Export Citation Format

Share Document