Neural oscillations in the temporal pole for a temporally congruent audio-visual speech detection task

Abstract In the current scenario, audio visual speech recognition is one of the emerging fields of research, but there is still deficiency of appropriate visual features for recognition of visual speech. Human lip-readers are increasingly being presented as useful in the gathering of forensic evidence but, like all human, suffer from unreliability in analyzing the lip movement. Here we used a custom dataset and design the system in such a way that it predicts the output for the lip reading. The problem of speaker independent lip-reading is very demanding due to unpredictable variations between people. Also due to recent developments and advances in the fields of signal processing and computer vision. The task of automating the lip reading is becoming a field of great interest. Here we use MFCC techniques for audio processing and LSTM method for visual speech recognition and finally integrate the audio and video using feed forward neural network (FFNN) and also got good accuracy. That is why the AVSR technique attract a great attention as a reliable solution for the speech detection problem. The final model was capable of taking more appropriate decision while predicting the spoken word. We were able to get a good accuracy of about 92.38% for the final model.

Download Full-text

Overview of the EVALITA 2018 Hate Speech Detection Task

EVALITA Evaluation of NLP and Speech Tools for Italian ◽

10.4000/books.aaccademia.4503 ◽

2018 ◽

pp. 67-74 ◽

Cited By ~ 4

Author(s):

Cristina Bosco ◽

Felice Dell’Orletta ◽

Fabio Poletto ◽

Manuela Sanguinetti ◽

Maurizio Tesconi

Keyword(s):

Hate Speech ◽

Detection Task ◽

Speech Detection

Download Full-text

Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations

The World Wide Web Conference on - WWW '19 ◽

10.1145/3308558.3313504 ◽

2019 ◽

Cited By ~ 5

Author(s):

Pinkesh Badjatiya ◽

Manish Gupta ◽

Vasudeva Varma

Keyword(s):

Hate Speech ◽

Detection Task ◽

Speech Detection ◽

Knowledge Based ◽

Bias Removal

Download Full-text

Investigating the audio–visual speech detection advantage

Speech Communication ◽

10.1016/j.specom.2004.09.008 ◽

2004 ◽

Vol 44 (1-4) ◽

pp. 19-30 ◽

Cited By ~ 36

Author(s):

Jeesun Kim ◽

Chris Davis

Keyword(s):

Visual Speech ◽

Speech Detection

Download Full-text

Infants and Adults Use Visual Cues to Improve Detection and Discrimination of Speech in Noise

Journal of Speech Language and Hearing Research ◽

10.1044/2019_jslhr-h-19-0106 ◽

2019 ◽

Vol 62 (10) ◽

pp. 3860-3875 ◽

Cited By ~ 3

Author(s):

Kaylah Lalonde ◽

Lynne A. Werner

Keyword(s):

Visual Cues ◽

Visual Speech ◽

Visual Signal ◽

Auditory Signal ◽

Speech Discrimination ◽

Linear Modeling ◽

Speech Detection ◽

Speech In Noise ◽

Speech Cues ◽

Visual Onset

Purpose This study assessed the extent to which 6- to 8.5-month-old infants and 18- to 30-year-old adults detect and discriminate auditory syllables in noise better in the presence of visual speech than in auditory-only conditions. In addition, we examined whether visual cues to the onset and offset of the auditory signal account for this benefit. Method Sixty infants and 24 adults were randomly assigned to speech detection or discrimination tasks and were tested using a modified observer-based psychoacoustic procedure. Each participant completed 1–3 conditions: auditory-only, with visual speech, and with a visual signal that only cued the onset and offset of the auditory syllable. Results Mixed linear modeling indicated that infants and adults benefited from visual speech on both tasks. Adults relied on the onset–offset cue for detection, but the same cue did not improve their discrimination. The onset–offset cue benefited infants for both detection and discrimination. Whereas the onset–offset cue improved detection similarly for infants and adults, the full visual speech signal benefited infants to a lesser extent than adults on the discrimination task. Conclusions These results suggest that infants' use of visual onset–offset cues is mature, but their ability to use more complex visual speech cues is still developing. Additional research is needed to explore differences in audiovisual enhancement (a) of speech discrimination across speech targets and (b) with increasingly complex tasks and stimuli.

Download Full-text

What Emotion Is Hate? Incorporating Emotion Information into the Hate Speech Detection Task

10.1007/978-3-030-89363-7_21 ◽

2021 ◽

pp. 273-286

Author(s):

Kosisochukwu Judith Madukwe ◽

Xiaoying Gao ◽

Bing Xue

Keyword(s):

Hate Speech ◽

Detection Task ◽

Speech Detection

Download Full-text