Spatio-temporal feature extraction-based hand gesture recognition for isolated American Sign Language and Arabic numbers

Author(s):  
M. Elmezain ◽  
A. Al-Hamadi ◽  
S.S. Pathan ◽  
B. Michaelis
Author(s):  
Dhanashree Shyam Bendarkar ◽  
Pratiksha Appasaheb Somase ◽  
Preety Kalyansingh Rebari ◽  
Renuka Ramkrishna Paturkar ◽  
Arjumand Masood Khan

Individuals with hearing hindrance utilize gesture based communication to exchange their thoughts. Generally hand movements are used by them to communicate among themselves. But there are certain limitations when they communicate with other people who cannot understand these hand movements. There is a need to have a mechanism that can act as a translator between these people to communicate. It would be easier for these people to interact if there exists direct infrastructure that is able to convert signs to text and voice messages. As of late, numerous such frameworks for gesture based communication acknowledgment have been developed. But most of them are made either for static gesture recognition or dynamic gesture recognition. As sentences are generated using combinations of static and dynamic gestures, it would be simpler for hearing debilitated individuals if such computerized frameworks can detect both the static and dynamic motions together. We have proposed a design and architecture of American Sign Language (ASL) recognition with convolutional neural networks (CNN). This paper utilizes a pretrained VGG-16 architecture for static gesture recognition and for dynamic gesture recognition, spatiotemporal features were learnt with the complex architecture, called deep learning. It contains a bidirectional convolutional Long Short Term Memory network (ConvLSTM) and 3D convolutional neural network (3DCNN) and this architecture is responsible to extract  2D spatio temporal features.


Sign in / Sign up

Export Citation Format

Share Document