Efficient sign language recognition system and dataset creation method based on deep learning and image processing

Author(s):  
Alvaro Leandro Cavalcante Carneiro ◽  
Lucas de Brito Silva ◽  
Denis Henrique Pinheiro Salvadeo
Prospectiva ◽  
2018 ◽  
Vol 16 (2) ◽  
pp. 41-48
Author(s):  
Betsy Villa ◽  
Valeria Valencia ◽  
Julie Berrio

El lenguaje de señas es el autóctono, utilizado por las personas sordas para comunicarse. Se compone de movimientos y expresiones realizadas a través de diferentes partes del cuerpo. En Colombia, hay gran ausencia de tecnologías encaminadas al aprendizaje e interpretación de éste; por ende, es un compromiso social, llevar a cabo iniciativas que promuevan la mejora de la calidad de vida de este grupo social del país, el cual está representado por una minoría considerable. En este artículo, se muestra el proceso de diseño e implementación de un sistema de reconocimiento de gestos no móviles mediante el entorno de Matlab y el método SIFT; a través del cual se visualiza la imagen de la letra adquirida, junto con la traducción de la misma en el lenguaje de señas colombiano, aplicando identificación de puntos claves y comparación con imágenes almacenadas en base de datos. La herramienta realiza el reconocimiento de las 20 letras no móviles de este conjunto, implementando una interfaz gráfica en Matlab para una mejor visualización, fácil acceso al sistema y uso por parte del usuario. Se comprueba una mejor respuesta del sistema mediante la utilización de un elemento estandarizado de la imagen, en este caso, un guante quirúrgico, y se propone la mejora de la herramienta aplicando métodos de redes neuronales para que posteriormente pueda ser desarrollada de forma online; generando un mayor impacto para las necesidades actuales de la población colombiana.


Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6256
Author(s):  
Boon Giin Lee ◽  
Teak-Wei Chong ◽  
Wan-Young Chung

Sign language was designed to allow hearing-impaired people to interact with others. Nonetheless, knowledge of sign language is uncommon in society, which leads to a communication barrier with the hearing-impaired community. Many studies of sign language recognition utilizing computer vision (CV) have been conducted worldwide to reduce such barriers. However, this approach is restricted by the visual angle and highly affected by environmental factors. In addition, CV usually involves the use of machine learning, which requires collaboration of a team of experts and utilization of high-cost hardware utilities; this increases the application cost in real-world situations. Thus, this study aims to design and implement a smart wearable American Sign Language (ASL) interpretation system using deep learning, which applies sensor fusion that “fuses” six inertial measurement units (IMUs). The IMUs are attached to all fingertips and the back of the hand to recognize sign language gestures; thus, the proposed method is not restricted by the field of view. The study reveals that this model achieves an average recognition rate of 99.81% for dynamic ASL gestures. Moreover, the proposed ASL recognition system can be further integrated with ICT and IoT technology to provide a feasible solution to assist hearing-impaired people in communicating with others and improve their quality of life.


2020 ◽  
Vol 12 (6) ◽  
pp. 33-45
Author(s):  
Marwa R. M. Bastwesy ◽  
◽  
Nada M. ElShennawy ◽  
Mohamed T. Faheem Saidahmed

Many sensing gesture recognition systems based on Wi-Fi signals are introduced because of the commercial off-the-shelf Wi-Fi devices without any need for additional equipment. In this paper, a deep learning-based sign language recognition system is proposed. Wi-Fi CSI amplitude and phase information is used as input to the proposed model. The proposed model uses three types of deep learning: CNN, LSTM, and ABLSTM with a complete study of the impact of optimizers, the use of amplitude and phase of CSI, and preprocessing phase. Accuracy, F-score, Precision, and recall are used as performance metrics to evaluate the proposed model. The proposed model achieves 99.855%, 99.674%, 99.734%, and 93.84% average recognition accuracy for the lab, home, lab + home, and 5 different users in a lab environment, respectively. Experimental results show that the proposed model can effectively detect sign gestures in complex environments compared with some deep learning recognition models.


Symmetry ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 262
Author(s):  
Thongpan Pariwat ◽  
Pusadee Seresangtakul

Sign language is a type of language for the hearing impaired that people in the general public commonly do not understand. A sign language recognition system, therefore, represents an intermediary between the two sides. As a communication tool, a multi-stroke Thai finger-spelling sign language (TFSL) recognition system featuring deep learning was developed in this study. This research uses a vision-based technique on a complex background with semantic segmentation performed with dilated convolution for hand segmentation, hand strokes separated using optical flow, and learning feature and classification done with convolution neural network (CNN). We then compared the five CNN structures that define the formats. The first format was used to set the number of filters to 64 and the size of the filter to 3 × 3 with 7 layers; the second format used 128 filters, each filter 3 × 3 in size with 7 layers; the third format used the number of filters in ascending order with 7 layers, all of which had an equal 3 × 3 filter size; the fourth format determined the number of filters in ascending order and the size of the filter based on a small size with 7 layers; the final format was a structure based on AlexNet. As a result, the average accuracy was 88.83%, 87.97%, 89.91%, 90.43%, and 92.03%, respectively. We implemented the CNN structure based on AlexNet to create models for multi-stroke TFSL recognition systems. The experiment was performed using an isolated video of 42 Thai alphabets, which are divided into three categories consisting of one stroke, two strokes, and three strokes. The results presented an 88.00% average accuracy for one stroke, 85.42% for two strokes, and 75.00% for three strokes.


Sign in / Sign up

Export Citation Format

Share Document