A Video-Based Taiwan Sign Language Recognition System Using Deep Learning Techniques

2022 ◽  
Author(s):  
Ming-Han Huang ◽  
Hsuan-Min Wang ◽  
CHUEN-TSAI SUN
Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6256
Author(s):  
Boon Giin Lee ◽  
Teak-Wei Chong ◽  
Wan-Young Chung

Sign language was designed to allow hearing-impaired people to interact with others. Nonetheless, knowledge of sign language is uncommon in society, which leads to a communication barrier with the hearing-impaired community. Many studies of sign language recognition utilizing computer vision (CV) have been conducted worldwide to reduce such barriers. However, this approach is restricted by the visual angle and highly affected by environmental factors. In addition, CV usually involves the use of machine learning, which requires collaboration of a team of experts and utilization of high-cost hardware utilities; this increases the application cost in real-world situations. Thus, this study aims to design and implement a smart wearable American Sign Language (ASL) interpretation system using deep learning, which applies sensor fusion that “fuses” six inertial measurement units (IMUs). The IMUs are attached to all fingertips and the back of the hand to recognize sign language gestures; thus, the proposed method is not restricted by the field of view. The study reveals that this model achieves an average recognition rate of 99.81% for dynamic ASL gestures. Moreover, the proposed ASL recognition system can be further integrated with ICT and IoT technology to provide a feasible solution to assist hearing-impaired people in communicating with others and improve their quality of life.


2020 ◽  
Vol 12 (6) ◽  
pp. 33-45
Author(s):  
Marwa R. M. Bastwesy ◽  
◽  
Nada M. ElShennawy ◽  
Mohamed T. Faheem Saidahmed

Many sensing gesture recognition systems based on Wi-Fi signals are introduced because of the commercial off-the-shelf Wi-Fi devices without any need for additional equipment. In this paper, a deep learning-based sign language recognition system is proposed. Wi-Fi CSI amplitude and phase information is used as input to the proposed model. The proposed model uses three types of deep learning: CNN, LSTM, and ABLSTM with a complete study of the impact of optimizers, the use of amplitude and phase of CSI, and preprocessing phase. Accuracy, F-score, Precision, and recall are used as performance metrics to evaluate the proposed model. The proposed model achieves 99.855%, 99.674%, 99.734%, and 93.84% average recognition accuracy for the lab, home, lab + home, and 5 different users in a lab environment, respectively. Experimental results show that the proposed model can effectively detect sign gestures in complex environments compared with some deep learning recognition models.


Symmetry ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 262
Author(s):  
Thongpan Pariwat ◽  
Pusadee Seresangtakul

Sign language is a type of language for the hearing impaired that people in the general public commonly do not understand. A sign language recognition system, therefore, represents an intermediary between the two sides. As a communication tool, a multi-stroke Thai finger-spelling sign language (TFSL) recognition system featuring deep learning was developed in this study. This research uses a vision-based technique on a complex background with semantic segmentation performed with dilated convolution for hand segmentation, hand strokes separated using optical flow, and learning feature and classification done with convolution neural network (CNN). We then compared the five CNN structures that define the formats. The first format was used to set the number of filters to 64 and the size of the filter to 3 × 3 with 7 layers; the second format used 128 filters, each filter 3 × 3 in size with 7 layers; the third format used the number of filters in ascending order with 7 layers, all of which had an equal 3 × 3 filter size; the fourth format determined the number of filters in ascending order and the size of the filter based on a small size with 7 layers; the final format was a structure based on AlexNet. As a result, the average accuracy was 88.83%, 87.97%, 89.91%, 90.43%, and 92.03%, respectively. We implemented the CNN structure based on AlexNet to create models for multi-stroke TFSL recognition systems. The experiment was performed using an isolated video of 42 Thai alphabets, which are divided into three categories consisting of one stroke, two strokes, and three strokes. The results presented an 88.00% average accuracy for one stroke, 85.42% for two strokes, and 75.00% for three strokes.


Electronics ◽  
2021 ◽  
Vol 10 (14) ◽  
pp. 1739
Author(s):  
Hamzah Luqman ◽  
El-Sayed M. El-Alfy

Sign languages are the main visual communication medium between hard-hearing people and their societies. Similar to spoken languages, they are not universal and vary from region to region, but they are relatively under-resourced. Arabic sign language (ArSL) is one of these languages that has attracted increasing attention in the research community. However, most of the existing and available works on sign language recognition systems focus on manual gestures, ignoring other non-manual information needed for other language signals such as facial expressions. One of the main challenges of not considering these modalities is the lack of suitable datasets. In this paper, we propose a new multi-modality ArSL dataset that integrates various types of modalities. It consists of 6748 video samples of fifty signs performed by four signers and collected using Kinect V2 sensors. This dataset will be freely available for researchers to develop and benchmark their techniques for further advancement of the field. In addition, we evaluated the fusion of spatial and temporal features of different modalities, manual and non-manual, for sign language recognition using the state-of-the-art deep learning techniques. This fusion boosted the accuracy of the recognition system at the signer-independent mode by 3.6% compared with manual gestures.


2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Gamal Tharwat ◽  
Abdelmoty M. Ahmed ◽  
Belgacem Bouallegue

In recent years, the role of pattern recognition in systems based on human computer interaction (HCI) has spread in terms of computer vision applications and machine learning, and one of the most important of these applications is to recognize the hand gestures used in dealing with deaf people, in particular to recognize the dashed letters in surahs of the Quran. In this paper, we suggest an Arabic Alphabet Sign Language Recognition System (AArSLRS) using the vision-based approach. The proposed system consists of four stages: the stage of data processing, preprocessing of data, feature extraction, and classification. The system deals with three types of datasets: data dealing with bare hands and a dark background, data dealing with bare hands, but with a light background, and data dealing with hands wearing dark colored gloves. AArSLRS begins with obtaining an image of the alphabet gestures, then revealing the hand from the image and isolating it from the background using one of the proposed methods, after which the hand features are extracted according to the selection method used to extract them. Regarding the classification process in this system, we have used supervised learning techniques for the classification of 28-letter Arabic alphabet using 9240 images. We focused on the classification for 14 alphabetic letters that represent the first Quran surahs in the Quranic sign language (QSL). AArSLRS achieved an accuracy of 99.5% for the K-Nearest Neighbor (KNN) classifier.


2019 ◽  
Vol 7 (2) ◽  
pp. 43
Author(s):  
MALHOTRA POOJA ◽  
K. MANIAR CHIRAG ◽  
V. SANKPAL NIKHIL ◽  
R. THAKKAR HARDIK ◽  
◽  
...  

Sign in / Sign up

Export Citation Format

Share Document