Experimental Study on Improvement of Sign Language Motion Classification Performance Using Pre-trained Network Models

Author(s):  
Kaito Kawaguchi ◽  
Zhizhong Wang ◽  
Tomoki Kuniwa ◽  
Paporn Daraseneeyakul ◽  
Phaphimon Veerakiatikit ◽  
...  
Author(s):  
Johannes Mehrer ◽  
Courtney J. Spoerer ◽  
Nikolaus Kriegeskorte ◽  
Tim C. Kietzmann

AbstractDeep neural networks (DNNs) excel at visual recognition tasks and are increasingly used as a modelling framework for neural computations in the primate brain. However, each DNN instance, just like each individual brain, has a unique connectivity and representational profile. Here, we investigate individual differences among DNN instances that arise from varying only the random initialization of the network weights. Using representational similarity analysis, we demonstrate that this minimal change in initial conditions prior to training leads to substantial differences in intermediate and higher-level network representations, despite achieving indistinguishable network-level classification performance. We locate the origins of the effects in an under-constrained alignment of category exemplars, rather than a misalignment of category centroids. Furthermore, while network regularization can increase the consistency of learned representations, considerable differences remain. These results suggest that computational neuroscientists working with DNNs should base their inferences on multiple networks instances instead of single off-the-shelf networks.


Symmetry ◽  
2020 ◽  
Vol 12 (7) ◽  
pp. 1193
Author(s):  
Shaochen Jiang ◽  
Liejun Wang ◽  
Shuli Cheng ◽  
Anyu Du ◽  
Yongming Li

The existing learning-based unsupervised hashing method usually uses a pre-trained network to extract features, and then uses the extracted feature vectors to construct a similarity matrix which guides the generation of hash codes through gradient descent. Existing research shows that the algorithm based on gradient descent will cause the hash codes of the paired images to be updated toward each other’s position during the training process. For unsupervised training, this situation will cause large fluctuations in the hash code during training and limit the learning efficiency of the hash code. In this paper, we propose a method named Deep Unsupervised Hashing with Gradient Attention (UHGA) to solve this problem. UHGA mainly includes the following contents: (1) use pre-trained network models to extract image features; (2) calculate the cosine distance of the corresponding features of the pair of images, and construct a similarity matrix through the cosine distance to guide the generation of hash codes; (3) a gradient attention mechanism is added during the training of the hash code to pay attention to the gradient. Experiments on two existing public datasets show that our proposed method can obtain more discriminating hash codes.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Johannes Mehrer ◽  
Courtney J. Spoerer ◽  
Nikolaus Kriegeskorte ◽  
Tim C. Kietzmann

AbstractDeep neural networks (DNNs) excel at visual recognition tasks and are increasingly used as a modeling framework for neural computations in the primate brain. Just like individual brains, each DNN has a unique connectivity and representational profile. Here, we investigate individual differences among DNN instances that arise from varying only the random initialization of the network weights. Using tools typically employed in systems neuroscience, we show that this minimal change in initial conditions prior to training leads to substantial differences in intermediate and higher-level network representations despite similar network-level classification performance. We locate the origins of the effects in an under-constrained alignment of category exemplars, rather than misaligned category centroids. These results call into question the common practice of using single networks to derive insights into neural information processing and rather suggest that computational neuroscientists working with DNNs may need to base their inferences on groups of multiple network instances.


2019 ◽  
Vol 9 (13) ◽  
pp. 2683 ◽  
Author(s):  
Sang-Ki Ko ◽  
Chang Jo Kim ◽  
Hyedong Jung ◽  
Choongsang Cho

We propose a sign language translation system based on human keypoint estimation. It is well-known that many problems in the field of computer vision require a massive dataset to train deep neural network models. The situation is even worse when it comes to the sign language translation problem as it is far more difficult to collect high-quality training data. In this paper, we introduce the KETI (Korea Electronics Technology Institute) sign language dataset, which consists of 14,672 videos of high resolution and quality. Considering the fact that each country has a different and unique sign language, the KETI sign language dataset can be the starting point for further research on the Korean sign language translation. Using the KETI sign language dataset, we develop a neural network model for translating sign videos into natural language sentences by utilizing the human keypoints extracted from the face, hands, and body parts. The obtained human keypoint vector is normalized by the mean and standard deviation of the keypoints and used as input to our translation model based on the sequence-to-sequence architecture. As a result, we show that our approach is robust even when the size of the training data is not sufficient. Our translation model achieved 93.28% (55.28%, respectively) translation accuracy on the validation set (test set, respectively) for 105 sentences that can be used in emergency situations. We compared several types of our neural sign translation models based on different attention mechanisms in terms of classical metrics for measuring the translation performance.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Hao Liu ◽  
Keqiang Yue ◽  
Siyi Cheng ◽  
Chengming Pan ◽  
Jie Sun ◽  
...  

Diabetic retinopathy (DR) is one of the most common complications of diabetes and the main cause of blindness. The progression of the disease can be prevented by early diagnosis of DR. Due to differences in the distribution of medical conditions and low labor efficiency, the best time for diagnosis and treatment was missed, which results in impaired vision. Using neural network models to classify and diagnose DR can improve efficiency and reduce costs. In this work, an improved loss function and three hybrid model structures Hybrid-a, Hybrid-f, and Hybrid-c were proposed to improve the performance of DR classification models. EfficientNetB4, EfficientNetB5, NASNetLarge, Xception, and InceptionResNetV2 CNNs were chosen as the basic models. These basic models were trained using enhance cross-entropy loss and cross-entropy loss, respectively. The output of the basic models was used to train the hybrid model structures. Experiments showed that enhance cross-entropy loss can effectively accelerate the training process of the basic models and improve the performance of the models under various evaluation metrics. The proposed hybrid model structures can also improve DR classification performance. Compared with the best-performing results in the basic models, the accuracy of DR classification was improved from 85.44% to 86.34%, the sensitivity was improved from 98.48% to 98.77%, the specificity was improved from 71.82% to 74.76%, the precision was improved from 90.27% to 91.37%, and the F1 score was improved from 93.62% to 93.9% by using hybrid model structures.


2018 ◽  
Vol 11 (3) ◽  
pp. 386-403 ◽  
Author(s):  
M. Arif Wani ◽  
Saduf Afzal

Purpose Many strategies have been put forward for training deep network models, however, stacking of several layers of non-linearities typically results in poor propagation of gradients and activations. The purpose of this paper is to explore the use of two steps strategy where initial deep learning model is obtained first by unsupervised learning and then optimizing the initial deep learning model by fine tuning. A number of fine tuning algorithms are explored in this work for optimizing deep learning models. This includes proposing a new algorithm where Backpropagation with adaptive gain algorithm is integrated with Dropout technique and the authors evaluate its performance in the fine tuning of the pretrained deep network. Design/methodology/approach The parameters of deep neural networks are first learnt using greedy layer-wise unsupervised pretraining. The proposed technique is then used to perform supervised fine tuning of the deep neural network model. Extensive experimental study is performed to evaluate the performance of the proposed fine tuning technique on three benchmark data sets: USPS, Gisette and MNIST. The authors have tested the approach on varying size data sets which include randomly chosen training samples of size 20, 50, 70 and 100 percent from the original data set. Findings Through extensive experimental study, it is concluded that the two steps strategy and the proposed fine tuning technique significantly yield promising results in optimization of deep network models. Originality/value This paper proposes employing several algorithms for fine tuning of deep network model. A new approach that integrates adaptive gain Backpropagation (BP) algorithm with Dropout technique is proposed for fine tuning of deep networks. Evaluation and comparison of various algorithms proposed for fine tuning on three benchmark data sets is presented in the paper.


2013 ◽  
Vol 16 (1) ◽  
pp. 31-73
Author(s):  
Lynn Y-S. Hou

Little is known about when and how children acquire plurality for directional verbs in ASL and other signed languages. This paper reports on an experimental study of 11 deaf native-signing children’s acquisition of ‘plural verb agreement’ or plural forms of directional verbs in American Sign Language. Eleven native-signing deaf adults were also tested. An elicitation task explored how children (aged 3;4 to 5;11) and adults marked directional verbs for plurality. The children also participated in an imitation task. Adults marked directional verbs for plurality significantly more often than children. However, adults also omitted plurality from directional verbs, utilizing alternative strategies to mark plural referents significantly more often than did children. Children across all ages omitted plurality, suggesting that the omission is attributable to both the conceptual complexity of plural markers and the optionality of number-marking. Directionality may not be best analyzed as a morphosyntactic phenomenon analogous to verb agreement morphology in spoken languages.


Sign in / Sign up

Export Citation Format

Share Document