Deep Learning Approach for Devanagari Script Recognition

2017 ◽  
Vol 17 (03) ◽  
pp. 1750016 ◽  
Author(s):  
S. Prabhanjan ◽  
R. Dinesh

In this paper, we have proposed a new technique for recognition of handwritten Devanagari Script using deep learning architecture. In any OCR or classification system extracting discriminating feature is most important and crucial step for its success. Accuracy of such system often depends on the good feature representation. Deciding upon the appropriate features for classification system is highly subjective and requires lot of experience to decide proper set of features for a given classification system. For handwritten Devanagari characters it is very difficult to decide on optimal set of good feature to get good recognition rate. These methods use raw pixel values as features. Deep Learning architectures learn hierarchies of features. In this work, first image is preprocessed to remove noise, converted to binary image, resized to fixed size of 30[Formula: see text][Formula: see text][Formula: see text]40 and then convert to gray scale image using mask operation, it blurs the edges of the images. Then we learn features using an unsupervised stacked Restricted Boltzmann Machines (RBM) and use it with the deep belief network for recognition. Finally network weight parameters are fine tuned by supervised back propagation learning to improve the overall recognition performance. The proposed method has been tested on large set of handwritten numerical, character, vowel modifiers and compound characters and experimental results reveals that unsupervised method yields very good accuracy of (83.44%) and upon fine tuning of network parameters with supervised learning yields accuracy of (91.81%) which is the best results reported so far for handwritten Devanagari characters.

2015 ◽  
Vol 2015 ◽  
pp. 1-6 ◽  
Author(s):  
Zongyong Cui ◽  
Zongjie Cao ◽  
Jianyu Yang ◽  
Hongliang Ren

A hierarchical recognition system (HRS) based on constrained Deep Belief Network (DBN) is proposed for SAR Automatic Target Recognition (SAR ATR). As a classical Deep Learning method, DBN has shown great performance on data reconstruction, big data mining, and classification. However, few works have been carried out to solve small data problems (like SAR ATR) by Deep Learning method. In HRS, the deep structure and pattern classifier are combined to solve small data classification problems. After building the DBN with multiple Restricted Boltzmann Machines (RBMs), hierarchical features can be obtained, and then they are fed to classifier directly. To obtain more natural sparse feature representation, the Constrained RBM (CRBM) is proposed with solving a generalized optimization problem. Three RBM variants,L1-RNM,L2-RBM, andL1/2-RBM, are presented and introduced to HRS in this paper. The experiments on MSTAR public dataset show that the performance of the proposed HRS with CRBM outperforms current pattern recognition methods in SAR ATR, like PCA + SVM, LDA + SVM, and NMF + SVM.


2014 ◽  
Vol 2 (2) ◽  
pp. 43-53 ◽  
Author(s):  
S. Rojathai ◽  
M. Venkatesulu

In speech word recognition systems, feature extraction and recognition plays a most significant role. More number of feature extraction and recognition methods are available in the existing speech word recognition systems. In most recent Tamil speech word recognition system has given high speech word recognition performance with PAC-ANFIS compared to the earlier Tamil speech word recognition systems. So the investigation of speech word recognition by various recognition methods is needed to prove their performance in the speech word recognition. This paper presents the investigation process with well known Artificial Intelligence method as Feed Forward Back Propagation Neural Network (FFBNN) and Adaptive Neuro Fuzzy Inference System (ANFIS). The Tamil speech word recognition system with PAC-FFBNN performance is analyzed in terms of statistical measures and Word Recognition Rate (WRR) and compared with PAC-ANFIS and other existing Tamil speech word recognition systems.


Electronics ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 135 ◽  
Author(s):  
Siti Nurmaini ◽  
Annisa Darmawahyuni ◽  
Akhmad Noviar Sakti Mukti ◽  
Muhammad Naufal Rachmatullah ◽  
Firdaus Firdaus ◽  
...  

The electrocardiogram (ECG) is a widely used, noninvasive test for analyzing arrhythmia. However, the ECG signal is prone to contamination by different kinds of noise. Such noise may cause deformation on the ECG heartbeat waveform, leading to cardiologists’ mislabeling or misinterpreting heartbeats due to varying types of artifacts and interference. To address this problem, some previous studies propose a computerized technique based on machine learning (ML) to distinguish between normal and abnormal heartbeats. Unfortunately, ML works on a handcrafted, feature-based approach and lacks feature representation. To overcome such drawbacks, deep learning (DL) is proposed in the pre-training and fine-tuning phases to produce an automated feature representation for multi-class classification of arrhythmia conditions. In the pre-training phase, stacked denoising autoencoders (DAEs) and autoencoders (AEs) are used for feature learning; in the fine-tuning phase, deep neural networks (DNNs) are implemented as a classifier. To the best of our knowledge, this research is the first to implement stacked autoencoders by using DAEs and AEs for feature learning in DL. Physionet’s well-known MIT-BIH Arrhythmia Database, as well as the MIT-BIH Noise Stress Test Database (NSTDB). Only four records are used from the NSTDB dataset: 118 24 dB, 118 −6 dB, 119 24 dB, and 119 −6 dB, with two levels of signal-to-noise ratio (SNRs) at 24 dB and −6 dB. In the validation process, six models are compared to select the best DL model. For all fine-tuned hyperparameters, the best model of ECG heartbeat classification achieves an accuracy, sensitivity, specificity, precision, and F1-score of 99.34%, 93.83%, 99.57%, 89.81%, and 91.44%, respectively. As the results demonstrate, the proposed DL model can extract high-level features not only from the training data but also from unseen data. Such a model has good application prospects in clinical practice.


2020 ◽  
Vol 70 (2) ◽  
pp. 234-238
Author(s):  
K.S. Imanbaev ◽  

Currently, deep learning of neural networks is one of the most popular methods for speech recognition, natural language processing, and computer vision. The article reviews the history of deep learning of neural networks and the current state in General. We consider algorithms for training neural networks used for deep training of neural networks, followed by fine-tuning using the method of back propagation of errors. Neural networks with large numbers of hidden layers, frequently occurring and disappearing gradients are very difficult to train. In this paper, we consider methods that successfully implement training of neural networks with large numbers of layers (more than one hundred) and vanishing gradients. A review of well-known libraries used for successful deep learning of neural networks is conducted.


2020 ◽  
pp. 1-30
Author(s):  
Diana Nicoleta Popa ◽  
Julien Perez ◽  
James Henderson ◽  
Eric Gaussier

Abstract Distributional semantic word representations are at the basis of most modern NLP systems. Their usefulness has been proven across various tasks, particularly as inputs to deep learning models. Beyond that, much work investigated fine-tuning the generic word embeddings to leverage linguistic knowledge from large lexical resources. Some work investigated context-dependent word token embeddings motivated by word sense disambiguation, using sequential context and large lexical resources. More recently, acknowledging the need for an in-context representation of words, some work leveraged information derived from language modelling and large amounts of data to induce contextualised representations. In this paper, we investigate Syntax-Aware word Token Embeddings (SATokE) as a way to explicitly encode specific information derived from the linguistic analysis of a sentence in vectors which are input to a deep learning model. We propose an efficient unsupervised learning algorithm based on tensor factorisation for computing these token embeddings given an arbitrary graph of linguistic structure. Applying this method to syntactic dependency structures, we investigate the usefulness of such token representations as part of deep learning models of text understanding. We encode a sentence either by learning embeddings for its tokens and the relations between them from scratch or by leveraging pre-trained relation embeddings to infer token representations. Given sufficient data, the former is slightly more accurate than the latter, yet both provide more informative token embeddings than standard word representations, even when the word representations have been learned on the same type of context from larger corpora (namely pre-trained dependency-based word embeddings). We use a large set of supervised tasks and two major deep learning families of models for sentence understanding to evaluate our proposal. We empirically demonstrate the superiority of the token representations compared to popular distributional representations of words for various sentence and sentence pair classification tasks.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
JinGen Tang

This paper investigates the extraction of volleyball players’ skeleton information and provides a deep learning-based solution for recognizing the players’ actions. For this purpose, the convolutional neural network-based approach for recognizing volleyball players’ actions is used. The Lie group skeleton has a large data dimension since it is used to represent the features retrieved from the model. The convolutional neural network is used for feature learning and classification in order to process high-dimensional data, minimize the complexity of the recognition process, and speed up the calculation. This paper uses the Lie group skeleton representation model to extract the geometric feature of the skeleton information in the feature extraction stage and the geometric transformation (rotation and translation) between different limbs to represent the volleyball players’ movements in the feature representation stage. The approach is evaluated using the datasets Florence3D actions, MSR action pairs, and UTKinect action. The average recognition rate of our method is 93.00%, which is higher than that of the existing literature with high attention and reflects better accuracy and robustness.


Micromachines ◽  
2019 ◽  
Vol 10 (4) ◽  
pp. 245 ◽  
Author(s):  
Pham ◽  
Nguyen ◽  
Min

A real memristor crossbar has defects, which should be considered during the retraining time after the pre-training of the crossbar. For retraining the crossbar with defects, memristors should be updated with the weights that are calculated by the back-propagation algorithm. Unfortunately, programming the memristors takes a very long time and consumes a large amount of power, because of the incremental behavior of memristor’s program-verify scheme for the fine-tuning of memristor’s conductance. To reduce the programming time and power, the partial gating scheme is proposed here to realize the partial training, where only some part of neurons are trained, which are more responsible in the recognition error. By retraining the part, rather than the entire crossbar, the programming time and power of memristor crossbar can be significantly reduced. The proposed scheme has been verified by CADENCE circuit simulation with the real memristor’s Verilog-A model. When compared to retraining the entire crossbar, the loss of recognition rate of the partial gating scheme has been estimated only as small as 2.5% and 2.9%, for the MNIST and CIFAR-10 datasets, respectively. However, the programming time and power can be saved by 86% and 89.5% than the 100% retraining, respectively.


In the past decade, deep learning has achieved a significant breakthrough in development. In addition to the emergence of convolution, the most important is self-learning of deep neural networks. By self-learning methods, adaptive weights of kernels and built-in parameters or interconnections are automatically modified such that the error rate is reduced along the learning process, and the recognition rate is improved. Emulating mechanism of the brain, it can have accurate recognition ability after learning. One of the most important self-learning methods is back-propagation (BP). The current BP method is indeed a systematic way of calculating the gradient of the loss with respect to adaptive interconnections. The main core of the gradient descent method addresses on modifying the weights negatively proportional to the determined gradient of the loss function, subsequently reducing the error of the network response in comparison with the standard answer. The basic assumption for this type of the gradient-based self-learning is that the loss function is the first-order differential.


Author(s):  
Shikha Bhardwaj ◽  
Gitanjali Pandove ◽  
Pawan Kumar Dahiya

Background: In order to retrieve a particular image from vast repository of images, an efficient system is required and such an eminent system is well-known by the name Content-based image retrieval (CBIR) system. Color is indeed an important attribute of an image and the proposed system consist of a hybrid color descriptor which is used for color feature extraction. Deep learning, has gained a prominent importance in the current era. So, the performance of this fusion based color descriptor is also analyzed in the presence of Deep learning classifiers. Method: This paper describes a comparative experimental analysis on various color descriptors and the best two are chosen to form an efficient color based hybrid system denoted as combined color moment-color autocorrelogram (Co-CMCAC). Then, to increase the retrieval accuracy of the hybrid system, a Cascade forward back propagation neural network (CFBPNN) is used. The classification accuracy obtained by using CFBPNN is also compared to Patternnet neural network. Results: The results of the hybrid color descriptor depict that the proposed system has superior results of the order of 95.4%, 88.2%, 84.4% and 96.05% on Corel-1K, Corel-5K, Corel-10K and Oxford flower benchmark datasets respectively as compared to many state-of-the-art related techniques. Conclusion: This paper depict an experimental and analytical analysis on different color feature descriptors namely, Color moment (CM), Color auto-correlogram (CAC), Color histogram (CH), Color coherence vector (CCV) and Dominant color descriptor (DCD). The proposed hybrid color descriptor (Co-CMCAC) is utilized for the withdrawal of color features with Cascade forward back propagation neural network (CFBPNN) is used as a classifier on four benchmark datasets namely Corel-1K, Corel-5K and Corel-10K and Oxford flower.


Sign in / Sign up

Export Citation Format

Share Document