DevNet: An Efficient CNN Architecture for Handwritten Devanagari Character Recognition

Tactic learning in virtual reality (VR) has been proven to be effective for basketball training. Endowed with the ability of generating virtual defenders in real time according to the movement of virtual offenders controlled by the user, a VR basketball training system can bring more immersive and realistic experiences for the trainee. In this article, an autoregressive generative model for instantly producing basketball defensive trajectory is introduced. We further focus on the issue of preserving the diversity of the generated trajectories. A differentiable sampling mechanism is adopted to learn the continuous Gaussian distribution of player position. Moreover, several heuristic loss functions based on the domain knowledge of basketball are designed to make the generated trajectories assemble real situations in basketball games. We compare the proposed method with the state-of-the-art works in terms of both objective and subjective manners. The objective manner compares the average position, velocity, and acceleration of the generated defensive trajectories with the real ones to evaluate the fidelity of the results. In addition, more high-level aspects such as the empty space for offender and the defensive pressure of the generated trajectory are also considered in the objective evaluation. As for the subjective manner, visual comparison questionnaires on the proposed and other methods are thoroughly conducted. The experimental results show that the proposed method can achieve better performance than previous basketball defensive trajectory generation works in terms of different evaluation metrics.

Download Full-text

The Diabolo Classifier

Neural Computation ◽

10.1162/089976698300017025 ◽

1998 ◽

Vol 10 (8) ◽

pp. 2175-2200 ◽

Cited By ~ 14

Author(s):

Holger Schwenk

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Distance Measure ◽

State Of The Art ◽

A Priori ◽

Principal Component ◽

New Classification ◽

Reference Databases ◽

Discriminant Models ◽

Specific Distance

We present a new classification architecture based on autoassociative neural networks that are used to learn discriminant models of each class. The proposed architecture has several interesting properties with respect to other model-based classifiers like nearest-neighbors or radial basis functions: it has a low computational complexity and uses a compact distributed representation of the models. The classifier is also well suited for the incorporation of a priori knowledge by means of a problem-specific distance measure. In particular, we will show that tangent distance (Simard, Le Cun, & Denker, 1993) can be used to achieve transformation invariance during learning and recognition. We demonstrate the application of this classifier to optical character recognition, where it has achieved state-of-the-art results on several reference databases. Relations to other models, in particular those based on principal component analysis, are also discussed.

Download Full-text

Convolutional Neural Networks for Handwritten Javanese Character Recognition

IJCCS (Indonesian Journal of Computing and Cybernetics Systems) ◽

10.22146/ijccs.31144 ◽

2018 ◽

Vol 12 (1) ◽

pp. 83 ◽

Cited By ~ 5

Author(s):

Chandra Kusuma Dewa ◽

Amanda Lailatul Fadhilah ◽

A Afiahayati

Keyword(s):

Character Recognition ◽

State Of The Art ◽

Recognition Task ◽

Feature Learning ◽

Classification Result ◽

Training Time ◽

Segmented Image ◽

Canny Edge ◽

Testing Accuracy ◽

And Training

Convolutional neural network (CNN) is state-of-the-art method in object recognition task. Specialized for spatial input data type, CNN has special convolutional and pooling layers which enable hierarchical feature learning from the input space. For offline handwritten character recognition problem such as classifying character in MNIST database, CNN shows better classification result than any other methods. By leveraging the advantages of CNN over character recognition task, in this paper we developed a software which utilizes digital image processing methods and CNN module for offline handwritten Javanese character recognition. The software performs image segmentation process using contour and Canny edge detection with OpenCV library over captured handwritten Javanese character image. CNN will classify the segmented image into 20 classes of Javanese letters. For evaluation purposes, we compared CNN to multilayer perceptron (MLP) on classification accuracy and training time. Experiment results show that CNN model testing accuracy outperforms MLP accuracy although CNN needs more training time than MLP.

Download Full-text

Boosting of Deep Convolutional Architectures for Arabic Handwriting Recognition

International Journal of Multimedia Data Engineering and Management ◽

10.4018/ijmdem.2019100102 ◽

2019 ◽

Vol 10 (4) ◽

pp. 26-45 ◽

Cited By ~ 1

Author(s):

Mohamed Elleuch ◽

Monji Kherallah

Keyword(s):

Character Recognition ◽

State Of The Art ◽

Handwriting Recognition ◽

Image Data ◽

Text Recognition ◽

Deep Belief Networks ◽

Handwritten Text ◽

Handwritten Text Recognition ◽

Accuracy Rates ◽

Hierarchical Representations

In recent years, deep learning (DL) based systems have become very popular for constructing hierarchical representations from unlabeled data. Moreover, DL approaches have been shown to exceed foregoing state of the art machine learning models in various areas, by pattern recognition being one of the more important cases. This paper applies Convolutional Deep Belief Networks (CDBN) to textual image data containing Arabic handwritten script (AHS) and evaluated it on two different databases characterized by the low/high-dimension property. In addition to the benefits provided by deep networks, the system is protected against over-fitting. Experimentally, the authors demonstrated that the extracted features are effective for handwritten character recognition and show very good performance comparable to the state of the art on handwritten text recognition. Yet using Dropout, the proposed CDBN architectures achieved a promising accuracy rates of 91.55% and 98.86% when applied to IFN/ENIT and HACDB databases, respectively.

Download Full-text

A FULLY AUTOMATED PIPELINE FOR CLASSIFICATION TASKS WITH AN APPLICATION TO REMOTE SENSING

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xli-b3-923-2016 ◽

2016 ◽

Vol XLI-B3 ◽

pp. 923-929

Author(s):

K. Suzuki ◽

M. Claesen ◽

H. Takeda ◽

B. De Moor

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Deep Learning ◽

Object Recognition ◽

Character Recognition ◽

Domain Knowledge ◽

High Performance ◽

A Priori ◽

Training Data ◽

Classification Problems

Nowadays deep learning has been intensively in spotlight owing to its great victories at major competitions, which undeservedly pushed ‘shallow’ machine learning methods, relatively naive/handy algorithms commonly used by industrial engineers, to the background in spite of their facilities such as small requisite amount of time/dataset for training. We, with a practical point of view, utilized shallow learning algorithms to construct a learning pipeline such that operators can utilize machine learning without any special knowledge, expensive computation environment, and a large amount of labelled data. The proposed pipeline automates a whole classification process, namely feature-selection, weighting features and the selection of the most suitable classifier with optimized hyperparameters. The configuration facilitates particle swarm optimization, one of well-known metaheuristic algorithms for the sake of generally fast and fine optimization, which enables us not only to optimize (hyper)parameters but also to determine appropriate features/classifier to the problem, which has conventionally been a priori based on domain knowledge and remained untouched or dealt with naïve algorithms such as grid search. Through experiments with the MNIST and CIFAR-10 datasets, common datasets in computer vision field for character recognition and object recognition problems respectively, our automated learning approach provides high performance considering its simple setting (i.e. non-specialized setting depending on dataset), small amount of training data, and practical learning time. Moreover, compared to deep learning the performance stays robust without almost any modification even with a remote sensing object recognition problem, which in turn indicates that there is a high possibility that our approach contributes to general classification problems.

Download Full-text

A FULLY AUTOMATED PIPELINE FOR CLASSIFICATION TASKS WITH AN APPLICATION TO REMOTE SENSING

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xli-b3-923-2016 ◽

2016 ◽

Vol XLI-B3 ◽

pp. 923-929

Author(s):

K. Suzuki ◽

M. Claesen ◽

H. Takeda ◽

B. De Moor

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Deep Learning ◽

Object Recognition ◽

Character Recognition ◽

Domain Knowledge ◽

High Performance ◽

A Priori ◽

Training Data ◽

Classification Problems

Nowadays deep learning has been intensively in spotlight owing to its great victories at major competitions, which undeservedly pushed ‘shallow’ machine learning methods, relatively naive/handy algorithms commonly used by industrial engineers, to the background in spite of their facilities such as small requisite amount of time/dataset for training. We, with a practical point of view, utilized shallow learning algorithms to construct a learning pipeline such that operators can utilize machine learning without any special knowledge, expensive computation environment, and a large amount of labelled data. The proposed pipeline automates a whole classification process, namely feature-selection, weighting features and the selection of the most suitable classifier with optimized hyperparameters. The configuration facilitates particle swarm optimization, one of well-known metaheuristic algorithms for the sake of generally fast and fine optimization, which enables us not only to optimize (hyper)parameters but also to determine appropriate features/classifier to the problem, which has conventionally been a priori based on domain knowledge and remained untouched or dealt with naïve algorithms such as grid search. Through experiments with the MNIST and CIFAR-10 datasets, common datasets in computer vision field for character recognition and object recognition problems respectively, our automated learning approach provides high performance considering its simple setting (i.e. non-specialized setting depending on dataset), small amount of training data, and practical learning time. Moreover, compared to deep learning the performance stays robust without almost any modification even with a remote sensing object recognition problem, which in turn indicates that there is a high possibility that our approach contributes to general classification problems.

Download Full-text

CNN-based Rain Reduction in Street View Images

London Imaging Meeting ◽

10.2352/issn.2694-118x.2020.lim-12 ◽

2020 ◽

Vol 2020 (1) ◽

pp. 78-81

Author(s):

Simone Zini ◽

Simone Bianco ◽

Raimondo Schettini

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

State Of The Art ◽

Weather Conditions ◽

Specific Interest ◽

Optical Character ◽

Street View ◽

In The Wild ◽

Bad Weather ◽

Detection And Recognition

Rain removal from pictures taken under bad weather conditions is a challenging task that aims to improve the overall quality and visibility of a scene. The enhanced images usually constitute the input for subsequent Computer Vision tasks such as detection and classification. In this paper, we present a Convolutional Neural Network, based on the Pix2Pix model, for rain streaks removal from images, with specific interest in evaluating the results of the processing operation with respect to the Optical Character Recognition (OCR) task. In particular, we present a way to generate a rainy version of the Street View Text Dataset (R-SVTD) for "text detection and recognition" evaluation in bad weather conditions. Experimental results on this dataset show that our model is able to outperform the state of the art in terms of two commonly used image quality metrics, and that it is capable to improve the performances of an OCR model to detect and recognise text in the wild.

Download Full-text

Building Attention and Edge Convolution Neural Networks for Bioactivity and Physical-Chemical Property Prediction

10.26434/chemrxiv.9873599.v2 ◽

2019 ◽

Cited By ~ 1

Author(s):

Michael Withnall ◽

Edvard Lindelöf ◽

Ola Engkvist ◽

Hongming Chen

Keyword(s):

Message Passing ◽

State Of The Art ◽

A Priori ◽

Model Performance ◽

Learning Approaches ◽

Physical Chemical ◽

Chemical Descriptor ◽

Derived Properties ◽

Memory Schemes ◽

Hyperparameter Selection

We introduce Attention and Edge Memory schemes to the existing Message Passing Neural Network framework for graph convolution, and benchmark our approaches against eight different physical-chemical and bioactivity datasets from the literature. We remove the need to introduce <i>a priori</i> knowledge of the task and chemical descriptor calculation by using only fundamental graph-derived properties. Our results consistently perform on-par with other state-of-the-art machine learning approaches, and set a new standard on sparse multi-task virtual screening targets. We also investigate model performance as a function of dataset preprocessing, and make some suggestions regarding hyperparameter selection.

Download Full-text

InstantDL: an easy-to-use deep learning pipeline for image segmentation and classification

BMC Bioinformatics ◽

10.1186/s12859-021-04037-3 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Dominik Jens Elias Waibel ◽

Sayedali Shetab Boushehri ◽

Carsten Marr

Keyword(s):

Image Processing ◽

Deep Learning ◽

Specific Problem ◽

State Of The Art ◽

Image Data ◽

Semantic Segmentation ◽

Parameter Tuning ◽

Cellular Processes ◽

Minimal Effort ◽

Instance Segmentation

Abstract Background Deep learning contributes to uncovering molecular and cellular processes with highly performant algorithms. Convolutional neural networks have become the state-of-the-art tool to provide accurate and fast image data processing. However, published algorithms mostly solve only one specific problem and they typically require a considerable coding effort and machine learning background for their application. Results We have thus developed InstantDL, a deep learning pipeline for four common image processing tasks: semantic segmentation, instance segmentation, pixel-wise regression and classification. InstantDL enables researchers with a basic computational background to apply debugged and benchmarked state-of-the-art deep learning algorithms to their own data with minimal effort. To make the pipeline robust, we have automated and standardized workflows and extensively tested it in different scenarios. Moreover, it allows assessing the uncertainty of predictions. We have benchmarked InstantDL on seven publicly available datasets achieving competitive performance without any parameter tuning. For customization of the pipeline to specific tasks, all code is easily accessible and well documented. Conclusions With InstantDL, we hope to empower biomedical researchers to conduct reproducible image processing with a convenient and easy-to-use pipeline.

Download Full-text

Image Classification for the Automatic Feature Extraction in Human Worn Fashion Data

Mathematics ◽

10.3390/math9060624 ◽

2021 ◽

Vol 9 (6) ◽

pp. 624

Author(s):

Stefan Rohrmanstorfer ◽

Mikhail Komarov ◽

Felix Mödritscher

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Image Classification ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Image Data ◽

Classification Model ◽

Upper Body ◽

Automatic Feature Extraction

With the always increasing amount of image data, it has become a necessity to automatically look for and process information in these images. As fashion is captured in images, the fashion sector provides the perfect foundation to be supported by the integration of a service or application that is built on an image classification model. In this article, the state of the art for image classification is analyzed and discussed. Based on the elaborated knowledge, four different approaches will be implemented to successfully extract features out of fashion data. For this purpose, a human-worn fashion dataset with 2567 images was created, but it was significantly enlarged by the performed image operations. The results show that convolutional neural networks are the undisputed standard for classifying images, and that TensorFlow is the best library to build them. Moreover, through the introduction of dropout layers, data augmentation and transfer learning, model overfitting was successfully prevented, and it was possible to incrementally improve the validation accuracy of the created dataset from an initial 69% to a final validation accuracy of 84%. More distinct apparel like trousers, shoes and hats were better classified than other upper body clothes.

Download Full-text