scholarly journals Online Turkish Handwriting Recognition Using Synthetic Data

Author(s):  
Esma Fatıma BİLGİN TAŞDEMİR
Author(s):  
L. P. CORDELLA ◽  
C. DE STEFANO ◽  
F. FONTANELLA

A new prototyping method based on the evolutionary computation paradigm and on the concept of Vector Quantization is proposed. It uses a specifically devised evolutionary algorithm for evolving a set of prototype feature vectors and does not require any a priori knowledge about either the actual number of prototypes or the statistical properties of the input data. Experiments performed by using both synthetic data and handwritten digits randomly extracted from the NIST database have confirmed the effectiveness of the approach.


2015 ◽  
Vol 2015 ◽  
pp. 1-18 ◽  
Author(s):  
Laslo Dinges ◽  
Ayoub Al-Hamadi ◽  
Moftah Elzobi ◽  
Sherif El-etriby ◽  
Ahmed Ghoneim

Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.


2011 ◽  
Vol 6 (1) ◽  
pp. 28-35
Author(s):  
D.P. Gaikwad ◽  
Yogesh Gunge ◽  
Raghunandan Mundada ◽  
Himani Bharadwaj ◽  
Swapnil Patil

Author(s):  
P.L. Nikolaev

This article deals with method of binary classification of images with small text on them Classification is based on the fact that the text can have 2 directions – it can be positioned horizontally and read from left to right or it can be turned 180 degrees so the image must be rotated to read the sign. This type of text can be found on the covers of a variety of books, so in case of recognizing the covers, it is necessary first to determine the direction of the text before we will directly recognize it. The article suggests the development of a deep neural network for determination of the text position in the context of book covers recognizing. The results of training and testing of a convolutional neural network on synthetic data as well as the examples of the network functioning on the real data are presented.


1989 ◽  
Vol 21 (6-7) ◽  
pp. 593-602 ◽  
Author(s):  
Andrew T. Watkin ◽  
W. Wesley Eckenfelder

A technique for rapidly determining Monod and inhibition kinetic parameters in activated sludge is evaluated. The method studied is known as the fed-batch reactor technique and requires approximately three hours to complete. The technique allows for a gradual build-up of substrate in the test reactor by introducing the substrate at a feed rate greater than the maximum substrate utilization rate. Both inhibitory and non-inhibitory substrate responses are modeled using a nonlinear numerical curve-fitting technique. The responses of both glucose and 2,4-dichlorophenol (DCP) are studied using activated sludges with various acclimation histories. Statistically different inhibition constants, KI, for DCP inhibition of glucose utilization were found for the various sludges studied. The curve-fitting algorithm was verified in its ability to accurately retrieve two kinetic parameters from synthetic data generated by superimposing normally distributed random error onto the two parameter numerical solution generated by the algorithm.


2020 ◽  
Vol 38 (2) ◽  
Author(s):  
Razec Cezar Sampaio Pinto da Silva Torres ◽  
Leandro Di Bartolo

ABSTRACT. Reverse time migration (RTM) is one of the most powerful methods used to generate images of the subsurface. The RTM was proposed in the early 1980s, but only recently it has been routinely used in exploratory projects involving complex geology – Brazilian pre-salt, for example. Because the method uses the two-way wave equation, RTM is able to correctly image any kind of geological environment (simple or complex), including those with anisotropy. On the other hand, RTM is computationally expensive and requires the use of computer clusters. This paper proposes to investigate the influence of anisotropy on seismic imaging through the application of RTM for tilted transversely isotropic (TTI) media in pre-stack synthetic data. This work presents in detail how to implement RTM for TTI media, addressing the main issues and specific details, e.g., the computational resources required. A couple of simple models results are presented, including the application to a BP TTI 2007 benchmark model.Keywords: finite differences, wave numerical modeling, seismic anisotropy. Migração reversa no tempo em meios transversalmente isotrópicos inclinadosRESUMO. A migração reversa no tempo (RTM) é um dos mais poderosos métodos utilizados para gerar imagens da subsuperfície. A RTM foi proposta no início da década de 80, mas apenas recentemente tem sido rotineiramente utilizada em projetos exploratórios envolvendo geologia complexa, em especial no pré-sal brasileiro. Por ser um método que utiliza a equação completa da onda, qualquer configuração do meio geológico pode ser corretamente tratada, em especial na presença de anisotropia. Por outro lado, a RTM é dispendiosa computacionalmente e requer o uso de clusters de computadores por parte da indústria. Este artigo apresenta em detalhes uma implementação da RTM para meios transversalmente isotrópicos inclinados (TTI), abordando as principais dificuldades na sua implementação, além dos recursos computacionais exigidos. O algoritmo desenvolvido é aplicado a casos simples e a um benchmark padrão, conhecido como BP TTI 2007.Palavras-chave: diferenças finitas, modelagem numérica de ondas, anisotropia sísmica.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
João Lobo ◽  
Rui Henriques ◽  
Sara C. Madeira

Abstract Background Three-way data started to gain popularity due to their increasing capacity to describe inherently multivariate and temporal events, such as biological responses, social interactions along time, urban dynamics, or complex geophysical phenomena. Triclustering, subspace clustering of three-way data, enables the discovery of patterns corresponding to data subspaces (triclusters) with values correlated across the three dimensions (observations $$\times$$ × features $$\times$$ × contexts). With increasing number of algorithms being proposed, effectively comparing them with state-of-the-art algorithms is paramount. These comparisons are usually performed using real data, without a known ground-truth, thus limiting the assessments. In this context, we propose a synthetic data generator, G-Tric, allowing the creation of synthetic datasets with configurable properties and the possibility to plant triclusters. The generator is prepared to create datasets resembling real 3-way data from biomedical and social data domains, with the additional advantage of further providing the ground truth (triclustering solution) as output. Results G-Tric can replicate real-world datasets and create new ones that match researchers needs across several properties, including data type (numeric or symbolic), dimensions, and background distribution. Users can tune the patterns and structure that characterize the planted triclusters (subspaces) and how they interact (overlapping). Data quality can also be controlled, by defining the amount of missing, noise or errors. Furthermore, a benchmark of datasets resembling real data is made available, together with the corresponding triclustering solutions (planted triclusters) and generating parameters. Conclusions Triclustering evaluation using G-Tric provides the possibility to combine both intrinsic and extrinsic metrics to compare solutions that produce more reliable analyses. A set of predefined datasets, mimicking widely used three-way data and exploring crucial properties was generated and made available, highlighting G-Tric’s potential to advance triclustering state-of-the-art by easing the process of evaluating the quality of new triclustering approaches.


2021 ◽  
Vol 40 (3) ◽  
pp. 1-12
Author(s):  
Hao Zhang ◽  
Yuxiao Zhou ◽  
Yifei Tian ◽  
Jun-Hai Yong ◽  
Feng Xu

Reconstructing hand-object interactions is a challenging task due to strong occlusions and complex motions. This article proposes a real-time system that uses a single depth stream to simultaneously reconstruct hand poses, object shape, and rigid/non-rigid motions. To achieve this, we first train a joint learning network to segment the hand and object in a depth image, and to predict the 3D keypoints of the hand. With most layers shared by the two tasks, computation cost is saved for the real-time performance. A hybrid dataset is constructed here to train the network with real data (to learn real-world distributions) and synthetic data (to cover variations of objects, motions, and viewpoints). Next, the depth of the two targets and the keypoints are used in a uniform optimization to reconstruct the interacting motions. Benefitting from a novel tangential contact constraint, the system not only solves the remaining ambiguities but also keeps the real-time performance. Experiments show that our system handles different hand and object shapes, various interactive motions, and moving cameras.


Sign in / Sign up

Export Citation Format

Share Document