character sequence Latest Research Papers

Morphological synthesis is one of the main components of Machine Translation (MT) frameworks, especially when any one or both of the source and target languages are morphologically rich. Morphological synthesis is the process of combining two words or two morphemes according to the Sandhi rules of the morphologically rich language. Malayalam and Tamil are two languages in India which are morphologically abundant as well as agglutinative. Morphological synthesis of a word in these two languages is challenging basically because of the following reasons: (1) Abundance in morphology; (2) Complex Sandhi rules; (3) The possibilty in Malayalam to form words by combining words that belong to different syntactic categories (for example, noun and verb); and (4) The construction of a sentence by combining multiple words. We formulated the task of the morphological generation of nouns and verbs of Malayalam and Tamil as a character-to-character sequence tagging problem. In this article, we used deep learning architectures like Recurrent Neural Network (RNN) , Long Short-Term Memory Networks (LSTM) , Gated Recurrent Unit (GRU) , and their stacked and bidirectional versions for the implementation of morphological synthesis at the character level. In addition to that, we investigated the performance of the combination of the aforementioned deep learning architectures and the Conditional Random Field (CRF) in the morphological synthesis of nouns and verbs in Malayalam and Tamil. We observed that the addition of CRF to the Bidirectional LSTM/GRU architecture achieved more than 99% accuracy in the morphological synthesis of Malayalam and Tamil nouns and verbs.

Download Full-text

An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory

Scientific Reports ◽

10.1038/s41598-021-01520-y ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Pengfei Meng ◽

Shuangcheng Jia ◽

Qian Li

Keyword(s):

Short Term Memory ◽

Receptive Fields ◽

The Public ◽

Sequence Recognition ◽

Important Research Topic ◽

Public Dataset ◽

Long Short Term Memory ◽

The One ◽

Recursive Residuals ◽

Character Sequence

AbstractSequence recognition of natural scene images has always been an important research topic in the field of computer vision. CRNN has been proven to be a popular end-to-end character sequence recognition network. However, the problem of wide characters is not considered under the setting of CRNN. The CRNN is less effective in recognizing long dense small characters. Aiming at the shortcomings of CRNN, we proposed an improved CRNN network, named CRNN-RES, based on BiLSTM and multiple receptive fields. Specifically, on the one hand, the CRNN-RES uses a dual pooling core to enhance the CNN network’s ability to extract features. On the other hand, by improving the last RNN layer, the BiLSTM is changed to a shared parameter BiLSTM network using recursive residuals, which reduces the number of network parameters and improves the accuracy. In addition, we designed a structure that can flexibly configure the length of the input data sequence in the RNN layer, called the CRFC layer. Comparing the CRNN-RES network proposed in this paper with the original CRNN network, the extensive experiments show that when recognizing English characters and numbers, the parameters of CRNN-RES is 8197549, which decreased 133,752 parameters compare with CRNN. In the public dataset ICDAR 2003 (IC03), ICDAR 2013 (IC13), IIIT 5k-word (IIIT5k), and Street View Text (SVT), the CRNN-RES obtain the accuracy of 96.90%, 89.85%, 83.63%, and 82.96%, which higher than CRNN by 1.40%, 3.15%, 5.43%, and 2.16% respectively.

Download Full-text

Lelonkiewicz_Ktori_Crepaldi_2020_Morphemes_as_letter_chunks

10.31234/osf.io/y2am5 ◽

2020 ◽

Author(s):

Jaroslaw Roman Lelonkiewicz ◽

Maria Ktori ◽

Davide Crepaldi

Keyword(s):

Word Processing ◽

Statistical Properties ◽

Linguistic Information ◽

Cognitive Mechanism ◽

Testing Phase ◽

Random Character ◽

Statistical Regularities ◽

Specific Position ◽

Real Language ◽

Character Sequence

During visual word processing readers identify chunks of co-occurring letters and code for their typical position within words. Using an artificial script, we examined whether these phenomena can be explained by the ability to extract visual regularities from the environment. Participants were first familiarized with a lexicon of pseudoletter strings, each comprising an affix-like chunk that either followed (Experiment 1) or preceded (Experiment 2) a random character sequence. In the absence of any linguistic information, chunks could be defined only by their statistical properties - similarly to affixes in the real language, chunks occurred frequently and assumed a specific position within strings. In a later testing phase, we found that participants were more likely to attribute a previously unseen string to the familiarization lexicon if it contained an affix, and if the affix appeared in its typical position. Importantly, these findings suggest that readers may chunk words using a general, language-agnostic cognitive mechanism that captures statistical regularities in the learning materials. [NOTE: Please cite this paper as: Lelonkiewicz, J. R., Ktori, M., & Crepaldi, D. (2020). Morphemes as letter chunks: Discovering affixes through visual regularities. Journal of Memory and language, 115, 104152. https://doi.org/10.1016/j.jml.2020.104152 ]

Download Full-text

Ideal Character of Muslim Generation of Industrial Revolution Era 4.0 and Society 5.0

JURNAL IQRA ◽

10.25217/ji.v5i1.644 ◽

2020 ◽

Vol 5 (1) ◽

pp. 171-182

Author(s):

Dede Ramdani ◽

Deasy Nurma Hidayat ◽

Asep Sumarna ◽

Icmiati Santika

Keyword(s):

Data Analysis ◽

Industrial Revolution ◽

Data Analysis Technique ◽

Analysis Technique ◽

Muslim Children ◽

Character Sequence ◽

The Ideal

This article was to find out the character Muslim prioritizes in facing the Industrial Revolution Era 4.0 and Society 5.0 The data analysis technique in this study uses categorical statistics from the distribution of questionnaires in the Bandung area and it’s surroundings. The results of the distribution of the questionnaire showed the ideal character sequence in Muslim children, namely honesty, discipline, responsibility, polite, confident, hardworking, tolerant, creative and innovative, caring, productive and religious. Then, it can be concluded that these characters can be the foundation of Muslim children facing the development of the industrial revolution era 4.0 and society 5.0. Keywords: Ideal Character, Muslim Generation, Industrial Revolution 4.0

Download Full-text

Multinational License Plate Recognition Using Generalized Character Sequence Detection

IEEE Access ◽

10.1109/access.2020.2974973 ◽

2020 ◽

Vol 8 ◽

pp. 35185-35199 ◽

Cited By ~ 7

Author(s):

Chris Henry ◽

Sung Yoon Ahn ◽

Sang-Woong Lee

Keyword(s):

License Plate ◽

License Plate Recognition ◽

Sequence Detection ◽

Character Sequence

Download Full-text

Enhanced Least Significant Bit Replacement Algorithm in Spatial Domain of Steganography Using Character Sequence Optimization

IEEE Access ◽

10.1109/access.2020.3009234 ◽

2020 ◽

Vol 8 ◽

pp. 136537-136545

Author(s):

Jagan Raj Jayapandiyan ◽

C. Kavitha ◽

K. Sakthivel

Keyword(s):

Spatial Domain ◽

Least Significant Bit ◽

Sequence Optimization ◽

Replacement Algorithm ◽

Character Sequence

Download Full-text

An Algorithm for Natural Images Text Recognition Using Four Direction Features

Electronics ◽

10.3390/electronics8090971 ◽

2019 ◽

Vol 8 (9) ◽

pp. 971 ◽

Cited By ~ 2

Author(s):

Min Zhang ◽

Yujin Yan ◽

Hai Wang ◽

Wei Zhao

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Natural Images ◽

Text Recognition ◽

Textual Information ◽

Word Level ◽

End To End ◽

Regular Text ◽

Character Sequence ◽

Feature Code

Irregular text has widespread applications in multiple areas. Different from regular text, irregular text is difficult to recognize because of its various shapes and distorted patterns. In this paper, we develop a multidirectional convolutional neural network (MCN) to extract four direction features to fully describe the textual information. Meanwhile, the character placement possibility is extracted as the weight of the four direction features. Based on these works, we propose the encoder to fuse the four direction features for the generation of feature code to predict the character sequence. The whole network is end-to-end trainable due to using images and word-level labels. The experiments on standard benchmarks, including the IIIT-5K, SVT, CUTE80, and ICDAR datasets, demonstrate the superiority of the proposed method on both regular and irregular datasets. The developed method shows an increase of 1.2% in the CUTE80 dataset and 1.5% in the SVT dataset, and has fewer parameters than most existing methods.

Download Full-text

A new method to evaluate the dynamic air gap thickness and garment sliding of virtual clothes during walking

Textile Research Journal ◽

10.1177/0040517519826930 ◽

2019 ◽

Vol 89 (19-20) ◽

pp. 4148-4161 ◽

Cited By ~ 1

Author(s):

Pengpeng Hu ◽

Edmond SL Ho ◽

Nauman Aslam ◽

Taku Komura ◽

Hubert PH Shum

Keyword(s):

Human Body ◽

Three Dimensional ◽

Air Gap ◽

The Body ◽

Walking Motion ◽

Previous Frame ◽

Clothing Fit ◽

Body Parameters ◽

Character Sequence ◽

New Framework

With the development of e-shopping, there is a significant growth in clothing purchases online. However, the virtual clothing fit evaluation is still under-researched. In the literature, the thickness of the air layer between the human body and clothes is a dominant geometric indicator to evaluate the clothing fit. However, such an approach has only been applied to the stationary positions of the mannequin/human body. Physical indicators such as the pressure/tension of a virtual garment fitted on the virtual body in a continuous motion are also proposed for clothing fit evaluation. Neither geometric nor physical evaluations consider the interaction of the garment with the body, e.g., the sliding of the garment along the human body. In this study, a new framework was proposed to automatically determine the dynamic air gap thickness. First, the dynamic dressed character sequence was simulated in three-dimensional (3D) clothing software via importing the body parameters, cloth parameters, and a walking motion. Second, a cost function was defined to convert the garment in the previous frame to the local coordinate of the next frame. The dynamic air gap thickness between clothes and the human body was determined. Third, a new metric, the 3D garment vector field was proposed to represent the movement flow of the dynamic virtual garment, whose directional changes are calculated by cosine similarity. Experimental results show that our method is more sensitive to the small air gap thickness changes compared with state-of-the-art methods, allowing it to more effectively evaluate clothing fit in a virtual environment.

Download Full-text

FASTC : a file format for multi‐character sequence data

Cladistics ◽

10.1111/cla.12370 ◽

2019 ◽

Vol 35 (5) ◽

pp. 573-575

Author(s):

Ward C. Wheeler ◽

Alexander J. Washburn

Keyword(s):

Sequence Data ◽

File Format ◽

Character Sequence

Download Full-text

Fully Character-Level Neural Machine Translation without Explicit Segmentation

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00067 ◽

2017 ◽

Vol 5 ◽

pp. 365-378 ◽

Cited By ~ 43

Author(s):

Jason Lee ◽

Kyunghyun Cho ◽

Thomas Hofmann

Keyword(s):

Machine Translation ◽

Convolutional Network ◽

Neural Machine Translation ◽

Single Character ◽

Comparable Performance ◽

Translation Systems ◽

Multiple Languages ◽

Character Sequence ◽

Language Pair

Most existing machine translation systems operate at the level of words, relying on explicit segmentation to extract tokens. We introduce a neural machine translation (NMT) model that maps a source character sequence to a target character sequence without any segmentation. We employ a character-level convolutional network with max-pooling at the encoder to reduce the length of source representation, allowing the model to be trained at a speed comparable to subword-level models while capturing local regularities. Our character-to-character model outperforms a recently proposed baseline with a subword-level encoder on WMT’15 DE-EN and CS-EN, and gives comparable performance on FI-EN and RU-EN. We then demonstrate that it is possible to share a single character-level encoder across multiple languages by training a model on a many-to-one translation task. In this multilingual setting, the character-level encoder significantly outperforms the subword-level encoder on all the language pairs. We observe that on CS-EN, FI-EN and RU-EN, the quality of the multilingual character-level translation even surpasses the models specifically trained on that language pair alone, both in terms of the BLEU score and human judgment.

Download Full-text

character sequence
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Deep Learning Approach for the Morphological Synthesis in Malayalam and Tamil at the Character Level

An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory

Lelonkiewicz_Ktori_Crepaldi_2020_Morphemes_as_letter_chunks

Ideal Character of Muslim Generation of Industrial Revolution Era 4.0 and Society 5.0

Multinational License Plate Recognition Using Generalized Character Sequence Detection

Enhanced Least Significant Bit Replacement Algorithm in Spatial Domain of Steganography Using Character Sequence Optimization

An Algorithm for Natural Images Text Recognition Using Four Direction Features

A new method to evaluate the dynamic air gap thickness and garment sliding of virtual clothes during walking

FASTC : a file format for multi‐character sequence data

Fully Character-Level Neural Machine Translation without Explicit Segmentation

Export Citation Format

character sequenceRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Deep Learning Approach for the Morphological Synthesis in Malayalam and Tamil at the Character Level

An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory

Lelonkiewicz_Ktori_Crepaldi_2020_Morphemes_as_letter_chunks

Ideal Character of Muslim Generation of Industrial Revolution Era 4.0 and Society 5.0

Multinational License Plate Recognition Using Generalized Character Sequence Detection

Enhanced Least Significant Bit Replacement Algorithm in Spatial Domain of Steganography Using Character Sequence Optimization

An Algorithm for Natural Images Text Recognition Using Four Direction Features

A new method to evaluate the dynamic air gap thickness and garment sliding of virtual clothes during walking

FASTC : a file format for multi‐character sequence data

Fully Character-Level Neural Machine Translation without Explicit Segmentation

character sequence
Recently Published Documents