scholarly journals Digitising (Romanian) Cyrillic using Transkribus: new perspectives

Diacronia ◽  
2021 ◽  
Author(s):  
Constanța Burlacu ◽  
Achim Rabus

In this paper we discuss the application of the software platform Transkribus (transkribus.eu), an AI-assisted tool for Handwritten Text Recognition (HTR), to 16th century Romanian manuscript and printed sources using Cyrillic scripts. After an overview of the basic functionality of the HTR technology and Transkribus, we discuss the Romanian and bilingual Slavonic-Romanian sources we used, give an insight on training specific and generic as well as smart (i.e. transliterating from Cyrillic into Latin script) models, evaluate their performance and discuss implications of HTR for philological research in the Digital Age. We conclude with an outlook on future research perspectives.

Author(s):  
Sri. Yugandhar Manchala ◽  
Jayaram Kinthali ◽  
Kowshik Kotha ◽  
Kanithi Santosh Kumar, Jagilinki Jayalaxmi ◽  

2021 ◽  
Author(s):  
Ayan Kumar Bhunia ◽  
Shuvozit Ghose ◽  
Amandeep Kumar ◽  
Pinaki Nath Chowdhury ◽  
Aneeshan Sain ◽  
...  

2020 ◽  
Vol 6 (12) ◽  
pp. 141
Author(s):  
Abdelrahman Abdallah ◽  
Mohamed Hamada ◽  
Daniyar Nurseitov

This article considers the task of handwritten text recognition using attention-based encoder–decoder networks trained in the Kazakh and Russian languages. We have developed a novel deep neural network model based on a fully gated CNN, supported by multiple bidirectional gated recurrent unit (BGRU) and attention mechanisms to manipulate sophisticated features that achieve 0.045 Character Error Rate (CER), 0.192 Word Error Rate (WER), and 0.253 Sequence Error Rate (SER) for the first test dataset and 0.064 CER, 0.24 WER and 0.361 SER for the second test dataset. Our proposed model is the first work to handle handwriting recognition models in Kazakh and Russian languages. Our results confirm the importance of our proposed Attention-Gated-CNN-BGRU approach for training handwriting text recognition and indicate that it can lead to statistically significant improvements (p-value < 0.05) in the sensitivity (recall) over the tests dataset. The proposed method’s performance was evaluated using handwritten text databases of three languages: English, Russian, and Kazakh. It demonstrates better results on the Handwritten Kazakh and Russian (HKR) dataset than the other well-known models.


Sign in / Sign up

Export Citation Format

Share Document