OCR Challenges for a Latvian Pronunciation Dictionary

Frontiers in Artificial Intelligence and Applications - Human Language Technologies – The Baltic Perspective ◽

10.3233/faia200623 ◽

2020 ◽

Author(s):

Laine Strankale ◽

Pēteris Paikens

Keyword(s):

Error Analysis ◽

Open Source ◽

Error Rate ◽

Post Processing ◽

Speech Technology ◽

Word Error Rate ◽

Additional Support ◽

Further Development ◽

Pronunciation Dictionary

This paper covers the devlopment of a custom OCR solution based on the Tesseract open source engine developed for digitization of a Latvian pronunciation dictionary where the pronunciation data is described using a large variety of diacritic markings not supported by standard OCR solutions. We describe our efforts in training a model for these symbols without the additional support of preexisting dictionaries and illustrate how word error rate (WER) and character error rate (CER) are affected by changes in the dataset content and size. We also provide an error analysis and postulate possible causes for common pitfalls. The resulting model achieved a CER of 2.07%, making it suitable for digitization of the whole dictionary in combination with heuristic post-processing and proofreading, resulting in a useful resource for further development of speech technology for Latvian.

Towards Automatic Error Analysis of Machine Translation Output

Computational Linguistics ◽

10.1162/coli_a_00072 ◽

2011 ◽

Vol 37 (4) ◽

pp. 657-688 ◽

Cited By ~ 26

Author(s):

Maja Popović ◽

Hermann Ney

Keyword(s):

Error Analysis ◽

Machine Translation ◽

Error Rate ◽

Human Error ◽

Translation System ◽

Specific Information ◽

Error Type ◽

Word Error Rate ◽

Advantages And Disadvantages ◽

Automatic Error

Evaluation and error analysis of machine translation output are important but difficult tasks. In this article, we propose a framework for automatic error analysis and classification based on the identification of actual erroneous words using the algorithms for computation of Word Error Rate (WER) and Position-independent word Error Rate (PER), which is just a very first step towards development of automatic evaluation measures that provide more specific information of certain translation problems. The proposed approach enables the use of various types of linguistic knowledge in order to classify translation errors in many different ways. This work focuses on one possible set-up, namely, on five error categories: inflectional errors, errors due to wrong word order, missing words, extra words, and incorrect lexical choices. For each of the categories, we analyze the contribution of various POS classes. We compared the results of automatic error analysis with the results of human error analysis in order to investigate two possible applications: estimating the contribution of each error type in a given translation output in order to identify the main sources of errors for a given translation system, and comparing different translation outputs using the introduced error categories in order to obtain more information about advantages and disadvantages of different systems and possibilites for improvements, as well as about advantages and disadvantages of applied methods for improvements. We used Arabic–English Newswire and Broadcast News and Chinese–English Newswire outputs created in the framework of the GALE project, several Spanish and English European Parliament outputs generated during the TC-Star project, and three German–English outputs generated in the framework of the fourth Machine Translation Workshop. We show that our results correlate very well with the results of a human error analysis, and that all our metrics except the extra words reflect well the differences between different versions of the same translation system as well as the differences between different translation systems.

Dynamic Acoustic Unit Augmentation with BPE-Dropout for Low-Resource End-to-End Speech Recognition

Sensors ◽

10.3390/s21093063 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3063

Author(s):

Aleksandr Laptev ◽

Andrei Andrusenko ◽

Ivan Podluzhny ◽

Anton Mitrofanov ◽

Ivan Medennikov ◽

...

Keyword(s):

Speech Recognition ◽

Error Rate ◽

Rapid Development ◽

Computational Cost ◽

Vocabulary Size ◽

Word Error Rate ◽

Low Resource ◽

Steady Improvement ◽

End To End ◽

Asr System

With the rapid development of speech assistants, adapting server-intended automatic speech recognition (ASR) solutions to a direct device has become crucial. For on-device speech recognition tasks, researchers and industry prefer end-to-end ASR systems as they can be made resource-efficient while maintaining a higher quality compared to hybrid systems. However, building end-to-end models requires a significant amount of speech data. Personalization, which is mainly handling out-of-vocabulary (OOV) words, is another challenging task associated with speech assistants. In this work, we consider building an effective end-to-end ASR system in low-resource setups with a high OOV rate, embodied in Babel Turkish and Babel Georgian tasks. We propose a method of dynamic acoustic unit augmentation based on the Byte Pair Encoding with dropout (BPE-dropout) technique. The method non-deterministically tokenizes utterances to extend the token’s contexts and to regularize their distribution for the model’s recognition of unseen words. It also reduces the need for optimal subword vocabulary size search. The technique provides a steady improvement in regular and personalized (OOV-oriented) speech recognition tasks (at least 6% relative word error rate (WER) and 25% relative F-score) at no additional computational cost. Owing to the BPE-dropout use, our monolingual Turkish Conformer has achieved a competitive result with 22.2% character error rate (CER) and 38.9% WER, which is close to the best published multilingual system.

Measuring the acceptable word error rate of machine-generated webcast transcripts

10.21437/interspeech.2006-40 ◽

2006 ◽

Cited By ~ 1

Author(s):

Cosmin Munteanu ◽

Gerald Penn ◽

Ron Baecker ◽

Elaine Toms ◽

David James

Keyword(s):

Error Rate ◽

Word Error Rate

Improvements to the LIUM French ASR system based on CMU sphinx: what helps to significantly reduce the word error rate?

10.21437/interspeech.2009-607 ◽

2009 ◽

Author(s):

Paul Deléglise ◽

Yannick Estève ◽

Sylvain Meignier ◽

Teva Merlin

Keyword(s):

Error Rate ◽

Word Error Rate ◽

Asr System

MarkUs: An Open-Source Web Application to Annotate Student Papers On-Line

Volume 4: Advanced Manufacturing Processes; Biomedical Engineering; Multiscale Mechanics of Biological Tissues; Sciences, Engineering and Education; Multiphysics; Emerging Technologies for Inspection ◽

10.1115/esda2012-82141 ◽

2012 ◽

Author(s):

Morgan Magnin ◽

Guillaume Moreau ◽

Nelle Varoquaux ◽

Benjamin Vialle ◽

Karen Reid ◽

...

Keyword(s):

Higher Education ◽

Open Source ◽

Web Application ◽

Online System ◽

Important Benefit ◽

Special Cases ◽

Assessment Time ◽

On Line ◽

Active Pedagogy ◽

Further Development

A critical component of the learning process lies in the feedback that students receive on their work that validates their progress, identifies flaws in their thinking, and identifies skills that still need to be learned. Many higher-education institutions have developed an active pedagogy that gives students opportunities for different forms of assessment and feedback. This means that students have numerous lab exercises, assignments, and projects. Both instructors and students thus require effective tools to efficiently manage the submission, assessment, and individualized feedback of students’ work. The open-source web application MarkUs aims at meeting these needs: it facilitates the submission and assessment of students’ work. Students directly submit their work using MarkUs, rather than printing it, or sending it by email. The instructors or teaching assistants use MarkUs’s interface to view the students’ work, annotate it, and fill in a marking rubric. Students use the same interface to read the annotations and learn from the assessment. Managing the students’ submissions and the instructors assessments within a single online system, has led to several positive pedagogical outcomes: the number of late submissions has decreased, the assessment time has been drastically reduced, students can access their results and read the instructor’s feedback immediately after the grading process is completed. Using MarkUs has also significantly reduced the time that instructors spend collecting assignments, creating the marking schemes, passing them on to graders, handling special cases, and returning work to the students. In this paper, we introduce MarkUs’ features, and illustrate their benefits for higher education through our own teaching experiences and that of our colleagues. We also describe an important benefit of the fact that the tool itself is open-source. MarkUs has been developed entirely by students giving them a valuable learning opportunity as they work on a large software system that real users depend on. Virtuous circles indeed arise, with former users of MarkUs becoming developers and then supervisors of further development. We will conclude by drawing perspectives about forthcoming features and use, both technically and pedagogically.

MC’s PlotXY—A general-purpose plotting and post-processing open-source tool

SoftwareX ◽

10.1016/j.softx.2019.01.017 ◽

2019 ◽

Vol 9 ◽

pp. 282-287

Author(s):

Massimo Ceraolo

Keyword(s):

Open Source ◽

General Purpose ◽

Post Processing ◽

Open Source Tool

Attention-Based Fully Gated CNN-BGRU for Russian Handwritten Text

Journal of Imaging ◽

10.3390/jimaging6120141 ◽

2020 ◽

Vol 6 (12) ◽

pp. 141

Author(s):

Abdelrahman Abdallah ◽

Mohamed Hamada ◽

Daniyar Nurseitov

Keyword(s):

Error Rate ◽

Handwriting Recognition ◽

Text Recognition ◽

P Value ◽

Word Error Rate ◽

Test Dataset ◽

Handwritten Text ◽

Proposed Model ◽

Handwritten Text Recognition ◽

Gated Recurrent Unit

This article considers the task of handwritten text recognition using attention-based encoder–decoder networks trained in the Kazakh and Russian languages. We have developed a novel deep neural network model based on a fully gated CNN, supported by multiple bidirectional gated recurrent unit (BGRU) and attention mechanisms to manipulate sophisticated features that achieve 0.045 Character Error Rate (CER), 0.192 Word Error Rate (WER), and 0.253 Sequence Error Rate (SER) for the first test dataset and 0.064 CER, 0.24 WER and 0.361 SER for the second test dataset. Our proposed model is the first work to handle handwriting recognition models in Kazakh and Russian languages. Our results confirm the importance of our proposed Attention-Gated-CNN-BGRU approach for training handwriting text recognition and indicate that it can lead to statistically significant improvements (p-value < 0.05) in the sensitivity (recall) over the tests dataset. The proposed method’s performance was evaluated using handwritten text databases of three languages: English, Russian, and Kazakh. It demonstrates better results on the Handwritten Kazakh and Russian (HKR) dataset than the other well-known models.

An Efficient Reconciliation in Removing Errors Using Bose, Chaudhuri, Hocquenghem Code for Quantum Key Distribution

Jurnal Teknologi ◽

10.11113/jt.v59.1262 ◽

2012 ◽

pp. 13-19

Author(s):

Riaz Ahmad Qamar ◽

Mohd Aizaini Maarof ◽

Subariah Ibrahim

Keyword(s):

Bit Error Rate ◽

Error Rate ◽

Quantum Key Distribution ◽

Error Probability ◽

Key Distribution ◽

Secret Key ◽

Post Processing ◽

Quantum Bit ◽

Quantum Bit Error Rate ◽

Quantum Key Distribution Protocol

A quantum key distribution protocol(QKD), known as BB84, was developed in 1984 by Charles Bennett and Gilles Brassard. The protocol works in two phases which are quantum state transmission and conventional post processing. In the first phase of BB84, raw key elements are distributed between two legitimate users by sending encoded photons through quantum channel whilst in the second phase, a common secret-key is obtained from correlated raw key elements by exchanging messages through a public channel e.g.; network or internet. The secret-key so obtained is used for cryptography purpose. Reconciliation is a compulsory part of post processing and hence of quantum key distribution protocol. The performance of a reconciliation protocol depends on the generation rate of common secret-key, number of bits disclosed and the error probability in common secrete-key. These characteristics of a protocol can be achieved by using a less interactive reconciliation protocol which can handle a higher initial quantum bit error rate (QBER). In this paper, we use a simple Bose, Chaudhuri, Hocquenghem (BCH) error correction algorithm with simplified syndrome table to achieve an efficient reconciliation protocol which can handle a higher quantum bit error rate and outputs a common key with zero error probability. The proposed protocol efficient in removing errors such that it can remove all errors even if QBER is 60%. Assuming the post processing channel is an authenticated binary symmetric channel (BSC).

Closed-Form Word Error Rate Analysis for Successive Interference Cancellation Decoders

IEEE Transactions on Wireless Communications ◽

10.1109/twc.2018.2875699 ◽

2018 ◽

Vol 17 (12) ◽

pp. 8256-8267 ◽

Cited By ~ 2

Author(s):

Jinming Wen ◽

Keyu Wu ◽

Chintha Tellambura ◽

Pingzhi Fan

Keyword(s):

Closed Form ◽

Error Rate ◽

Interference Cancellation ◽

Successive Interference Cancellation ◽

Word Error Rate ◽

Rate Analysis

Electrical measurements error analysis laboratory exercise using an open-source hardware platform

2019 2nd International Colloquium on Smart Grid Metrology (SMAGRIMET) ◽

10.23919/smagrimet.2019.8720354 ◽

2019 ◽

Author(s):

Mateo Marcelic ◽

Bruno Sandric ◽

Ivana Leto ◽

Marko Jurcevic

Keyword(s):

Error Analysis ◽

Open Source ◽

Electrical Measurements ◽

Hardware Platform ◽

Laboratory Exercise ◽

Analysis Laboratory ◽

Open Source Hardware