Increasing Quality of Maritime Communication through Intelligent Speech Recognition and Radio Direction Finding

Author(s):  
Ole John ◽  
Maximilian Reimann
2017 ◽  
Vol 68 (2) ◽  
pp. 346-354
Author(s):  
Ján Staš ◽  
Daniel Hládek ◽  
Peter Viszlay ◽  
Tomáš Koctúr

Abstract This paper describes a new Slovak speech recognition dedicated corpus built from TEDx talks and Jump Slovakia lectures. The proposed speech database consists of 220 talks and lectures in total duration of about 58 hours. Annotated speech database was generated automatically in an unsupervised manner by using acoustic speech segmentation based on principal component analysis and automatic speech transcription using two complementary speech recognition systems. The evaluation data consisting of 50 manually annotated talks and lectures in total duration of about 12 hours, has been created for evaluation of the quality of Slovak speech recognition. By unsupervised automatic annotation of TEDx talks and Jump Slovakia lectures we have obtained 21.26% of new speech segments with approximately 9.44% word error rate, suitable for retraining or adaptation of acoustic models trained beforehand.


2018 ◽  
Vol 34 (S1) ◽  
pp. 107-107
Author(s):  
Thomas Poder ◽  
Véronique Déry ◽  
Jean-Francois Fisette

Introduction:Speech recognition is increasingly used in medical reporting. The aim of this article is to identify in the literature the advantages and weaknesses of this technology, as well as barriers and facilitators to its implementation.Methods:A systematic review of systematic reviews has been conducted in PubMed, Scopus, Cochrane Library and Center for Reviews and Dissemination up to August 2017. The grey literature has also been consulted. The quality of systematic reviews has been assessed with the AMSTAR checklist. Inclusion criteria were to use speech recognition for medical reporting (front or back-end). A Survey has also been conducted in Quebec, Canada, to identify the dissemination of this technology in this province, as well as the factors of success or failure in its implementation.Results:Five systematic reviews were identified. These reviews indicated a high level of heterogeneity across studies. The quality of the studies reported was generally poor. Speech recognition is not as accurate as human transcription but can dramatically reduce the turnaround times for reporting. In front-end use, medical doctors need to spend more time for dictation and correction than with human transcription. With speech recognition, major errors can be up to three times more frequent. In back-end use, a potential increase in the productivity of transcriptionist is noted.Conclusions:Speech recognition offers some advantages for medical reporting, the main one being a reduction in turnaround times. However, these advantages are challenged by an increased burden for medical doctor and risks of additional errors in medical reports. It is also hard to identify for which medical specialties and which clinical activities the use of speech recognition will be the most beneficial.


1998 ◽  
Vol 41 (5) ◽  
pp. 1073-1087 ◽  
Author(s):  
Aaron J. Parkinson ◽  
Wendy S. Parkinson ◽  
Richard S. Tyler ◽  
Mary W. Lowder ◽  
Bruce J. Gantz

Sixteen experienced cochlear implant patients with a wide range of speechperception abilities received the SPEAK processing strategy in the Nucleus Spectra-22 cochlear implant. Speech perception was assessed in quiet and in noise with SPEAK and with the patients' previous strategies (for most, Multipeak) at the study onset, as well as after using SPEAK for 6 months. Comparisons were made within and across the two test sessions to elucidate possible learning effects. Patients were also asked to rate the strategies on seven speech recognition and sound quality scales. After 6 months' experience with SPEAK, patients showed significantly improved mean performance on a range of speech recognition measures in quiet and noise. When mean subjective ratings were compared over time there were no significant differences noted between strategies. However, many individuals rated the SPEAK strategy better for two or more of the seven subjective measures. Ratings for "appreciation of music" and "quality of my own voice" in particular were generally higher for SPEAK. Improvements were realized by patients with a wide range of speech perception abilities, including those with little or no open-set speech recognition.


2020 ◽  
Vol 8 (5) ◽  
pp. 1677-1681

Stuttering or Stammering is a speech defect within which sounds, syllables, or words are rehashed or delayed, disrupting the traditional flow of speech. Stuttering can make it hard to speak with other individuals, which regularly have an effect on an individual's quality of life. Automatic Speech Recognition (ASR) system is a technology that converts audio speech signal into corresponding text. Presently ASR systems play a major role in controlling or providing inputs to the various applications. Such an ASR system and Machine Translation Application suffers a lot due to stuttering (speech dysfluency). Dysfluencies will affect the phrase consciousness accuracy of an ASR, with the aid of increasing word addition, substitution and dismissal rates. In this work we focused on detecting and removing the prolongation, silent pauses and repetition to generate proper text sequence for the given stuttered speech signal. The stuttered speech recognition consists of two stages namely classification using LSTM and testing in ASR. The major phases of classification system are Re-sampling, Segmentation, Pre-Emphasis, Epoch Extraction and Classification. The current work is carried out in UCLASS Stuttering dataset using MATLAB with 4% to 6% increase in accuracy when compare with ANN and SVM.


2019 ◽  
Vol 3 (2) ◽  
pp. 222
Author(s):  
Suyatmo Suyatmo ◽  
Hadi Prayitno ◽  
Ulfa Hasnita ◽  
Iswandi Idris ◽  
Rizaldy Khair

Abstract - The importance of learning media is utilized by ATKP Medan as an opportunity to continue to improve the quality of learning. However, the most common problem in avionic learning is the limited resources available. This is because to access Avionic learning, cadets can only access it from LAB CBT. This is because Avionic software is only installed inside the lab and cannot be learned from outside the lab. The purpose of this research is to improve the learning process of Avionic - Automatic Direction Finding System digitally which is packaged in multimedia animation to make it easier for cadets to learn the Avionic Automatic Direction Finding System without having access in the laboratory. The method used in this study is to use the MDLC Multimedia Development Life Cycle method, namely the Concept, Design, Material Collecting, Manufacturing, Testing, Distribution methods. The learning media produced from this study are by displaying 5 types of display pages, namely Introduction, ADF Components, Sense Antennas, Antenna Components, and Direction Finding System.Keywords - Learning Media, Avionic, ADF, ATKP Medan  Abstract - Pentingnya media pembelajaran dimanfaatkan oleh ATKP Medan sebagai kesempatan untuk terus meningkatkan mutu pembelajaran.  Namun, permasalahan yang paling sering terjadi dalam pembelajaran avionic adalah adanya keterbatasan resource yang ada. Hal ini dikarenakan untuk mengakses pembelajaran Avionic para taruna hanya bisa mengaksesnya dari LAB CBT. Hal ini disebabkan software Avionic hanya terpasang didalam lab dan tidak bisa dipelajari dari luar lab. Tujuan dalam penelitian ini adalah meningkatkan proses pembelajaran Avionic - Automatic Direction Finding System secara digital yang dkemas dalam animasi multimedia untuk memudahkan para taruna untuk mempelajari Avionic Automatic Direction Finding System  tanpa harus akses di laboratories. Metode yang digunakan dalam penelitian ini adalah menggunakan Dengan menggunakan metode Multimedia Development Life Cycle MDLC  yaitu metode Konsep (Concept), Perancangan (Desain), Pengumpulan Bahan (Material Collecting), Pembuatan (Assembly), Pengujian (Testing), Distribusi (Distribution). Media pembelajaran yang dihasilkan dari penelitian ini adalah dengan menampilkan 5 jenis halaman tampilan yaitu Introduction, Komponen ADF, Sense Antenna, Komponen Antena dan Direction Finding System.Kata kunci - Media Pembelajaran,  Avionic, ADF, ATKP Medan.


2021 ◽  
Vol 8 (1) ◽  
pp. 164-170
Author(s):  
Mohammad Husam Alhumsi ◽  
Saleh Belhassen

Phonetic dictionaries are regarded as pivotal components of speech recognition systems. The function of speech recognition research is to generate a machine which will accurately identify and distinguish the normal human speech from any other speaker. Literature affirmed that Arabic phonetics is one of the major problems in Arabic speech recognition. Therefore, this paper reviews previous studies tackling the challenges faced by initiating an Arabic phonetic dictionary with respect to Arabic speech recognition. It has been found that the system of speech recognition investigated areas of differences concerning Arabic phonetics. In addition, an Arabic phonetic dictionary should be initiated where the Arabic vowels’ phonemes should be considered as a component of the consonants’ phonemes. Thus, the incorporation of developed machine translation systems may enhance the quality of the system. The current paper concludes with the existing challenges faced by Arabic phonetic dictionary.


2021 ◽  
Vol 111 (09) ◽  
pp. 579-582
Author(s):  
Daniel Schulte ◽  
Martin Sudhoff ◽  
Bernd Kuhlenkötter

In diesem Beitrag wird die Konzeption und Erprobung eines Systems zur Datenerfassung mittels Spracherkennung in der manuellen Montage beschrieben. Dieses wurde in einem realen Montagesystem in der Lern- und Forschungsfabrik (LFF) des Lehrstuhls für Produktionssysteme (LPS) zur Prozesszeitaufnahme eingesetzt. Anschließend wurde die Qualität der Daten sowie auf die Bedienerfreundlichkeit untersucht. Es konnte gezeigt werden, dass die Spracherkennung eine gute Ergänzung zur manuellen Datenerfassung darstellt.   This paper describes the design and testing of a system for data acquisition using speech recognition in manual assembly. This was used in a real assembly system in the Learning and Research Factory of the Chair of Production Systems for process time recording. Subsequently, the quality of the data as well as the user-friendliness were examined. It could be shown that speech recognition is a good complement to manual data acquisition.


Sign in / Sign up

Export Citation Format

Share Document