Automatic speech recognition: can you understand me?

Innovative language pedagogy report ◽

10.14705/rpnet.2021.50.1246 ◽

2021 ◽

pp. 121-126

Author(s):

Susana Pérez Castillejo

Keyword(s):

Speech Recognition ◽

Natural Language ◽

Language Processing ◽

Automatic Speech Recognition ◽

Text Messaging ◽

Digital Communication ◽

Spoken Discourse ◽

Video Captioning ◽

Written Text ◽

Communication Method

What is it? Automatic Speech Recognition (ASR) is a digital communication method that transforms spoken discourse into written text. This rapidly evolving technology is used in email, text messaging, or live video captioning. Current ASR systems operate in conjunction with Natural Language Processing (NLP) technology to transform speech into text that people – and machines – can read. NLP refers to the methodologies and computational tools that analyze data produced in a natural language, such as English.

Download Full-text

Road Navigation System Using Automatic Speech Recognition (ASR) And Natural Language Processing (NLP)

2018 IEEE Region 10 Humanitarian Technology Conference (R10-HTC) ◽

10.1109/r10-htc.2018.8629859 ◽

2018 ◽

Cited By ~ 1

Author(s):

Pooja Withanage ◽

Tharaka Liyanage ◽

Naditha Deeyakaduwe ◽

Eshan Dias ◽

Samantha Thelijjagoda

Keyword(s):

Natural Language Processing ◽

Speech Recognition ◽

Natural Language ◽

Language Processing ◽

Automatic Speech Recognition ◽

Navigation System

Download Full-text

Toward the Integration of Natural Language Processing and Automatic Speech Recognition: Using Morpho-Syntax and Pragmatics for Transcription

Multimodal Processing and Interaction ◽

10.1007/978-0-387-76316-3_9 ◽

2008 ◽

pp. 1-18

Author(s):

Stéphane Huet ◽

Gwénolé Lecorvé ◽

Guillaume Gravier ◽

Pascale Sébillot

Keyword(s):

Natural Language Processing ◽

Speech Recognition ◽

Natural Language ◽

Language Processing ◽

Automatic Speech Recognition

Download Full-text

Error Types in Natural Language Processing in Inflectional Languages

Encyclopedia of Information Science and Technology, Fifth Edition - Advances in Information Quality and Management ◽

10.4018/978-1-7998-3479-3.ch006 ◽

2021 ◽

pp. 73-86

Author(s):

Gregor Donaj ◽

Mirjam Sepesy Maučec

Keyword(s):

Natural Language Processing ◽

Speech Recognition ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Automatic Speech Recognition ◽

Slovene Language ◽

Error Classification ◽

Error Types

This article presents the challenges of natural language processing applications when they are used with inflectional languages. Two typical applications are presented: automatic speech recognition and machine translation. An overview of those applications and the properties of inflectional languages is given as well as examples from the highly inflectional Slovene language. Then, an error classification with examples is given, also with an emphasis on inflectional languages, as well as some directions for further research in this area.

Download Full-text

A Computational Look at Oral History Archives

Journal on Computing and Cultural Heritage ◽

10.1145/3477605 ◽

2022 ◽

Vol 15 (1) ◽

pp. 1-16

Author(s):

Francisca Pessanha ◽

Almila Akdag Salah

Keyword(s):

Signal Processing ◽

Natural Language Processing ◽

Speech Recognition ◽

Natural Language ◽

Oral History ◽

Language Processing ◽

Automatic Speech Recognition ◽

Visual Cues ◽

Social Signal Processing ◽

Social Signal

Computational technologies have revolutionized the archival sciences field, prompting new approaches to process the extensive data in these collections. Automatic speech recognition and natural language processing create unique possibilities for analysis of oral history (OH) interviews, where otherwise the transcription and analysis of the full recording would be too time consuming. However, many oral historians note the loss of aural information when converting the speech into text, pointing out the relevance of subjective cues for a full understanding of the interviewee narrative. In this article, we explore various computational technologies for social signal processing and their potential application space in OH archives, as well as neighboring domains where qualitative studies is a frequently used method. We also highlight the latest developments in key technologies for multimedia archiving practices such as natural language processing and automatic speech recognition. We discuss the analysis of both visual (body language and facial expressions), and non-visual cues (paralinguistics, breathing, and heart rate), stating the specific challenges introduced by the characteristics of OH collections. We argue that applying social signal processing to OH archives will have a wider influence than solely OH practices, bringing benefits for various fields from humanities to computer sciences, as well as to archival sciences. Looking at human emotions and somatic reactions on extensive interview collections would give scholars from multiple fields the opportunity to focus on feelings, mood, culture, and subjective experiences expressed in these interviews on a larger scale.

Download Full-text

Automatic Speech Recognition and Natural Language Understanding for Emotion Detection in Multi-party Conversations

Proceedings of the 1st International Workshop on Multimodal Conversational AI ◽

10.1145/3423325.3423737 ◽

2020 ◽

Author(s):

Ilja Popovic ◽

Dubravko Culibrk ◽

Milan Mirkovic ◽

Srdjan Vukmirovic

Keyword(s):

Speech Recognition ◽

Natural Language ◽

Automatic Speech Recognition ◽

Natural Language Understanding ◽

Emotion Detection ◽

Language Understanding

Download Full-text

Detecting Linguistic Markers of Violent Extremism in Online Environments

Combating Violent Extremism and Radicalization in the Digital Era - Advances in Religious and Cultural Studies ◽

10.4018/978-1-5225-0156-5.ch018 ◽

2016 ◽

pp. 374-390 ◽

Cited By ~ 5

Author(s):

Fredrik Johansson ◽

Lisa Kaati ◽

Magnus Sahlgren

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Terrorist Attacks ◽

The Internet ◽

Written Text ◽

Linguistic Markers ◽

Geographical Regions ◽

Violent Extremism ◽

Online Environments

The ability to disseminate information instantaneously over vast geographical regions makes the Internet a key facilitator in the radicalisation process and preparations for terrorist attacks. This can be both an asset and a challenge for security agencies. One of the main challenges for security agencies is the sheer amount of information available on the Internet. It is impossible for human analysts to read through everything that is written online. In this chapter we will discuss the possibility of detecting violent extremism by identifying signs of warning behaviours in written text – what we call linguistic markers – using computers, or more specifically, natural language processing.

Download Full-text

Speech Interface for Controlling Micro Air Vehicle

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.8695 ◽

2020 ◽

Vol 17 (1) ◽

pp. 488-491

Author(s):

P. Lakshmi ◽

S. Veena ◽

D. K. Rahul ◽

H. Lokesha

Keyword(s):

Natural Language ◽

Language Processing ◽

Automatic Speech Recognition ◽

Micro Air Vehicle ◽

The Other ◽

Ground Control ◽

Control Applications ◽

Speech Interface ◽

Air Vehicle ◽

Ground Control Station

This paper focuses on the development of the speech interface for controlling a Micro Air Vehicle (MAV). A speech interface in such control applications will have two distinct modules. One is the Automatic Speech Recognition (ASR) module and the other is the Natural Language Processing (NLP) module. The ASR is developed using the models built using CMU Sphinx toolkit. The NLP scheme is proposed and developed using Natural Language Toolkit (NLTK). Understanding of the speech is very important in such kind of control applications. The NLP outcome is used to invoke the Ground Control Station (GCS) commands. The results are validated in a Flight Gear simulator using Mission Planner GCS configured for MAV.

Download Full-text

The Concept of Integrating Artificial Intelligence Technologies Into Human Resources in a Digital Paradigm

Management of the personnel and intellectual resources in Russia ◽

10.12737/2305-7807-2020-5-9 ◽

2020 ◽

Vol 9 (2) ◽

pp. 5-9

Author(s):

Oksana Chulanova

Keyword(s):

Artificial Intelligence ◽

Computer Vision ◽

Natural Language Processing ◽

Decision Support ◽

Speech Recognition ◽

Human Resources ◽

Natural Language ◽

Language Processing

The article discusses the capabilities of artificial intelligence technologies - technologies based on the use of artificial intelligence, including natural language processing, intellectual decision support, computer vision, speech recognition and synthesis, and promising methods of artificial intelligence. The results of the author's study and the analysis of artificial intelligence technologies and their capabilities for optimizing work with staff are presented. A study conducted by the author allowed us to develop an author's concept of integrating artificial intelligence technologies into work with personnel in the digital paradigm.

Download Full-text

Automatic Classification of the Korean Triage Acuity Scale in Simulated Emergency Rooms Using Speech Recognition and Natural Language Processing: a Proof of Concept Study

Journal of Korean Medical Science ◽

10.3346/jkms.2021.36.e175 ◽

2021 ◽

Vol 36 (27) ◽

Author(s):

Dongkyun Kim ◽

Jaehoon Oh ◽

Heeju Im ◽

Myeongseong Yoon ◽

Jiwoo Park ◽

...

Keyword(s):

Natural Language Processing ◽

Speech Recognition ◽

Natural Language ◽

Language Processing ◽

Automatic Classification ◽

Proof Of Concept ◽

Emergency Rooms ◽

Concept Study

Download Full-text

Speech recognition with feedback from natural language processing for adaptation of acoustic model

The Journal of the Acoustical Society of America ◽

10.1121/1.2832845 ◽

2008 ◽

Vol 123 (1) ◽

pp. 25

Author(s):

Hitoshj Honda

Keyword(s):

Natural Language Processing ◽

Speech Recognition ◽

Natural Language ◽

Language Processing ◽

Acoustic Model

Download Full-text