scholarly journals Automatic Call Sign Detection: Matching Air Surveillance Data with Air Traffic Spoken Communications

Proceedings ◽  
2020 ◽  
Vol 59 (1) ◽  
pp. 14
Author(s):  
Juan Zuluaga-Gomez ◽  
Karel Veselý ◽  
Alexander Blatt ◽  
Petr Motlicek ◽  
Dietrich Klakow ◽  
...  

Voice communication is the main channel to exchange information between pilots and Air-Traffic Controllers (ATCos). Recently, several projects have explored the employment of speech recognition technology to automatically extract spoken key information such as call signs, commands, and values, which can be used to reduce ATCos’ workload and increase performance and safety in Air-Traffic Control (ATC)-related activities. Nevertheless, the collection of ATC speech data is very demanding, expensive, and limited to the intrinsic speakers’ characteristics. As a solution, this paper presents ATCO2, a project that aims to develop a unique platform to collect, organize, and pre-process ATC data collected from air space. Initially, the data are gathered directly through publicly accessible radio frequency channels with VHF receivers and LiveATC, which can be considered as an “unlimited-source” of low-quality data. The ATCO2 project explores employing context information such as radar and air surveillance data (collected with ADS-B and Mode S) from the OpenSky Network (OSN) to correlate call signs automatically extracted from voice communication with those available from ADS-B channels, to eventually increase the overall call sign detection rates. More specifically, the timestamp and location of the spoken command (issued by the ATCo by voice) are extracted, and a query is sent to the OSN server to retrieve the call sign tags in ICAO format for the airplanes corresponding to the given area. Then, a word sequence provided by an automatic speech recognition system is fed into a Natural Language Processing (NLP) based module together with the set of call signs available from the ADS-B channels. The NLP module extracts the call sign, command, and command arguments from the spoken utterance.

Author(s):  
Zhe Sun ◽  
Pingbo Tang

Losses of separation (LoS) are breaches of regulations that specify the minimum distance between aircraft in controlled airspace. Erroneous communications between air traffic controllers (ATCs) and pilots are leading contributors to LoS that result in elevated risk of fatal accidents. An air traffic control system that could identify communication errors promptly would, therefore, be advantageous. Establishing such a system requires a systematic characterization of communication errors to reveal how various communication arrangements and errors influence the development of LoS. Such know-how could guide the ATCs and pilots in identifying the parts of their communication processes and content that most influence the occurrence of LoS. Existing studies of LoS focus on simulation of aircraft operation processes with little quantitative analysis about how communication issues arise and result in elevated risks of LoS. This paper presents a method for supporting automatic communication error detection through integrated use of speech recognition, text analysis, and formal modeling of airport operational processes. The proposed method focuses on: identifying communication features to guide the detection of vulnerable communications; characterizing communication errors; and Bayesian Network modeling for predicting communication errors and LoS using the features derived from ATC–pilot communications. Major findings show that incorrect read-backs by pilots are highly correlated with a majority of LoS. Results indicate the proposed method could form a basis for automating communication error detection and preventing LoS. The integrated Automatic Speech Recognition and Natural Language Processing functions may be incorporated into existing aviation applications for real-time ATC–pilot communication monitoring and preventive LoS control.


2021 ◽  
Vol 7 ◽  
pp. 205520762110021
Author(s):  
Catherine Diaz-Asper ◽  
Chelsea Chandler ◽  
R Scott Turner ◽  
Brigid Reynolds ◽  
Brita Elvevåg

Objective There is a critical need to develop rapid, inexpensive and easily accessible screening tools for mild cognitive impairment (MCI) and Alzheimer’s disease (AD). We report on the efficacy of collecting speech via the telephone to subsequently develop sensitive metrics that may be used as potential biomarkers by leveraging natural language processing methods. Methods Ninety-one older individuals who were cognitively unimpaired or diagnosed with MCI or AD participated from home in an audio-recorded telephone interview, which included a standard cognitive screening tool, and the collection of speech samples. In this paper we address six questions of interest: (1) Will elderly people agree to participate in a recorded telephone interview? (2) Will they complete it? (3) Will they judge it an acceptable approach? (4) Will the speech that is collected over the telephone be of a good quality? (5) Will the speech be intelligible to human raters? (6) Will transcriptions produced by automated speech recognition accurately reflect the speech produced? Results Participants readily agreed to participate in the telephone interview, completed it in its entirety, and rated the approach as acceptable. Good quality speech was produced for further analyses to be applied, and almost all recorded words were intelligible for human transcription. Not surprisingly, human transcription outperformed off the shelf automated speech recognition software, but further investigation into automated speech recognition shows promise for its usability in future work. Conclusion Our findings demonstrate that collecting speech samples from elderly individuals via the telephone is well tolerated, practical, and inexpensive, and produces good quality data for uses such as natural language processing.


Author(s):  
Sandeep Badrinath ◽  
Hamsa Balakrishnan

A significant fraction of communications between air traffic controllers and pilots is through speech, via radio channels. Automatic transcription of air traffic control (ATC) communications has the potential to improve system safety, operational performance, and conformance monitoring, and to enhance air traffic controller training. We present an automatic speech recognition model tailored to the ATC domain that can transcribe ATC voice to text. The transcribed text is used to extract operational information such as call-sign and runway number. The models are based on recent improvements in machine learning techniques for speech recognition and natural language processing. We evaluate the performance of the model on diverse datasets.


2021 ◽  
Vol 11 (1) ◽  
pp. 428
Author(s):  
Donghoon Oh ◽  
Jeong-Sik Park ◽  
Ji-Hwan Kim ◽  
Gil-Jin Jang

Speech recognition consists of converting input sound into a sequence of phonemes, then finding text for the input using language models. Therefore, phoneme classification performance is a critical factor for the successful implementation of a speech recognition system. However, correctly distinguishing phonemes with similar characteristics is still a challenging problem even for state-of-the-art classification methods, and the classification errors are hard to be recovered in the subsequent language processing steps. This paper proposes a hierarchical phoneme clustering method to exploit more suitable recognition models to different phonemes. The phonemes of the TIMIT database are carefully analyzed using a confusion matrix from a baseline speech recognition model. Using automatic phoneme clustering results, a set of phoneme classification models optimized for the generated phoneme groups is constructed and integrated into a hierarchical phoneme classification method. According to the results of a number of phoneme classification experiments, the proposed hierarchical phoneme group models improved performance over the baseline by 3%, 2.1%, 6.0%, and 2.2% for fricative, affricate, stop, and nasal sounds, respectively. The average accuracy was 69.5% and 71.7% for the baseline and proposed hierarchical models, showing a 2.2% overall improvement.


2016 ◽  
Vol 150 (4) ◽  
pp. S61
Author(s):  
Jennifer Nayor ◽  
Sergey Goryachev ◽  
Vivian S. Gainer ◽  
John R. Saltzman

Sign in / Sign up

Export Citation Format

Share Document