Understanding Aviation English: Challenges and Opportunities in NLP Applications for Indian Languages

Aviation English

English is a language that is understood, spoken and used by citizens of a diverse array of countries. The speakers include both native and non-native speakers of English. NLP or Natural Language Processing on the other hand is a branch of computer science that deals with one of the most challenging aspect that a machine can process: dealing with Natural Languages. Natural languages which have evolved over centuries are complete, diverse and highly complex and thus are challenging for a computer system to understand and process. MT or Machine Translation is a more specific part of NLP that translates one natural language to another (English being one of the major researched and sought after languages among them). Though research in the field of NLP and MT has come a long way and many efficient translators are available, still Translation and other NLP applications in specialized domains such as aeronautics are still today a challenge for NLP researchers and developers to achieve. NLP applications are often used in education of English Language, and are therefore a continuous process for Non-Native speakers of English. Non-native English speakers take help of various NLP tools such as E-Dictionary, MT applications and others to better understand the English language and thus learn it better and faster. Aviation English poses a challenge to MT systems and understanding it as a whole requires specialized handling as it has own phonetic pronunciations and terminologies and constituent Out-Of-Vocabulary words. Dealing with Aviation English calls for teaming up of experts from Applied Linguistics, NLP and AI. As a result it becomes a cross-research discipline that covers situations that demand real time use of proper language, e.g. ATC communications. This Paper aims to discuss most recent research methodologies that deals with the Aviation English and reviews the problems posed by it. Being a specialized and structured form of English, the problems are faced by both native and non-native speakers of English Language. Discussion is carried out in the relevant and recent advances of methods in dealing with aviation English language challenges from both, the Human (ICAO/DGCA/AAI) as well as NLP angle. Lastly we have a look at how these challenges are linked to scope for development of applied technologies. Research in experiential Aviation English situations deals with both English for Specific Purposes - ESP (Aeronautics in our case) as well as situations in English as a Foreign Language i.e. EFL (English-Indian language pair).

Predictive processes during simultaneous interpreting from German into English

Interpreting ◽

10.1075/intp.19.1.01hod ◽

2017 ◽

Vol 19 (1) ◽

pp. 1-20 ◽

Cited By ~ 3

Author(s):

Ena Hodzik ◽

John N. Williams

Keyword(s):

Language Processing ◽

English Language ◽

Native Speakers ◽

Transitional Probability ◽

Semantic Cues ◽

Simultaneous Interpreting ◽

Spoken Language Processing ◽

Advanced Students ◽

Predictive Processes

We report a study on prediction in shadowing and simultaneous interpreting (SI), both considered as forms of real-time, ‘online’ spoken language processing. The study comprised two experiments, focusing on: (i) shadowing of German head-final sentences by 20 advanced students of German, all native speakers of English; (ii) SI of the same sentences into English head-initial sentences by 22 advanced students of German, again native English speakers, and also by 11 trainee and practising interpreters. Latency times for input and production of the target verbs were measured. Drawing on studies of prediction in English-language reading production, we examined two cues to prediction in both experiments: contextual constraints (semantic cues in the context) and transitional probability (the statistical likelihood of words occurring together in the language concerned). While context affected prediction during both shadowing and SI, transitional probability appeared to favour prediction during shadowing but not during SI. This suggests that the two cues operate on different levels of language processing in SI.

Proficiency Differences in Syntactic Processing of Monolingual Native Speakers Indexed by Event-related Potentials

Journal of Cognitive Neuroscience ◽

10.1162/jocn.2009.21393 ◽

2010 ◽

Vol 22 (12) ◽

pp. 2728-2744 ◽

Cited By ~ 95

Author(s):

Eric Pakulak ◽

Helen J. Neville

Keyword(s):

Language Proficiency ◽

Language Processing ◽

English Language ◽

Native Speakers ◽

Memory Span ◽

Wide Spectrum ◽

Event Related Potentials ◽

Related Potentials ◽

Proficiency Scores

Although anecdotally there appear to be differences in the way native speakers use and comprehend their native language, most empirical investigations of language processing study university students and none have studied differences in language proficiency, which may be independent of resource limitations such as working memory span. We examined differences in language proficiency in adult monolingual native speakers of English using an ERP paradigm. ERPs were recorded to insertion phrase structure violations in naturally spoken English sentences. Participants recruited from a wide spectrum of society were given standardized measures of English language proficiency, and two complementary ERP analyses were performed. In between-groups analyses, participants were divided on the basis of standardized proficiency scores into lower proficiency and higher proficiency groups. Compared with lower proficiency participants, higher proficiency participants showed an early anterior negativity that was more focal, both spatially and temporally, and a larger and more widely distributed positivity (P600) to violations. In correlational analyses, we used a wide spectrum of proficiency scores to examine the degree to which individual proficiency scores correlated with individual neural responses to syntactic violations in regions and time windows identified in the between-groups analyses. This approach also used partial correlation analyses to control for possible confounding variables. These analyses provided evidence for the effects of proficiency that converged with the between-groups analyses. These results suggest that adult monolingual native speakers of English who vary in language proficiency differ in the recruitment of syntactic processes that are hypothesized to be at least in part automatic as well as of those thought to be more controlled. These results also suggest that to fully characterize neural organization for language in native speakers it is necessary to include participants of varying proficiency.

International Journal of Scientific Research in Science Engineering and Technology ◽

Multi - Class Document Classification: Effective and Systematized Method to Categorize Documents

10.32628/ijsrset207117 ◽

2020 ◽

pp. 118-123 ◽

Cited By ~ 1

Author(s):

Kaushika Pal ◽

Biraj V. Patel

Keyword(s):

Machine Learning ◽

Natural Language ◽

Language Processing ◽

English Language ◽

Nearest Neighbor ◽

Research Work ◽

Support Vector ◽

Indian Languages ◽

K Nearest Neighbor

A large section of World Wide Web is full of Documents, content; Data, Big data, unformatted data, formatted data, unstructured and unorganized data and we need information infrastructure, which is useful and easily accessible as an when required. This research work is combining approach of Natural Language Processing and Machine Learning for content-based classification of documents. Natural Language Processing is used which will divide the problem of understanding entire document at once into smaller chucks and give us only with useful tokens responsible for Feature Extraction, which is machine learning technique to create Feature Set which helps to train classifier to predict label for new document and place it at appropriate location. Machine Learning subset of Artificial Intelligence is enriched with sophisticated algorithms like Support Vector Machine, K – Nearest Neighbor, Naïve Bayes, which works well with many Indian Languages and Foreign Language content’s for classification. This Model is successful in classifying documents with more than 70% of accuracy for major Indian Languages and more than 80% accuracy for English Language.

Head Start teachers' verbal behaviors in classrooms consisting of primarily English Language Learners and native speakers of English

PsycEXTRA Dataset ◽

10.1037/e584752012-028 ◽

2010 ◽

Author(s):

Hope Gerde ◽

Karen Diamond ◽

Marci Hanson

Keyword(s):

English Language Learners ◽

Head Start ◽

Language Learners ◽

English Language ◽

Native Speakers ◽

Head Start Teachers ◽

Verbal Behaviors ◽

Native Speakers Of English

A Hindi Image Caption Generation Framework Using Deep Learning

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3432246 ◽

2021 ◽

Vol 20 (2) ◽

pp. 1-19

Author(s):

Santosh Kumar Mishra ◽

Rijul Dhir ◽

Sriparna Saha ◽

Pushpak Bhattacharyya

Keyword(s):

Computer Vision ◽

Natural Language ◽

Language Processing ◽

English Language ◽

Image Captioning ◽

Textual Description ◽

Proposed Model ◽

Hindi Language ◽

The Given

Image captioning is the process of generating a textual description of an image that aims to describe the salient parts of the given image. It is an important problem, as it involves computer vision and natural language processing, where computer vision is used for understanding images, and natural language processing is used for language modeling. A lot of works have been done for image captioning for the English language. In this article, we have developed a model for image captioning in the Hindi language. Hindi is the official language of India, and it is the fourth most spoken language in the world, spoken in India and South Asia. To the best of our knowledge, this is the first attempt to generate image captions in the Hindi language. A dataset is manually created by translating well known MSCOCO dataset from English to Hindi. Finally, different types of attention-based architectures are developed for image captioning in the Hindi language. These attention mechanisms are new for the Hindi language, as those have never been used for the Hindi language. The obtained results of the proposed model are compared with several baselines in terms of BLEU scores, and the results show that our model performs better than others. Manual evaluation of the obtained captions in terms of adequacy and fluency also reveals the effectiveness of our proposed approach. Availability of resources : The codes of the article are available at https://github.com/santosh1821cs03/Image_Captioning_Hindi_Language ; The dataset will be made available: http://www.iitp.ac.in/∼ai-nlp-ml/resources.html .

Formalising Natural Languages: Applications to Natural Language Processing and Digital Humanities

10.1007/978-3-030-70629-6 ◽

2021 ◽

Keyword(s):

Natural Language ◽

Language Processing ◽

Digital Humanities ◽

Natural Languages

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

Natural Language Processing by Enhanced Honey Encryption Technique

10.35940/ijitee.l1048.10812s19 ◽

2019 ◽

Vol 8 (12S) ◽

pp. 159-163

Keyword(s):

Natural Language ◽

Language Processing ◽

Cyber Attacks ◽

Binary Form ◽

Brute Force ◽

Natural Languages ◽

Cipher Text ◽

The Right ◽

Binary Strings

Traditional encryption systems and techniques have always been vulnerable to brute force cyber-attacks. This is due to bytes encoding of characters utf8 also known as ASCII characters. Therefore, an opponent who intercepts a cipher text and attempts to decrypt the signal by applying brute force with a faulty pass key can detect some of the decrypted signals by employing a mixture of symbols that are not uniformly dispersed and contain no meaningful significance. Honey encoding technique is suggested to curb this classical authentication weakness by developing cipher-texts that provide correct and evenly dispersed but untrue plaintexts after decryption with a false key. This technique is only suitable for passkeys and PINs. Its adjustment in order to promote the encoding of the texts of natural languages such as electronic mails, records generated by man, still remained an open-end drawback. Prevailing proposed schemes to expand the encryption of natural language messages schedule exposes fragments of the plaintext embedded with coded data, thus they are more prone to cipher text attacks. In this paper, amending honey encoded system is proposed to promote natural language message encryption. The main aim was to create a framework that would encrypt a signal fully in binary form. As an end result, most binary strings semantically generate the right texts to trick an opponent who tries to decipher an error key in the cipher text. The security of the suggested system is assessed..

English as a Lingua Franca Approach: Implications and Limitations in Saudi English as a Foreign Language Classrooms

Advances in Social Sciences Research Journal ◽

10.14738/assrj.812.11355 ◽

2021 ◽

Vol 8 (12) ◽

pp. 96-104

Author(s):

Samar Alharbi

Keyword(s):

Foreign Language ◽

English Language ◽

Native Speakers ◽

Lingua Franca ◽

English As Foreign Language ◽

The World ◽

Language Classrooms ◽

Foreign Language Classrooms ◽

Global Language

English language considers a global language spoken by a majority of people around the world. It is a language used mainly for communication, trades and study purposes. This widespread of English language being wildly spoken lead to different varieties of English as a lingua franca (ELF) means that non native speakers of English still be able to communicate with each other. Using ELF as a legitimate variety of English in language classrooms is questioned by some researchers. This paper will provide an overview of the concept of ELF. It will also present implications and limitations of using ELF in Saudi English as foreign language classrooms.

Teaching English in India —The Use of Technologically Enhanced Realia in the Classroom

Studies in English Language Teaching ◽

10.22158/selt.v2n2p149 ◽

2014 ◽

Vol 2 (2) ◽

pp. 149

Author(s):

Priya K. Nair

Keyword(s):

Foreign Language ◽

English Language ◽

Native Speakers ◽

Mother Tongue ◽

Teaching English ◽

Job Market ◽

Listening And Speaking ◽

Speaking Skills

In India acquisition of English language is imperative if one wants to sell oneself in the increasingly competitive job market. With a booming population the nation is filled with educated, technologically literate youth. English is not merely a foreign language in India. As India is separated by a plethora of languages knowledge of English is imperative. As the teachers in India are not native speakers of English the language they teach is not free from errors. The articulation is quite problematic as the mother tongue influence is quite pronounced. Technology helps to reduce these errors. Movies as a tool can enhance the listening and speaking skills of our students. It is quite boring to work with disembodied voices and the recorded conversations available in language labs do not sustain the learner’s interest. However learners are often forced to listen to recorded conversations of people they never see, the conversation is often stilted and contemporary idiom is hardly used. However, a completely new dimension to aural practice can be added in the classroom by using movies.

Syntactic Complexity in EFL and Native Learners' Undergraduate Thesis Abstracts

Journal of English Language and Culture ◽

10.30813/jelc.v9i1.1450 ◽

2018 ◽

Vol 9 (1) ◽

Author(s):

Murniati Murniati

Keyword(s):

English Language ◽

Native Speakers ◽

The United States ◽

Syntactic Complexity ◽

Syntactic Structures ◽

Complex Sentences ◽

Academic Texts ◽

Academic Year ◽

Undergraduate Thesis

This research aims to find syntactic complexity of the abstracts in the undergraduate thesis written down by university learners in Indonesia and the ones written down by native speakers of English. The characteristics of syntactic complexity produced by Indonesian learners and the learners who are the native speakers should also be analyzed. It is possible to extend the type of syntactic complexity found in academic texts. In the end, those extensions should be characterized the English language used by Indonesian learners. The data is gained through downloading the abstracts of the undergraduate thesis in the academic year of 2015-2016 from the UBM English Department alumni database. The data regarding the abstracts written down by the native speakers is downloaded from the reputable universities in The United States of America. After that, the data is analyzed by making used of the syntactic analyzer by Lu & Ai (2015). The results shows that the Indonesian learners tend to write more complex sentences and use subordination in the abstracts. The native speakers, on the other hands, tend to write longer sentences with longer T-Unit and clauses. They also tend to write complex nominal in the abstracts. The number of coordination used is similar between the ones written down by Indonesian learners and native speakers of English. Keywords: syntactic complexity, syntactic structures, undergraduate thesis, Indonesian learners