An Evaluation Of A Linguistically Motivated Conversational Software Agent Framework

This paper presents a critical evaluation framework for a linguistically motivated conversational software agent (CSA). The CSA prototype investigates the integration, intersection and interface of the language, knowledge, and speech act constructions (SAC) based on a grammatical object, and the sub-model of belief, desires and intention (BDI) and dialogue management (DM) for natural language processing (NLP). A long-standing issue within NLP CSA systems is refining the accuracy of interpretation to provide realistic dialogue to support human-to-computer communication. This prototype constitutes three phase models: (1) a linguistic model based on a functional linguistic theory – Role and Reference Grammar (RRG), (2) an Agent Cognitive Model with two inner models: (a) a knowledge representation model, (b) a planning model underpinned by BDI concepts, intentionality and rational interaction, and (3) a dialogue model. The evaluation strategy for this Java-based prototype is multi-approach driven by grammatical testing (English language utterances), software engineering and agent practice. A set of evaluation criteria are grouped per phase model, and the testing framework aims to test the interface, intersection and integration of all phase models. The empirical evaluations demonstrate that the CSA is a proof-of-concept, demonstrating RRG’s fitness for purpose for describing, and explaining phenomena, language processing and knowledge, and computational adequacy. Contrastingly, evaluations identify the complexity of lower level computational mappings of NL – agent to ontology with semantic gaps, and further addressed by a lexical bridging solution.

Download Full-text

Chaitanya

10.1093/oso/9780199493838.001.0001 ◽

2019 ◽

Author(s):

Amiya P. Sen

Keyword(s):

Graduate Students ◽

English Language ◽

Critical Evaluation ◽

Significant Part ◽

Dynamic Relationship ◽

Iconic Figure

This is a short yet critical biography of a major religious figure from Hindu Bengal, Krishna Chaitanya (1486–1533), based on extant hagiographical sources from medieval Bengal as also recent scholarly studies. It relies on both Bengali and English language sources, creating a dialogic and dynamic relationship between the two. The book primarily addresses graduate students and interested general readers in an easily accessible and intelligible manner, without taking recourse to copious notes and citations. The intention of this project was to produce a narrative that was both gripping and enjoyable. However, there is also ample material in this book that will interest and motivate the researcher as well. A significant part of this work is a critical evaluation of just how Chaitanya has been perceived and understood after his time, particularly in colonial Bengal where he has come to assume the place of an iconic figure. Interested readers will find the painstakingly compiled appendices quite useful.

Download Full-text

Predictive processes during simultaneous interpreting from German into English

Interpreting ◽

10.1075/intp.19.1.01hod ◽

2017 ◽

Vol 19 (1) ◽

pp. 1-20 ◽

Cited By ~ 3

Author(s):

Ena Hodzik ◽

John N. Williams

Keyword(s):

Language Processing ◽

English Language ◽

Native Speakers ◽

Transitional Probability ◽

Semantic Cues ◽

Simultaneous Interpreting ◽

Spoken Language Processing ◽

Advanced Students ◽

Native Speakers Of English ◽

Predictive Processes

We report a study on prediction in shadowing and simultaneous interpreting (SI), both considered as forms of real-time, ‘online’ spoken language processing. The study comprised two experiments, focusing on: (i) shadowing of German head-final sentences by 20 advanced students of German, all native speakers of English; (ii) SI of the same sentences into English head-initial sentences by 22 advanced students of German, again native English speakers, and also by 11 trainee and practising interpreters. Latency times for input and production of the target verbs were measured. Drawing on studies of prediction in English-language reading production, we examined two cues to prediction in both experiments: contextual constraints (semantic cues in the context) and transitional probability (the statistical likelihood of words occurring together in the language concerned). While context affected prediction during both shadowing and SI, transitional probability appeared to favour prediction during shadowing but not during SI. This suggests that the two cues operate on different levels of language processing in SI.

Download Full-text

A Hindi Image Caption Generation Framework Using Deep Learning

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3432246 ◽

2021 ◽

Vol 20 (2) ◽

pp. 1-19

Author(s):

Santosh Kumar Mishra ◽

Rijul Dhir ◽

Sriparna Saha ◽

Pushpak Bhattacharyya

Keyword(s):

Computer Vision ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

English Language ◽

Image Captioning ◽

Textual Description ◽

Proposed Model ◽

Hindi Language ◽

The Given

Image captioning is the process of generating a textual description of an image that aims to describe the salient parts of the given image. It is an important problem, as it involves computer vision and natural language processing, where computer vision is used for understanding images, and natural language processing is used for language modeling. A lot of works have been done for image captioning for the English language. In this article, we have developed a model for image captioning in the Hindi language. Hindi is the official language of India, and it is the fourth most spoken language in the world, spoken in India and South Asia. To the best of our knowledge, this is the first attempt to generate image captions in the Hindi language. A dataset is manually created by translating well known MSCOCO dataset from English to Hindi. Finally, different types of attention-based architectures are developed for image captioning in the Hindi language. These attention mechanisms are new for the Hindi language, as those have never been used for the Hindi language. The obtained results of the proposed model are compared with several baselines in terms of BLEU scores, and the results show that our model performs better than others. Manual evaluation of the obtained captions in terms of adequacy and fluency also reveals the effectiveness of our proposed approach. Availability of resources : The codes of the article are available at https://github.com/santosh1821cs03/Image_Captioning_Hindi_Language ; The dataset will be made available: http://www.iitp.ac.in/∼ai-nlp-ml/resources.html .

Download Full-text

Measuring second language proficiency with EEG synchronization: how functional cortical networks and hemispheric involvement differ as a function of proficiency level in second language speakers

Second language Research ◽

10.1177/0267658308098997 ◽

2009 ◽

Vol 25 (1) ◽

pp. 77-106 ◽

Cited By ~ 20

Author(s):

Susanne Reiterer ◽

Ernesto Pereda ◽

Joydeep Bhattacharya

Keyword(s):

Second Language ◽

Language Learners ◽

Language Processing ◽

English Language ◽

Second Language Learners ◽

Right Hemisphere ◽

Brain Activation ◽

Frequency Range ◽

Second Language Processing ◽

Language Students

This article examines the question of whether university-based high-level foreign language and linguistic training can influence brain activation and whether different L2 proficiency groups have different brain activation in terms of lateralization and hemispheric involvement. The traditional and prevailing theory of hemispheric involvement in bilingual language processing states that bilingual and second language processing is always at least in some form connected to the right hemisphere (RH), when compared to monolingual first language processing, the classical left-hemispheric language-processing domain. A widely held specification of this traditional theory claims that especially bilinguals or second language learners in their initial phases and/or bilinguals with poor fluency and less experience rely more on RH areas when processing their L2. We investigated this neurolinguistic hypothesis with differently proficient Austrian learners of English as a second language. Two groups of L2 speakers (all Austrian German native speakers), differing in their L2 (English) language performance, were recorded on electroencephalography (EEG) during the processing of spoken English language. A short comprehension interview followed each task. The `high proficiency group' consisted of English language students who were about to complete their master's degree for English language and linguistics, while the `low proficiency group' was composed of non-language students who had only school level performance and less training in English. The age of onset of L2 learning was kept constant: 9 years for both groups. To look for cooperative network activity in the brain, EEG coherence and synchronization measures were analysed for a high EEG frequency range (gamma band). Results showed the most significant group differences in synchronization patterns within the lower gamma frequency range, with more RH involvement (extensive right-hemisphere networks) for the low proficiency group, especially when processing their L2. The results can be interpreted in favour of RH theories of second language processing since, once again, we found evidence of more RH involvement in (late) second language learners with less experience and less training in the L2. The study shows that second language training (and resulting proficiency) and/or differences in ability or state of linguistic alertness can be made visible by brain imaging using newly developed EEG-synchronization techniques as a measure.

Download Full-text

Proficiency Differences in Syntactic Processing of Monolingual Native Speakers Indexed by Event-related Potentials

Journal of Cognitive Neuroscience ◽

10.1162/jocn.2009.21393 ◽

2010 ◽

Vol 22 (12) ◽

pp. 2728-2744 ◽

Cited By ~ 95

Author(s):

Eric Pakulak ◽

Helen J. Neville

Keyword(s):

Language Proficiency ◽

Language Processing ◽

English Language ◽

Native Speakers ◽

Memory Span ◽

Wide Spectrum ◽

Event Related Potentials ◽

Related Potentials ◽

Native Speakers Of English ◽

Proficiency Scores

Although anecdotally there appear to be differences in the way native speakers use and comprehend their native language, most empirical investigations of language processing study university students and none have studied differences in language proficiency, which may be independent of resource limitations such as working memory span. We examined differences in language proficiency in adult monolingual native speakers of English using an ERP paradigm. ERPs were recorded to insertion phrase structure violations in naturally spoken English sentences. Participants recruited from a wide spectrum of society were given standardized measures of English language proficiency, and two complementary ERP analyses were performed. In between-groups analyses, participants were divided on the basis of standardized proficiency scores into lower proficiency and higher proficiency groups. Compared with lower proficiency participants, higher proficiency participants showed an early anterior negativity that was more focal, both spatially and temporally, and a larger and more widely distributed positivity (P600) to violations. In correlational analyses, we used a wide spectrum of proficiency scores to examine the degree to which individual proficiency scores correlated with individual neural responses to syntactic violations in regions and time windows identified in the between-groups analyses. This approach also used partial correlation analyses to control for possible confounding variables. These analyses provided evidence for the effects of proficiency that converged with the between-groups analyses. These results suggest that adult monolingual native speakers of English who vary in language proficiency differ in the recruitment of syntactic processes that are hypothesized to be at least in part automatic as well as of those thought to be more controlled. These results also suggest that to fully characterize neural organization for language in native speakers it is necessary to include participants of varying proficiency.

Download Full-text

Context and Culture in Evaluation: A Case Study of Evaluation Anthropology

Evaluation Journal of Australasia ◽

10.1177/1035719x1701700105 ◽

2017 ◽

Vol 17 (1) ◽

pp. 30-38

Author(s):

Elise Howard

Keyword(s):

School Readiness ◽

Cultural Difference ◽

English Language ◽

Evaluation Framework ◽

Cultural Bias ◽

Culturally Appropriate ◽

Evaluation Practice ◽

Cultural Aspects ◽

Poverty And Inequality

Designing programs to address poverty and inequality for Australian Aboriginal communities over recent decades has proved problematic. There is a need for greater consideration of different cultural perspectives. A culturally appropriate evaluation framework can provide a range of strategies to embrace cultural difference. Evaluation anthropology, one of many culturally appropriate approaches, emphasises understanding of socio-cultural environments and contexts, and reflective practice to draw attention to cultural bias. This paper will define evaluation anthropology and then reflect on its usefulness in establishing an evaluation framework for a preliteracy program located in a remote Aboriginal community in Australia. The aims of the program are to improve school readiness through developing preliteracy (English language) skills in children aged 0-3 years. Developing an evaluation framework for the program required an approach that accounted for the socio-cultural aspects of literacy development. The lessons from this case study demonstrate the need for place-specific theory to inform program design and evaluation practice.

Download Full-text

Automatically Representing TExt Meaning via an Interlingua-based System (ARTEMIS). A further step towards the computational representation of RRG

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2017.7788 ◽

2017 ◽

Vol 1 (1) ◽

pp. 61 ◽

Cited By ~ 1

Author(s):

Ricardo Mairal-Usón ◽

Francisco Cortés-Rodríguez

Keyword(s):

Language Processing ◽

Semantic Representation ◽

Analysis Data ◽

Logical Structure ◽

Automatic Generation ◽

Role And Reference Grammar ◽

Reference Grammar ◽

Level 1 ◽

Computational Resources ◽

Computational Representation

Within the framework of FUNK Lab – a virtual laboratory for natural language processing inspired on a functionally-oriented linguistic theory like Role and Reference Grammar-, a number of computational resources have been built dealing with different aspects of language and with an application in different scientific domains, i.e. terminology, lexicography, sentiment analysis, document classification, text analysis, data mining etc. One of these resources is ARTEMIS (Automatically Representing TExt Meaning via an Interlingua-Based System), which departs from the pioneering work of Periñán-Pascual (2013) and Periñán-Pascual & Arcas (2014). This computational tool is a proof of concept prototype which allows the automatic generation of a conceptual logical structure (CLS) (cf. Mairal-Usón, Periñán-Pascual and Pérez 2012; Van Valin and Mairal-Usón 2014), that is, a fully specified semantic representation of an input text on the basis of a reduced sample of sentences. The primary aim of this paper is to develop the syntactic rules that form part of the computational grammar for the representation of simple clauses in English. More specifically, this work focuses on the format of those syntactic rules that account for the upper levels of the RRG Layered Structure of the Clause (LSC), that is, the core (and the level-1 construction associated with it), the clause and the sentence (Van Valin 2005). In essence, this analysis, together with that in Cortés-Rodríguez and Mairal-Usón (2016), offers an almost complete description of the computational grammar behind the LSC for simple clauses.

Download Full-text

English as a Lingua Franca: Lessons for language and mobility

Multilingual Margins A journal of multilingualism from the periphery ◽

10.14426/mm.v1i1.21 ◽

2018 ◽

Vol 1 (1) ◽

pp. 40 ◽

Cited By ~ 1

Author(s):

Joseph Sung-Yul Park ◽

Lionel Wee

Keyword(s):

English Language ◽

Social Inequalities ◽

Fluid Dynamic ◽

Critical Evaluation ◽

English Language Teaching ◽

Critical Examination ◽

Lingua Franca ◽

Considerable Difficulty ◽

Core Set ◽

Linguistic Backgrounds

Greater mobility of people in the globalising world foregrounds the inherent problemsof an ideology of language as a bounded entity and the unequal relations of powerthat shape experiences of mobility. In this paper, we consider how these problems canbe interrelated in research on language and mobility through a critical evaluation ofcurrent research on English as a lingua franca (ELF), particularly what we refer to asthe ‘ELF research project’, exemplified by the work of Jenkins and Seidlhofer. TheELF project aims at a non-hegemonic alternative to English language teaching byidentifying a core set of linguistic variables that can facilitate communication betweenspeakers of different linguistic backgrounds. We provide a critical examination ofthe project by problematising its narrow conceptualisation of communication asinformation transfer and its inability to address the prejudices that speakers may stillencounter because they speak the language ‘differently’. In our discussion, we arguethat investigation of language in the context of mobility requires serious rethinkingon the level of both theory and political stancetaking: a theory of language that doesnot take account of the fluid, dynamic, and practice-based nature of language willhave considerable difficulty in proposing a cogent critique of social inequalities thatpermeate the lives of people on the move.

Download Full-text

Analyzing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets

Wireless Communications and Mobile Computing ◽

10.1155/2021/5375334 ◽

2021 ◽

Vol 2021 ◽

pp. 1-17

Author(s):

Changchang Zeng ◽

Shaobo Li

Keyword(s):

Reading Comprehension ◽

Language Processing ◽

Question Answering ◽

Multiple Choice ◽

Length Distribution ◽

Research Field ◽

Evaluation Framework ◽

Language Models ◽

Training Objective ◽

Machine Reading

Machine reading comprehension (MRC) is a challenging natural language processing (NLP) task. It has a wide application potential in the fields of question answering robots, human-computer interactions in mobile virtual reality systems, etc. Recently, the emergence of pretrained models (PTMs) has brought this research field into a new era, in which the training objective plays a key role. The masked language model (MLM) is a self-supervised training objective widely used in various PTMs. With the development of training objectives, many variants of MLM have been proposed, such as whole word masking, entity masking, phrase masking, and span masking. In different MLMs, the length of the masked tokens is different. Similarly, in different machine reading comprehension tasks, the length of the answer is also different, and the answer is often a word, phrase, or sentence. Thus, in MRC tasks with different answer lengths, whether the length of MLM is related to performance is a question worth studying. If this hypothesis is true, it can guide us on how to pretrain the MLM with a relatively suitable mask length distribution for MRC tasks. In this paper, we try to uncover how much of MLM’s success in the machine reading comprehension tasks comes from the correlation between masking length distribution and answer length in the MRC dataset. In order to address this issue, herein, (1) we propose four MRC tasks with different answer length distributions, namely, the short span extraction task, long span extraction task, short multiple-choice cloze task, and long multiple-choice cloze task; (2) four Chinese MRC datasets are created for these tasks; (3) we also have pretrained four masked language models according to the answer length distributions of these datasets; and (4) ablation experiments are conducted on the datasets to verify our hypothesis. The experimental results demonstrate that our hypothesis is true. On four different machine reading comprehension datasets, the performance of the model with correlation length distribution surpasses the model without correlation.

Download Full-text

Quantifying and correcting slide-to-slide variation in multiplexed immunofluorescence images

10.1101/2021.07.16.452359 ◽

2021 ◽

Author(s):

Coleman R Harris ◽

Eliot T McKinley ◽

Joseph T Roland ◽

Qi Liu ◽

Martha J Shrubsole ◽

...

Keyword(s):

Functional Data ◽

Single Cell Analysis ◽

Evaluation Criteria ◽

Evaluation Framework ◽

Complex Data ◽

Imaging Data ◽

Data Registration ◽

Multiplexed Imaging ◽

In Situ Methods ◽

Clear Slide

The multiplexed imaging domain is a nascent single-cell analysis field with a complex data structure susceptible to technical variability that disrupts inference. These in situ methods are valuable in understanding cell-cell interactions, but few standardized processing steps or normalization techniques of multiplexed imaging data are available. We implement and compare data transformations and normalization algorithms in multiplexed imaging data. Our methods adapt the ComBat and functional data registration methods to remove slide effects in this domain, and we present an evaluation framework to compare the proposed approaches. We present clear slide-to-slide variation in the raw, unadjusted data, and show that many of the proposed normalization methods reduce this variation while preserving and improving the biological signal. Further, we find that dividing this data by its slide mean, and the functional data registration methods, perform the best under our proposed evaluation framework. In summary, this approach provides a foundation for better data quality and evaluation criteria in the multiplexed domain.

Download Full-text