Journal of Computer-Assisted Linguistic Research
Latest Publications


TOTAL DOCUMENTS

18
(FIVE YEARS 11)

H-INDEX

1
(FIVE YEARS 1)

Published By Universitat Politecnica De Valencia

2530-9455

2021 ◽  
Vol 5 (1) ◽  
pp. 47-75
Author(s):  
Lidiia Melnyk

The research focuses on hate speech in the comments section of Ukrainian news websites. Restricted to solely COVID-19 related comments, it seeks to analyze the development of hate speech rates throughout the pandemic. Using a semi-automated machine-learning-aided approach, the paper identifies hate speech in the comments and defines its main targets. The research shows that a crisis like the COVID-19 pandemic can strengthen existing negative stereotypes and gives rise to new forms of stigmatization against social and ethnic groups.


2021 ◽  
Vol 5 (1) ◽  
pp. 27-46
Author(s):  
Alexis Kauffmann ◽  
François-Claude Rey ◽  
Iana Atanassova ◽  
Arnaud Gaudinat ◽  
Peter Greenfield ◽  
...  

We define here indirectly named entities, as a term to denote multiword expressions referring to known named entities by means of periphrasis.  While named entity recognition is a classical task in natural language processing, little attention has been paid to indirectly named entities and their treatment. In this paper, we try to address this gap, describing issues related to the detection and understanding of indirectly named entities in texts. We introduce a proof of concept for retrieving both lexicalised and non-lexicalised indirectly named entities in French texts. We also show example cases where this proof of concept is applied, and discuss future perspectives. We have initiated the creation of a first lexicon of 712 indirectly named entity entries that is available for future research.


2021 ◽  
Vol 5 (1) ◽  
pp. 1-26
Author(s):  
Ziyuan Zhang

Several Japanese multinational corporations (MNCs) have recently adopted an English-only policy known as “Englishnization”. This study examines the impact of this policy using computer-assisted text analysis to investigate changes in cultural expatriates’ perceptions of Japanese work practices and values over time. Cultural expatriates are a significant but underexplored outcome of globalization. Despite the recent proliferation of studies on the internationalization of Japanese MNCs, few studies have focused on cultural expatriates' perceptions of corporate language policy in social media texts. This study analyzes a corpus of 208 posts from Rakuten, a Japanese MNC, on Glassdoor from 2009 to 2020. The findings suggest that these posts can be divided into three content groups: the threat of a foreign corporate culture, embracing the Rakuten way, and perceptions of leadership and marginalized status. Further, the posts reveal how Rakuten’s corporate language policy, as an instrument of internal internationalization, impacts external internationalization. The dynamics of “Englishnization’’ reveal a pressing issue facing Rakuten: namely, how to balance multinational cohesion with monolingualism and multiculturalism. This paper aims to demonstrate that dynamic topic modeling could enhance our understanding of the manner in which cultural expatriates and the English-only policy affect the internationalization of Japanese MNCs. It contributes to the literature by examining cultural expatriates’ perceptions of Japanese work practices and values from a diachronic perspective.


2020 ◽  
Vol 4 (1) ◽  
pp. 1
Author(s):  
Javad Haditaghi ◽  
Jaleh Hassasskhah ◽  
Mohammad Amin Sorahi

<p>The current study provides the possibility of merging Laclau and Mouffe’s theory of discourse analysis with network theory to specify an alternative bedstead for studying discourse via a semi-automatic algorithm. To do so, first, considering the text as the discourse of complex system, a semi-automatic algorithm is implemented to transform the interacting linguistic components into a network which is depicted as a graph of vertices connected by edges. Then, some of the graph statistics, e.g. degree, weighted degree, eigenvector centrality, etc., are identified for characterizing the nodes as moments, nodal points, and/or nodal point of identity. Finally, the articulation of the discourse based on the above-mentioned components is studied. The results indicate that the approach is strong enough to pave a way for studying the articulation of the discourse from an alternative view, especially based on Laclau and Mouffe’s theory of discourse analysis.</p>


2020 ◽  
Vol 4 (1) ◽  
pp. 23
Author(s):  
Inna V. Skrynnikova

<p>The paper substantiates the critical role of analogical reasoning and figurative languge in resolving the ambiguity of cybersecurity terms in various expert communities. Dwelling on the divergent interpretations of a backdoor, it uncovers the potential of metaphor to serve both as an interpretative mechanism and as a framing tool in the ongoing digital technologies discourse. By combining methods of corpus research and frame semantics analysis the study examines the challenges of unpacking the meaning of the contested concept of the backdoor. The paper proposes a qualitatively new metaphor-facilitated mode of interpreting cybersecurity vulnerabilities based on MetaNet deep semantic metaphor analysis and outlines the merits of this hierarchically organized metaphor and frames ontology. The utility of the method is demonstrated through analyzing corpus data and top-down extracting of metaphors (linguistic metaphor – conceptual metaphor – entailed metaphor – inferences) with subsequent identifying of metaphor families dominating the cybersecurity discourse. The paper further claims that the predominant metaphors prompt certain decisions and solutions affecting information security policies. </p>


2020 ◽  
Vol 4 (1) ◽  
pp. 47
Author(s):  
Kulvinder Panesar

This paper aims to demystify the hype and attention on chatbots and its association with conversational artificial intelligence. Both are slowly emerging as a real presence in our lives from the impressive technological developments in machine learning, deep learning and natural language understanding solutions. However, what is under the hood, and how far and to what extent can chatbots/conversational artificial intelligence solutions work – is our question. Natural language is the most easily understood knowledge representation for people, but certainly not the best for computers because of its inherent ambiguous, complex and dynamic nature. We will critique the knowledge representation of heavy statistical chatbot solutions against linguistics alternatives. In order to react intelligently to the user, natural language solutions must critically consider other factors such as context, memory, intelligent understanding, previous experience, and personalized knowledge of the user. We will delve into the spectrum of conversational interfaces and focus on a strong artificial intelligence concept. This is explored via a text based conversational software agents with a deep strategic role to hold a conversation and enable the mechanisms need to plan, and to decide what to do next, and manage the dialogue to achieve a goal. To demonstrate this, a deep linguistically aware and knowledge aware text based conversational agent (LING-CSA) presents a proof-of-concept of a non-statistical conversational AI solution.


2019 ◽  
Vol 3 (1) ◽  
pp. 67
Author(s):  
Kyle Goslin ◽  
Markus Hofmann

<p>Automatic Search Query Enhancement (ASQE) is the process of modifying a user submitted search query and identifying terms that can be added or removed to enhance the relevance of documents retrieved from a search engine. ASQE differs from other enhancement approaches as no human interaction is required. ASQE algorithms typically rely on a source of a priori knowledge to aid the process of identifying relevant enhancement terms. This paper describes the results of a qualitative analysis of the enhancement terms generated by the Wikipedia NSubstate Algorithm (WNSSA) for ASQE. The WNSSA utilises Wikipedia as the sole source of a priori knowledge during the query enhancement process. As each Wikipedia article typically represents a single topic, during the enhancement process of the WNSSA, a mapping is performed between the user’s original search query and Wikipedia articles relevant to the query. If this mapping is performed correctly, a collection of potentially relevant terms and acronyms are accessible for ASQE. This paper reviews the results of a qualitative analysis process performed for the individual enhancement term generated for each of the 50 test topics from the TREC-9 Web Topic collection. The contributions of this paper include: (a) a qualitative analysis of generated WNSSA search query enhancement terms and (b) an analysis of the concepts represented in the TREC-9 Web Topics, detailing interpretation issues during query-to-Wikipedia article mapping performed by the WNSSA.</p>


2019 ◽  
Vol 3 (1) ◽  
pp. 41 ◽  
Author(s):  
Kulvinder Panesar

<p>This paper presents a critical evaluation framework for a linguistically motivated conversational software agent (CSA). The CSA prototype investigates the integration, intersection and interface of the language, knowledge, and speech act constructions (SAC) based on a grammatical object, and the sub-model of belief, desires and intention (BDI) and dialogue management (DM) for natural language processing (NLP). A long-standing issue within NLP CSA systems is refining the accuracy of interpretation to provide realistic dialogue to support human-to-computer communication. This prototype constitutes three phase models: (1) a linguistic model based on a functional linguistic theory – Role and Reference Grammar (RRG), (2) an Agent Cognitive Model with two inner models: (a) a knowledge representation model, (b) a planning model underpinned by BDI concepts, intentionality and rational interaction, and (3) a dialogue model. The evaluation strategy for this Java-based prototype is multi-approach driven by grammatical testing (English language utterances), software engineering and agent practice. A set of evaluation criteria are grouped per phase model, and the testing framework aims to test the interface, intersection and integration of all phase models. The empirical evaluations demonstrate that the CSA is a proof-of-concept, demonstrating RRG’s fitness for purpose for describing, and explaining phenomena, language processing and knowledge, and computational adequacy. Contrastingly, evaluations identify the complexity of lower level computational mappings of NL – agent to ontology with semantic gaps, and further addressed by a lexical bridging solution.</p>


2019 ◽  
Vol 3 (1) ◽  
pp. 1
Author(s):  
Aurelia Power ◽  
Anthony Keane ◽  
Brian Nolan ◽  
Brian O'Neill

<div data-canvas-width="619.2967614992452">In this paper we investigate the contribution of previous discourse in identifying elements that are key to detecting public textual cyberbullying. Based on the analysis of our dataset, we first discuss the missing cyberbullying elements and the grammatical structures representative of discourse-dependent cyberbullying discourse. Then we identify four types of discourse dependent cyberbullying constructions: (1) fully inferable constructions, (2) personal marker and cyberbullying link inferable constructions, (3) dysphemistic element and cyberbullying link inferable constructions, and (4) dysphemistic element inferable constructions. Finally, we formalise a framework to resolve the missing cyberbullying elements that proposes several resolution algorithms. The resolution algorithms target the following discourse dependent message types: (1) polarity answers, (2) contradictory statements, (3) explicit ellipsis, (4) implicit affirmative answers, and (5) statements that use indefinite pronouns as placeholders for the</div><div data-canvas-width="146.57988069516077">dysphemistic element.</div>


2019 ◽  
Vol 3 (1) ◽  
pp. 78
Author(s):  
Elisabeth Huber

<p>Why does football combine productively with further nouns to form more complex expressions like football game, whereas seemingly comparable compounds like keyword only infrequently expand to more complex sequences? This project explores why some two-noun compounds are more readily available for forming triconstituent constructions than others. I hypothesize that the productivity of a two-noun compound in the formation of triconstituent sequences depends on the degree of entrenchment of that two-noun compound, assuming that only compounds that are entrenched to a certain degree are productive in forming more complex constructions. In order to test this hypothesis, a list of three-noun compounds in the English language needed to be compiled. The obvious thing to do would be to search for sequences of three nouns in POS-tagged corpora. However, since such automatized searches on the one hand do not allow the recall of all required instances and, on the other hand, often create results that are not precise enough, this requires substantial manual screening. Furthermore, in order to operationalize the concepts of entrenchment and productivity, it was necessary to count the usage frequencies of noun constructions. For this work, as well, the automatic elicitation of the data needed to be complemented by further manual selection in order to obtain correct usage frequencies. Both the complex automatic and manual work processes in the elicitation of the data will be presented in detail to give an impression of the extent of such a project.</p>


Sign in / Sign up

Export Citation Format

Share Document