Journal of Computer-Assisted Linguistic Research

Hate speech targets in COVID-19 related comments on Ukrainian news websites

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2021.15966 ◽

2021 ◽

Vol 5 (1) ◽

pp. 47-75

Author(s):

Lidiia Melnyk

Keyword(s):

Machine Learning ◽

Ethnic Groups ◽

Hate Speech ◽

Negative Stereotypes ◽

News Websites ◽

Automated Machine Learning

The research focuses on hate speech in the comments section of Ukrainian news websites. Restricted to solely COVID-19 related comments, it seeks to analyze the development of hate speech rates throughout the pandemic. Using a semi-automated machine-learning-aided approach, the paper identifies hate speech in the comments and defines its main targets. The research shows that a crisis like the COVID-19 pandemic can strengthen existing negative stereotypes and gives rise to new forms of stigmatization against social and ethnic groups.

Download Full-text

Indirectly Named Entity Recognition

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2021.15922 ◽

2021 ◽

Vol 5 (1) ◽

pp. 27-46

Author(s):

Alexis Kauffmann ◽

François-Claude Rey ◽

Iana Atanassova ◽

Arnaud Gaudinat ◽

Peter Greenfield ◽

...

Keyword(s):

Language Processing ◽

Named Entity Recognition ◽

Entity Recognition ◽

Future Research ◽

Proof Of Concept ◽

Future Perspectives ◽

Named Entities ◽

Multiword Expressions ◽

Named Entity ◽

French Texts

We define here indirectly named entities, as a term to denote multiword expressions referring to known named entities by means of periphrasis. While named entity recognition is a classical task in natural language processing, little attention has been paid to indirectly named entities and their treatment. In this paper, we try to address this gap, describing issues related to the detection and understanding of indirectly named entities in texts. We introduce a proof of concept for retrieving both lexicalised and non-lexicalised indirectly named entities in French texts. We also show example cases where this proof of concept is applied, and discuss future perspectives. We have initiated the creation of a first lexicon of 712 indirectly named entity entries that is available for future research.

Download Full-text

Analyzing cultural expatriates' attitudes toward “Englishnization” using dynamic topic modeling

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2021.15909 ◽

2021 ◽

Vol 5 (1) ◽

pp. 1-26

Author(s):

Ziyuan Zhang

Keyword(s):

Language Policy ◽

Multinational Corporations ◽

Corporate Culture ◽

Topic Modeling ◽

Computer Assisted ◽

Work Practices ◽

English Only ◽

Perceptions Of Leadership ◽

The Impact ◽

Corporate Language

Several Japanese multinational corporations (MNCs) have recently adopted an English-only policy known as “Englishnization”. This study examines the impact of this policy using computer-assisted text analysis to investigate changes in cultural expatriates’ perceptions of Japanese work practices and values over time. Cultural expatriates are a significant but underexplored outcome of globalization. Despite the recent proliferation of studies on the internationalization of Japanese MNCs, few studies have focused on cultural expatriates' perceptions of corporate language policy in social media texts. This study analyzes a corpus of 208 posts from Rakuten, a Japanese MNC, on Glassdoor from 2009 to 2020. The findings suggest that these posts can be divided into three content groups: the threat of a foreign corporate culture, embracing the Rakuten way, and perceptions of leadership and marginalized status. Further, the posts reveal how Rakuten’s corporate language policy, as an instrument of internal internationalization, impacts external internationalization. The dynamics of “Englishnization’’ reveal a pressing issue facing Rakuten: namely, how to balance multinational cohesion with monolingualism and multiculturalism. This paper aims to demonstrate that dynamic topic modeling could enhance our understanding of the manner in which cultural expatriates and the English-only policy affect the internationalization of Japanese MNCs. It contributes to the literature by examining cultural expatriates’ perceptions of Japanese work practices and values from a diachronic perspective.

Download Full-text

A network-based approach for discourse analysis from Laclau and Mouffe’s perspectives

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2020.12105 ◽

2020 ◽

Vol 4 (1) ◽

pp. 1

Author(s):

Javad Haditaghi ◽

Jaleh Hassasskhah ◽

Mohammad Amin Sorahi

Keyword(s):

Discourse Analysis ◽

Network Theory ◽

Nodal Point ◽

Alternative View ◽

Eigenvector Centrality ◽

Weighted Degree ◽

System A ◽

Theory Of Discourse ◽

Automatic Algorithm ◽

Do So

<p>The current study provides the possibility of merging Laclau and Mouffe’s theory of discourse analysis with network theory to specify an alternative bedstead for studying discourse via a semi-automatic algorithm. To do so, first, considering the text as the discourse of complex system, a semi-automatic algorithm is implemented to transform the interacting linguistic components into a network which is depicted as a graph of vertices connected by edges. Then, some of the graph statistics, e.g. degree, weighted degree, eigenvector centrality, etc., are identified for characterizing the nodes as moments, nodal points, and/or nodal point of identity. Finally, the articulation of the discourse based on the above-mentioned components is studied. The results indicate that the approach is strong enough to pave a way for studying the articulation of the discourse from an alternative view, especially based on Laclau and Mouffe’s theory of discourse analysis.</p>

Download Full-text

Analogical reasoning in uncovering the meaning of digital-technology terms: the case of backdoor

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2020.12921 ◽

2020 ◽

Vol 4 (1) ◽

pp. 23

Author(s):

Inna V. Skrynnikova

Keyword(s):

Digital Technology ◽

Analogical Reasoning ◽

Critical Role ◽

Digital Technologies ◽

Security Policies ◽

Frame Semantics ◽

Top Down ◽

Corpus Data ◽

Combining Methods

<p>The paper substantiates the critical role of analogical reasoning and figurative languge in resolving the ambiguity of cybersecurity terms in various expert communities. Dwelling on the divergent interpretations of a backdoor, it uncovers the potential of metaphor to serve both as an interpretative mechanism and as a framing tool in the ongoing digital technologies discourse. By combining methods of corpus research and frame semantics analysis the study examines the challenges of unpacking the meaning of the contested concept of the backdoor. The paper proposes a qualitatively new metaphor-facilitated mode of interpreting cybersecurity vulnerabilities based on MetaNet deep semantic metaphor analysis and outlines the merits of this hierarchically organized metaphor and frames ontology. The utility of the method is demonstrated through analyzing corpus data and top-down extracting of metaphors (linguistic metaphor – conceptual metaphor – entailed metaphor – inferences) with subsequent identifying of metaphor families dominating the cybersecurity discourse. The paper further claims that the predominant metaphors prompt certain decisions and solutions affecting information security policies. </p>

Download Full-text

Conversational artificial intelligence - demystifying statistical vs linguistic NLP solutions

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2020.12932 ◽

2020 ◽

Vol 4 (1) ◽

pp. 47

Author(s):

Kulvinder Panesar

Keyword(s):

Artificial Intelligence ◽

Knowledge Representation ◽

Natural Language ◽

Proof Of Concept ◽

Language Understanding ◽

Real Presence ◽

Strong Artificial Intelligence ◽

Conversational Interfaces ◽

Strategic Role ◽

Technological Developments

This paper aims to demystify the hype and attention on chatbots and its association with conversational artificial intelligence. Both are slowly emerging as a real presence in our lives from the impressive technological developments in machine learning, deep learning and natural language understanding solutions. However, what is under the hood, and how far and to what extent can chatbots/conversational artificial intelligence solutions work – is our question. Natural language is the most easily understood knowledge representation for people, but certainly not the best for computers because of its inherent ambiguous, complex and dynamic nature. We will critique the knowledge representation of heavy statistical chatbot solutions against linguistics alternatives. In order to react intelligently to the user, natural language solutions must critically consider other factors such as context, memory, intelligent understanding, previous experience, and personalized knowledge of the user. We will delve into the spectrum of conversational interfaces and focus on a strong artificial intelligence concept. This is explored via a text based conversational software agents with a deep strategic role to hold a conversation and enable the mechanisms need to plan, and to decide what to do next, and manage the dialogue to achieve a goal. To demonstrate this, a deep linguistically aware and knowledge aware text based conversational agent (LING-CSA) presents a proof-of-concept of a non-statistical conversational AI solution.

Download Full-text

A qualitative analysis of the Wikipedia N-Substate Algorithm's Enhancement Terms

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2019.11159 ◽

2019 ◽

Vol 3 (1) ◽

pp. 67

Author(s):

Kyle Goslin ◽

Markus Hofmann

Keyword(s):

Qualitative Analysis ◽

A Priori ◽

Sole Source ◽

Human Interaction ◽

A Priori Knowledge ◽

Search Query ◽

Analysis Process ◽

Automatic Search ◽

The Individual ◽

Priori Knowledge

<p>Automatic Search Query Enhancement (ASQE) is the process of modifying a user submitted search query and identifying terms that can be added or removed to enhance the relevance of documents retrieved from a search engine. ASQE differs from other enhancement approaches as no human interaction is required. ASQE algorithms typically rely on a source of a priori knowledge to aid the process of identifying relevant enhancement terms. This paper describes the results of a qualitative analysis of the enhancement terms generated by the Wikipedia NSubstate Algorithm (WNSSA) for ASQE. The WNSSA utilises Wikipedia as the sole source of a priori knowledge during the query enhancement process. As each Wikipedia article typically represents a single topic, during the enhancement process of the WNSSA, a mapping is performed between the user’s original search query and Wikipedia articles relevant to the query. If this mapping is performed correctly, a collection of potentially relevant terms and acronyms are accessible for ASQE. This paper reviews the results of a qualitative analysis process performed for the individual enhancement term generated for each of the 50 test topics from the TREC-9 Web Topic collection. The contributions of this paper include: (a) a qualitative analysis of generated WNSSA search query enhancement terms and (b) an analysis of the concepts represented in the TREC-9 Web Topics, detailing interpretation issues during query-to-Wikipedia article mapping performed by the WNSSA.</p>

Download Full-text

An Evaluation Of A Linguistically Motivated Conversational Software Agent Framework

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2019.11118 ◽

2019 ◽

Vol 3 (1) ◽

pp. 41 ◽

Cited By ~ 1

Author(s):

Kulvinder Panesar

Keyword(s):

Language Processing ◽

English Language ◽

Critical Evaluation ◽

Cognitive Model ◽

Evaluation Criteria ◽

Software Agent ◽

Evaluation Framework ◽

Dialogue Model ◽

Role And Reference Grammar ◽

Phase Models

<p>This paper presents a critical evaluation framework for a linguistically motivated conversational software agent (CSA). The CSA prototype investigates the integration, intersection and interface of the language, knowledge, and speech act constructions (SAC) based on a grammatical object, and the sub-model of belief, desires and intention (BDI) and dialogue management (DM) for natural language processing (NLP). A long-standing issue within NLP CSA systems is refining the accuracy of interpretation to provide realistic dialogue to support human-to-computer communication. This prototype constitutes three phase models: (1) a linguistic model based on a functional linguistic theory – Role and Reference Grammar (RRG), (2) an Agent Cognitive Model with two inner models: (a) a knowledge representation model, (b) a planning model underpinned by BDI concepts, intentionality and rational interaction, and (3) a dialogue model. The evaluation strategy for this Java-based prototype is multi-approach driven by grammatical testing (English language utterances), software engineering and agent practice. A set of evaluation criteria are grouped per phase model, and the testing framework aims to test the interface, intersection and integration of all phase models. The empirical evaluations demonstrate that the CSA is a proof-of-concept, demonstrating RRG’s fitness for purpose for describing, and explaining phenomena, language processing and knowledge, and computational adequacy. Contrastingly, evaluations identify the complexity of lower level computational mappings of NL – agent to ontology with semantic gaps, and further addressed by a lexical bridging solution.</p>

Download Full-text

Sampling Techniques to Overcome Class Imbalance in a Cyberbullying Context

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2019.11112 ◽

2019 ◽

Vol 3 (1) ◽

pp. 21

Author(s):

David Colton ◽

Markus Hofmann

Keyword(s):

Class Imbalance ◽

Sampling Techniques ◽

Compromise Solution ◽

Class Imbalance Problem ◽

Minority Class ◽

Imbalance Problem ◽

Positive Class ◽

Negative Class ◽

Significant Class ◽

Machine Learning Models

<div data-canvas-width="705.3003252350338">The majority of datasets suffer from class imbalance where samples of a dominant class significantly outnumber the samples available for the minority class that is to be detected. Prediction and classification machine learning models work best when there are roughly equal numbers of each class type. This paper explores sampling techniques that can be used to overcome this class imbalance problem in a cyberbullying context. A newly classified cyberbullying dataset, including detailed descriptions of the criteria used in its classification, was used to examine the feasibility of applying text mining techniques, to automate the detection of cyberbullying text when the dataset shows a significant class imbalance between the positive, cyberbullying, sample and the negative, not cyberbullying, samples. In this paper, we will investigate if oversampling the minority positive class or undersampling the majority negative class affects the performance of a prediction model. A compromise solution where the positive class is partially oversampled, and the negative class is partially undersampled is also examined. Although not strictly a class imbalance solution, sampling using the most frequently observed features was also explored.</div><p> </p>

Download Full-text

The Role of Previous Discourse in Identifying Public Textual Cyberbullying

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2019.11013 ◽

2019 ◽

Vol 3 (1) ◽

pp. 1

Author(s):

Aurelia Power ◽

Anthony Keane ◽

Brian Nolan ◽

Brian O'Neill

Keyword(s):

Grammatical Structures ◽

Indefinite Pronouns

<div data-canvas-width="619.2967614992452">In this paper we investigate the contribution of previous discourse in identifying elements that are key to detecting public textual cyberbullying. Based on the analysis of our dataset, we first discuss the missing cyberbullying elements and the grammatical structures representative of discourse-dependent cyberbullying discourse. Then we identify four types of discourse dependent cyberbullying constructions: (1) fully inferable constructions, (2) personal marker and cyberbullying link inferable constructions, (3) dysphemistic element and cyberbullying link inferable constructions, and (4) dysphemistic element inferable constructions. Finally, we formalise a framework to resolve the missing cyberbullying elements that proposes several resolution algorithms. The resolution algorithms target the following discourse dependent message types: (1) polarity answers, (2) contradictory statements, (3) explicit ellipsis, (4) implicit affirmative answers, and (5) statements that use indefinite pronouns as placeholders for the</div><div data-canvas-width="146.57988069516077">dysphemistic element.</div>

Download Full-text

Journal of Computer-Assisted Linguistic Research
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Universitat Politecnica De Valencia

Hate speech targets in COVID-19 related comments on Ukrainian news websites

Indirectly Named Entity Recognition

Analyzing cultural expatriates' attitudes toward “Englishnization” using dynamic topic modeling

A network-based approach for discourse analysis from Laclau and Mouffe’s perspectives

Analogical reasoning in uncovering the meaning of digital-technology terms: the case of backdoor

Conversational artificial intelligence - demystifying statistical vs linguistic NLP solutions

A qualitative analysis of the Wikipedia N-Substate Algorithm's Enhancement Terms

An Evaluation Of A Linguistically Motivated Conversational Software Agent Framework

Sampling Techniques to Overcome Class Imbalance in a Cyberbullying Context

The Role of Previous Discourse in Identifying Public Textual Cyberbullying

Export Citation Format

Journal of Computer-Assisted Linguistic ResearchLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Universitat Politecnica De Valencia

Hate speech targets in COVID-19 related comments on Ukrainian news websites

Indirectly Named Entity Recognition

Analyzing cultural expatriates' attitudes toward “Englishnization” using dynamic topic modeling

A network-based approach for discourse analysis from Laclau and Mouffe’s perspectives

Analogical reasoning in uncovering the meaning of digital-technology terms: the case of backdoor

Conversational artificial intelligence - demystifying statistical vs linguistic NLP solutions

A qualitative analysis of the Wikipedia N-Substate Algorithm's Enhancement Terms

An Evaluation Of A Linguistically Motivated Conversational Software Agent Framework

Sampling Techniques to Overcome Class Imbalance in a Cyberbullying Context

The Role of Previous Discourse in Identifying Public Textual Cyberbullying

Journal of Computer-Assisted Linguistic Research
Latest Publications