Linguistic Evidence in Security Law and Intelligence

Developing and Analyzing a Spanish Corpus for Forensic Purposes

Linguistic Evidence in Security Law and Intelligence ◽

10.5195/lesli.2019.19 ◽

2019 ◽

Vol 3 ◽

Author(s):

Ángela Almela ◽

Gema Alcaraz-Mármol ◽

Arancha García-Pinar ◽

Clara Pallejá

Keyword(s):

Data Collection ◽

Language Processing ◽

Ad Hoc ◽

Semantic Analysis ◽

Linguistic Features ◽

Natural Language Processing Tool ◽

Check Method ◽

Spanish Universities ◽

Gender Based ◽

Main Instrument

In this paper, the methods for developing a database of Spanish writing that can be used for forensic linguistic research are presented, including our data collection procedures. Specifically, the main instrument used for data collection has been translated into Spanish and adapted from Chaski (2001). It consists of ten tasks, by means of which the subjects are asked to write formal and informal texts about different topics. To date, 93 undergraduates from Spanish universities have already participated in the study and prisoners convicted of gender-based abuse have participated. A twofold analysis has been performed, since the data collected have been approached from a semantic and a morphosyntactic perspective. Regarding the semantic analysis, psycholinguistic categories have been used, many of them taken from the LIWC dictionary (Pennebaker et al., 2001). In order to obtain a more comprehensive depiction of the linguistic data, some other ad-hoc categories have been created, based on the corpus itself, using a double-check method for their validation so as to ensure inter-rater reliability. Furthermore, as regards morphosyntactic analysis, the natural language processing tool ALIAS TATTLER is being developed for Spanish. Results shows that is it possible to differentiate non-abusers from abusers with strong accuracy based on linguistic features.

Download Full-text

Benchmarking Author Recognition Systems for Forensic Application

Linguistic Evidence in Security Law and Intelligence ◽

10.5195/lesli.2019.20 ◽

2019 ◽

Vol 3 ◽

Author(s):

Hans Van Halteren

Keyword(s):

Error Rate ◽

Test Sample ◽

Recognition System ◽

Error Rates ◽

Test Material ◽

System Quality ◽

Equal Error Rate ◽

Training Material ◽

Sequential Samples ◽

Measured System

This paper demonstrates how an author recognition system could be benchmarked, as a prerequisite for admission in court. The system used in the demonstration is the FEDERALES system, and the experimental data used were taken from the British National Corpus. The system was given several tasks, namely attributing a text sample to a specific text, verifying that a text sample was taken from a specific text, and verifying that a text sample was produced by a specific author. For the former two tasks, 1,099 texts with at least 10,000 words were used; for the latter 1,366 texts with known authors, which were verified against models for the 28 known authors for whom there were three or more texts. The experimental tasks were performed with different sampling methods (sequential samples or samples of concatenated random sentences), different sample sizes (1,000, 500, 250 or 125 words), varying amounts of training material (between 2 and 20 samples) and varying amounts of test material (1 or 3 samples). Under the best conditions, the system performed very well: with 7 training and 3 test samples of 1,000 words of randomly selected sentences, text attribution had an equal error rate of 0.06% and text verification an equal error rate of 1.3%; with 20 training and 3 test samples of 1,000 words of randomly selected sentences, author verification had an equal error rate of 7.5%. Under the worst conditions, with 2 training and 1 test sample of 125 words of sequential text, equal error rates for text attribution and text verification were 26.6% and 42.2%, and author verification did not perform better than chance. Furthermore, the quality degradation curves with slowly worsening conditions were not smooth, but contained steep drops. All in all, the results show the importance of having a benchmark which is as similar as possible to the actual court material for which the system is to be used, since the measured system quality differed greatly between evaluation scenarios and system degradation could not be predicted easily on the basis of the chosen scenario parameters.

Download Full-text

Prosody and its application to forensic linguistics

Linguistic Evidence in Security Law and Intelligence ◽

10.5195/lesli.2014.12 ◽

2014 ◽

Vol 2 (2) ◽

Cited By ~ 5

Author(s):

Michael J. Harris ◽

Stefan Th. Gries ◽

Viola G. Miglio

Keyword(s):

Best Practices ◽

Potential Application ◽

Final Section ◽

Ensemble Methods ◽

Spanish Speakers ◽

English Speakers ◽

Vowel Duration ◽

Forensic Linguistics ◽

Frequency Effects ◽

New Information

This article describes three studies in prosody and their potential application to the field of forensic linguistics. It begins with a brief introduction to prosody. It then proceeds to describe Miglio, Gries, & Harris (2014), a comparison of prosodic coding of new information by bilingual Spanish-English speakers and monolingual Spanish speakers. A description of Harris & Gries (2011) follows. This study compares the vowel duration variability of bilingual Spanish-English speakers and monolingual Spanish speakers, and touches upon corpus-based frequency effects and differences in linguistic aptitude between the two speaker groups. Finally, a portion of anongoing study is described (Harris in preparation). This section describes the use of prosodic variables and ensemble methods (or methods that use multiple learning algorithms) to classify languages, even in the case of impoverished data. All three experiments have implications and applications to the field of forensic linguistics, which are touched upon in each respective section and discussed in a more in-depth manner in the final section of this article. Furthermore, the applications of these methods to forensic linguistics are discussed in light of best practices for forensic linguistics, as outlined in Chaski (2013).

Download Full-text

Detecting Deception by Analyzing Written Statements in Korean

Linguistic Evidence in Security Law and Intelligence ◽

10.5195/lesli.2014.13 ◽

2014 ◽

Vol 2 (2) ◽

Author(s):

Seung-Man Kang ◽

Hyoungkeun Lee

Keyword(s):

Internal Consistency ◽

Accuracy Rate ◽

Cronbach’S Alpha ◽

True Statement ◽

Cronbach's Alpha ◽

Significant Difference ◽

Alpha Level ◽

Detecting Deception

This paper delves into the effect of SCAN and its cross-linguistic applicability by analyzing written statements in Korean. For this research, we conducted an experiment in which truth tellers were asked to write a true statement about a staged event and liars a fabricated one about the same event. We analyzed these two types of written statements using the criteria of SCAN. The results (accuracy rate, 81.6%) indicate that SCAN is effective in detecting deception despite the low internal consistency level among coders (Cronbach’s alpha level, 0.577). It was also shown that the SCAN criteria are not universally applicable across languages as the mode of using pronouns in Korean yields no significant difference between truthful and deceptive statements.

Download Full-text

Correction To: Hollien, H. (2013). Barriers to Progress in Speaker Identification with Comments on the Trayvon Martin Case. Linguistic Evidence In Security, Law And Intelligence, 1(1), 76-98. doi:10.5195/lesli.2013.3

Linguistic Evidence in Security Law and Intelligence ◽

10.5195/lesli.2014.9 ◽

2014 ◽

Vol 2 (1) ◽

Author(s):

Carole Chaski

Keyword(s):

Speaker Identification ◽

Linguistic Evidence ◽

Trayvon Martin

Download Full-text

False Confessions and the Use of Incriminating Evidence

Linguistic Evidence in Security Law and Intelligence ◽

10.5195/lesli.2013.4 ◽

2013 ◽

Vol 1 (1) ◽

pp. 67-75 ◽

Cited By ~ 1

Author(s):

Tim Cole ◽

JC Bruno Teboul ◽

David E Zulawski ◽

Douglas E Wicklander ◽

Shane G Sturman

Keyword(s):

Present Experiment ◽

Experimental Studies ◽

False Confessions ◽

False Confession ◽

Experimental Task

To date, few experimental studies have looked at the factors that influence people’s willingness to confess to something they did not do. One widely cited experiment on the topic (i.e., Kassin & Kiechel, 1996) has suggested that false confessions are easy to obtain and that the use of false incriminating evidence increases the likelihood of obtaining one. The present research attempted to replicate Kassin and Kiechel’s (1996) work using a different experimental task. In the present experiment, unlike Kassin and Kiechel’s (1996) study, the participants were completely certain that they were not responsible for what had happened, thereby providing a different context for testing the idea that false incriminating evidence increases the likelihood of obtaining a false confession. The results are discussed with respect to factors that may or may not increase individuals’ willingness to offer a false admission of guilt.

Download Full-text

Analysing Deception in Written Witness Statements

Linguistic Evidence in Security Law and Intelligence ◽

10.5195/lesli.2013.2 ◽

2013 ◽

Vol 1 (1) ◽

pp. 41-50 ◽

Cited By ~ 2

Author(s):

Isabel Picornell

Keyword(s):

High Stakes ◽

New Approach ◽

Strategy Study ◽

Study Results ◽

Linguistic Strategy ◽

Witness Statements ◽

Characteristic Features ◽

Witness Narratives ◽

Unique Source ◽

Over Time

Written witness statements are a unique source for the study of high-stakes textual deception. To date, however, there is no distinction in the way that they and other forms of verbal deception have been analysed, with written statements treated as extensions of transcribed versions of oral reports. Given the highly context-dependent nature of cues, it makes sense to take the characteristics of the medium into account when analysing for deceptive language. This study examines the characteristic features of witness narratives and proposes a new approach to search for deception cues. Narratives are treated as a progression of episodes over time, and deception as a progression of acts over time. This allows for the profiling of linguistic bundles in sequence, revealing the statements’ internal gradient, and deceivers’ choice of deceptive linguistic strategy. Study results suggest that, at least in the context of written witness statements, the weighting of individual features as deception cues is not static but depends on their interaction with other cues, and that detecting deceivers’ use of linguistic strategy is en effective vehicle for identifying deception.

Download Full-text

ILER: A Web-Accessible Resource for Research in Forensic Linguistics

Linguistic Evidence in Security Law and Intelligence ◽

10.5195/lesli.2013.8 ◽

2013 ◽

Vol 1 (1) ◽

pp. 99-103

Author(s):

Carole E Chaski

Keyword(s):

Data Analysis ◽

Experimental Design ◽

Human Subjects ◽

Information Collection ◽

Design Experiment ◽

Demographic Information ◽

Forensic Linguistics ◽

Subject Recruitment ◽

Confidentiality Agreement ◽

Collection Data

ILER is a web-accessible platform for empirical research in forensic linguistics. It enables new researchers and practitioners to conduct mature research from literature review, experimental design, experiment review and approval, subject recruitment, human subjects protection and confidentiality agreement, subject demographic information collection, data collection, to data analysis.

Download Full-text

Barriers to Progress in Speaker Identification with Comments on the Trayvon Martin Case

Linguistic Evidence in Security Law and Intelligence ◽

10.5195/lesli.2013.3 ◽

2013 ◽

Vol 1 (1) ◽

pp. 76-98

Author(s):

Harry Hollien

Keyword(s):

Psychological Health ◽

Speaker Identification ◽

Short Review ◽

Voice Analysis ◽

Health States ◽

Relevant Evidence ◽

Fundamental Challenge ◽

Computer Based ◽

Robust Speaker Identification ◽

Assessment Procedures

Linguistics and phonetics overlap in many areas. The essay to follow reviews some of the problems experienced by phoneticians in one of these regions. It may provide some insight for linguists when they are confronted by barriers in their own field. The present example involves individuals who are attempting to identify speakers from voice analysis. The fundamental challenge they face is, of course, caused by the thousands of variables associated with that task. Included here are differences among speakers’ gender, age, size, physiology, language, dialect, psychological/health states, background/education, reason for speaking, situation, environment, configuration of the acoustic channel -- plus many others. Many formal assessment procedures -- both aural-perceptual ones conducted by humans or machine/computer based systems -- have been proposed and/or used for the cited analyses. Unfortunately, however, few have enjoyed particularly high levels of success. Worse yet, reasonable progress has suffered from external impedances; the report to follow will outline some of them. Among the problems considered are: 1) competition (verification vs. identification, from voiceprints), 2) concept disputes 3) the continued undervaluation of relevant evidence and 4) markedly dissimilar philosophies of professionals from different disciplines. A response in the form of a short review of the data and concepts which clearly support the possibility of robust speaker identification is presented. Also included are suggestions as to how to enhance the effectiveness of disciplines such as ours.

Download Full-text

Detection of Deception in a Virtual World

Linguistic Evidence in Security Law and Intelligence ◽

10.5195/lesli.2013.6 ◽

2013 ◽

Vol 1 (1) ◽

pp. 51-66 ◽

Cited By ~ 1

Author(s):

Lauren B. Collister

Keyword(s):

Case Studies ◽

Virtual Worlds ◽

Virtual World ◽

Online Community ◽

Large Scale ◽

World Of Warcraft ◽

Small Scale ◽

Detection Of Deception

This work explores the role of multimodal cues in detection of deception in a virtual world, an online community of World of Warcraft players. Case studies from a five-year ethnography are presented in three categories: small-scale deception in text, deception by avoidance, and large-scale deception in game-external modes. Each case study is analyzed in terms of how the affordances of the medium enabled or hampered deception as well as how the members of the community ultimately detected the deception. The ramifications of deception on the community are discussed, as well as the need for researchers to have a deep community knowledge when attempting to understand the role of deception in a complex society. Finally, recommendations are given for assessment of behavior in virtual worlds and the unique considerations that investigators must give to the rules and procedures of online communities.

Download Full-text

Linguistic Evidence in Security Law and Intelligence
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By "University Library System, University Of Pittsburgh"

Developing and Analyzing a Spanish Corpus for Forensic Purposes

Benchmarking Author Recognition Systems for Forensic Application

Prosody and its application to forensic linguistics

Detecting Deception by Analyzing Written Statements in Korean

Correction To: Hollien, H. (2013). Barriers to Progress in Speaker Identification with Comments on the Trayvon Martin Case. Linguistic Evidence In Security, Law And Intelligence, 1(1), 76-98. doi:10.5195/lesli.2013.3

False Confessions and the Use of Incriminating Evidence

Analysing Deception in Written Witness Statements

ILER: A Web-Accessible Resource for Research in Forensic Linguistics

Barriers to Progress in Speaker Identification with Comments on the Trayvon Martin Case

Detection of Deception in a Virtual World

Export Citation Format

Linguistic Evidence in Security Law and IntelligenceLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By "University Library System, University Of Pittsburgh"

Developing and Analyzing a Spanish Corpus for Forensic Purposes

Benchmarking Author Recognition Systems for Forensic Application

Prosody and its application to forensic linguistics

Detecting Deception by Analyzing Written Statements in Korean

Correction To: Hollien, H. (2013). Barriers to Progress in Speaker Identification with Comments on the Trayvon Martin Case. Linguistic Evidence In Security, Law And Intelligence, 1(1), 76-98. doi:10.5195/lesli.2013.3

False Confessions and the Use of Incriminating Evidence

Analysing Deception in Written Witness Statements

ILER: A Web-Accessible Resource for Research in Forensic Linguistics

Barriers to Progress in Speaker Identification with Comments on the Trayvon Martin Case

Detection of Deception in a Virtual World

Linguistic Evidence in Security Law and Intelligence
Latest Publications