TIARA 2.0: an interactive tool for annotating discourse structure and text improvement

Language Resources and Evaluation ◽

10.1007/s10579-021-09566-0 ◽

2021 ◽

Author(s):

Jan Wira Gotama Putra ◽

Kana Matsumura ◽

Simone Teufel ◽

Takenobu Tokunaga

Keyword(s):

Visual Complexity ◽

Discourse Structure ◽

Annotation Tool ◽

Web Technologies ◽

Four Levels ◽

Structure Annotation ◽

Argumentative Structure ◽

Discourse Units ◽

Interactive Visualisation ◽

Tree View

AbstractDiscourse structure annotation aims at analysing how discourse units (e.g. sentences or clauses) relate to each other and what roles they play in the overall discourse. Several annotation tools for discourse structure have been developed. However, they often only support specific annotation schemes, making their usage limited to new schemes. This article presents TIARA 2.0, an annotation tool for discourse structure and text improvement. Departing from our specific needs, we extend an existing tool to accommodate four levels of annotation: discourse structure, argumentative structure, sentence rearrangement and content alteration. The latter two are particularly unique compared to existing tools. TIARA is implemented on standard web technologies and can be easily customised. It deals with the visual complexity during the annotation process by systematically simplifying the layout and by offering interactive visualisation, including clutter-reducing features and dual-view display. TIARA’s text-view allows annotators to focus on the analysis of logical sequencing between sentences. The tree-view allows them to review their analysis in terms of the overall discourse structure. Apart from being an annotation tool, it is also designed to be useful for educational purposes in the teaching of argumentation; this gives it an edge over other existing tools.

Download Full-text

Annotating argumentative structure in English-as-a-Foreign-Language learner essays

Natural Language Engineering ◽

10.1017/s1351324921000218 ◽

2021 ◽

pp. 1-27

Author(s):

Jan Wira Gotama Putra ◽

Simone Teufel ◽

Takenobu Tokunaga

Keyword(s):

Foreign Language ◽

Language Learner ◽

English Speakers ◽

Argumentative Discourse ◽

Annotation Scheme ◽

Efl Learners ◽

Structure Annotation ◽

Argumentative Structure ◽

Discourse Units

Abstract Argument mining (AM) aims to explain how individual argumentative discourse units (e.g. sentences or clauses) relate to each other and what roles they play in the overall argumentation. The automatic recognition of argumentative structure is attractive as it benefits various downstream tasks, such as text assessment, text generation, text improvement, and summarization. Existing studies focused on analyzing well-written texts provided by proficient authors. However, most English speakers in the world are non-native, and their texts are often poorly structured, particularly if they are still in the learning phase. Yet, there is no specific prior study on argumentative structure in non-native texts. In this article, we present the first corpus containing argumentative structure annotation for English-as-a-foreign-language (EFL) essays, together with a specially designed annotation scheme. The annotated corpus resulting from this work is called “ICNALE-AS” and contains 434 essays written by EFL learners from various Asian countries. The corpus presented here is particularly useful for the education domain. On the basis of the analysis of argumentation-related problems in EFL essays, educators can formulate ways to improve them so that they more closely resemble native-level productions. Our argument annotation scheme is demonstrably stable, achieving good inter-annotator agreement and near-perfect intra-annotator agreement. We also propose a set of novel document-level agreement metrics that are able to quantify structural agreement from various argumentation aspects, thus providing a more holistic analysis of the quality of the argumentative structure annotation. The metrics are evaluated in a crowd-sourced meta-evaluation experiment, achieving moderate to good correlation with human judgments.

Download Full-text

Investigating genre distinctions through discourse distance and discourse network

Corpus Linguistics and Linguistic Theory ◽

10.1515/cllt-2020-0064 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Kun Sun ◽

Rong Wang ◽

Wenxin Xiong

Keyword(s):

Quantitative Study ◽

Crucial Role ◽

Quantitative Method ◽

Quantitative Methods ◽

Discourse Structure ◽

Structure Theory ◽

Network Data ◽

Quantitative Studies ◽

Discourse Units ◽

Syntactic Dependency

Abstract The notion of genre has been widely explored using quantitative methods from both lexical and syntactical perspectives. However, discourse structure has rarely been used to examine genre. Mostly concerned with the interrelation of discourse units, discourse structure can play a crucial role in genre analysis. Nevertheless, few quantitative studies have explored genre distinctions from a discourse structure perspective. Here, we use two English discourse corpora (RST-DT and GUM) to investigate discourse structure from a novel viewpoint. The RST-DT is divided into four small subcorpora distinguished according to genre, and another corpus (GUM) containing seven genres are used for cross-verification. An RST (rhetorical structure theory) tree is converted into dependency representations by taking information from RST annotations to calculate the discourse distance through a process similar to that used to calculate syntactic dependency distance. Moreover, the data on dependency representations deriving from the two corpora are readily convertible into network data. Afterwards, we examine different genres in the two corpora by combining discourse distance and discourse network. The two methods are mutually complementary in comprehensively revealing the distinctiveness of various genres. Accordingly, we propose an effective quantitative method for assessing genre differences using discourse distance and discourse network. This quantitative study can help us better understand the nature of genre.

Download Full-text

The discourse structure of free indirect discourse reports

Linguistics in the Netherlands ◽

10.1075/avt.00048.bim ◽

2021 ◽

Vol 38 ◽

pp. 21-39

Author(s):

Sofia Bimpikou ◽

Emar Maier ◽

Petra Hendriks

Keyword(s):

Discourse Structure ◽

Pronoun Resolution ◽

Free Indirect Discourse ◽

Indirect Discourse ◽

Attribution Analysis ◽

Natural Conversation ◽

Discourse Units

Abstract We investigate the discourse structure of Free Indirect Discourse passages in narratives. We argue that Free Indirect Discourse reports consist of two separate propositional discourse units: an (explicit or implicit) frame segment and a reported content. These segments are connected at the level of discourse structure by a non-veridical, subordinating discourse relation of Attribution, familiar from recent SDRT analyses of indirect discourse constructions in natural conversation (Hunter, 2016). We conducted an experiment to detect the covert presence of a subordinating frame segment based on its effects on pronoun resolution. We compared (unframed) Free Indirect Discourse with overtly framed Indirect Discourse and a non-reportative segment. We found that the first two indeed pattern alike in terms of pronoun resolution, which we take as evidence against the pragmatic context split approach of Schlenker (2004) and Eckardt (2014), and in favor of our discourse structural Attribution analysis.

Download Full-text

A computational model for measuring discourse complexity

Discourse Studies ◽

10.1177/1461445619866985 ◽

2019 ◽

Vol 21 (6) ◽

pp. 690-712 ◽

Cited By ~ 1

Author(s):

Kun Sun ◽

Wenxin Xiong

Keyword(s):

Computational Model ◽

Discourse Structure ◽

Structure Theory ◽

Dependency Tree ◽

Discourse Relations ◽

New Perspective ◽

Discourse Units ◽

Syntactic Dependency ◽

Regular Patterns ◽

Quantitative Approaches

In past studies, the few quantitative approaches to discourse structure were mostly confined to the presentation of the frequency of discourse relations. However, quantitative approaches should take into account both hierarchical and relational layers in the discourse structure. This study considers these factors and addresses the issue of how discourse relations and discourse units are related. It draws upon the available corpora of discourse structure (rhetorical structure theory-discourse treebank (RST-DT)) from a new perspective. Since an RST tree can be converted into a syntactic dependency tree, the data extracted from the RST-DT can be useful for calculating the discourse distance in much the same way as syntactic dependency distance is calculated. Discourse distance is also applicable to measuring the depth of the human processing of discourse. Furthermore, the data derived from the RST-DT are also easily converted into network data. This study finds that discourse structure has its discourse distance minimum and each type of RST relations has its range of discourse distance. The frequency distribution of discourse data basically follows the power law on several levels, while a network approach reveals how discourse units are arranged spatially in regular patterns. The two methods are mutually complementary in revealing the interaction between discourse relations and discourse units in a comprehensive manner, as well as in revealing how people process and comprehend discourse dynamically. Accordingly, we propose merging the two methods so as to yield a computational model for assessing discourse complexity and comprehension.

Download Full-text

Discourse structure, topicality and questioning

Journal of Linguistics ◽

10.1017/s002222670000058x ◽

1995 ◽

Vol 31 (1) ◽

pp. 109-147 ◽

Cited By ~ 49

Author(s):

Jan Van Kuppevelt

Keyword(s):

Discourse Structure ◽

Alternative Approach ◽

Discourse Units

In this paper we present an alternative approach to discourse structure according to which topicality is the general organizing principle in discourse. This approach accounts for the fact that the segmentation structure of discourse is in correspondence with the hierarchy of topics defined for the discourse units. Fundamental to the proposed analysis is the relation it assumes between the notion of topic and that of explicit and implicit questioning in discourse. This relation implies that (1) the topic associated with a discourse unit is provided by the explicit or implicit question it answers and (2) the relation between discourse units is determined by the relation between these topic-providing questions.

Download Full-text

The scope of discourse connectives: implications for discourse organization

Journal of Linguistics ◽

10.1017/s0022226700015942 ◽

1996 ◽

Vol 32 (2) ◽

pp. 403-438 ◽

Cited By ~ 18

Author(s):

Christoph Unger

Keyword(s):

Hierarchical Organization ◽

Discourse Structure ◽

Optimal Relevance ◽

Coherence Relations ◽

Discourse Organization ◽

Discourse Units ◽

Discourse Connectives

The main aim of this paper is to discuss the claim that discourse connectives are best treated as indicators of coherence relations between hierarchically organized discourse units. It will be argued that coherence relations cannot be seen as cognitively real entities. Furthermore, there is no evidence for hierarchical organization in discourse. The intuitions underlying the notion of hierarchical discourse structure are instead explained in terms of consequences of processing a text in the search for optimal relevance. This account draws attention to a hitherto not widely discussed set of data.

Download Full-text

Evaluation in Discourse: a Corpus-Based Study

Dialogue & Discourse ◽

10.5087/dad.2016.101 ◽

2016 ◽

Vol 7 (1) ◽

pp. 1-49 ◽

Cited By ~ 1

Author(s):

Farah Benamara ◽

Nicholas Asher ◽

Yvette Yannick Mathieu ◽

Vladimir Popescu ◽

Baptiste Chardon

Keyword(s):

Sentiment Analysis ◽

Semantic Category ◽

Discourse Structure ◽

Annotation Scheme ◽

Letters To The Editor ◽

Discourse Representation Theory ◽

Discourse Relations ◽

Methodological Problems ◽

Discourse Units ◽

The Impact

This paper describes the CASOAR corpus, the first manually annotated corpus that explores the impact of discourse structure on sentiment analysis with a study of movie reviews in French and in English as well as letters to the editor in French. While annotating opinions at the expression, the sentence or the document level is a well-established task and relatively straightforward, discourse annotation remains difficult, especially for non-experts. Therefore, combining both annotations poses several methodological problems that we address here. We propose a multi-layered annotation scheme that includes: the complete discourse structure according to the Segmented Discourse Representation Theory, the opinion orientation of elementary discourse units and opinion expressions, and their associated features. We detail each layer, explore the interactions between them and discuss our results. In particular, we examine the correlation between discourse and semantic category of opinion expressions, the impact of discourse relations on both subjectivity and polarity analysis and the impact of discourse on the determination of the overall opinion of a document. Our results demonstrate that discourse is an important cue for sentiment analysis, at least for the corpus genres we have studied.

Download Full-text

The genre element in the systems analyst’s interview

Australian Review of Applied Linguistics ◽

10.1075/aral.15.2.07teb ◽

1992 ◽

Vol 15 (2) ◽

pp. 120-136 ◽

Cited By ~ 1

Author(s):

Helen Tebble

Keyword(s):

Systems Analysis ◽

Discourse Structure ◽

Top Down ◽

Linguistic Description ◽

Computing Industry ◽

Systems Analysts ◽

Speech Events ◽

Discourse Units ◽

Communicative Abilities ◽

National Systems

It has been estimated by those who work in the computing industry that sixty per cent of their time is taken up in communication and only forty per cent is spent on technical work. There is then a clear need to develop the communicative abilities of those in the computer industry. Well designed communication courses for people in computing would benefit from linguistic descriptions of the discourses of this industry. A linguistic description of the structure and genre of the systems analyst’s interview should provide the basis for some of these courses. This paper discusses the genre of the two major types of interviews used by systems analysts and identifies the genre element as the unit of discourse structure that links the lower level and higher level units of discourse structure within systemic linguistics. It draws upon data collected from the depth phase of a national systems analysis project. It is argued that for a full linguistic description of the structure of lengthy speech events within a systemic linguistics framework it is necessary to take both a top down (generic) and bottom up (discourse units) approach.

Download Full-text

Anadeixis and the signalling of discourse structure

Quaderns de Filologia - Estudis Lingüístics ◽

10.7203/qf.23.13519 ◽

2018 ◽

Vol 23 (23) ◽

pp. 33 ◽

Cited By ~ 1

Author(s):

Francis Cornish

Keyword(s):

Discourse Structure ◽

French And English ◽

Discourse Units

By “anadeixis” (a termed first coined by Ehlich, 1982) is meant, prototypically, the indexical functioning of certain context-bound expressions to target discourse entities which are either not yet topical, or whose erstwhile topical status has faded. It is the discourse-structuring function of anadeictic indexicals that will be the particular focus of this study. The basis for the discussion will be two short whole texts, in two languages (French and English). This will make it possible to show how certain ‘strict’-anadeictic and discourse-deictic references may signal the macro- (content structures) and super-structures (discourse-functional structures) that characterize them. Such references may serve either to foreshadow a transition between major discourse units within a given text, or to actually introduce one.

Download Full-text

Investigating Genre Distinctions through Discourse Distance and Discourse Network

10.31234/osf.io/eywvz ◽

2021 ◽

Author(s):

Kun Sun

Keyword(s):

Quantitative Study ◽

Quantitative Method ◽

Quantitative Methods ◽

Discourse Structure ◽

Structure Theory ◽

Network Data ◽

Effective Strategies ◽

Quantitative Studies ◽

Discourse Units ◽

Syntactic Dependency

The notion of genre has been widely explored using quantitative methods from both lexical and syntactical perspectives. However, discourse structure has rarely been used to examine genre. Mostly concerned with the interrelation of discourse units, discourse structure can play a crucial role in genre analysis because genre is closely related to discourse (text). Nevertheless, few quantitative studies have explored genre distinctions from a discourse structure perspective. Here, we use two English discourse corpora (RST-DT and GUM) to investigate the hierarchical and relational dimensions of discourse structure from a novel viewpoint. The RST-DT is divided into four small subcorpora distinguished according to genre, and another corpus (GUM) containing seven genres is used for cross-verification. An RST (rhetorical structure theory) tree is converted into dependency representations by taking information from RST annotations to calculate the discourse distance through a process similar to that used to calculate syntactic dependency distance. Moreover, the data on dependency representations stemming from the two corpora is readily convertible into network data. Afterwards, we examine different genres in the two corpora by combining discourse distance and discourse network. The two methods are mutually complementary in comprehensively revealing the distinctiveness of various genres. Accordingly, we propose an effective quantitative method for assessing genre differences using discourse distance and discourse network. This quantitative study can help us better understand the nature of genre and develop effective strategies for genre-based writing.

Download Full-text