scholarly journals Evaluation in Discourse: a Corpus-Based Study

2016 ◽  
Vol 7 (1) ◽  
pp. 1-49 ◽  
Author(s):  
Farah Benamara ◽  
Nicholas Asher ◽  
Yvette Yannick Mathieu ◽  
Vladimir Popescu ◽  
Baptiste Chardon

This paper describes the CASOAR corpus, the first manually annotated corpus that explores the impact of discourse structure on sentiment analysis with a study of movie reviews in French and in English as well as letters to the editor in French. While annotating opinions at the expression, the sentence or the document level is a well-established task and relatively straightforward, discourse annotation remains difficult, especially for non-experts. Therefore, combining both annotations poses several methodological problems that we address here. We propose a multi-layered annotation scheme that includes: the complete discourse structure according to the Segmented Discourse Representation Theory, the opinion orientation of elementary discourse units and opinion expressions, and their associated features. We detail each layer, explore the interactions between them and discuss our results. In particular, we examine the correlation between discourse and semantic category of opinion expressions, the impact of discourse relations on both subjectivity and polarity analysis and the impact of discourse on the determination of the overall opinion of a document. Our results demonstrate that discourse is an important cue for sentiment analysis, at least for the corpus genres we have studied.

2019 ◽  
Vol 21 (6) ◽  
pp. 690-712 ◽  
Author(s):  
Kun Sun ◽  
Wenxin Xiong

In past studies, the few quantitative approaches to discourse structure were mostly confined to the presentation of the frequency of discourse relations. However, quantitative approaches should take into account both hierarchical and relational layers in the discourse structure. This study considers these factors and addresses the issue of how discourse relations and discourse units are related. It draws upon the available corpora of discourse structure (rhetorical structure theory-discourse treebank (RST-DT)) from a new perspective. Since an RST tree can be converted into a syntactic dependency tree, the data extracted from the RST-DT can be useful for calculating the discourse distance in much the same way as syntactic dependency distance is calculated. Discourse distance is also applicable to measuring the depth of the human processing of discourse. Furthermore, the data derived from the RST-DT are also easily converted into network data. This study finds that discourse structure has its discourse distance minimum and each type of RST relations has its range of discourse distance. The frequency distribution of discourse data basically follows the power law on several levels, while a network approach reveals how discourse units are arranged spatially in regular patterns. The two methods are mutually complementary in revealing the interaction between discourse relations and discourse units in a comprehensive manner, as well as in revealing how people process and comprehend discourse dynamically. Accordingly, we propose merging the two methods so as to yield a computational model for assessing discourse complexity and comprehension.


2018 ◽  
Vol 19 (2) ◽  
pp. 205-231
Author(s):  
Anita Fetzer ◽  
Augustin Speyer

Abstract This paper presents an analysis of the linguistic realization of discourse relations across and within English and German discourse, comparing the genres of newspaper editorial and personal narrative. It concentrates on Continuation, Narration and Contrast, and Elaboration, Explanation and Comment. Particular attention is given to (1) their overt realization with textual themes and pragmatic word order, and (2) the (non)adjacent positioning of discourse units realizing the relations. The methodological framework is an integrated one, supplementing Systemic Functional Grammar with Segmented Discourse Representation Theory. In the English and German narratives, there is a strong tendency to realize discourse relations overtly. The overall overt realization is significantly higher for narratives in both languages with editorials being significantly less overt. There are also significant differences in the overt realization of non-adjacently positioned units realizing discourse relations with significant distributions in all cases, although the distribution in the narratives is less significant.


Author(s):  
Radoslava Trnavac ◽  
Maite Taboada

Taboada et al. (2008) propose a word-based method for extracting sentiment from text that relies on the most relevant parts of a text. The method predicts that opinion words found in the nuclei (more important parts) of a document are more significant for the overall sentiment, whereas opinion words found in the satellites (less important parts) only potentially interfere with the overall sentiment.  However, as pointed out by Taboada et al. (2008) and Narayanan et al. (2009), for certain discourse relations (for instance, Condition relations), the calculation of sentiment should involve both parts of the relation. Based on our analysis of the affective content expressed by automatically extracted discourse relations from the Simon Fraser University Corpus (Taboada 2008) and the Penn Discourse Treebank (Prasad et al. 2008), we propose to classify all the discourse relations into four categories: (1) relations that reverse polarity, (2) intensify polarity, (3) downtone polarity, or (4) produce no change in polarity.  We compare the performance of a sentiment analysis system (SO-CAL, Taboada et al. 2011) when opinion words are detected only in the nuclei with its performance when both parts of the relation are analyzed in combination with the opinion words. The results of the experiment show that extraction of both the nucleus and the satellite parts of texts does not improve the performance of a sentiment extraction system.


2013 ◽  
Vol 4 (2) ◽  
pp. 142-173 ◽  
Author(s):  
Yannick Versley ◽  
Anna Gastel

Discourse structure and discourse relations are an important ingredient in systems for the analysis of text that go beyond the boundary of single clauses. Discourse relations often indicate important additional information about the connection between two clauses, such as causality, and are widely believed to have an influence on aspects of reference resolution.In this article, we first present the general design choices that are to be made in the design of an annotation scheme for discourse structure and discourse relations. In a second part, we present the scheme used in our annotation of selected articles from the TüBa-D/Z treebank of German (Telljohann et al., 2009). The scheme used in the annotation is theory-neutral, but informed by more detailed linguistic knowledge in the way of linguistic tests that can help disambiguate between several plausible relations.


2021 ◽  
Vol 184 ◽  
pp. 148-155
Author(s):  
Abdul Munem Nerabie ◽  
Manar AlKhatib ◽  
Sujith Samuel Mathew ◽  
May El Barachi ◽  
Farhad Oroumchian

2019 ◽  
Vol 81 (1-2) ◽  
pp. 81-86
Author(s):  
Pierre Koskas ◽  
Mouna Romdhani ◽  
Olivier Drunat

As commonly happens in epidemiological research, none of the reported studies were totally free of methodological problems. Studies have considered the influence of social relationships on dementia, but the mechanisms underlying these associations are not perfectly understood. We look at the possible impact of selection bias. For their first memory consultation, patients may come alone or accompanied by a relative. Our objective is to better understand the impact of this factor by retrospective follow-up of geriatric memory outpatients over several years. All patients over 70 who were referred to Bretonneau Memory Clinic for the first time, between January 2006 and 2018, were included in the study. The patients who came alone formed group 1, the others, whatever type of relative accompanied them, formed group 2. We compared the Mini-Mental State Examination (MMSE) scores of patients; and for all patients who came twice for consultation with at least a 60-day interval, we compared their first MMSE with the MMSE performed at the second consultation. In total, 2,935 patients were included, aged 79.7 ± 8.4 years. Six hundred and twenty-five formed group 1 and 2,310 group 2. We found a significant difference in MMSE scores between the 2 groups of patients; and upon second consultation in group 2, but that difference was minor in group 1. Our finding of a possible confounding factor underlines the complexity of choosing comparison groups in order to minimize selection bias while maintaining clinical relevance.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Kun Sun ◽  
Rong Wang ◽  
Wenxin Xiong

Abstract The notion of genre has been widely explored using quantitative methods from both lexical and syntactical perspectives. However, discourse structure has rarely been used to examine genre. Mostly concerned with the interrelation of discourse units, discourse structure can play a crucial role in genre analysis. Nevertheless, few quantitative studies have explored genre distinctions from a discourse structure perspective. Here, we use two English discourse corpora (RST-DT and GUM) to investigate discourse structure from a novel viewpoint. The RST-DT is divided into four small subcorpora distinguished according to genre, and another corpus (GUM) containing seven genres are used for cross-verification. An RST (rhetorical structure theory) tree is converted into dependency representations by taking information from RST annotations to calculate the discourse distance through a process similar to that used to calculate syntactic dependency distance. Moreover, the data on dependency representations deriving from the two corpora are readily convertible into network data. Afterwards, we examine different genres in the two corpora by combining discourse distance and discourse network. The two methods are mutually complementary in comprehensively revealing the distinctiveness of various genres. Accordingly, we propose an effective quantitative method for assessing genre differences using discourse distance and discourse network. This quantitative study can help us better understand the nature of genre.


Electronics ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 374 ◽  
Author(s):  
Sudhanshu Kumar ◽  
Monika Gahalawat ◽  
Partha Pratim Roy ◽  
Debi Prosad Dogra ◽  
Byung-Gyu Kim

Sentiment analysis is a rapidly growing field of research due to the explosive growth in digital information. In the modern world of artificial intelligence, sentiment analysis is one of the essential tools to extract emotion information from massive data. Sentiment analysis is applied to a variety of user data from customer reviews to social network posts. To the best of our knowledge, there is less work on sentiment analysis based on the categorization of users by demographics. Demographics play an important role in deciding the marketing strategies for different products. In this study, we explore the impact of age and gender in sentiment analysis, as this can help e-commerce retailers to market their products based on specific demographics. The dataset is created by collecting reviews on books from Facebook users by asking them to answer a questionnaire containing questions about their preferences in books, along with their age groups and gender information. Next, the paper analyzes the segmented data for sentiments based on each age group and gender. Finally, sentiment analysis is done using different Machine Learning (ML) approaches including maximum entropy, support vector machine, convolutional neural network, and long short term memory to study the impact of age and gender on user reviews. Experiments have been conducted to identify new insights into the effect of age and gender for sentiment analysis.


2014 ◽  
Vol 31 ◽  
pp. 13-25
Author(s):  
Enrico Boone

This paper is concerned with the correct characterization of the licensing condition on clausal ellipsis and how it relates to the distribution of ellipsis. I argue, essentially following López (2000), that ellipsis is licensed when the ellipsis clause bears a relation to an antecedent in the discourse component. A relation between two discourse units can be established in two ways: (1) Either there holds a direct relation between the two discourse units or (2) there holds an anaphoric relation mediated by a discourse anaphor. In this paper, I show how this two-way distinction in setting up discourse relations accounts for the two-way split we find in the distribution of ellipsis.


Sign in / Sign up

Export Citation Format

Share Document