This paper describes the CASOAR corpus, the first manually annotated corpus that
explores the impact of discourse structure on sentiment analysis with a study of movie
reviews in French and in English as well as letters to the editor in French. While
annotating opinions at the expression, the sentence or the document level is a
well-established task and relatively straightforward, discourse annotation remains
difficult, especially for non-experts. Therefore, combining both annotations poses
several methodological problems that we address here. We propose a multi-layered
annotation scheme that includes: the complete discourse structure according to the
Segmented Discourse Representation Theory, the opinion orientation of elementary
discourse units and opinion expressions, and their associated features. We detail each
layer, explore the interactions between them and discuss our results. In particular, we
examine the correlation between discourse and semantic category of opinion expressions,
the impact of discourse relations on both subjectivity and polarity analysis and the
impact of discourse on the determination of the overall opinion of a document. Our
results demonstrate that discourse is an important cue for sentiment analysis, at least
for the corpus genres we have studied.