Using LSA for Pronominal Anaphora Resolution

Author(s):  
Beata Klebanov ◽  
Peter Wiemer-Hastings
Author(s):  
Vilson J. Leffa

A typical problem in the resolution of pronominal anaphora is the presence of more than one candidate for the antecedent of the pronoun. Considering two English sentences like (1) "People buy expensive cars because they offer more status" and (2) "People buy expensive cars because they want more status" we can see that the two NPs "people" and "expensive cars", from a purely syntactic perspective, are both legitimate candidates as antecedents for the pronoun "they". This problem has been traditionally solved by using world knowledge (e.g. schema theory), where, through an internal representation of the world, we "know" that cars "offer" status and people "want" status. The assumption in this paper is that the use of world knowledge does not explain how the disambiguation process works and alternative explanations should be explored. Using a knowledge poor approach (explicit information from the text rather than implicit world knowledge) the study investigates to what extent syntactic and semantic constraints can be used to resolve anaphora. For this purpose, 1,400 examples of the word "they" were randomly selected from a corpus of 10,000,000 words of expository text in English. Antecedent candidates for each case were then analyzed and classified in terms of their syntactic functions in the sentence (subject, object, etc.) and semantic features (+ human, + animate, etc.). It was found that syntactic constraints resolved 85% of the cases. When combined with semantic constraints the resolution rate rose to 98%. The implications of the findings for Natural Language Processing are discussed.


Author(s):  
Priya Lakhmani ◽  
Smita Pratistha Mathur ◽  
Sudha Morwal

Author(s):  
Alonso García ◽  
Martha Victoria González ◽  
Francisco López-Orozco ◽  
Lucero Zamora

Recent technological advances have allowed the development of numerous natural language processing applications with which users frequently interact. When interacting with this type of application, users often search for the economy of words, which promotes the use of pronouns, thereby highlighting the well-known anaphora problem. This chapter describes a proposal to approach the pronominal anaphora for the Spanish language. A set of rules (based on the Eagle standard) was designed to identify the referents of personal pronouns through the structure of the grammatical tags of the words. The proposed algorithm uses the online Freeling service to perform tokenization and tagging tasks. The performance of the algorithm was compared with an online version of Freeling, and the proposed algorithm shows better performance.


2001 ◽  
Vol 15 ◽  
pp. 263-287 ◽  
Author(s):  
M. Palomar ◽  
P. Martinez-Barco

This paper presents an algorithm for identifying noun-phrase antecedents of pronouns and adjectival anaphors in Spanish dialogues. We believe that anaphora resolution requires numerous sources of information in order to find the correct antecedent of the anaphor. These sources can be of different kinds, e.g., linguistic information, discourse/dialogue structure information, or topic information. For this reason, our algorithm uses various different kinds of information (hybrid information). The algorithm is based on linguistic constraints and preferences and uses an anaphoric accessibility space within which the algorithm finds the noun phrase. We present some experiments related to this algorithm and this space using a corpus of 204 dialogues. The algorithm is implemented in Prolog. According to this study, 95.9% of antecedents were located in the proposed space, a precision of 81.3% was obtained for pronominal anaphora resolution, and 81.5% for adjectival anaphora.


Sign in / Sign up

Export Citation Format

Share Document