scholarly journals DISCO PAL: Diachronic Spanish sonnet corpus with psychological and affective labels

Author(s):  
Alberto Barbado ◽  
Víctor Fresno ◽  
Ángeles Manjarrés Riesco ◽  
Salvador Ros

AbstractNowadays, there are many applications of text mining over corpora from different languages. However, most of them are based on texts in prose, lacking applications that work with poetry texts. An example of an application of text mining in poetry is the usage of features derived from their individual words in order to capture the lexical, sublexical and interlexical meaning, and infer the General Affective Meaning (GAM) of the text. However, even though this proposal has been proved as useful for poetry in some languages, there is a lack of studies for both Spanish poetry and for highly-structured poetic compositions such as sonnets. This article presents a study over an annotated corpus of Spanish sonnets, in order to analyse if it is possible to build features from their individual words for predicting their GAM. The purpose of this is to model sonnets at an affective level. The article also analyses the relationship between the GAM of the sonnets and the content itself. For this, we consider the content from a psychological perspective, identifying with tags when a sonnet is related to a specific term. Then, we study how GAM changes according to each of those psychological terms. The corpus used contains 274 Spanish sonnets from authors of different centuries, from fifteenth to nineteenth. This corpus was annotated by different domain experts. The experts annotated the poems with affective and lexico-semantic features, as well as with domain concepts that belong to psychology. Thanks to this, the corpus of sonnets can be used in different applications, such as poetry recommender systems, personality text mining studies of the authors, or the usage of poetry for therapeutic purposes.

2011 ◽  
pp. 145-158
Author(s):  
Stanley Loh ◽  
Daniel Lichtnow ◽  
Thyago Borges ◽  
Gustavo Piltcher

This chapter investigates different aspects in the construction of a domain ontology to a content-based recommender system. The recommender systems suggests textual electronic documents from a Digital Library, based on documents read by the users and based on textual messages posted in electronic discussions through a web chat. The domain ontology is used to represent the user’s interest and the content of the documents. In this context, the ontology is composed by a hierarchy of concepts and keywords. Each concept has a vector of keywords with weights associated. Keywords are used to identify the content of the texts (documents and messages), through the application of text mining techniques. The chapter discusses different approaches for constructing the domain ontology, including the use of text mining software tools for supervised learning, the interference of domain experts in the engineering process and the use of a normalization step.


2020 ◽  
Author(s):  
Amir Karami ◽  
Brandon Bookstaver ◽  
Melissa Nolan

BACKGROUND The COVID-19 pandemic has impacted nearly all aspects of life and has posed significant threats to international health and the economy. Given the rapidly unfolding nature of the current pandemic, there is an urgent need to streamline literature synthesis of the growing scientific research to elucidate targeted solutions. While traditional systematic literature review studies provide valuable insights, these studies have restrictions, including analyzing a limited number of papers, having various biases, being time-consuming and labor-intensive, focusing on a few topics, incapable of trend analysis, and lack of data-driven tools. OBJECTIVE This study fills the mentioned restrictions in the literature and practice by analyzing two biomedical concepts, clinical manifestations of disease and therapeutic chemical compounds, with text mining methods in a corpus containing COVID-19 research papers and find associations between the two biomedical concepts. METHODS This research has collected papers representing COVID-19 pre-prints and peer-reviewed research published in 2020. We used frequency analysis to find highly frequent manifestations and therapeutic chemicals, representing the importance of the two biomedical concepts. This study also applied topic modeling to find the relationship between the two biomedical concepts. RESULTS We analyzed 9,298 research papers published through May 5, 2020 and found 3,645 disease-related and 2,434 chemical-related articles. The most frequent clinical manifestations of disease terminology included COVID-19, SARS, cancer, pneumonia, fever, and cough. The most frequent chemical-related terminology included Lopinavir, Ritonavir, Oxygen, Chloroquine, Remdesivir, and water. Topic modeling provided 25 categories showing relationships between our two overarching categories. These categories represent statistically significant associations between multiple aspects of each category, some connections of which were novel and not previously identified by the scientific community. CONCLUSIONS Appreciation of this context is vital due to the lack of a systematic large-scale literature review survey and the importance of fast literature review during the current COVID-19 pandemic for developing treatments. This study is beneficial to researchers for obtaining a macro-level picture of literature, to educators for knowing the scope of literature, to journals for exploring most discussed disease symptoms and pharmaceutical targets, and to policymakers and funding agencies for creating scientific strategic plans regarding COVID-19.


Author(s):  
Martin Haspelmath

This chapter focuses on various theoretical approaches to the semantic and syntactic functions of indefinite pronouns. It begins with a discussion of structuralist semantics, which suggests that language is a system whose parts must be defined and described on the basis of their place in the system and their relation to each other, rather than on the basis of their own intrinsic properties. It then considers some of the problems associated with structuralist semantics, including the unclear status of the semantic features; significant overlap of the functions of grammatical items in many areas, including indefinite pronouns; and structuralist semantics makes wrong predictions about semantic change. The chapter proceeds by analysing logical semantics and the issues raised by this approach, along with syntactic approaches, the theory of mental spaces, pragmatic scales and scale reversal. Finally, it explains the relationship between focusing and sentence accent.


2017 ◽  
Vol 8 (5) ◽  
pp. 158
Author(s):  
Robert S.P. Jones

James Joyce’s Portrait of the Artist as a Young Man has fascinated readers for more than a century and there are layers of psychological meaning to be found throughout the novel. The novel is the perfect vehicle to discuss the relationship between form language and emotion as Joyce deliberately manipulated the emotional response of the reader through innovations in form and language, departing dramatically from previous literary traditions. This paper attempts to take a fresh look at the novel from a psychological perspective and seeks to examine underlying conditioning processes at work in the narrative – particularly the concept of associative learning. Understanding emotional responses to different stimuli is the bedrock of psychological investigation and 100 years after the date of its publication, Portrait of an Artist presents remarkably fresh insights into the human experience of emotion. Despite its age, Portrait of the Artist contains many contemporary psychological insights.


2017 ◽  
Vol 1 (2) ◽  
pp. 160
Author(s):  
Hindra Kurniawam

Sentence textbook training manual contains commands that must be understood and carried out as desired student authors.It aims to train students evaluatetheir learning results. Grammatical form changes in the text aims to enable students more easily to reveal a meaning of an instructions contained in a textbook training.Better to understand a changes in grammatically forms, are presented in a textbook process Indonesiangrammatically. In agrammatically process, there will be changes due to become different in meaning,grammatically composition form and structure of a sentence. A meaning can be a new or meaning still retain an old meaning which still survive.A new meaning appear when experiencing a reduction in a sentence element or elements added point. Changes in a core elements as a result of a sentence to change a reader mindset, in predicting a primary purpose of giving guidance n at t on his practice.Meaning that unchanged is a core element of a sentence that remain despite has develop a grammaticallyprocess, so intent submitted n to t remains a same and unchanged. The first discovery process experienced by a guidance desemanticization of his practice due to the loss of semantic elements within a meaning of a word so that experienced more narrow and focused on one speechpurpose.A first discovery process experienced by a guidance lose specific semantic features of human practicing caused by a loss of semantic elements within the meaning of the word so that experienced more narrow and focused on one purpose speech.In an analysis of a sentence mean desemanticization found changes as a result of an analysis partly different and there are mostly suffered ambiguity.Because of an ambiguity can cause t disoriented about what to do with a task that given by n. Desemanticization meaning that changes in affective meaning into thematic, thematic meaning becomes a conceptual, thematic meaning into collocation, and affective be connotative meaning.The meaning of a changes can not be separated from the desemanticization process but that meaning has a value different flavors in each type.


2004 ◽  
Vol 28 (1) ◽  
pp. 15-49
Author(s):  
Simon Perry

Mussorgsky's Sunless cycle is aesthetically and stylistically an anomalous member of his oeuvre. Its notably effaced, pared-down, and withdrawn qualities present challenges to critical interpretation. Its uniqueness, however, renders it a crucial work for furnishing the fullest possible picture of Mussorgsky as a creative artist. The author of its texts, Golenishchev-Kutuzov (whose relationship with Mussorgsky at the time of its writing possibly extended beyond the platonic) has been identified by recent scholarship as an essential "eye-witness" for those to whom Stasov's populist characterization of the composer does not ring entirely true. Golenishchev-Kutuzov believed that in Sunless Mussorgsky first revealed his authentic artistic self. According to Golenishchev-Kutuvoz, Mussorgsky regarded his signal achievement in Sunless to have been the eradication of all elements other than "feeling." In other words, he had thrown off the stylistic shackles imposed by the aesthetics of realism and relied entirely on intuitive harmonic invention as the sole conveyor of a purely subjective, "affective" meaning in the cycle. This hypothesis forms the point of departure for an investigation of select numbers of the cycle. Analysis reveals that the affective aspect is not the only significant element operative. Alongside remnants of the realist style, there is evidence, of varying degrees of subtlety, for a knowing use of symmetrical pitch organization. Mussorgsky not only adapted the usual referential attachments of symmetrically based chromaticism--typically found in Russian operas of the second half of the nineteenth century--he also, through extremely simple but effective means, synthesized the "intuitive" harmonic and "rational" symmetrical elements of the cycle's pitch organization so that the latter emerges seamlessly out of the former. This remarkable synthesis ensures the cycle's uniformity of tone while also allowing for a reading that extends beyond the generally affective to the symbolically more specific. This symbolic level of reading offers several interpretative possibilities, one of which may refer even to the relationship of the poet and the composer. Irrespective of such potentials for interpretation, the most significant achievement in the cycle remains the synthesis of the intuitive/affective and rational/symbolic elements of its organization. Songs 1, 2, 3, and 6 of the cycle are considered in detail.


2019 ◽  
Vol 59 (4) ◽  
pp. 722-741 ◽  
Author(s):  
Paul Phillips ◽  
Nuno Antonio ◽  
Ana de Almeida ◽  
Luís Nunes

This study examines the relationship between distance measures and a Portuguese data set consisting of 34,622 online hotel reviews extracted from Booking.com and TripAdvisor written in Portuguese, Spanish, and English. Based on the country of origin of each review author, a geographic and a psychic distance measure is calculated for Portugal. Data and text mining analysis provides additional insights into online hotel ratings. The authors confirm that online travelers’ evaluations are multifaceted constructs displaying varying patterns of rating behavior among the traveler base. By investigating the contemporary relevance of geographic and psychic distance, a key finding of this study is that travelers with less distance both in terms of psychic and geographic distance give a lower rating score than travelers with greater distance. The inclusion of psychic and geographic distance is advocated as a salient aspect for future researchers and for those practitioners who wish to enhance hotel product and service features.


2019 ◽  
Vol 62 (2) ◽  
pp. 195-215
Author(s):  
Frederik Situmeang ◽  
Nelleke de Boer ◽  
Austin Zhang

The purpose of this study is to contribute to the marketing literature and practice by describing a research methodology to identify latent dimensions of customer satisfaction in product reviews, and examining the relationship between these attributes and customer satisfaction. Previous research in product reviews has largely relied only on quantitative ratings, either stars or review score. Advanced techniques for text mining provide the opportunity to extract meaning from customer online reviews. By analyzing 51,110 online reviews for 1,610 restaurants via latent Dirichlet allocation, this study uncovers 30 latent dimensions that are determinants of customer satisfaction. Furthermore, this study developed measurements of sentiment and innovativeness as moderators of the effect of these latent attributes to satisfaction.


2019 ◽  
Vol 18 (02) ◽  
pp. 717-742 ◽  
Author(s):  
Xiangling Fu ◽  
Jintae Lee ◽  
Chenwei Yan ◽  
Li Gao

Microblog can provide a valuable resource for journalists as it captures potential newsworthy events as they occur, including ones occurring remotely. Given the large volume and the fast pace of typical microblog, it is impractical to monitor all microblog postings for potential news events. Therefore, it would be useful if a method exists that uses text mining to help identify such events. For this endeavor, we need a good model of newsworthiness that furthermore can be operationalized with text-mining techniques. This study examines the feasibility and usefulness of such a model by first adopting the Shoemaker model of newsworthiness, one of the most comprehensive and accepted among such models; refining it based on a set of extensive interviews with domain experts and users in the context of news media in China; operationalizing it with a set of text-analytic measures in the domain of traffic accident; and testing its feasibility and validity using data from Weibo, the largest microblog site in China. As such, we believe that this study makes important theoretical and methodological contributions by developing and testing the most comprehensive and computable model of newsworthiness to date. We also point out its limitations and the areas that need further research.


Sign in / Sign up

Export Citation Format

Share Document