scholarly journals IDENTIFICATION OF ARGUMENTATIVE RELATIONS IN POPULAR SCIENCE TEXTS

Author(s):  
Natalia Vasilievna Salomatina ◽  
◽  
Irina Semenovna Kononenko ◽  
Elena Anatolvna Sidorova ◽  
Ivan Sergeevich Pimenov ◽  
...  

The presented work describes the analysis of argumentative statements included into the same text topic fragment as a recognition feature in terms of its efficiency. This study is performed with the purpose of using this feature in automatic recognition of argumentative structures presented in the popular science texts written in Russian. The topic model of a text is constructed based on superphrasal units (text fragments united by one topic) that are identified by detecting clusters of words and word-combinations with the use of scan statistics. Potential relations, extracted from topic models, are verified through the use of texts with manually annotated argumentation structures. The comparison between potential (based on topic models) and manually constructed relations is performed automatically. Macro-average scores of precision and recall are equal to 48.6% and 76.2% correspondingly.

2018 ◽  
Vol 45 (4) ◽  
pp. 554-570 ◽  
Author(s):  
Jian Jin ◽  
Qian Geng ◽  
Haikun Mou ◽  
Chong Chen

Interdisciplinary studies are becoming increasingly popular, and research domains of many experts are becoming diverse. This phenomenon brings difficulty in recommending experts to review interdisciplinary submissions. In this study, an Author–Subject–Topic (AST) model is proposed with two versions. In the model, reviewers’ subject information is embedded to analyse topic distributions of submissions and reviewers’ publications. The major difference between the AST and Author–Topic models lies in the introduction of a ‘Subject’ layer, which supervises the generation of hierarchical topics and allows sharing of subjects among authors. To evaluate the performance of the AST model, papers in Information System and Management (a typical interdisciplinary domain) in a famous Chinese academic library are investigated. Comparative experiments are conducted, which show the effectiveness of the AST model in topic distribution analysis and reviewer recommendation for interdisciplinary studies.


Author(s):  
Olga Grynko

When used in the texts, foreign words often function as a stylistic device and become a relevant feature of the author’s individual style. The article looks at the issues of functioning and translation of foreign words with the focus on those not being “adapted”, that is preserving its original “foreign” form (unlike those being transcribed without morphological and syntactical changes). The work systematizes the ways these elements are introduced into the original text. It shows they can either be introduced with no explanation, relying on the reader’s general expertise and creating certain environment, flavour etc. or be accompanied by any kind of their meaning’s explanation). The article also offers the insights into the key functions of the foreign words in popular-science texts (specifically, they make the text sound more authentic and documentary, and also display author’s intelligence and competence). Further, the research finalizes the classification of the ways to translate/render the foreign words in the translated text in the view of the genre peculiarities of popular-science texts. Among other ways, such as preserving a foreign word with a translation of the author’s comment, transcription/transliteration, translator’s comments, actual translation into the target language, etc., such texts allow for science editor’s comments in translation.


2020 ◽  
Vol 1 (80) ◽  
Author(s):  
Yu. O. Babyatinskaya ◽  
I. N. Hroza ◽  
K. S. Guseinova

2020 ◽  
Vol 39 (4) ◽  
pp. 727-742 ◽  
Author(s):  
Joachim Büschken ◽  
Greg M. Allenby

User-generated content in the form of customer reviews, blogs, and tweets is an emerging and rich source of data for marketers. Topic models have been successfully applied to such data, demonstrating that empirical text analysis benefits greatly from a latent variable approach that summarizes high-level interactions among words. We propose a new topic model that allows for serial dependency of topics in text. That is, topics may carry over from word to word in a document, violating the bag-of-words assumption in traditional topic models. In the proposed model, topic carryover is informed by sentence conjunctions and punctuation. Typically, such observed information is eliminated prior to analyzing text data (i.e., preprocessing) because words such as “and” and “but” do not differentiate topics. We find that these elements of grammar contain information relevant to topic changes. We examine the performance of our models using multiple data sets and establish boundary conditions for when our model leads to improved inference about customer evaluations. Implications and opportunities for future research are discussed.


Author(s):  
Ximing Li ◽  
Jiaojiao Zhang ◽  
Jihong Ouyang

Conventional topic models suffer from a severe sparsity problem when facing extremely short texts such as social media posts. The family of Dirichlet multinomial mixture (DMM) can handle the sparsity problem, however, they are still very sensitive to ordinary and noisy words, resulting in inaccurate topic representations at the document level. In this paper, we alleviate this problem by preserving local neighborhood structure of short texts, enabling to spread topical signals among neighboring documents, so as to correct the inaccurate topic representations. This is achieved by using variational manifold regularization, constraining the close short texts should have similar variational topic representations. Upon this idea, we propose a novel Laplacian DMM (LapDMM) topic model. During the document graph construction, we further use the word mover’s distance with word embeddings to measure document similarities at the semantic level. To evaluate LapDMM, we compare it against the state-of-theart short text topic models on several traditional tasks. Experimental results demonstrate that our LapDMM achieves very significant performance gains over baseline models, e.g., achieving even about 0.2 higher scores on clustering and classification tasks in many cases.


Symmetry ◽  
2019 ◽  
Vol 11 (12) ◽  
pp. 1486
Author(s):  
Zhinan Gou ◽  
Zheng Huo ◽  
Yuanzhen Liu ◽  
Yi Yang

Supervised topic modeling has been successfully applied in the fields of document classification and tag recommendation in recent years. However, most existing models neglect the fact that topic terms have the ability to distinguish topics. In this paper, we propose a term frequency-inverse topic frequency (TF-ITF) method for constructing a supervised topic model, in which the weight of each topic term indicates the ability to distinguish topics. We conduct a series of experiments with not only the symmetric Dirichlet prior parameters but also the asymmetric Dirichlet prior parameters. Experimental results demonstrate that the result of introducing TF-ITF into a supervised topic model outperforms several state-of-the-art supervised topic models.


2019 ◽  
Vol 26 (5) ◽  
pp. 531-549
Author(s):  
Chuan Wu ◽  
Evangelos Kanoulas ◽  
Maarten de Rijke

AbstractEntities play an essential role in understanding textual documents, regardless of whether the documents are short, such as tweets, or long, such as news articles. In short textual documents, all entities mentioned are usually considered equally important because of the limited amount of information. In long textual documents, however, not all entities are equally important: some are salient and others are not. Traditional entity topic models (ETMs) focus on ways to incorporate entity information into topic models to better explain the generative process of documents. However, entities are usually treated equally, without considering whether they are salient or not. In this work, we propose a novel ETM, Salient Entity Topic Model, to take salient entities into consideration in the document generation process. In particular, we model salient entities as a source of topics used to generate words in documents, in addition to the topic distribution of documents used in traditional topic models. Qualitative and quantitative analysis is performed on the proposed model. Application to entity salience detection demonstrates the effectiveness of our model compared to the state-of-the-art topic model baselines.


2019 ◽  
Vol 25 (2) ◽  
pp. 152
Author(s):  
Gabriela Belini Gontijo ◽  
Jane Raquel Silva de Oliveira

O objetivo deste trabalho foi analisar características da sociologia da ciência em textos de divulgação científica da Minas faz Ciência, uma revista brasileira publicada por uma agência estadual de apoio à pesquisa científica. Foram selecionados doze textos dessa revista, os quais continham informações sobre a o percurso adotado pelo cientista para o desenvolvimento de sua pesquisa. Os textos selecionados foram analisados por meio da Análise Textual Discursiva, adotando-se como referenciais teóricos estudos sobre sociologia da ciência de Bruno Latour. Nas análises, identificamos as seguintes características da prática da ciência: aspectos persuasivos na ciência; ciência como construção coletiva; influência de fatores externos na construção dos fatos; dinâmica de trabalho do pesquisador; e o ciclo de credibilidade na ciência. Os resultados evidenciam que tais textos podem ser usados para abordagem de aspectos da sociologia da ciência em sala de aula.


Sign in / Sign up

Export Citation Format

Share Document