automatic text analysis
Recently Published Documents


TOTAL DOCUMENTS

52
(FIVE YEARS 17)

H-INDEX

10
(FIVE YEARS 2)

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Lixue Zou ◽  
Xiwen Liu ◽  
Wray Buntine ◽  
Yanli Liu

PurposeFull text of a document is a rich source of information that can be used to provide meaningful topics. The purpose of this paper is to demonstrate how to use citation context (CC) in the full text to identify the cited topics and citing topics efficiently and effectively by employing automatic text analysis algorithms.Design/methodology/approachThe authors present two novel topic models, Citation-Context-LDA (CC-LDA) and Citation-Context-Reference-LDA (CCRef-LDA). CC is leveraged to extract the citing text from the full text, which makes it possible to discover topics with accuracy. CC-LDA incorporates CC, citing text, and their latent relationship, while CCRef-LDA incorporates CC, citing text, their latent relationship and reference information in CC. Collapsed Gibbs sampling is used to achieve an approximate estimation. The capacity of CC-LDA to simultaneously learn cited topics and citing topics together with their links is investigated. Moreover, a topic influence measure method based on CC-LDA is proposed and applied to create links between the two-level topics. In addition, the capacity of CCRef-LDA to discover topic influential references is also investigated.FindingsThe results indicate CC-LDA and CCRef-LDA achieve improved or comparable performance in terms of both perplexity and symmetric Kullback–Leibler (sKL) divergence. Moreover, CC-LDA is effective in discovering the cited topics and citing topics with topic influence, and CCRef-LDA is able to find the cited topic influential references.Originality/valueThe automatic method provides novel knowledge for cited topics and citing topics discovery. Topic influence learnt by our model can link two-level topics and create a semantic topic network. The method can also use topic specificity as a feature to rank references.


2021 ◽  
Vol 6 (2) ◽  
pp. 133-157
Author(s):  
Luca Pareschi

The COVID-19 pandemic forced lockdowns in several countries, and many organisations had to introduce teleworking for their employees. While remote working is not a new thing, and was already permitted by laws, the extent to which enterprises had to redefine their process is unprecedented. Therefore, teleworking was widely discussed in national media. Newspapers are a relevant outlet for the diffusion and legitimation of schemata of interpretation, and we explored of teleworking was framed in the Italian discursive space during the first two months of the pandemic. We analysed seven national newspapers, and adopted a semi-automatic text analysis, which we performed through topic modelling. In this paper, we describe the topics that are used by newspapers to frame teleworking, the different use of these topics performed by different newspapers, the trend of topics over time, and we discuss the institutionalisation of the issue of teleworking.


2021 ◽  
pp. 146144482199449
Author(s):  
Max Schindler ◽  
Emese Domahidi

Online user comments (UCs) as the most popular type of online audience participation nowadays form a popular and important field of research. The widespread examination of UCs across different disciplines leads to a variety of terms and constructs and thus a missing clarity about the discussed topics. With this computational scoping review, we uncovered six relevant, overarching topics and their development in the field. Due to the combination of an automatic text analysis via structural topic modeling and a qualitative evaluation, we were able to describe the current state of UC research and found an inherently interdisciplinary body of literature. We observed an inter- and intradisciplinary fragmentation and call for a systematization of the used terms, constructs, and examined topics.


2021 ◽  
Vol 27 (3) ◽  
pp. 138-146
Author(s):  
A. O. Korney ◽  
◽  
E. N. Kryuchkova ◽  

The resonant world events of2020 led to an increase in the amount of information on the Internet, including criminal, fake news, and fake negative reviews. False negative information can spread very quickly, and methods are needed to suppress this process. The development of effective algorithms for automatic text analysis is especially relevant today. The most important subtasks include thematic catesorization, sentiment analysis, includins ABSA (aspect-based sentiment analysis). The paper proposes a combined semantic-statistical alsorithm for the aspect analysis of larse texts, based on the use of a semantic graph. The aspect extraction method contains the phases of selectins a set of sisnificant words, calculatins the weishts of the vertices of the semantic sraph by the relaxation method, filterins aspects based on the sradient method. The method proposed allows to extract domain-dependent aspect terms from trainins data. Different aspect term sets extracted from different domains have the same statistical features, and in the same time lexical diversity and structure are taken into account.


2021 ◽  
Vol 3 (2) ◽  
pp. 1-16
Author(s):  
Kasper Welbers ◽  
Wouter van Atteveldt ◽  
Jan Kleinnijenhuis

Abstract Most common methods for automatic text analysis in communication science ignore syntactic information, focusing on the occurrence and co-occurrence of individual words, and sometimes n-grams. This is remarkably effective for some purposes, but poses a limitation for fine-grained analyses into semantic relations such as who does what to whom and according to what source. One tested, effective method for moving beyond this bag-of-words assumption is to use a rule-based approach for labeling and extracting syntactic patterns in dependency trees. Although this method can be used for a variety of purposes, its application is hindered by the lack of dedicated and accessible tools. In this paper we introduce the rsyntax R package, which is designed to make working with dependency trees easier and more intuitive for R users, and provides a framework for combining multiple rules for reliably extracting useful semantic relations.


2021 ◽  
Vol 7 ◽  
pp. 237802312110480
Author(s):  
Kristo Leung ◽  
Ke Cheng ◽  
Junyao Zhang ◽  
Yipeng Cheng ◽  
Viet Hung Nguyen Cao ◽  
...  

How do individuals respond to discrimination against their group? The authors help answer this normatively important question by conducting a survey with a large, national, quota-based sample of 2,482 Asians living in the United States during December 2020. In the survey, the authors provide respondents with truthful information about the increasing prevalence of anti-Asian discrimination in the United States during the coronavirus disease 2019 pandemic and ask them to write about what this makes them feel or think about life in America. Using automatic text analysis tools to analyze this rich, novel set of personal reflections, the authors show in this visualization that Asian reactions to discrimination do not meaningfully differ across partisan identification. These findings extend the large literature showing partisan differences in perceptions of racial discrimination and its effects by the general public and show at least one way in which partisan polarization does not influence American views.


2020 ◽  
pp. 019251212095353
Author(s):  
Paula Castro

Coalitions play a central role in the international negotiations under the United Nations Framework Convention on Climate Change. By getting together, countries join resources in defending their interests and positions. But building coalitions may come at a cost. Coalition positions are a result of compromise between their members, and thus the increase in bargaining power may come at a price if the preferences of their members are heterogeneous. Relying on automatic text analysis of written position papers submitted to the negotiations, I analyze the extent to which coalitions represent the preferences of their members and discuss whether this contributes to disproportionate policy responses at the international level. I focus on a recently formed coalition: the Like-Minded Developing Countries, a large and heterogeneous group that brings together emerging, oil-dependent and poor developing countries.


The following paper examines and illustrates various problems which occur in the field of Natural Language Processing. The solutions used in these papers use Word Net in one way or the other to enhance or improve the efficiency of the projects.Word Net can therefore be viewed as a combination and an augmentation of a word reference and a thesaurus. While it can be used by developers and programmers via a web browser, its prime use is in automatic text analysis and applications based on AI.


2020 ◽  
Vol 20 (1) ◽  
pp. e05
Author(s):  
Viviana Mercado ◽  
Andrea Villagra ◽  
Marcelo Errecalde

Political alignment identification is an author profiling task that aims at identifying political bias/orientation in people’ writings. As usual in any automatic text analysis, a critical aspect here is having available adequate data sets so that the data mining and machine learning approaches can obtain reliable and informative results. This article makes a contribution in this regard by presenting a new corpus for the study of political alignment in documents of Argentinian journalists. Thestudy also includes several kinds of analysis of documents of pro-government and opposition journalists such as the relevance of terms in each journalist class,sentiment analysis, topic modelling and the analysis of psycholinguistic indicators obtained from the Linguistic Inquiry and Word Count (LIWC) system. From the experimental results, interesting patterns could be observed such as the topics both types of journalists write about, how the sentiment polarities are distributed and how the writings of pro-government and opposition journalists differ in the distinct LIWC categories.


2020 ◽  
Vol 10 (6) ◽  
pp. 2157 ◽  
Author(s):  
Xieling Chen ◽  
Haoran Xie ◽  
Gary Cheng ◽  
Leonard K. M. Poon ◽  
Mingming Leng ◽  
...  

Natural language processing (NLP) is an effective tool for generating structured information from unstructured data, the one that is commonly found in clinical trial texts. Such interdisciplinary research has gradually grown into a flourishing research field with accumulated scientific outputs available. In this study, bibliographical data collected from Web of Science, PubMed, and Scopus databases from 2001 to 2018 had been investigated with the use of three prominent methods, including performance analysis, science mapping, and, particularly, an automatic text analysis approach named structural topic modeling. Topical trend visualization and test analysis were further employed to quantify the effects of the year of publication on topic proportions. Topical diverse distributions across prolific countries/regions and institutions were also visualized and compared. In addition, scientific collaborations between countries/regions, institutions, and authors were also explored using social network analysis. The findings obtained were essential for facilitating the development of the NLP-enhanced clinical trial texts processing, boosting scientific and technological NLP-enhanced clinical trial research, and facilitating inter-country/region and inter-institution collaborations.


Sign in / Sign up

Export Citation Format

Share Document