Topic Modeling of the Research Papers' Citation Contexts: a Structure of an Author's Research Consumption

2021 ◽  
Vol 12 (3) ◽  
pp. 140-149
Author(s):  
S. I. Parinov ◽  

Citation contexts from research papers, as a rule, contain information about the reasons and the character of using the cited research outputs. By extracting this information from the citation contexts, one can create different data sets for scientometric studies. The paper systematizes general possibilities of using data from the citation contexts for the development of the author-citation network analysis. As one of applications, the paper presents an approach to constructing the thematic structure of a research consumption based on topic modelling of the citation contexts from researchers papers. The thematic structure features built in the forms of a "word tree" and a flowchart are discussed. Possible directions of development of this approach are considered. The proposed thematic structure of the research consumption is a promising new data source for both scientometric studies and creation of new research services.

2021 ◽  
Author(s):  
Takuya Takata ◽  
Hajime Sasaki ◽  
Hiroko Yamano ◽  
Masashi Honma ◽  
Mayumi Shikano

ABSTRACTObjectivesHorizon-scanning for innovative technologies that might be applied to medical products and require new assessment approaches/regulations will help to prepare regulators, allowing earlier access to the product for patients and an improved benefit/risk ratio. In this study, we focused on the field of AI-based medical image analysis as a retrospective example of medical devices, where many products have recently been developed and applied. We proposed and validated horizon-scanning using citation network analysis and text mining for bibliographic information analysis.Methods and analysisResearch papers for citation network analysis which contain “convolutional*” OR “machine-learning” OR “deep-learning” were obtained from Science Citation Index Expanded (SCI-expanded) in the Web of Science (WoS). The citation network among those papers was converted into an unweighted network with papers as nodes and citation relationships as links. The network was then divided into clusters using the topological clustering method and the characteristics of each cluster were confirmed by extracting a summary of frequently cited academic papers, and the characteristic keywords, in the cluster.ResultsWe classified 119,553 publications obtained from SCI and grouped them into 36 clusters. Hence, it was possible to understand the academic landscape of AI applications. The key articles on AI-based medical image analysis were included in one or two clusters, suggesting that clusters specific to the technology were appropriately formed. Based on the average publication year of the constituent papers of each cluster, we tracked recent research trends. It was also suggested that significant research progress would be detected as a quick increase in constituent papers and the number of citations of hub papers in the cluster.ConclusionWe validated that citation network analysis applies to the horizon-scanning of innovative medical devices and demonstrated that AI-based electrocardiograms and electroencephalograms can lead to the development of innovative products.Article SummaryStrengths and limitations of this studyCitation network analysis can provide an academic landscape in the investigated research field, based on the citation relationship of research papers and objective information, such as characteristic keywords and publication year.It might be possible to detect possible significant research progress and the emergence of new research areas through analysis every several months.It is important to confirm the opinions of experts in this area when evaluating the results of the analysis.Information on patents and clinical trials for this analysis is currently unavailable.


2019 ◽  
Vol 48 ◽  
pp. 35-54
Author(s):  
Kamil Brzeziński

Participation has gained enormous popularity in Poland in recent years. More and more local authorities – mostly due to the influence of city activists – are decided to: conduct public consultation, implementparticipatory budgeting and other forms and tools involving residents in co-decision processes on city issues. A huge number of data sets and reports are the results of these activities. It is assumed that loads of these documents might be an easily accessible data source for urban researchers. This paper presents own way and experience of using data from participatory budgeting conducted in Łódź, as well as some suggestions and possibilities of using data obtained by participatory techniques. Kamil Brzeziński, „Dane z ulicy” – propozycje i sugestie na temat wykorzystania materiałów z procesów partycypacyjnych [„Street data” – some suggestions on the use of data from participatory processes] edited by M. Nowak, „Człowiek i Społeczeństwo” vol. XLVIII: Kuchnia badań miejskich. Studia na temat praktyki empirycznej badaczy miasta [A backstage of urban research. Studies on the empirical practices of city research scientists], Poznań 2019, pp. 35–54, Adam Mickiewicz University. ISSN 0239-3271. Kamil Brzeziński, Uniwersytet Łódzki, Katedra Socjologii Wsi i Miasta, ul. Rewolucji 1905 r. nr 41, 90-214 Łódź, [email protected]


2012 ◽  
Author(s):  
Kate C. Miller ◽  
Lindsay L. Worthington ◽  
Steven Harder ◽  
Scott Phillips ◽  
Hans Hartse ◽  
...  

2020 ◽  
Author(s):  
Amir Karami ◽  
Brandon Bookstaver ◽  
Melissa Nolan

BACKGROUND The COVID-19 pandemic has impacted nearly all aspects of life and has posed significant threats to international health and the economy. Given the rapidly unfolding nature of the current pandemic, there is an urgent need to streamline literature synthesis of the growing scientific research to elucidate targeted solutions. While traditional systematic literature review studies provide valuable insights, these studies have restrictions, including analyzing a limited number of papers, having various biases, being time-consuming and labor-intensive, focusing on a few topics, incapable of trend analysis, and lack of data-driven tools. OBJECTIVE This study fills the mentioned restrictions in the literature and practice by analyzing two biomedical concepts, clinical manifestations of disease and therapeutic chemical compounds, with text mining methods in a corpus containing COVID-19 research papers and find associations between the two biomedical concepts. METHODS This research has collected papers representing COVID-19 pre-prints and peer-reviewed research published in 2020. We used frequency analysis to find highly frequent manifestations and therapeutic chemicals, representing the importance of the two biomedical concepts. This study also applied topic modeling to find the relationship between the two biomedical concepts. RESULTS We analyzed 9,298 research papers published through May 5, 2020 and found 3,645 disease-related and 2,434 chemical-related articles. The most frequent clinical manifestations of disease terminology included COVID-19, SARS, cancer, pneumonia, fever, and cough. The most frequent chemical-related terminology included Lopinavir, Ritonavir, Oxygen, Chloroquine, Remdesivir, and water. Topic modeling provided 25 categories showing relationships between our two overarching categories. These categories represent statistically significant associations between multiple aspects of each category, some connections of which were novel and not previously identified by the scientific community. CONCLUSIONS Appreciation of this context is vital due to the lack of a systematic large-scale literature review survey and the importance of fast literature review during the current COVID-19 pandemic for developing treatments. This study is beneficial to researchers for obtaining a macro-level picture of literature, to educators for knowing the scope of literature, to journals for exploring most discussed disease symptoms and pharmaceutical targets, and to policymakers and funding agencies for creating scientific strategic plans regarding COVID-19.


2021 ◽  
Vol 13 (13) ◽  
pp. 2433
Author(s):  
Shu Yang ◽  
Fengchao Peng ◽  
Sibylle von Löwis ◽  
Guðrún Nína Petersen ◽  
David Christian Finger

Doppler lidars are used worldwide for wind monitoring and recently also for the detection of aerosols. Automatic algorithms that classify the lidar signals retrieved from lidar measurements are very useful for the users. In this study, we explore the value of machine learning to classify backscattered signals from Doppler lidars using data from Iceland. We combined supervised and unsupervised machine learning algorithms with conventional lidar data processing methods and trained two models to filter noise signals and classify Doppler lidar observations into different classes, including clouds, aerosols and rain. The results reveal a high accuracy for noise identification and aerosols and clouds classification. However, precipitation detection is underestimated. The method was tested on data sets from two instruments during different weather conditions, including three dust storms during the summer of 2019. Our results reveal that this method can provide an efficient, accurate and real-time classification of lidar measurements. Accordingly, we conclude that machine learning can open new opportunities for lidar data end-users, such as aviation safety operators, to monitor dust in the vicinity of airports.


BMJ Open ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. e043339
Author(s):  
Camila Olarte Parra ◽  
Lorenzo Bertizzolo ◽  
Sara Schroter ◽  
Agnès Dechartres ◽  
Els Goetghebeur

ObjectiveTo evaluate the consistency of causal statements in observational studies published in The BMJ.DesignReview of observational studies published in a general medical journal.Data sourceCohort and other longitudinal studies describing an exposure-outcome relationship published in The BMJ in 2018. We also had access to the submitted papers and reviewer reports.Main outcome measuresProportion of published research papers with ‘inconsistent’ use of causal language. Papers where language was consistently causal or non-causal were classified as ‘consistently causal’ or ‘consistently not causal’, respectively. For the ‘inconsistent’ papers, we then compared the published and submitted version.ResultsOf 151 published research papers, 60 described eligible studies. Of these 60, we classified the causal language used as ‘consistently causal’ (48%), ‘inconsistent’ (20%) and ‘consistently not causal’(32%). Eleven out of 12 (92%) of the ‘inconsistent’ papers were already inconsistent on submission. The inconsistencies found in both submitted and published versions were mainly due to mismatches between objectives and conclusions. One section might be carefully phrased in terms of association while the other presented causal language. When identifying only an association, some authors jumped to recommending acting on the findings as if motivated by the evidence presented.ConclusionFurther guidance is necessary for authors on what constitutes a causal statement and how to justify or discuss assumptions involved. Based on screening these papers, we provide a list of expressions beyond the obvious ‘cause’ word which may inspire a useful more comprehensive compendium on causal language.


2021 ◽  
pp. 004051752110362
Author(s):  
Ka-Po Lee ◽  
Joanne Yip ◽  
Kit-Lun Yick ◽  
Chao Lu ◽  
Chris K Lo

Receptivity towards textile-based fiber optic sensors that are used to monitor physical health is increasing as they have good flexibility, are light in weight, provide wear comfort, have electromagnetic immunity, and are electrically safe. Their superior performance has facilitated their use for obtaining close to body measurements. However, there are many related studies in the literature, so it is challenging to identify the knowledge structure and research trends. Therefore, this article aims to provide an objective and systematic literature review on textile-based fiber optic sensors that are used for monitoring health issues and to analyze their trends through a citation network analysis. A full-text search of journal articles was conducted in the Web of Science Core Collection, and a total of 625 studies was found, with 47 that were used as the sample. Also, CitNetExplorer was used for analyzing the research domains and trends. Three research domains were identified, among them, “Flexible sensors for vital signs monitoring” is the largest research cluster, and most of the articles in this cluster focus on respiratory monitoring. Therefore, this area of study should probably be on the academic radar. The collection of data on textile-based fiber optic sensors is invaluable for evaluating degree of rehabilitation, detecting diseases, preventing accidents, as well as gauging the performance and training successfulness of athletes.


Sign in / Sign up

Export Citation Format

Share Document