scholarly journals Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling

PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0251493
Author(s):  
Maxime Rivest ◽  
Etienne Vignola-Gagné ◽  
Éric Archambault

Classification schemes for scientific activity and publications underpin a large swath of research evaluation practices at the organizational, governmental, and national levels. Several research classifications are currently in use, and they require continuous work as new classification techniques becomes available and as new research topics emerge. Convolutional neural networks, a subset of “deep learning” approaches, have recently offered novel and highly performant methods for classifying voluminous corpora of text. This article benchmarks a deep learning classification technique on more than 40 million scientific articles and on tens of thousands of scholarly journals. The comparison is performed against bibliographic coupling-, direct citation-, and manual-based classifications—the established and most widely used approaches in the field of bibliometrics, and by extension, in many science and innovation policy activities such as grant competition management. The results reveal that the performance of this first iteration of a deep learning approach is equivalent to the graph-based bibliometric approaches. All methods presented are also on par with manual classification. Somewhat surprisingly, no machine learning approaches were found to clearly outperform the simple label propagation approach that is direct citation. In conclusion, deep learning is promising because it performed just as well as the other approaches but has more flexibility to be further improved. For example, a deep neural network incorporating information from the citation network is likely to hold the key to an even better classification algorithm.

2020 ◽  
pp. 016555152096277
Author(s):  
Rajmund Kleminski ◽  
Przemysiaw Kazienko ◽  
Tomasz Kajdanowicz

In our study, we examine the impact of citation network structures on the ability to discern valuable research topics in Computer Science literature. We use the bibliographic information available in the DBLP database to extract candidate phrases from scientific paper abstracts. Following that, we construct citation networks based on direct citation, co-citation and bibliographic coupling relationships between the papers. The candidate research topics, in the form of keyphrases and n-grammes, are subsequently ranked and filtered by a graph-text ranking algorithm. This selection of the highest ranked potential topics is further evaluated by domain experts and through the Wikipedia knowledge base. The results obtained from these citation networks are complementary, returning valid but non-overlapping output phrases between some pairs of networks. In particular, bibliographic coupling appears to capture more unique information than either direct citation or co-citation. These findings point towards the possible added value in combining bibliographic coupling analysis with other structures. At the same time, combining direct citation and co-citation is put into question. We expect our findings to be utilised in method design for research topic identification.


2016 ◽  
Vol 68 (5) ◽  
pp. 607-627 ◽  
Author(s):  
Antonio J. Gómez-Núñez ◽  
Benjamin Vargas-Quesada ◽  
Zaida Chinchilla-Rodríguez ◽  
Vladimir Batagelj ◽  
Félix Moya-Anegón

Purpose The purpose of this paper is to visualize the structure of SCImago Journal & Country Rank (SJR) coverage of the extensive citation network of Scopus journals, examining this bibliometric portal through an alternative approach, applying clustering and visualization techniques to a combination of citation-based links. Design/methodology/approach Three SJR journal-journal networks containing direct citation, co-citation and bibliographic coupling links are built. The three networks were then combined into a new one by summing up their values, which were later normalized through geo-normalization measure. Finally, the VOS clustering algorithm was executed and the journal clusters obtained were labeled using original SJR category tags and significant words from journal titles. Findings The resultant scientogram displays the SJR structure through a set of communities equivalent to SJR categories that represent the subject contents of the journals they cover. A higher level of aggregation by areas provides a broad view of the SJR structure, facilitating its analysis and visualization at the same time. Originality/value This is the first study using Persson’s combination of most popular citation-based links (direct citation, co-citation and bibliographic coupling) in order to develop a scientogram based on Scopus journals from SJR. The integration of the three measures along with performance of the VOS community detection algorithm gave a balanced set of clusters. The resulting scientogram is useful for assessing and validating previous classifications as well as for information retrieval and domain analysis.


2020 ◽  
pp. 1-23 ◽  
Author(s):  
Ludo Waltman ◽  
Kevin W. Boyack ◽  
Giovanni Colavizza ◽  
Nees Jan van Eck

There are many different relatedness measures, based for instance on citation relations or textual similarity, that can be used to cluster scientific publications. We propose a principled methodology for evaluating the accuracy of clustering solutions obtained using these relatedness measures. We formally show that the proposed methodology has an important consistency property. The empirical analyses that we present are based on publications in the fields of cell biology, condensed matter physics, and economics. Using the BM25 text-based relatedness measure as the evaluation criterion, we find that bibliographic coupling relations yield more accurate clustering solutions than direct citation relations and cocitation relations. The so-called extended direct citation approach performs similarly to or slightly better than bibliographic coupling in terms of the accuracy of the resulting clustering solutions. The other way around, using a citation-based relatedness measure as evaluation criterion, BM25 turns out to yield more accurate clustering solutions than other text-based relatedness measures.


2019 ◽  
Vol 2019 (1) ◽  
pp. 360-368
Author(s):  
Mekides Assefa Abebe ◽  
Jon Yngve Hardeberg

Different whiteboard image degradations highly reduce the legibility of pen-stroke content as well as the overall quality of the images. Consequently, different researchers addressed the problem through different image enhancement techniques. Most of the state-of-the-art approaches applied common image processing techniques such as background foreground segmentation, text extraction, contrast and color enhancements and white balancing. However, such types of conventional enhancement methods are incapable of recovering severely degraded pen-stroke contents and produce artifacts in the presence of complex pen-stroke illustrations. In order to surmount such problems, the authors have proposed a deep learning based solution. They have contributed a new whiteboard image data set and adopted two deep convolutional neural network architectures for whiteboard image quality enhancement applications. Their different evaluations of the trained models demonstrated their superior performances over the conventional methods.


2019 ◽  
Author(s):  
Qian Wu ◽  
Weiling Zhao ◽  
Xiaobo Yang ◽  
Hua Tan ◽  
Lei You ◽  
...  

2020 ◽  
Author(s):  
Priyanka Meel ◽  
Farhin Bano ◽  
Dr. Dinesh K. Vishwakarma

2019 ◽  
Vol 277 ◽  
pp. 02024 ◽  
Author(s):  
Lincan Li ◽  
Tong Jia ◽  
Tianqi Meng ◽  
Yizhe Liu

In this paper, an accurate two-stage deep learning method is proposed to detect vulnerable plaques in ultrasonic images of cardiovascular. Firstly, a Fully Convonutional Neural Network (FCN) named U-Net is used to segment the original Intravascular Optical Coherence Tomography (IVOCT) cardiovascular images. We experiment on different threshold values to find the best threshold for removing noise and background in the original images. Secondly, a modified Faster RCNN is adopted to do precise detection. The modified Faster R-CNN utilize six-scale anchors (122,162,322,642,1282,2562) instead of the conventional one scale or three scale approaches. First, we present three problems in cardiovascular vulnerable plaque diagnosis, then we demonstrate how our method solve these problems. The proposed method in this paper apply deep convolutional neural networks to the whole diagnostic procedure. Test results show the Recall rate, Precision rate, IoU (Intersection-over-Union) rate and Total score are 0.94, 0.885, 0.913 and 0.913 respectively, higher than the 1st team of CCCV2017 Cardiovascular OCT Vulnerable Plaque Detection Challenge. AP of the designed Faster RCNN is 83.4%, higher than conventional approaches which use one-scale or three-scale anchors. These results demonstrate the superior performance of our proposed method and the power of deep learning approaches in diagnose cardiovascular vulnerable plaques.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Shan Guleria ◽  
Tilak U. Shah ◽  
J. Vincent Pulido ◽  
Matthew Fasullo ◽  
Lubaina Ehsan ◽  
...  

AbstractProbe-based confocal laser endomicroscopy (pCLE) allows for real-time diagnosis of dysplasia and cancer in Barrett’s esophagus (BE) but is limited by low sensitivity. Even the gold standard of histopathology is hindered by poor agreement between pathologists. We deployed deep-learning-based image and video analysis in order to improve diagnostic accuracy of pCLE videos and biopsy images. Blinded experts categorized biopsies and pCLE videos as squamous, non-dysplastic BE, or dysplasia/cancer, and deep learning models were trained to classify the data into these three categories. Biopsy classification was conducted using two distinct approaches—a patch-level model and a whole-slide-image-level model. Gradient-weighted class activation maps (Grad-CAMs) were extracted from pCLE and biopsy models in order to determine tissue structures deemed relevant by the models. 1970 pCLE videos, 897,931 biopsy patches, and 387 whole-slide images were used to train, test, and validate the models. In pCLE analysis, models achieved a high sensitivity for dysplasia (71%) and an overall accuracy of 90% for all classes. For biopsies at the patch level, the model achieved a sensitivity of 72% for dysplasia and an overall accuracy of 90%. The whole-slide-image-level model achieved a sensitivity of 90% for dysplasia and 94% overall accuracy. Grad-CAMs for all models showed activation in medically relevant tissue regions. Our deep learning models achieved high diagnostic accuracy for both pCLE-based and histopathologic diagnosis of esophageal dysplasia and its precursors, similar to human accuracy in prior studies. These machine learning approaches may improve accuracy and efficiency of current screening protocols.


2020 ◽  
Author(s):  
Seokbeom Kwon ◽  
Jan Youtie ◽  
Alan L Porter

Abstract This article puts forth a new indicator of emerging technological topics as a tool for addressing challenges inherent in the evaluation of interdisciplinary research. We present this indicator and test its relationship with interdisciplinary and atypical research combinations. We perform this test by using metadata of scientific publications in three domains with different interdisciplinarity challenges: Nano-Enabled Drug Delivery, Synthetic Biology, and Autonomous Vehicles. Our analysis supports the connection between technological emergence and interdisciplinarity and atypicality in knowledge combinations. We further find that the contributions of interdisciplinary and atypical knowledge combinations to addressing emerging technological topics increase or stay constant over time. Implications for policymakers and contributions to the literature on interdisciplinarity and evaluation are provided.


2021 ◽  
Author(s):  
Isidro Lloret ◽  
José A. Troyano ◽  
Fernando Enríquez ◽  
Juan-José González-de-la-Rosa

Sign in / Sign up

Export Citation Format

Share Document