A Novel Deep Neural Network-Based Approach to Measure Scholarly Research Dissemination Using Citations Network

We investigated the scientific research dissemination by analyzing the publications and citation data, implying that not all citations are significantly important. Therefore, as alluded to existing state-of-the-art models that employ feature-based techniques to measure the scholarly research dissemination between multiple entities, our model implements the convolutional neural network (CNN) with fastText-based pre-trained embedding vectors, utilizes only the citation context as its input to distinguish between important and non-important citations. Moreover, we speculate using focal-loss and class weight methods to address the inherited class imbalance problems in citation classification datasets. Using a dataset of 10 K annotated citation contexts, we achieved an accuracy of 90.7% along with a 90.6% f1-score, in the case of binary classification. Finally, we present a case study to measure the comprehensiveness of our deployed model on a dataset of 3100 K citations taken from the ACL Anthology Reference Corpus. We employed state-of-the-art graph visualization open-source tool Gephi to analyze the various aspects of citation network graphs, for each respective citation behavior.

Download Full-text

Evaluating semantometrics from computer science publications

Scientometrics ◽

10.1007/s11192-020-03409-5 ◽

2020 ◽

Vol 125 (3) ◽

pp. 2915-2954

Author(s):

Christin Katharina Kreutz ◽

Premtim Sahitaj ◽

Ralf Schenkel

Keyword(s):

State Of The Art ◽

Binary Classification ◽

Citation Network ◽

Distance Measures ◽

Semantic Features ◽

Citation Networks ◽

Current State ◽

Publication Time ◽

Citation Practices ◽

Vector Representations

AbstractIdentification of important works and assessment of importance of publications in vast scientific corpora are challenging yet common tasks subjected by many research projects. While the influence of citations in finding seminal papers has been analysed thoroughly, citation-based approaches come with several problems. Their impracticality when confronted with new publications which did not yet receive any citations, area-dependent citation practices and different reasons for citing are only a few drawbacks of them. Methods relying on more than citations, for example semantic features such as words or topics contained in publications of citation networks, are regarded with less vigour while providing promising preliminary results. In this work we tackle the issue of classifying publications with their respective referenced and citing papers as either seminal, survey or uninfluential by utilising semantometrics. We use distance measures over words, semantics, topics and publication years of papers in their citation network to engineer features on which we predict the class of a publication. We present the SUSdblp dataset consisting of 1980 labelled entries to provide a means of evaluating this approach. A classification accuracy of up to .9247 was achieved when combining multiple types of features using semantometrics. This is +.1232 compared to the current state of the art (SOTA) which uses binary classification to identify papers from classes seminal and survey. The utilisation of one-vector representations for the ternary classification task resulted in an accuracy of .949 which is +.1475 compared to the binary SOTA. Classification based on information available at publication time derived with semantometrics resulted in an accuracy of .8152 while an accuracy of .9323 could be achieved when using one-vector representations.

Download Full-text

Effective human detection via multi-model classification and adaptive late fusion

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s021969131840012x ◽

2018 ◽

Vol 16 (02) ◽

pp. 1840012

Author(s):

Chao Zhu ◽

Xu-Cheng Yin

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

State Of The Art ◽

Human Detection ◽

Detection Methods ◽

Late Fusion ◽

Model Classification ◽

Detection Approach ◽

Feature Based ◽

Multiple State

Human detection serves as an important basis to achieve certain video surveillance-oriented biometrics such as gait, face and actions since the first step is to find and locate human targets in surveillance scenes. In the literature, channel feature-based methods and deep neural network-based methods are two most popular kinds of human detection approaches, with their own advantages. However, there is not much effort on the study of their combination to take full advantage of these two kinds of approaches. Therefore in this paper, we propose an effective human detection approach by combining multiple state-of-the-art deep neural network-based and channel feature-based methods with an adaptive late fusion strategy. The key idea of our approach is to explore complementary information of different state-of-the-art detection methods and to find an appropriate way to combine their strong points for better performance. The proposed approach is evaluated on several standard human detection benchmarks, and shows its effectiveness by achieving superior performances to the other state-of-the-art methods on most evaluation settings.

Download Full-text

Multi-Class Imbalance in Text Classification: A Feature Engineering Approach to Detect Cyberbullying in Twitter

Informatics ◽

10.3390/informatics7040052 ◽

2020 ◽

Vol 7 (4) ◽

pp. 52

Author(s):

Bandeh Ali Talpur ◽

Declan O’Sullivan

Keyword(s):

Binary Classification ◽

Class Imbalance ◽

Age Group ◽

Learning Classifier ◽

Semantic Orientation ◽

Medium Level ◽

Twitter Account ◽

Feature Based ◽

Multi Class Classification ◽

High Level

Twitter enables millions of active users to send and read concise messages on the internet every day. Yet some people use Twitter to propagate violent and threatening messages resulting in cyberbullying. Previous research has focused on whether cyberbullying behavior exists or not in a tweet (binary classification). In this research, we developed a model for detecting the severity of cyberbullying in a tweet. The developed model is a feature-based model that uses features from the content of a tweet, to develop a machine learning classifier for classifying the tweets as non-cyberbullied, and low, medium, or high-level cyberbullied tweets. In this study, we introduced pointwise semantic orientation as a new input feature along with utilizing predicted features (gender, age, and personality type) and Twitter API features. Results from experiments with our proposed framework in a multi-class setting are promising both with respect to Kappa (84%), classifier accuracy (93%), and F-measure (92%) metric. Overall, 40% of the classifiers increased performance in comparison with baseline approaches. Our analysis shows that features with the highest odd ratio: for detecting low-level severity include: age group between 19–22 years and users with <1 year of Twitter account activation; for medium-level severity: neuroticism, age group between 23–29 years, and being a Twitter user between one to two years; and for high-level severity: neuroticism and extraversion, and the number of times tweet has been favorited by other users. We believe that this research using a multi-class classification approach provides a step forward in identifying severity at different levels (low, medium, high) when the content of a tweet is classified as cyberbullied. Lastly, the current study only focused on the Twitter platform; other social network platforms can be investigated using the same approach to detect cyberbullying severity patterns.

Download Full-text

K-margin-based Residual-Convolution-Recurrent Neural Network for Atrial Fibrillation Detection

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/839 ◽

2019 ◽

Cited By ~ 2

Author(s):

Yuxi Zhou ◽

Shenda Hong ◽

Junyuan Shang ◽

Meng Wu ◽

Qingyun Wang ◽

...

Keyword(s):

Neural Network ◽

Atrial Fibrillation ◽

Recurrent Neural Network ◽

State Of The Art ◽

Class Imbalance ◽

Heart Rhythm ◽

Error Rates ◽

Detection Task ◽

Training Data

Atrial Fibrillation (AF) is an abnormal heart rhythm which can trigger cardiac arrest and sudden death. Nevertheless, its interpretation is mostly done by medical experts due to high error rates of computerized interpretation. One study found that only about 66% of AF were correctly recognized from noisy ECGs. This is in part due to insufficient training data, class skewness, as well as semantical ambiguities caused by noisy segments in an ECG record. In this paper, we propose a K-margin-based Residual-Convolution-Recurrent neural network (K-margin-based RCR-net) for AF detection from noisy ECGs. In detail, a skewness-driven dynamic augmentation method is employed to handle the problems of data inadequacy and class imbalance. A novel RCR-net is proposed to automatically extract both long-term rhythm-level and local heartbeat-level characters. Finally, we present a K-margin-based diagnosis model to automatically focus on the most important parts of an ECG record and handle noise by naturally exploiting expected consistency among the segments associated for each record. The experimental results demonstrate that the proposed method with 0.8125 F1NAOP score outperforms all state-of-the-art deep learning methods for AF detection task by 6.8%.

Download Full-text

A novel focal-loss and class-weight-aware convolutional neural network for the classification of in-text citations

Journal of Information Science ◽

10.1177/0165551521991022 ◽

2021 ◽

pp. 016555152199102

Author(s):

Naif Radi Aljohani ◽

Ayman Fayoumi ◽

Saeed-Ul Hassan

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Computational Linguistics ◽

Class Imbalance ◽

Weight Functions ◽

Accuracy Score ◽

Citation Context ◽

Feature Based ◽

Class Weight ◽

Classification Tasks

We argue that citations, as they have different reasons and functions, should not all be treated in the same way. Using the large, annotated dataset of about 10K citation contexts annotated by human experts, extracted from the Association for Computational Linguistics repository, we present a deep learning–based citation context classification architecture. Unlike all existing state-of-the-art feature-based citation classification models, our proposed convolutional neural network (CNN) with fastText-based pre-trained embedding vectors uses only the citation context as its input to outperform them in both binary- (important and non-important) and multi-class (Use, Extends, CompareOrContrast, Motivation, Background, Other) citation classification tasks. Furthermore, we propose using focal-loss and class-weight functions in the CNN model to overcome the inherited class imbalance issues in citation classification datasets. We show that using the focal-loss function with CNN adds a factor of [Formula: see text] to the cross-entropy function. Our model improves on the baseline results by achieving an encouraging 90.6 F1 score with 90.7% accuracy and a 72.3 F1 score with a 72.1% accuracy score, respectively, for binary- and multi-class citation classification tasks.

Download Full-text

GEV-NN: A deep neural network architecture for class imbalance problem in binary classification

Knowledge-Based Systems ◽

10.1016/j.knosys.2020.105534 ◽

2020 ◽

Vol 194 ◽

pp. 105534 ◽

Cited By ~ 3

Author(s):

Lkhagvadorj Munkhdalai ◽

Tsendsuren Munkhdalai ◽

Keun Ho Ryu

Keyword(s):

Neural Network ◽

Network Architecture ◽

Deep Neural Network ◽

Binary Classification ◽

Class Imbalance ◽

Neural Network Architecture ◽

Class Imbalance Problem ◽

Imbalance Problem

Download Full-text

Method of determination of the text direction on the image with the use of convolutional neural network

Informatization and communication ◽

10.34219/2078-8320-2020-11-2-96-99 ◽

2020 ◽

pp. 96-99

Author(s):

P.L. Nikolaev

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Deep Neural Network ◽

Binary Classification ◽

Synthetic Data ◽

Real Data ◽

Method Of Determination ◽

Classification Of Images

This article deals with method of binary classification of images with small text on them Classification is based on the fact that the text can have 2 directions – it can be positioned horizontally and read from left to right or it can be turned 180 degrees so the image must be rotated to read the sign. This type of text can be found on the covers of a variety of books, so in case of recognizing the covers, it is necessary first to determine the direction of the text before we will directly recognize it. The article suggests the development of a deep neural network for determination of the text position in the context of book covers recognizing. The results of training and testing of a convolutional neural network on synthetic data as well as the examples of the network functioning on the real data are presented.

Download Full-text

Acoustic and Articulatory Feature Based Speech Rate Estimation Using a Convolutional Dense Neural Network

10.21437/interspeech.2019-2295 ◽

2019 ◽

Cited By ~ 1

Author(s):

Renuka Mannem ◽

Jhansi Mallela ◽

Aravind Illa ◽

Prasanta Kumar Ghosh

Keyword(s):

Neural Network ◽

Speech Rate ◽

Rate Estimation ◽

Feature Based

Download Full-text

Architecture Optimization Model for the Deep Neural Network For Binary Classification Problems

International Journal of Intelligent Computing and Information Sciences ◽

10.21608/ijicis.2020.18509.1008 ◽

2020 ◽

Vol 0 (0) ◽

pp. 0-0

Author(s):

Kingsley Ukaoha ◽

Efosa Igodan

Keyword(s):

Neural Network ◽

Optimization Model ◽

Deep Neural Network ◽

Binary Classification ◽

Classification Problems ◽

Architecture Optimization

Download Full-text

A Study on Multi Class Classification from Breast Cancer Images using Ensemble Network and Transfer Learning

Recent Patents on Engineering ◽

10.2174/1872212114999201109205421 ◽

2020 ◽

Vol 14 ◽

Author(s):

Lahari Tipirneni ◽

Rizwan Patan

Keyword(s):

Breast Cancer ◽

Neural Network ◽

Convolutional Neural Network ◽

Binary Classification ◽

Disease Diagnosis ◽

Feature Descriptors ◽

Histopathological Images ◽

Viable Approach ◽

Multi Class Classification

Abstract:: Millions of deaths all over the world are caused by breast cancer every year. It has become the most common type of cancer in women. Early detection will help in better prognosis and increases the chance of survival. Automating the classification using Computer-Aided Diagnosis (CAD) systems can make the diagnosis less prone to errors. Multi class classification and Binary classification of breast cancer is a challenging problem. Convolutional neural network architectures extract specific feature descriptors from images, which cannot represent different types of breast cancer. This leads to false positives in classification, which is undesirable in disease diagnosis. The current paper presents an ensemble Convolutional neural network for multi class classification and Binary classification of breast cancer. The feature descriptors from each network are combined to produce the final classification. In this paper, histopathological images are taken from publicly available BreakHis dataset and classified between 8 classes. The proposed ensemble model can perform better when compared to the methods proposed in the literature. The results showed that the proposed model could be a viable approach for breast cancer classification.

Download Full-text