scholarly journals Pearson Correlation-Based Feature Selection for Document Classification Using Balanced Training

Sensors ◽  
2020 ◽  
Vol 20 (23) ◽  
pp. 6793
Author(s):  
Inzamam Mashood Nasir ◽  
Muhammad Attique Khan ◽  
Mussarat Yasmin ◽  
Jamal Hussain Shah ◽  
Marcin Gabryel ◽  
...  

Documents are stored in a digital form across several organizations. Printing this amount of data and placing it into folders instead of storing digitally is against the practical, economical, and ecological perspective. An efficient way of retrieving data from digitally stored documents is also required. This article presents a real-time supervised learning technique for document classification based on deep convolutional neural network (DCNN), which aims to reduce the impact of adverse document image issues such as signatures, marks, logo, and handwritten notes. The proposed technique’s major steps include data augmentation, feature extraction using pre-trained neural network models, feature fusion, and feature selection. We propose a novel data augmentation technique, which normalizes the imbalanced dataset using the secondary dataset RVL-CDIP. The DCNN features are extracted using the VGG19 and AlexNet networks. The extracted features are fused, and the fused feature vector is optimized by applying a Pearson correlation coefficient-based technique to select the optimized features while removing the redundant features. The proposed technique is tested on the Tobacco3482 dataset, which gives a classification accuracy of 93.1% using a cubic support vector machine classifier, proving the validity of the proposed technique.

2020 ◽  
Vol 10 (14) ◽  
pp. 4966 ◽  
Author(s):  
Maryam Nisa ◽  
Jamal Hussain Shah ◽  
Shansa Kanwal ◽  
Mudassar Raza ◽  
Muhammad Attique Khan ◽  
...  

As the number of internet users increases so does the number of malicious attacks using malware. The detection of malicious code is becoming critical, and the existing approaches need to be improved. Here, we propose a feature fusion method to combine the features extracted from pre-trained AlexNet and Inception-v3 deep neural networks with features attained using segmentation-based fractal texture analysis (SFTA) of images representing the malware code. In this work, we use distinctive pre-trained models (AlexNet and Inception-V3) for feature extraction. The purpose of deep convolutional neural network (CNN) feature extraction from two models is to improve the malware classifier accuracy, because both models have characteristics and qualities to extract different features. This technique produces a fusion of features to build a multimodal representation of malicious code that can be used to classify the grayscale images, separating the malware into 25 malware classes. The features that are extracted from malware images are then classified using different variants of support vector machine (SVM), k-nearest neighbor (KNN), decision tree (DT), and other classifiers. To improve the classification results, we also adopted data augmentation based on affine image transforms. The presented method is evaluated on a Malimg malware image dataset, achieving an accuracy of 99.3%, which makes it the best among the competing approaches.


2020 ◽  
Vol 39 (6) ◽  
pp. 8927-8935
Author(s):  
Bing Zheng ◽  
Dawei Yun ◽  
Yan Liang

Under the impact of COVID-19, research on behavior recognition are highly needed. In this paper, we combine the algorithm of self-adaptive coder and recurrent neural network to realize the research of behavior pattern recognition. At present, most of the research of human behavior recognition is focused on the video data, which is based on the video number. At the same time, due to the complexity of video image data, it is easy to violate personal privacy. With the rapid development of Internet of things technology, it has attracted the attention of a large number of experts and scholars. Researchers have tried to use many machine learning methods, such as random forest, support vector machine and other shallow learning methods, which perform well in the laboratory environment, but there is still a long way to go from practical application. In this paper, a recursive neural network algorithm based on long and short term memory (LSTM) is proposed to realize the recognition of behavior patterns, so as to improve the accuracy of human activity behavior recognition.


Friction ◽  
2021 ◽  
Author(s):  
Xiaobin Hu ◽  
Jian Song ◽  
Zhenhua Liao ◽  
Yuhong Liu ◽  
Jian Gao ◽  
...  

AbstractFinding the correct category of wear particles is important to understand the tribological behavior. However, manual identification is tedious and time-consuming. We here propose an automatic morphological residual convolutional neural network (M-RCNN), exploiting the residual knowledge and morphological priors between various particle types. We also employ data augmentation to prevent performance deterioration caused by the extremely imbalanced problem of class distribution. Experimental results indicate that our morphological priors are distinguishable and beneficial to largely boosting overall performance. M-RCNN demonstrates a much higher accuracy (0.940) than the deep residual network (0.845) and support vector machine (0.821). This work provides an effective solution for automatically identifying wear particles and can be a powerful tool to further analyze the failure mechanisms of artificial joints.


PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0250782
Author(s):  
Bin Wang ◽  
Bin Xu

With the rapid development of Unmanned Aerial Vehicles, vehicle detection in aerial images plays an important role in different applications. Comparing with general object detection problems, vehicle detection in aerial images is still a challenging research topic since it is plagued by various unique factors, e.g. different camera angle, small vehicle size and complex background. In this paper, a Feature Fusion Deep-Projection Convolution Neural Network is proposed to enhance the ability to detect small vehicles in aerial images. The backbone of the proposed framework utilizes a novel residual block named stepwise res-block to explore high-level semantic features as well as conserve low-level detail features at the same time. A specially designed feature fusion module is adopted in the proposed framework to further balance the features obtained from different levels of the backbone. A deep-projection deconvolution module is used to minimize the impact of the information contamination introduced by down-sampling/up-sampling processes. The proposed framework has been evaluated by UCAS-AOD, VEDAI, and DOTA datasets. According to the evaluation results, the proposed framework outperforms other state-of-the-art vehicle detection algorithms for aerial images.


2019 ◽  
Vol 2019 ◽  
pp. 1-14
Author(s):  
Renzhou Gui ◽  
Tongjie Chen ◽  
Han Nie

With the continuous development of science, more and more research results have proved that machine learning is capable of diagnosing and studying the major depressive disorder (MDD) in the brain. We propose a deep learning network with multibranch and local residual feedback, for four different types of functional magnetic resonance imaging (fMRI) data produced by depressed patients and control people under the condition of listening to positive- and negative-emotions music. We use the large convolution kernel of the same size as the correlation matrix to match the features and obtain the results of feature matching of 264 regions of interest (ROIs). Firstly, four-dimensional fMRI data are used to generate the two-dimensional correlation matrix of one person’s brain based on ROIs and then processed by the threshold value which is selected according to the characteristics of complex network and small-world network. After that, the deep learning model in this paper is compared with support vector machine (SVM), logistic regression (LR), k-nearest neighbor (kNN), a common deep neural network (DNN), and a deep convolutional neural network (CNN) for classification. Finally, we further calculate the matched ROIs from the intermediate results of our deep learning model which can help related fields further explore the pathogeny of depression patients.


2020 ◽  
Vol 22 (4) ◽  
pp. 900-915 ◽  
Author(s):  
Xiao-ying Bi ◽  
Bo Li ◽  
Wen-long Lu ◽  
Xin-zhi Zhou

Abstract Accurate daily runoff prediction plays an important role in the management and utilization of water resources. In order to improve the accuracy of prediction, this paper proposes a deep neural network (CAGANet) composed of a convolutional layer, an attention mechanism, a gated recurrent unit (GRU) neural network, and an autoregressive (AR) model. Given that the daily runoff sequence is abrupt and unstable, it is difficult for a single model and combined model to obtain high-precision daily runoff predictions directly. Therefore, this paper uses a linear interpolation method to enhance the stability of hydrological data and apply the augmented data to the CAGANet model, the support vector machine (SVM) model, the long short-term memory (LSTM) neural network and the attention-mechanism-based LSTM model (AM-LSTM). The comparison results show that among the four models based on data augmentation, the CAGANet model proposed in this paper has the best prediction accuracy. Its Nash–Sutcliffe efficiency can reach 0.993. Therefore, the CAGANet model based on data augmentation is a feasible daily runoff forecasting scheme.


2011 ◽  
Vol 403-408 ◽  
pp. 3805-3812 ◽  
Author(s):  
Kong Hui Guo ◽  
Xian Yun Wang

Nonparametric models of hydraulic damper based on support vector regression (SVR) are developed. Then these models are compared with two kinds neural network models. One is backpropagation neural network (BPNN) model; another is radial basis function neural network (RBFNN) model. Comparisons are carried out both on virtual damper and actual damper. The force-velocity relation of a virtual damper is obtained based on a rheological model. Then these data are used to identify the characteristics of the virtual damper. The dynamometer measurements of an actual displacement-dependent damper are obtained by experiment. And these data are used to identify the characteristics of this actual damper. The comparisons show that BPNN model is best at identifying the characteristics of the virtual damper, but SVR model is best at identifying the characteristics of the actual damper. The reason is that all experimental data include noise more or less. When the amplitude of the noise is smaller than the parameter of SVR, the noise can not affect the construction of the resulting model. So when training a model based on the experimental data, SVR is superior to other neural networks methods.


2020 ◽  
Vol 31 (3) ◽  
pp. 287-296
Author(s):  
Ahmed A. Moustafa ◽  
Angela Porter ◽  
Ahmed M. Megreya

AbstractMany students suffer from anxiety when performing numerical calculations. Mathematics anxiety is a condition that has a negative effect on educational outcomes and future employment prospects. While there are a multitude of behavioral studies on mathematics anxiety, its underlying cognitive and neural mechanism remain unclear. This article provides a systematic review of cognitive studies that investigated mathematics anxiety. As there are no prior neural network models of mathematics anxiety, this article discusses how previous neural network models of mathematical cognition could be adapted to simulate the neural and behavioral studies of mathematics anxiety. In other words, here we provide a novel integrative network theory on the links between mathematics anxiety, cognition, and brain substrates. This theoretical framework may explain the impact of mathematics anxiety on a range of cognitive and neuropsychological tests. Therefore, it could improve our understanding of the cognitive and neurological mechanisms underlying mathematics anxiety and also has important applications. Indeed, a better understanding of mathematics anxiety could inform more effective therapeutic techniques that in turn could lead to significant improvements in educational outcomes.


Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6008 ◽  
Author(s):  
Misbah Farooq ◽  
Fawad Hussain ◽  
Naveed Khan Baloch ◽  
Fawad Riasat Raja ◽  
Heejung Yu ◽  
...  

Speech emotion recognition (SER) plays a significant role in human–machine interaction. Emotion recognition from speech and its precise classification is a challenging task because a machine is unable to understand its context. For an accurate emotion classification, emotionally relevant features must be extracted from the speech data. Traditionally, handcrafted features were used for emotional classification from speech signals; however, they are not efficient enough to accurately depict the emotional states of the speaker. In this study, the benefits of a deep convolutional neural network (DCNN) for SER are explored. For this purpose, a pretrained network is used to extract features from state-of-the-art speech emotional datasets. Subsequently, a correlation-based feature selection technique is applied to the extracted features to select the most appropriate and discriminative features for SER. For the classification of emotions, we utilize support vector machines, random forests, the k-nearest neighbors algorithm, and neural network classifiers. Experiments are performed for speaker-dependent and speaker-independent SER using four publicly available datasets: the Berlin Dataset of Emotional Speech (Emo-DB), Surrey Audio Visual Expressed Emotion (SAVEE), Interactive Emotional Dyadic Motion Capture (IEMOCAP), and the Ryerson Audio Visual Dataset of Emotional Speech and Song (RAVDESS). Our proposed method achieves an accuracy of 95.10% for Emo-DB, 82.10% for SAVEE, 83.80% for IEMOCAP, and 81.30% for RAVDESS, for speaker-dependent SER experiments. Moreover, our method yields the best results for speaker-independent SER with existing handcrafted features-based SER approaches.


Author(s):  
Ilona Jagielska ◽  

An important task in knowledge discovery is feature selection. This paper describes a practical approach to feature subset selection proposed as part of a hybrid rough sets/neural network framework for knowledge discovery for decision support. In this framework neural networks and rough sets are combined and used cooperatively during the system life cycle. The reason for combining rough sets with neural networks in the proposed framework is twofold. Firstly, rough sets based systems provide domain knowledge expressed in the form of If-then rules as well as tools for data analysis. Secondly, rough sets are used in this framework in the task of feature selection for neural network models. This paper examines the feature selection aspect of the framework. An empirical study that tested the approach on artificial datasets and real-world datasets was carried out. Experimental results indicate that the proposed approach can improve the performance of neural network models. The framework was also applied in the development of a real-world decision support system. The experience with this application has shown that the approach can support the users in the task of feature selection.


Sign in / Sign up

Export Citation Format

Share Document