Toxic Comment Classification Based on Bidirectional Gated Recurrent Unit and Convolutional Neural Network

Author(s):  
Zhongguo Wang ◽  
Bao Zhang

For English toxic comment classification, this paper presents the model that combines Bi-GRU and CNN optimized by global average pooling (BG-GCNN) based on the bidirectional gated recurrent unit (Bi-GRU) and global pooling optimized convolution neural network (CNN) . The model treats each type of toxic comment as a binary classification. First, Bi-GRU is used to extract the time-series features of the comment and then the dimensionality is reduced through global pooling optimized convolution neural network. Finally, the classification result is output by Sigmoid function. Comparative experiments show the BG-GCNN model has a better classification effect than Text-CNN, LSTM, Bi-GRU, and other models. The Macro-F1 value of the toxic comment dataset on the Kaggle competition platform is 0.62. The F1 values of the three toxic label classification results (toxic, obscene, and insult label) are 0.81, 0.84, and 0.74, respectively, which are the highest values in the comparative experiment.

Author(s):  
P.L. Nikolaev

This article deals with method of binary classification of images with small text on them Classification is based on the fact that the text can have 2 directions – it can be positioned horizontally and read from left to right or it can be turned 180 degrees so the image must be rotated to read the sign. This type of text can be found on the covers of a variety of books, so in case of recognizing the covers, it is necessary first to determine the direction of the text before we will directly recognize it. The article suggests the development of a deep neural network for determination of the text position in the context of book covers recognizing. The results of training and testing of a convolutional neural network on synthetic data as well as the examples of the network functioning on the real data are presented.


2020 ◽  
Vol 14 ◽  
Author(s):  
Lahari Tipirneni ◽  
Rizwan Patan

Abstract:: Millions of deaths all over the world are caused by breast cancer every year. It has become the most common type of cancer in women. Early detection will help in better prognosis and increases the chance of survival. Automating the classification using Computer-Aided Diagnosis (CAD) systems can make the diagnosis less prone to errors. Multi class classification and Binary classification of breast cancer is a challenging problem. Convolutional neural network architectures extract specific feature descriptors from images, which cannot represent different types of breast cancer. This leads to false positives in classification, which is undesirable in disease diagnosis. The current paper presents an ensemble Convolutional neural network for multi class classification and Binary classification of breast cancer. The feature descriptors from each network are combined to produce the final classification. In this paper, histopathological images are taken from publicly available BreakHis dataset and classified between 8 classes. The proposed ensemble model can perform better when compared to the methods proposed in the literature. The results showed that the proposed model could be a viable approach for breast cancer classification.


2021 ◽  
Vol 11 (14) ◽  
pp. 6594
Author(s):  
Yu-Chia Hsu

The interdisciplinary nature of sports and the presence of various systemic and non-systemic factors introduce challenges in predicting sports match outcomes using a single disciplinary approach. In contrast to previous studies that use sports performance metrics and statistical models, this study is the first to apply a deep learning approach in financial time series modeling to predict sports match outcomes. The proposed approach has two main components: a convolutional neural network (CNN) classifier for implicit pattern recognition and a logistic regression model for match outcome judgment. First, the raw data used in the prediction are derived from the betting market odds and actual scores of each game, which are transformed into sports candlesticks. Second, CNN is used to classify the candlesticks time series on a graphical basis. To this end, the original 1D time series are encoded into 2D matrix images using Gramian angular field and are then fed into the CNN classifier. In this way, the winning probability of each matchup team can be derived based on historically implied behavioral patterns. Third, to further consider the differences between strong and weak teams, the CNN classifier adjusts the probability of winning the match by using the logistic regression model and then makes a final judgment regarding the match outcome. We empirically test this approach using 18,944 National Football League game data spanning 32 years and find that using the individual historical data of each team in the CNN classifier for pattern recognition is better than using the data of all teams. The CNN in conjunction with the logistic regression judgment model outperforms the CNN in conjunction with SVM, Naïve Bayes, Adaboost, J48, and random forest, and its accuracy surpasses that of betting market prediction.


2021 ◽  
Vol 13 (4) ◽  
pp. 554
Author(s):  
A. A. Masrur Ahmed ◽  
Ravinesh C Deo ◽  
Nawin Raj ◽  
Afshin Ghahramani ◽  
Qi Feng ◽  
...  

Remotely sensed soil moisture forecasting through satellite-based sensors to estimate the future state of the underlying soils plays a critical role in planning and managing water resources and sustainable agricultural practices. In this paper, Deep Learning (DL) hybrid models (i.e., CEEMDAN-CNN-GRU) are designed for daily time-step surface soil moisture (SSM) forecasts, employing the gated recurrent unit (GRU), complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), and convolutional neural network (CNN). To establish the objective model’s viability for SSM forecasting at multi-step daily horizons, the hybrid CEEMDAN-CNN-GRU model is tested at 1st, 5th, 7th, 14th, 21st, and 30th day ahead period by assimilating a comprehensive pool of 52 predictor dataset obtained from three distinct data sources. Data comprise satellite-derived Global Land Data Assimilation System (GLDAS) repository a global, high-temporal resolution, unique terrestrial modelling system, and ground-based variables from Scientific Information Landowners (SILO) and synoptic-scale climate indices. The results demonstrate the forecasting capability of the hybrid CEEMDAN-CNN-GRU model with respect to the counterpart comparative models. This is supported by a relatively lower value of the mean absolute percentage and root mean square error. In terms of the statistical score metrics and infographics employed to test the final model’s utility, the proposed CEEMDAN-CNN-GRU models are considerably superior compared to a standalone and other hybrid method tested on independent SSM data developed through feature selection approaches. Thus, the proposed approach can be successfully implemented in hydrology and agriculture management.


2020 ◽  
pp. 1-11
Author(s):  
Jie Liu ◽  
Hongbo Zhao

BACKGROUND: Convolution neural network is often superior to other similar algorithms in image classification. Convolution layer and sub-sampling layer have the function of extracting sample features, and the feature of sharing weights greatly reduces the training parameters of the network. OBJECTIVE: This paper describes the improved convolution neural network structure, including convolution layer, sub-sampling layer and full connection layer. This paper also introduces five kinds of diseases and normal eye images reflected by the blood filament of the eyeball “yan.mat” data set, convenient to use MATLAB software for calculation. METHODSL: In this paper, we improve the structure of the classical LeNet-5 convolutional neural network, and design a network structure with different convolution kernels, different sub-sampling methods and different classifiers, and use this structure to solve the problem of ocular bloodstream disease recognition. RESULTS: The experimental results show that the improved convolutional neural network structure is ideal for the recognition of eye blood silk data set, which shows that the convolution neural network has the characteristics of strong classification and strong robustness. The improved structure can classify the diseases reflected by eyeball bloodstain well.


2021 ◽  
Author(s):  
Wenjie Cao ◽  
Cheng Zhang ◽  
Zhenzhen Xiong ◽  
Ting Wang ◽  
Junchao Chen ◽  
...  

2022 ◽  
Vol 10 (1) ◽  
pp. 0-0

Brain tumor is a severe cancer disease caused by uncontrollable and abnormal partitioning of cells. Timely disease detection and treatment plans lead to the increased life expectancy of patients. Automated detection and classification of brain tumor are a more challenging process which is based on the clinician’s knowledge and experience. For this fact, one of the most practical and important techniques is to use deep learning. Recent progress in the fields of deep learning has helped the clinician’s in medical imaging for medical diagnosis of brain tumor. In this paper, we present a comparison of Deep Convolutional Neural Network models for automatically binary classification query MRI images dataset with the goal of taking precision tools to health professionals based on fined recent versions of DenseNet, Xception, NASNet-A, and VGGNet. The experiments were conducted using an MRI open dataset of 3,762 images. Other performance measures used in the study are the area under precision, recall, and specificity.


Sign in / Sign up

Export Citation Format

Share Document