Efficient Deep Learning Model for Text Classification Based on Recurrent and Convolutional Layers

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Sunil Kumar Prabhakar ◽

Dong-Ok Won

Keyword(s):

Deep Learning ◽

Language Processing ◽

Text Classification ◽

Patient Information ◽

Classification Accuracy ◽

Learning Model ◽

Training Data ◽

Machine Learning Techniques ◽

Medical Text ◽

Deep Learning Model

To unlock information present in clinical description, automatic medical text classification is highly useful in the arena of natural language processing (NLP). For medical text classification tasks, machine learning techniques seem to be quite effective; however, it requires extensive effort from human side, so that the labeled training data can be created. For clinical and translational research, a huge quantity of detailed patient information, such as disease status, lab tests, medication history, side effects, and treatment outcomes, has been collected in an electronic format, and it serves as a valuable data source for further analysis. Therefore, a huge quantity of detailed patient information is present in the medical text, and it is quite a huge challenge to process it efficiently. In this work, a medical text classification paradigm, using two novel deep learning architectures, is proposed to mitigate the human efforts. The first approach is that a quad channel hybrid long short-term memory (QC-LSTM) deep learning model is implemented utilizing four channels, and the second approach is that a hybrid bidirectional gated recurrent unit (BiGRU) deep learning model with multihead attention is developed and implemented successfully. The proposed methodology is validated on two medical text datasets, and a comprehensive analysis is conducted. The best results in terms of classification accuracy of 96.72% is obtained with the proposed QC-LSTM deep learning model, and a classification accuracy of 95.76% is obtained with the proposed hybrid BiGRU deep learning model.

Download Full-text

Hierarchical Graph Transformer-Based Deep Learning Model for Large-Scale Multi-Label Text Classification

IEEE Access ◽

10.1109/access.2020.2972751 ◽

2020 ◽

Vol 8 ◽

pp. 30885-30896 ◽

Cited By ~ 3

Author(s):

Jibing Gong ◽

Hongyuan Ma ◽

Zhiyong Teng ◽

Qi Teng ◽

Hekai Zhang ◽

...

Keyword(s):

Deep Learning ◽

Text Classification ◽

Large Scale ◽

Learning Model ◽

Hierarchical Graph ◽

Deep Learning Model

Download Full-text

Learning From How Human Correct

10.36227/techrxiv.13647974.v2 ◽

2021 ◽

Author(s):

Tong Guo

Keyword(s):

Deep Learning ◽

Text Classification ◽

Classification Accuracy ◽

Noisy Data ◽

Learning Model ◽

Simple Method ◽

Know How ◽

Novel Method ◽

Deep Learning Model

In industry NLP application, our manually labeled data has a certain number of noisy data. We present a simple method to find the noisy data and relabel them manually, meanwhile we collect the correction information. Then we present novel method to incorporate the human correction information into deep learning model. Human know how to correct noisy data. So the correction information can be inject into deep learning model. We do the experiment on our own text classification dataset, which is manually labeled, because we relabel the noisy data in our dataset for our industry application. The experiment result shows that our method improve the classification accuracy from 91.7% to 92.5%. The 91.7% baseline is based on BERT training on the corrected dataset, which is hard to surpass.

Download Full-text

Learning From How Human Correct

10.36227/techrxiv.13647974.v1 ◽

2021 ◽

Author(s):

Tong Guo

Keyword(s):

Deep Learning ◽

Text Classification ◽

Classification Accuracy ◽

Noisy Data ◽

Learning Model ◽

Simple Method ◽

Know How ◽

Novel Method ◽

Deep Learning Model

In industry NLP application, our manually labeled data has a certain number of noisy data. We present a simple method to find the noisy data and relabel them manually, meanwhile we collect the correction information. Then we present novel method to incorporate the human correction information into deep learning model. Human know how to correct noisy data. So the correction information can be inject into deep learning model. We do the experiment on our own text classification dataset, which is manually labeled, because we relabel the noisy data in our dataset for our industry application. The experiment result shows that our method improve the classification accuracy from 91.7% to 92.5%. The 91.7% baseline is based on BERT training on the corrected dataset, which is hard to surpass.

Download Full-text

Learning From How Human Correct

10.36227/techrxiv.13647974 ◽

2021 ◽

Author(s):

Tong Guo

Keyword(s):

Deep Learning ◽

Text Classification ◽

Classification Accuracy ◽

Noisy Data ◽

Learning Model ◽

Simple Method ◽

Know How ◽

Novel Method ◽

Deep Learning Model

In industry NLP application, our manually labeled data has a certain number of noisy data. We present a simple method to find the noisy data and relabel them manually, meanwhile we collect the correction information. Then we present novel method to incorporate the human correction information into deep learning model. Human know how to correct noisy data. So the correction information can be inject into deep learning model. We do the experiment on our own text classification dataset, which is manually labeled, because we relabel the noisy data in our dataset for our industry application. The experiment result shows that our method improve the classification accuracy from 91.7% to 92.5%. The 91.7% baseline is based on BERT training on the corrected dataset, which is hard to surpass.

Download Full-text

A Hybrid Deep Learning Model for Text Classification

2018 14th International Conference on Semantics, Knowledge and Grids (SKG) ◽

10.1109/skg.2018.00014 ◽

2018 ◽

Author(s):

Xianglong Chen ◽

Chunping Ouyang ◽

Yongbin Liu ◽

Lingyun Luo ◽

Xiaohua Yang

Keyword(s):

Deep Learning ◽

Text Classification ◽

Learning Model ◽

Deep Learning Model

Download Full-text

Learning From Human Correction For Data-Centric Deep Learning

10.36227/techrxiv.13647974.v6 ◽

2021 ◽

Author(s):

Tong Guo

Keyword(s):

Deep Learning ◽

Text Classification ◽

Classification Accuracy ◽

Noisy Data ◽

Learning Model ◽

Simple Method ◽

Know How ◽

Novel Method ◽

Deep Learning Model

In industry NLP application, our manually labeled data has a certain number of noisy data. We present a simple method to find the noisy data and relabel them manually, meanwhile we collect the correction information. Then we present novel method to incorporate the human correction information into deep learning model. Human know how to correct noisy data. So the correction information can be inject into deep learning model. We do the experiment on our own text classification dataset, which is manually labeled, because we relabel the noisy data in our dataset for our industry application. The experiment result shows that our method improve the classification accuracy from 91.7% to 92.5%. The 91.7% baseline is based on BERT training on the corrected dataset, which is hard to surpass.

Download Full-text

Technical Domain Classification of Bangla Text using BERT

Proceedings of Intelligent Computing and Technologies Conference ◽

10.21467/proceedings.115.16 ◽

2021 ◽

Author(s):

Koyel Ghosh ◽

Apurbalal Senapati

Keyword(s):

Deep Learning ◽

Computer Science ◽

Communication Technology ◽

Text Classification ◽

Language Model ◽

Learning Model ◽

Coarse Grained ◽

Deep Learning Model

Coarse-grained tasks are primarily based on Text classification, one of the earliest problems in NLP, and these tasks are done on document and sentence levels. Here, our goal is to identify the technical domain of a given Bangla text. In Coarse-grained technical domain classification, such a piece of the Bangla text provides information about specific Coarse-grained technical domains like Biochemistry (bioche), Communication Technology (com-tech), Computer Science (cse), Management (mgmt), Physics (phy) Etc. This paper uses a recent deep learning model called the Bangla Bidirectional Encoder Representations Transformers (Bangla BERT) mechanism to identify the domain of a given text. Bangla BERT (Bangla-Bert-Base) is a pretrained language model of the Bangla language. Later, we discuss the Bangla BERT accuracy and compare it with other models that solve the same problem.

Download Full-text