scholarly journals News Text Classification Method and Simulation Based on the Hybrid Deep Learning Model

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Ningfeng Sun ◽  
Chengye Du

This paper uses the database as the data source, using bibliometrics and visual analysis methods, to statistically analyze the relevant documents published in the field of text classification in the past ten years, to clarify the development context and research status of the text classification field, and to predict the research in the field of text classification priorities and research frontiers. Based on the in-depth study of the background, research status, related theories, and developments of online news text classification, this article analyzes the annual publication trend, subject distribution, journal distribution, institution distribution, author distribution, highly cited literature analysis, and research hotspots. Forefront and other aspects clarify the development context and research status of the text classification field and provide a theoretical reference for the further development of the text classification field. Then, on the basis of systematic research on text classification, deep learning, and news text classification theories, a deep learning-based network news text classification model is constructed, and the function of each module is introduced in detail, which will help the future news text classification of application and improvement provide theoretical basis. On the basis of the predecessors, this article separately studied and improved the neural network model based on the convolutional neural network, cyclic neural network, and attention mechanism and merged the three models into one model, which can obtain local associated features and contextual features and highlight the role of keywords. Finally, experiments are used to verify the effectiveness of the model proposed in this paper and compared with traditional text classification to prove the superiority of the network news text classification based on deep learning proposed in this paper. This article aims to study the internal connection between news comments and the number of votes received by news comments, and through the proposed model, the number of votes for news comments can be predicted.


2018 ◽  
Vol 10 (11) ◽  
pp. 113 ◽  
Author(s):  
Yue Li ◽  
Xutao Wang ◽  
Pengjian Xu

Text classification is of importance in natural language processing, as the massive text information containing huge amounts of value needs to be classified into different categories for further use. In order to better classify text, our paper tries to build a deep learning model which achieves better classification results in Chinese text than those of other researchers’ models. After comparing different methods, long short-term memory (LSTM) and convolutional neural network (CNN) methods were selected as deep learning methods to classify Chinese text. LSTM is a special kind of recurrent neural network (RNN), which is capable of processing serialized information through its recurrent structure. By contrast, CNN has shown its ability to extract features from visual imagery. Therefore, two layers of LSTM and one layer of CNN were integrated to our new model: the BLSTM-C model (BLSTM stands for bi-directional long short-term memory while C stands for CNN.) LSTM was responsible for obtaining a sequence output based on past and future contexts, which was then input to the convolutional layer for extracting features. In our experiments, the proposed BLSTM-C model was evaluated in several ways. In the results, the model exhibited remarkable performance in text classification, especially in Chinese texts.



2021 ◽  
Vol 5 (3) ◽  
pp. 584-593
Author(s):  
Naufal Hilmiaji ◽  
Kemas Muslim Lhaksmana ◽  
Mahendra Dwifebri Purbolaksono

especially with the advancement of deep learning methods for text classification. Despite some effort to identify emotion on Indonesian tweets, its performance evaluation results have not achieved acceptable numbers. To solve this problem, this paper implements a classification model using a convolutional neural network (CNN), which has demonstrated expected performance in text classification. To easily compare with the previous research, this classification is performed on the same dataset, which consists of 4,403 tweets in Indonesian that were labeled using five different emotion classes: anger, fear, joy, love, and sadness. The performance evaluation results achieve the precision, recall, and F1-score at respectively 90.1%, 90.3%, and 90.2%, while the highest accuracy achieves 89.8%. These results outperform previous research that classifies the same classification on the same dataset.



PLoS ONE ◽  
2021 ◽  
Vol 16 (3) ◽  
pp. e0247984
Author(s):  
Xuyang Wang ◽  
Yixuan Tong

With the rapid development of the mobile internet, people are becoming more dependent on the internet to express their comments on products or stores; meanwhile, text sentiment classification of these comments has become a research hotspot. In existing methods, it is fairly popular to apply a deep learning method to the text classification task. Aiming at solving information loss, weak context and other problems, this paper makes an improvement based on the transformer model to reduce the difficulty of model training and training time cost and achieve higher overall model recall and accuracy in text sentiment classification. The transformer model replaces the traditional convolutional neural network (CNN) and the recurrent neural network (RNN) and is fully based on the attention mechanism; therefore, the transformer model effectively improves the training speed and reduces training difficulty. This paper selects e-commerce reviews as research objects and applies deep learning theory. First, the text is preprocessed by word vectorization. Then the IN standardized method and the GELUs activation function are applied based on the original model to analyze the emotional tendencies of online users towards stores or products. The experimental results show that our method improves by 9.71%, 6.05%, 5.58% and 5.12% in terms of recall and approaches the peak level of the F1 value in the test model by comparing BiLSTM, Naive Bayesian Model, the serial BiLSTM_CNN model and BiLSTM with an attention mechanism model. Therefore, this finding proves that our method can be used to improve the text sentiment classification accuracy and effectively apply the method to text classification.



2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Young-Gon Kim ◽  
Sungchul Kim ◽  
Cristina Eunbee Cho ◽  
In Hye Song ◽  
Hee Jin Lee ◽  
...  

AbstractFast and accurate confirmation of metastasis on the frozen tissue section of intraoperative sentinel lymph node biopsy is an essential tool for critical surgical decisions. However, accurate diagnosis by pathologists is difficult within the time limitations. Training a robust and accurate deep learning model is also difficult owing to the limited number of frozen datasets with high quality labels. To overcome these issues, we validated the effectiveness of transfer learning from CAMELYON16 to improve performance of the convolutional neural network (CNN)-based classification model on our frozen dataset (N = 297) from Asan Medical Center (AMC). Among the 297 whole slide images (WSIs), 157 and 40 WSIs were used to train deep learning models with different dataset ratios at 2, 4, 8, 20, 40, and 100%. The remaining, i.e., 100 WSIs, were used to validate model performance in terms of patch- and slide-level classification. An additional 228 WSIs from Seoul National University Bundang Hospital (SNUBH) were used as an external validation. Three initial weights, i.e., scratch-based (random initialization), ImageNet-based, and CAMELYON16-based models were used to validate their effectiveness in external validation. In the patch-level classification results on the AMC dataset, CAMELYON16-based models trained with a small dataset (up to 40%, i.e., 62 WSIs) showed a significantly higher area under the curve (AUC) of 0.929 than those of the scratch- and ImageNet-based models at 0.897 and 0.919, respectively, while CAMELYON16-based and ImageNet-based models trained with 100% of the training dataset showed comparable AUCs at 0.944 and 0.943, respectively. For the external validation, CAMELYON16-based models showed higher AUCs than those of the scratch- and ImageNet-based models. Model performance for slide feasibility of the transfer learning to enhance model performance was validated in the case of frozen section datasets with limited numbers.



2019 ◽  
Vol 14 (1) ◽  
pp. 124-134 ◽  
Author(s):  
Shuai Zhang ◽  
Yong Chen ◽  
Xiaoling Huang ◽  
Yishuai Cai

Online feedback is an effective way of communication between government departments and citizens. However, the daily high number of public feedbacks has increased the burden on government administrators. The deep learning method is good at automatically analyzing and extracting deep features of data, and then improving the accuracy of classification prediction. In this study, we aim to use the text classification model to achieve the automatic classification of public feedbacks to reduce the work pressure of administrator. In particular, a convolutional neural network model combined with word embedding and optimized by differential evolution algorithm is adopted. At the same time, we compared it with seven common text classification models, and the results show that the model we explored has good classification performance under different evaluation metrics, including accuracy, precision, recall, and F1-score.



Author(s):  
Noha Ali ◽  
Ahmed H. AbuEl-Atta ◽  
Hala H. Zayed

<span id="docs-internal-guid-cb130a3a-7fff-3e11-ae3d-ad2310e265f8"><span>Deep learning (DL) algorithms achieved state-of-the-art performance in computer vision, speech recognition, and natural language processing (NLP). In this paper, we enhance the convolutional neural network (CNN) algorithm to classify cancer articles according to cancer hallmarks. The model implements a recent word embedding technique in the embedding layer. This technique uses the concept of distributed phrase representation and multi-word phrases embedding. The proposed model enhances the performance of the existing model used for biomedical text classification. The result of the proposed model overcomes the previous model by achieving an F-score equal to 83.87% using an unsupervised technique that trained on PubMed abstracts called PMC vectors (PMCVec) embedding. Also, we made another experiment on the same dataset using the recurrent neural network (RNN) algorithm with two different word embeddings Google news and PMCVec which achieving F-score equal to 74.9% and 76.26%, respectively.</span></span>



10.2196/23230 ◽  
2021 ◽  
Vol 9 (8) ◽  
pp. e23230
Author(s):  
Pei-Fu Chen ◽  
Ssu-Ming Wang ◽  
Wei-Chih Liao ◽  
Lu-Cheng Kuo ◽  
Kuan-Chih Chen ◽  
...  

Background The International Classification of Diseases (ICD) code is widely used as the reference in medical system and billing purposes. However, classifying diseases into ICD codes still mainly relies on humans reading a large amount of written material as the basis for coding. Coding is both laborious and time-consuming. Since the conversion of ICD-9 to ICD-10, the coding task became much more complicated, and deep learning– and natural language processing–related approaches have been studied to assist disease coders. Objective This paper aims at constructing a deep learning model for ICD-10 coding, where the model is meant to automatically determine the corresponding diagnosis and procedure codes based solely on free-text medical notes to improve accuracy and reduce human effort. Methods We used diagnosis records of the National Taiwan University Hospital as resources and apply natural language processing techniques, including global vectors, word to vectors, embeddings from language models, bidirectional encoder representations from transformers, and single head attention recurrent neural network, on the deep neural network architecture to implement ICD-10 auto-coding. Besides, we introduced the attention mechanism into the classification model to extract the keywords from diagnoses and visualize the coding reference for training freshmen in ICD-10. Sixty discharge notes were randomly selected to examine the change in the F1-score and the coding time by coders before and after using our model. Results In experiments on the medical data set of National Taiwan University Hospital, our prediction results revealed F1-scores of 0.715 and 0.618 for the ICD-10 Clinical Modification code and Procedure Coding System code, respectively, with a bidirectional encoder representations from transformers embedding approach in the Gated Recurrent Unit classification model. The well-trained models were applied on the ICD-10 web service for coding and training to ICD-10 users. With this service, coders can code with the F1-score significantly increased from a median of 0.832 to 0.922 (P<.05), but not in a reduced interval. Conclusions The proposed model significantly improved the F1-score but did not decrease the time consumed in coding by disease coders.



Author(s):  
Victoria Wu

Introduction: Scoliosis, an excessive curvature of the spine, affects approximately 1 in 1,000 individuals. As a result, there have formerly been implementations of mandatory scoliosis screening procedures. Screening programs are no longer widely used as the harms often outweigh the benefits; it causes many adolescents to undergo frequent diagnosis X-ray procedure This makes spinal ultrasounds an ideal substitute for scoliosis screening in patients, as it does not expose them to those levels of radiation. Spinal curvatures can be accurately computed from the location of spinal transverse processes, by measuring the vertebral angle from a reference line [1]. However, ultrasound images are less clear than x-ray images, making it difficult to identify the spinal processes. To overcome this, we employ deep learning using a convolutional neural network, which is a powerful tool for computer vision and image classification [2]. Method: A total of 2,752 ultrasound images were recorded from a spine phantom to train a convolutional neural network. Subsequently, we took another recording of 747 images to be used for testing. All the ultrasound images from the scans were then segmented manually, using the 3D Slicer (www.slicer.org) software. Next, the dataset was fed through a convolutional neural network. The network used was a modified version of GoogLeNet (Inception v1), with 2 linearly stacked inception models. This network was chosen because it provided a balance between accurate performance, and time efficient computations. Results: Deep learning classification using the Inception model achieved an accuracy of 84% for the phantom scan.  Conclusion: The classification model performs with considerable accuracy. Better accuracy needs to be achieved, possibly with more available data and improvements in the classification model.  Acknowledgements: G. Fichtinger is supported as a Canada Research Chair in Computer-Integrated Surgery. This work was funded, in part, by NIH/NIBIB and NIH/NIGMS (via grant 1R01EB021396-01A1 - Slicer+PLUS: Point-of-Care Ultrasound) and by CANARIE’s Research Software Program.    Figure 1: Ultrasound scan containing a transverse process (left), and ultrasound scan containing no transverse process (right).                                Figure 2: Accuracy of classification for training (red) and validation (blue). References:           Ungi T, King F, Kempston M, Keri Z, Lasso A, Mousavi P, Rudan J, Borschneck DP, Fichtinger G. Spinal Curvature Measurement by Tracked Ultrasound Snapshots. Ultrasound in Medicine and Biology, 40(2):447-54, Feb 2014.           Krizhevsky A, Sutskeyer I, Hinton GE. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems 25:1097-1105. 



2020 ◽  
Vol 47 (11) ◽  
pp. 1054-1060
Author(s):  
Gihyeon Choi ◽  
Youngjin Jang ◽  
Harksoo Kim ◽  
Kwanwoo Kim


2021 ◽  
Author(s):  
Zhiqiang Liu ◽  
Jingkun Feng ◽  
Zhihao Yang ◽  
Lei Wang

BACKGROUND With the development of biomedicine, the number of biomedical documents has increased rapidly, which brings a great challenge for researchers retrieving the information they need. Information retrieval aims to meet this challenge by searching relevant documents from abundant documents based on the given query. However, sometimes the relevance of search results needs to be evaluated from multiple aspects in some specific retrieval tasks and thereby increases the difficulty of biomedical information retrieval. OBJECTIVE This study aims to find a more systematic method to retrieve relevant scientific literature for a given patient. METHODS In the initial retrieval stage, we supplement query terms through query expansion strategies and apply query boosting to obtain an initial ranking list of relevant documents. In the re-ranking phase, we employ a text classification model and relevance matching model to evaluate documents respectively from different dimensions, then we combine the outputs through logistic regression to re-rank all the documents from the initial ranking list. RESULTS The proposed ensemble method contributes to the improvement of biomedical retrieval performance. Comparing with the existing deep learning-based methods, experimental results show that our method achieves state-of-the-art performance on the data collection provided by TREC 2019 Precision Medicine Track. CONCLUSIONS In this paper, we propose a novel ensemble method based on deep learning. As shown in the experiments, the strategies we used in the initial retrieval phase such as query expansion and query boosting are effective. The application of the text classification model and the relevance matching model can better capture semantic context information and improve retrieval performance.



Sign in / Sign up

Export Citation Format

Share Document