A Text Classification Model to Identify Performance Bonds Requirement in Public Bidding Notices

Author(s):  
Urias Cruz da Cunha ◽  
Ricardo Silva Carvalho ◽  
Alexandre Zaghetto
2019 ◽  
Vol 14 (1) ◽  
pp. 124-134 ◽  
Author(s):  
Shuai Zhang ◽  
Yong Chen ◽  
Xiaoling Huang ◽  
Yishuai Cai

Online feedback is an effective way of communication between government departments and citizens. However, the daily high number of public feedbacks has increased the burden on government administrators. The deep learning method is good at automatically analyzing and extracting deep features of data, and then improving the accuracy of classification prediction. In this study, we aim to use the text classification model to achieve the automatic classification of public feedbacks to reduce the work pressure of administrator. In particular, a convolutional neural network model combined with word embedding and optimized by differential evolution algorithm is adopted. At the same time, we compared it with seven common text classification models, and the results show that the model we explored has good classification performance under different evaluation metrics, including accuracy, precision, recall, and F1-score.


Author(s):  
Noha Ali ◽  
Ahmed H. AbuEl-Atta ◽  
Hala H. Zayed

<span id="docs-internal-guid-cb130a3a-7fff-3e11-ae3d-ad2310e265f8"><span>Deep learning (DL) algorithms achieved state-of-the-art performance in computer vision, speech recognition, and natural language processing (NLP). In this paper, we enhance the convolutional neural network (CNN) algorithm to classify cancer articles according to cancer hallmarks. The model implements a recent word embedding technique in the embedding layer. This technique uses the concept of distributed phrase representation and multi-word phrases embedding. The proposed model enhances the performance of the existing model used for biomedical text classification. The result of the proposed model overcomes the previous model by achieving an F-score equal to 83.87% using an unsupervised technique that trained on PubMed abstracts called PMC vectors (PMCVec) embedding. Also, we made another experiment on the same dataset using the recurrent neural network (RNN) algorithm with two different word embeddings Google news and PMCVec which achieving F-score equal to 74.9% and 76.26%, respectively.</span></span>


Author(s):  
Han-joon Kim

This chapter introduces two practical techniques for improving Naïve Bayes text classifiers that are widely used for text classification. The Naïve Bayes has been evaluated to be a practical text classification algorithm due to its simple classification model, reasonable classification accuracy, and easy update of classification model. Thus, many researchers have a strong incentive to improve the Naïve Bayes by combining it with other meta-learning approaches such as EM (Expectation Maximization) and Boosting. The EM approach is to combine the Naïve Bayes with the EM algorithm and the Boosting approach is to use the Naïve Bayes as a base classifier in the AdaBoost algorithm. For both approaches, a special uncertainty measure fit for Naïve Bayes learning is used. In the Naïve Bayes learning framework, these approaches are expected to be practical solutions to the problem of lack of training documents in text classification systems.


2018 ◽  
Vol 10 (11) ◽  
pp. 113 ◽  
Author(s):  
Yue Li ◽  
Xutao Wang ◽  
Pengjian Xu

Text classification is of importance in natural language processing, as the massive text information containing huge amounts of value needs to be classified into different categories for further use. In order to better classify text, our paper tries to build a deep learning model which achieves better classification results in Chinese text than those of other researchers’ models. After comparing different methods, long short-term memory (LSTM) and convolutional neural network (CNN) methods were selected as deep learning methods to classify Chinese text. LSTM is a special kind of recurrent neural network (RNN), which is capable of processing serialized information through its recurrent structure. By contrast, CNN has shown its ability to extract features from visual imagery. Therefore, two layers of LSTM and one layer of CNN were integrated to our new model: the BLSTM-C model (BLSTM stands for bi-directional long short-term memory while C stands for CNN.) LSTM was responsible for obtaining a sequence output based on past and future contexts, which was then input to the convolutional layer for extracting features. In our experiments, the proposed BLSTM-C model was evaluated in several ways. In the results, the model exhibited remarkable performance in text classification, especially in Chinese texts.


Energies ◽  
2020 ◽  
Vol 13 (17) ◽  
pp. 4522
Author(s):  
Kai Chen ◽  
Rabea Jamil Mahfoud ◽  
Yonghui Sun ◽  
Dongliang Nan ◽  
Kaike Wang ◽  
...  

In the process of the operation and maintenance of secondary devices in smart substation, a wealth of defect texts containing the state information of the equipment is generated. Aiming to overcome the low efficiency and low accuracy problems of artificial power text classification and mining, combined with the characteristics of power equipment defect texts, a defect texts mining method for a secondary device in a smart substation is proposed, which integrates global vectors for word representation (GloVe) method and attention-based bidirectional long short-term memory (BiLSTM-Attention) method in one model. First, the characteristics of the defect texts are analyzed and preprocessed to improve the quality of the defect texts. Then, defect texts are segmented into words, and the words are mapped to the high-dimensional feature space based on the global vectors for word representation (GloVe) model to form distributed word vectors. Finally, a text classification model based on BiLSTM-Attention was proposed to classify the defect texts of a secondary device. Precision, Recall and F1-score are selected as evaluation indicators, and compared with traditional machine learning and deep learning models. The analysis of a case study shows that the BiLSTM-Attention model has better performance and can achieve the intelligent, accurate and efficient classification of secondary device defect texts. It can assist the operation and maintenance personnel to make scientific maintenance decisions on a secondary device and improve the level of intelligent management of equipment.


Sign in / Sign up

Export Citation Format

Share Document