scholarly journals An Extended Text Combination Classification Model for Short Video Based on Albert

2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Yi Liu ◽  
Yue Zhang ◽  
Haidong Hu ◽  
Xiaodong Liu ◽  
Lun Zhang ◽  
...  

With the rise and rapid development of short video sharing websites, the number of short videos on the Internet has been growing explosively. The organization and classification of short videos have become the basis for the effective use of short videos, which is also a problem faced by major short video platforms. Aiming at the characteristics of complex short video content categories and rich extended text information, this paper uses methods in the text classification field to solve the short video classification problem. Compared with the traditional way of classifying and understanding short video key frames, this method has the characteristics of lower computational cost, more accurate classification results, and easier application. This paper proposes a text classification model based on the attention mechanism of multitext embedding short video extension. The experiment first uses the training language model Albert to extract sentence-level vectors and then uses the attention mechanism to study the text information in various short video extensions in a short video classification weight factor. And this research applied Google’s unsupervised data augmentation (UDA) method based on unsupervised data, creatively combining it with the Chinese knowledge graph, and realized TF-IDF word replacement. During the training process, we introduced a large amount of unlabeled data, which significantly improved the accuracy of model classification. The final series of related experiments is aimed at comparing with the existing short video title classification methods, classification methods based on video key frames, and hybrid methods, and proving that the method proposed in this article is more accurate and robust on the test set.

PLoS ONE ◽  
2021 ◽  
Vol 16 (3) ◽  
pp. e0247984
Author(s):  
Xuyang Wang ◽  
Yixuan Tong

With the rapid development of the mobile internet, people are becoming more dependent on the internet to express their comments on products or stores; meanwhile, text sentiment classification of these comments has become a research hotspot. In existing methods, it is fairly popular to apply a deep learning method to the text classification task. Aiming at solving information loss, weak context and other problems, this paper makes an improvement based on the transformer model to reduce the difficulty of model training and training time cost and achieve higher overall model recall and accuracy in text sentiment classification. The transformer model replaces the traditional convolutional neural network (CNN) and the recurrent neural network (RNN) and is fully based on the attention mechanism; therefore, the transformer model effectively improves the training speed and reduces training difficulty. This paper selects e-commerce reviews as research objects and applies deep learning theory. First, the text is preprocessed by word vectorization. Then the IN standardized method and the GELUs activation function are applied based on the original model to analyze the emotional tendencies of online users towards stores or products. The experimental results show that our method improves by 9.71%, 6.05%, 5.58% and 5.12% in terms of recall and approaches the peak level of the F1 value in the test model by comparing BiLSTM, Naive Bayesian Model, the serial BiLSTM_CNN model and BiLSTM with an attention mechanism model. Therefore, this finding proves that our method can be used to improve the text sentiment classification accuracy and effectively apply the method to text classification.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Wenjing Lu ◽  
Wei Jiang ◽  
Na Zhang ◽  
Feng Xue

Adverse nursing events occur suddenly, unpredictably, or unexpectedly during course of clinical diagnosis and treatment processes in the hospitals. These events adversely affect the patient’s diagnosis and treatment results and even increase the patient’s pain and burden. Additionally, It is high likely to cause accidents and disputes and affect normal medical work and personnel safety and is not conducive to the development of the health system. Due to the rapid development of modern medicine, health and safety of patients have become the most concerned issue in society and patient safety is an important part of medical care management. Research and events have shown that classified management of adverse nursing events, event analysis, and improvement measures are beneficial, specifically to the health system, to continuously improve the quality of medical care and reduce the occurrence of adverse nursing events. In the management of adverse nursing events, it is very important to categorize the text reports of adverse nursing events and divide these into different categories and levels. Traditional reports of adverse nursing events are mostly unstructured and simple data, often relying on manual classification, which is difficult to analyze. Furthermore, data is relatively inaccurate and practical reference significance is not obvious. In this paper, we have extensively evaluated various deep learning-based classification methods which are specifically designed for the healthcare systems. It becomes possible with the development of science and technology; text classification methods based on deep learning are gradually entering people’s field of vision. Additionally, we have proposed a text classification model for adverse nursing events in the health system. Experiments and data comparison test of both the proposed deep learning-based method and existing methods in the text classification of nursing adverse events effect are performed. These results show the exceptional performance of the proposed mechanism in terms of various evaluation metrics.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Ming Gao ◽  
Weiwei Cai ◽  
Runmin Liu

As a hot research topic, sports video classification research has a wide range of applications in switched TV, video on demand, smart TV, and other fields and is closely related to people’s lives. Under this background, sports video classification research has aroused great interest in people. However, the existing methods usually use manual video classification, which the workers themselves often influence. It is challenging to ensure the accuracy of the results, leading to the wrong classification. Due to these limitations, we introduce neural network technology to the automatic classification of sports. This paper proposed a novel attention-based graph convolution-guided third-order hourglass network (AGTH-Net) classification model. First, we designed a kind of figure convolution model based on the attention mechanism. The model is the key to introduce the attention mechanism for neighborhood node weights’ allocation. It reduces the impact of error nodes in the neighborhood while avoiding manual weight assignment. Second, according to the sports complex video image characteristics, we use the third-order hourglass network structure. It is used for the extraction and fusion of multiscale characteristics of sports. In addition, in the hourglass, internal network residual-intensive modules are introduced, realizing characteristics in different levels of network transfer and reuse. It is helpful for maximum details to feature extracting and enhancing the network expression ability. Comparison and ablation experiments are also carried out to prove the effectiveness and superiority of the proposed algorithm.


Author(s):  
Yakobus Wiciaputra ◽  
Julio Young ◽  
Andre Rusli

With the large amount of text information circulating on the internet, there is a need of a solution that can help processing data in the form of text for various purposes. In Indonesia, text information circulating on the internet generally uses 2 languages, English and Indonesian. This research focuses in building a model that is able to classify text in more than one language, or also commonly known as multilingual text classification. The multilingual text classification will use the XLM-RoBERTa model in its implementation. This study applied the transfer learning concept used by XLM-RoBERTa to build a classification model for texts in Indonesian using only the English News Dataset as a training dataset with Matthew Correlation Coefficient value of 42.2%. The results of this study also have the highest accuracy value when tested on a large English News Dataset (37,886) with Matthew Correlation Coefficient value of 90.8%, accuracy of 93.3%, precision of 93.4%, recall of 93.3%, and F1 of 93.3% and the accuracy value when tested on a large Indonesian News Dataset (70,304) with Matthew Correlation Coefficient value of 86.4%, accuracy, precision, recall, and F1 values of 90.2% using the large size Mixed News Dataset (108,190) in the model training process. Keywords: Multilingual Text Classification, Natural Language Processing, News Dataset, Transfer Learning, XLM-RoBERTa


2020 ◽  
Vol 309 ◽  
pp. 03015
Author(s):  
Wenbin Liu ◽  
Bojian Wen ◽  
Shang Gao ◽  
Jiesheng Zheng ◽  
Yinlong Zheng

Text classification is a common application in natural language processing. We proposed a multi-label text classification model based on ELMo and attention mechanism which help solve the problem for the sentiment classification task that there is no grammar or writing convention in power supply related text and the sentiment related information disperses in the text. Firstly, we use pre-trained word embedding vector to extract the feature of text from the Internet. Secondly, the analyzed deep information features are weighted according to the attention mechanism. Finally, an improved ELMo model in which we replace the LSTM module with GRU module is used to characterize the text and information is classified. The experimental results on Kaggle’s toxic comment classification data set show that the accuracy of sentiment classification is as high as 98%.


2020 ◽  
Vol 309 ◽  
pp. 03016 ◽  
Author(s):  
Jutao Huang ◽  
Jiesheng Zheng ◽  
Shang Gao ◽  
Wenbin Liu ◽  
Jiaxin Lin

With the rapid development of network technology, the electric power Internet of Things needs to face a large number of electronic texts and a large number of distributed data access and analysis requirements. If the system wants to complete accurate and efficient data analysis and build an existing data and service standard system covering the entire chain of energy and power business on the existing basis, it must implement massive electronic text retrieval, information extraction and classification in the power grid system. In order to achieve this purpose, a DNN neural network classification model is constructed to classify the text information of the power grid, and the effectiveness of the method is verified by experiments based on data from the substation information system.


2021 ◽  
Vol 11 (18) ◽  
pp. 8554
Author(s):  
Krzysztof Fiok ◽  
Waldemar Karwowski ◽  
Edgar Gutierrez ◽  
Mohammad Reza Davahli ◽  
Maciej Wilamowski ◽  
...  

The quality of text classification has greatly improved with the introduction of deep learning, and more recently, models using attention mechanism. However, to address the problem of classifying text instances that are longer than the length limit adopted by most of the best performing transformer models, the most common method is to naively truncate the text so that it meets the model limit. Researchers have proposed other approaches, but they do not appear to be popular, because of their high computational cost and implementation complexity. Recently, another method called Text Guide has been proposed, which allows for text truncation that outperforms the naive approach and simultaneously is less complex and costly than earlier proposed solutions. Our study revisits Text Guide by testing the influence of certain modifications on the method’s performance. We found that some aspects of the method can be altered to further improve performance and confirmed several assumptions regarding the dependence of the method’s quality on certain factors.


Sign in / Sign up

Export Citation Format

Share Document