scholarly journals Rider Weed Deep Residual Network-Based Incremental Model for Text Classification Using Multidimensional Features and MapReduce Framework

Abstract The full text of this preprint has been withdrawn by the authors as it was submitted and made public without the full consent of all the authors. Therefore, the authors do not wish this work to be cited as a reference. Questions should be directed to the corresponding author.

2021 ◽  
Author(s):  
Hemn Barzan Abdalla

Abstract The increasing demand for information and rapid growth of big data has dramatically increased textual data. The amount of different kinds of data has led to the overloading of information. For obtaining useful text information, the classification of texts is considered an imperative task. This paper develops a technique for text classification in big data using the MapReduce model. The goal is to design a hybrid optimization algorithm for classifying the text. Here, the pre-processing is done with the steaming process and stop word removal. In addition, the Extraction of imperative features is performed wherein SentiWordNet features, contextual features, and thematic features are generated. Furthermore, the selection of optimal features is performed using Tanimoto similarity. The Tanimoto similarity method estimates the similarity between the features and selects the relevant features with higher feature selection accuracy. After that, a deep residual network is utilized for dynamic text classification. The Adam algorithm trains the deep residual network. In addition, the dynamic learning is performed with the proposed Rider invasive weed optimization (RIWO)-based deep residual network along with fuzzy theory. The proposed RIWO algorithm combines Invasive weed optimization (IWO) and the Rider optimization algorithm (ROA). The method mentioned above is solved under the MapReduce framework. The proposed RIWO-based deep residual network outperformed other techniques with the highest True positive rate (TPR) of 85%, True negative rate (TNR) of 94%, and accuracy of 88.7%.


2020 ◽  
pp. 1-11
Author(s):  
Dawei Yu ◽  
Jie Yang ◽  
Yun Zhang ◽  
Shujuan Yu

The Densely Connected Network (DenseNet) has been widely recognized as a highly competitive architecture in Deep Neural Networks. And its most outstanding property is called Dense Connections, which represent each layer’s input by concatenating all the preceding layers’ outputs and thus improve the performance by encouraging feature reuse to the extreme. However, it is Dense Connections that cause the challenge of dimension-enlarging, making DenseNet very resource-intensive and low efficiency. In the light of this, inspired by the Residual Network (ResNet), we propose an improved DenseNet named Additive DenseNet, which features replacing concatenation operations (used in Dense Connections) with addition operations (used in ResNet), and in terms of feature reuse, it upgrades addition operations to accumulating operations (namely ∑ (·)), thus enables each layer’s input to be the summation of all the preceding layers’ outputs. Consequently, Additive DenseNet can not only preserve the dimension of input from enlarging, but also retain the effect of Dense Connections. In this paper, Additive DenseNet is applied to text classification task. The experimental results reveal that compared to DenseNet, our Additive DenseNet can reduce the model complexity by a large margin, such as GPU memory usage and quantity of parameters. And despite its high resource economy, Additive DenseNet can still outperform DenseNet on 6 text classification datasets in terms of accuracy and show competitive performance for model training.


Author(s):  
Zhichao Li ◽  
Helen Gurgel ◽  
Nadine Dessay ◽  
Luojia Hu ◽  
Lei Xu ◽  
...  

In recent years there has been an increasing use of satellite Earth observation (EO) data in dengue research, in particular the identification of landscape factors affecting dengue transmission. Summarizing landscape factors and satellite EO data sources, and making the information public are helpful for guiding future research and improving health decision-making. In this case, a review of the literature would appear to be an appropriate tool. However, this is not an easy-to-use tool. The review process mainly includes defining the topic, searching, screening at both title/abstract and full-text levels and data extraction that needs consistent knowledge from experts and is time-consuming and labor intensive. In this context, this study integrates the review process, text scoring, active learning (AL) mechanism, and bidirectional long short-term memory (BiLSTM) networks, and proposes a semi-supervised text classification framework that enables the efficient and accurate selection of the relevant articles. Specifically, text scoring and BiLSTM-based active learning were used to replace the title/abstract screening and full-text screening, respectively, which greatly reduces the human workload. In this study, 101 relevant articles were selected from 4 bibliographic databases, and a catalogue of essential dengue landscape factors was identified and divided into four categories: land use (LU), land cover (LC), topography and continuous land surface features. Moreover, various satellite EO sensors and products used for identifying landscape factors were tabulated. Finally, possible future directions of applying satellite EO data in dengue research in terms of landscape patterns, satellite sensors and deep learning were proposed. The proposed semi-supervised text classification framework was successfully applied in research evidence synthesis that could be easily applied to other topics, particularly in an interdisciplinary context.


2020 ◽  
Vol 29 (5) ◽  
pp. 880-886
Author(s):  
Chuanhua Zhou ◽  
Jiayi Zhou ◽  
Cai Yu ◽  
Wei Zhao ◽  
Ruilin Pan

Author(s):  
Carlos Adriano Gonçalves ◽  
Eva Lorenzo Iglesias ◽  
Lourdes Borrajo ◽  
Rui Camacho ◽  
Adrián Seara Vieira ◽  
...  

2021 ◽  
Author(s):  
Huihui Xu ◽  
Jaromir Savelka ◽  
Kevin D. Ashley

In this paper, we treat sentence annotation as a classification task. We employ sequence-to-sequence models to take sentence position information into account in identifying case law sentences as issues, conclusions, or reasons. We also compare the legal domain specific sentence embedding with other general purpose sentence embeddings to gauge the effect of legal domain knowledge, captured during pre-training, on text classification. We deployed the models on both summaries and full-text decisions. We found that the sentence position information is especially useful for full-text sentence classification. We also verified that legal domain specific sentence embeddings perform better, and that meta-sentence embedding can further enhance performance when sentence position information is included.


1998 ◽  
pp. 46-52
Author(s):  
S. V. Rabotkina

A huge place in the spiritual life of medieval Rusich was occupied by the Bible, although for a long time Kievan Rus did not know it fully. The full text of the Holy Scriptures appears in the Church Slavonic language not earlier than 1499.


Sign in / Sign up

Export Citation Format

Share Document