An Improved Bees Algorithm for Training Deep Recurrent Networks for Sentiment Classification

Recurrent neural networks (RNNs) are powerful tools for learning information from temporal sequences. Designing an optimum deep RNN is difficult due to configuration and training issues, such as vanishing and exploding gradients. In this paper, a novel metaheuristic optimisation approach is proposed for training deep RNNs for the sentiment classification task. The approach employs an enhanced Ternary Bees Algorithm (BA-3+), which operates for large dataset classification problems by considering only three individual solutions in each iteration. BA-3+ combines the collaborative search of three bees to find the optimal set of trainable parameters of the proposed deep recurrent learning architecture. Local learning with exploitative search utilises the greedy selection strategy. Stochastic gradient descent (SGD) learning with singular value decomposition (SVD) aims to handle vanishing and exploding gradients of the decision parameters with the stabilisation strategy of SVD. Global learning with explorative search achieves faster convergence without getting trapped at local optima to find the optimal set of trainable parameters of the proposed deep recurrent learning architecture. BA-3+ has been tested on the sentiment classification task to classify symmetric and asymmetric distribution of the datasets from different domains, including Twitter, product reviews, and movie reviews. Comparative results have been obtained for advanced deep language models and Differential Evolution (DE) and Particle Swarm Optimization (PSO) algorithms. BA-3+ converged to the global minimum faster than the DE and PSO algorithms, and it outperformed the SGD, DE, and PSO algorithms for the Turkish and English datasets. The accuracy value and F1 measure have improved at least with a 30–40% improvement than the standard SGD algorithm for all classification datasets. Accuracy rates in the RNN model trained with BA-3+ ranged from 80% to 90%, while the RNN trained with SGD was able to achieve between 50% and 60% for most datasets. The performance of the RNN model with BA-3+ has as good as for Tree-LSTMs and Recursive Neural Tensor Networks (RNTNs) language models, which achieved accuracy results of up to 90% for some datasets. The improved accuracy and convergence results show that BA-3+ is an efficient, stable algorithm for the complex classification task, and it can handle the vanishing and exploding gradients problem of deep RNNs.

Download Full-text

Automated Source Code Generation and Auto-Completion Using Deep Learning: Comparing and Discussing Current Language Model-Related Approaches

AI ◽

10.3390/ai2010001 ◽

2021 ◽

Vol 2 (1) ◽

pp. 1-16

Author(s):

Juan Cruz-Benito ◽

Sanjay Vishwakarma ◽

Francisco Martin-Fernandez ◽

Ismael Faro

Keyword(s):

Deep Learning ◽

Learning Community ◽

Programming Languages ◽

Language Processing ◽

Code Generation ◽

Language Model ◽

Language Models ◽

Stochastic Gradient Descent ◽

Network Architectures ◽

Learning Architectures

In recent years, the use of deep learning in language models has gained much attention. Some research projects claim that they can generate text that can be interpreted as human writing, enabling new possibilities in many application areas. Among the different areas related to language processing, one of the most notable in applying this type of modeling is programming languages. For years, the machine learning community has been researching this software engineering area, pursuing goals like applying different approaches to auto-complete, generate, fix, or evaluate code programmed by humans. Considering the increasing popularity of the deep learning-enabled language models approach, we found a lack of empirical papers that compare different deep learning architectures to create and use language models based on programming code. This paper compares different neural network architectures like Average Stochastic Gradient Descent (ASGD) Weight-Dropped LSTMs (AWD-LSTMs), AWD-Quasi-Recurrent Neural Networks (QRNNs), and Transformer while using transfer learning and different forms of tokenization to see how they behave in building language models using a Python dataset for code generation and filling mask tasks. Considering the results, we discuss each approach’s different strengths and weaknesses and what gaps we found to evaluate the language models or to apply them in a real programming context.

Download Full-text

Pre-Trained Transformer-Based Language Models for Sundanese

10.21203/rs.3.rs-907893/v1 ◽

2021 ◽

Author(s):

Wilson Wongso ◽

Henry Lucky ◽

Derwin Suhartono

Keyword(s):

Natural Language ◽

Text Classification ◽

Training Data ◽

Language Models ◽

Classification Task ◽

Language Understanding ◽

Training Corpus ◽

Low Resource ◽

Corpus Size ◽

Fine Tune

Abstract The Sundanese language has over 32 million speakers worldwide, but the language has reaped little to no benefits from the recent advances in natural language understanding. Like other low-resource languages, the only alternative is to fine-tune existing multilingual models. In this paper, we pre-trained three monolingual Transformer-based language models on Sundanese data. When evaluated on a downstream text classification task, we found that most of our monolingual models outperformed larger multilingual models despite the smaller overall pre-training data. In the subsequent analyses, our models benefited strongly from the Sundanese pre-training corpus size and do not exhibit socially biased behavior. We released our models for other researchers and practitioners to use.

Download Full-text

Application and Need-Based Architecture Design of Deep Neural Networks

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800142052014x ◽

2020 ◽

Vol 34 (13) ◽

pp. 2052014 ◽

Cited By ~ 1

Author(s):

Soniya ◽

Sandeep Paul ◽

Lotika Singh

Keyword(s):

Genetic Algorithm ◽

Network Structure ◽

Gradient Descent ◽

Stochastic Gradient Descent ◽

Number Of Layers ◽

Effective Manner ◽

Compact Genetic Algorithm ◽

Benchmark Datasets ◽

The Cost ◽

Optimal Set

This paper applies a hybrid evolutionary approach to a convolutional neural network (CNN) and determines the number of layers and filters based on the application and user need. It integrates compact genetic algorithm with stochastic gradient descent (SGD) for simultaneously evolving structure and parameters of the CNN. It defines an effectual string representation for combining structure and parameters of the CNN. The compact genetic algorithm helps in the evolution of network structure by optimizing the number of convolutional layers and number of filters in each convolutional layer. At the same time, an optimal set of weight parameters of the network is obtained using the SGD law. This approach amalgamates exploration in network space by compact genetic algorithm and exploitation in weight space with SGD in an effective manner. The proposed approach also incorporates user-defined parameters in the cost function in an elegant manner which controls the network structure and hence the performance of the network based on the users need. The effectiveness of the proposed approach has been demonstrated on four benchmark datasets, namely MNIST, COIL-100, CIFAR-10 and CIFAR-100. The obtained results clearly demonstrate the potential of the proposed approach by evolving architectures based on the nature of the application and the need of the user.

Download Full-text

Stability Issues with Classifier Using Lukasiewicz Similarity and Modified Schweizer & Sklar Equations

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2005.p0514 ◽

2005 ◽

Vol 9 (5) ◽

pp. 514-525 ◽

Cited By ~ 2

Author(s):

Pasi Luukka ◽

◽

Jouni Sampo

Keyword(s):

The Other ◽

Classification Task ◽

Classification Problems ◽

Generalized Mean ◽

The Stability ◽

Stability Issues ◽

Made In

In this article we have tested the stability of a classifier based on Lukasiewicz similarity in the generalized Lukasiewicz structure. We have also tested Schweizer & Sklar's implications with an extension to generalized mean to classification task. We will show that classification results are not so sensitive to p values with Schweizer & Sklar's measures, which indicates a generalized form of equations. In this article we have also tested the stability of these measures. Two different tests for stability were made: In one test the stability was checked with respect to weight parameters and the other test was carried out for idealvectors. The tests were done with five different classification problems.

Download Full-text

Research on Multi-Channel Semantic Fusion Classification Model

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2019.p1044 ◽

2019 ◽

Vol 23 (6) ◽

pp. 1044-1051

Author(s):

Di Yang ◽

◽

Ningjia Qiu ◽

Lin Cong ◽

Huamin Yang

Keyword(s):

Adaptive Learning ◽

Sentiment Classification ◽

Classification Model ◽

Classification Task ◽

Model Parameters ◽

Semantic Features ◽

Gradient Descent Algorithm ◽

Text Word ◽

High Level ◽

Rate Gradient

In this work, we propose a multi-channel semantic fusion convolutional neural network (SFCNN) to solve the problem of emotional ambiguity caused by the change of contextual order in sentiment classification task. Firstly, the emotional tendency weights are evaluated on the text word vector through the improved emotional tendency attention mechanism. Secondly, the multi-channel semantic fusion layer is leveraged to combine deep semantic fusion of sentences with contextual order to generate deep semantic vectors, which are learned by CNN to extract high-level semantic features. Finally, the improved adaptive learning rate gradient descent algorithm is employed to optimize the model parameters, and completes the sentiment classification task. Three datasets are used to evaluate the effectiveness of the proposed algorithm. The experimental results show that the SFCNN model has the high steady-state precision and generalization performance.

Download Full-text

Sentiment classification for employees reviews using regression vector- stochastic gradient descent classifier (RV-SGDC)

PeerJ Computer Science ◽

10.7717/peerj-cs.712 ◽

2021 ◽

Vol 7 ◽

pp. e712

Author(s):

Babacar Gaye ◽

Dezheng Zhang ◽

Aziguli Wulamu

Keyword(s):

Machine Learning ◽

Gradient Descent ◽

Stochastic Gradient ◽

Sentiment Classification ◽

Majority Voting ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Hybrid Architecture ◽

Accuracy Score ◽

Learning Models

The satisfaction of employees is very important for any organization to make sufficient progress in production and to achieve its goals. Organizations try to keep their employees satisfied by making their policies according to employees’ demands which help to create a good environment for the collective. For this reason, it is beneficial for organizations to perform staff satisfaction surveys to be analyzed, allowing them to gauge the levels of satisfaction among employees. Sentiment analysis is an approach that can assist in this regard as it categorizes sentiments of reviews into positive and negative results. In this study, we perform experiments for the world’s big six companies and classify their employees’ reviews based on their sentiments. For this, we proposed an approach using lexicon-based and machine learning based techniques. Firstly, we extracted the sentiments of employees from text reviews and labeled the dataset as positive and negative using TextBlob. Then we proposed a hybrid/voting model named Regression Vector-Stochastic Gradient Descent Classifier (RV-SGDC) for sentiment classification. RV-SGDC is a combination of logistic regression, support vector machines, and stochastic gradient descent. We combined these models under a majority voting criteria. We also used other machine learning models in the performance comparison of RV-SGDC. Further, three feature extraction techniques: term frequency-inverse document frequency (TF-IDF), bag of words, and global vectors are used to train learning models. We evaluated the performance of all models in terms of accuracy, precision, recall, and F1 score. The results revealed that RV-SGDC outperforms with a 0.97 accuracy score using the TF-IDF feature due to its hybrid architecture.

Download Full-text

Submodular Batch Selection for Training Deep Neural Networks

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/372 ◽

2019 ◽

Cited By ~ 1

Author(s):

K J Joseph ◽

Vamshi Teja R ◽

Krishnakant Singh ◽

Vineeth N Balasubramanian

Keyword(s):

Gradient Descent ◽

Optimization Problem ◽

Sampling Strategy ◽

Stochastic Gradient Descent ◽

Selection Strategy ◽

Network Architectures ◽

Distance Metrics ◽

Learning Rates ◽

Batch Sizes ◽

Selection For

Mini-batch gradient descent based methods are the de facto algorithms for training neural network architectures today.We introduce a mini-batch selection strategy based on submodular function maximization. Our novel submodular formulation captures the informativeness of each sample and diversity of the whole subset. We design an efficient, greedy algorithm which can give high-quality solutions to this NP-hard combinatorial optimization problem. Our extensive experiments on standard datasets show that the deep models trained using the proposed batch selection strategy provide better generalization than Stochastic Gradient Descent as well as a popular baseline sampling strategy across different learning rates, batch sizes, and distance metrics.

Download Full-text

Comparative Analysis of Fine-tuned Deep Learning Language Models for ICD-10 classification task for Bulgarian Language

10.26615/978-954-452-072-4_162 ◽

2021 ◽

Author(s):

Boris Velichkov ◽

◽

Sylvia Vassileva ◽

Simeon Gerginov ◽

Boris Kraychev ◽

...

Keyword(s):

Deep Learning ◽

Comparative Analysis ◽

Language Models ◽

Classification Task ◽

Icd 10 ◽

Learning Language

Download Full-text

Imbalanced sentiment classification based on sequence generative adversarial nets

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201370 ◽

2020 ◽

Vol 39 (5) ◽

pp. 7909-7919

Author(s):

Chuantao Wang ◽

Xuexin Yang ◽

Linkai Ding

Keyword(s):

Deep Learning ◽

Online Reviews ◽

Classification Performance ◽

Sentiment Classification ◽

Classification Task ◽

Algorithm Optimization ◽

Minority Class ◽

Sample Distribution ◽

Practical Applications ◽

Deep Model

The purpose of sentiment classification is to solve the problem of automatic judgment of sentiment tendency. In the sentiment classification task of text data (such as online reviews), the traditional deep learning model focuses on algorithm optimization, but ignores the characteristics of the imbalanced distribution of the number of samples in each classification, which will cause the classification performance of the model to decrease in practical applications. In this paper, the experiment is divided into two stages. In the first stage, samples of minority class in the sample distribution are used to train a sequence generative adversarial nets, so that the sequence generative adversarial nets can learn the features of the samples of minority class in depth. In the second stage, the trained generator of sequence generative adversarial nets is used to generate false samples of minority class and mix them with the original samples to balance the sample distribution. After that, the mixed samples are input into the sentiment classification deep model to complete the model training. Experimental results show that the model has excellent classification performance in comparing a variety of deep learning models based on classic imbalanced learning methods in the sentiment classification task of hotel reviews.

Download Full-text

Mixture of Language Models Utilization in Score-Based Sentiment Classification on Clinical Narratives

Trends in Applied Knowledge-Based Systems and Data Science - Lecture Notes in Computer Science ◽

10.1007/978-3-319-42007-3_22 ◽

2016 ◽

pp. 255-268 ◽

Cited By ~ 3

Author(s):

Tran-Thai Dang ◽

Tu-Bao Ho

Keyword(s):

Sentiment Classification ◽

Language Models

Download Full-text