scholarly journals Keras Model for Text Classification in Amazon Review Dataset using LSTM

Author(s):  
Thivaharan S ◽  
Srivatsun G

With the use of Ecommerce, Industry 4.0 is being effectively used in online product-based commercial transactions. An effort has been made in this article to extract positive and negative sentiments from Amazon review datasets. This will give an upper hold to the purchaser to decide upon a particular product, without considering the manual rating given in the reviews. Even the number words in an inherent positive review exceeds by one, where the present classifiers misclassify them under negative category. This article addresses the aforementioned issue by using LSTM (Long-Short-Term-Memory) model, as LSTM model has a feedback mechanism based progression unlike the other classifiers, which are dependent on feed-forward mechanism. For achieving better classification accuracy, the dataset is initially processed and a total of 100239 short and 411313 long reviews have been obtained. With the appropriate Epoch iterations, it is observed that, this proposed model has gain the ability to classify with 89% accuracy, while maintaining a non-bias between the train and test datasets. The entire model is deployed in TensorFlow2.1.0 platform by using the Keras framework and python 3.6.0.

Author(s):  
Azim Heydari ◽  
Meysam Majidi Nezhad ◽  
Davide Astiaso Garcia ◽  
Farshid Keynia ◽  
Livio De Santoli

AbstractAir pollution monitoring is constantly increasing, giving more and more attention to its consequences on human health. Since Nitrogen dioxide (NO2) and sulfur dioxide (SO2) are the major pollutants, various models have been developed on predicting their potential damages. Nevertheless, providing precise predictions is almost impossible. In this study, a new hybrid intelligent model based on long short-term memory (LSTM) and multi-verse optimization algorithm (MVO) has been developed to predict and analysis the air pollution obtained from Combined Cycle Power Plants. In the proposed model, long short-term memory model is a forecaster engine to predict the amount of produced NO2 and SO2 by the Combined Cycle Power Plant, where the MVO algorithm is used to optimize the LSTM parameters in order to achieve a lower forecasting error. In addition, in order to evaluate the proposed model performance, the model has been applied using real data from a Combined Cycle Power Plant in Kerman, Iran. The datasets include wind speed, air temperature, NO2, and SO2 for five months (May–September 2019) with a time step of 3-h. In addition, the model has been tested based on two different types of input parameters: type (1) includes wind speed, air temperature, and different lagged values of the output variables (NO2 and SO2); type (2) includes just lagged values of the output variables (NO2 and SO2). The obtained results show that the proposed model has higher accuracy than other combined forecasting benchmark models (ENN-PSO, ENN-MVO, and LSTM-PSO) considering different network input variables. Graphic abstract


2021 ◽  
pp. 1-10
Author(s):  
Hye-Jeong Song ◽  
Tak-Sung Heo ◽  
Jong-Dae Kim ◽  
Chan-Young Park ◽  
Yu-Seop Kim

Sentence similarity evaluation is a significant task used in machine translation, classification, and information extraction in the field of natural language processing. When two sentences are given, an accurate judgment should be made whether the meaning of the sentences is equivalent even if the words and contexts of the sentences are different. To this end, existing studies have measured the similarity of sentences by focusing on the analysis of words, morphemes, and letters. To measure sentence similarity, this study uses Sent2Vec, a sentence embedding, as well as morpheme word embedding. Vectors representing words are input to the 1-dimension convolutional neural network (1D-CNN) with various sizes of kernels and bidirectional long short-term memory (Bi-LSTM). Self-attention is applied to the features transformed through Bi-LSTM. Subsequently, vectors undergoing 1D-CNN and self-attention are converted through global max pooling and global average pooling to extract specific values, respectively. The vectors generated through the above process are concatenated to the vector generated through Sent2Vec and are represented as a single vector. The vector is input to softmax layer, and finally, the similarity between the two sentences is determined. The proposed model can improve the accuracy by up to 5.42% point compared with the conventional sentence similarity estimation models.


2021 ◽  
Vol 35 (4) ◽  
pp. 1167-1181
Author(s):  
Yun Bai ◽  
Nejc Bezak ◽  
Bo Zeng ◽  
Chuan Li ◽  
Klaudija Sapač ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document