An Improved Cross-Domain Sentiment Analysis Based on a Semi-Supervised Convolutional Neural Network

2022 ◽  
pp. 155-170
Author(s):  
Lap-Kei Lee ◽  
Kwok Tai Chui ◽  
Jingjing Wang ◽  
Yin-Chun Fung ◽  
Zhanhui Tan

The dependence on Internet in our daily life is ever-growing, which provides opportunity to discover valuable and subjective information using advanced techniques such as natural language processing and artificial intelligence. In this chapter, the research focus is a convolutional neural network for three-class (positive, neutral, and negative) cross-domain sentiment analysis. The model is enhanced in two-fold. First, a similarity label method facilitates the management between the source and target domains to generate more labelled data. Second, term frequency-inverse document frequency (TF-IDF) and latent semantic indexing (LSI) are employed to compute the similarity between source and target domains. Performance evaluation is conducted using three datasets, beauty reviews, toys reviews, and phone reviews. The proposed method enhances the accuracy by 4.3-7.6% and reduces the training time by 50%. The limitations of the research work have been discussed, which serve as the rationales of future research directions.

2020 ◽  
Author(s):  
Monalisha Ghosh ◽  
Goutam Sanyal

Abstract ­­­­­­­­­­­­­­­­­­­­­­­­­­­ Sentiment Analysis has recently been considered as the most active research field in the natural language processing (NLP) domain. Deep Learning is a subset of the large family of Machine Learning and becoming a growing trend due to its automatic learning capability with impressive consequences across different NLP tasks. Hence, a fusion-based Machine Learning framework has been attempted by merging the Traditional Machine Learning method with Deep Learning techniques to tackle the challenge of sentiment prediction for a massive amount of unstructured review dataset. The proposed architecture aims to utilize the Convolutional Neural Network (CNN) with a backpropagation algorithm to extract embedded feature vectors from the top hidden layer. Thereafter, these vectors augmented to an optimized feature set generated from binary particle swarm optimization (BPSO) method. Finally, a traditional SVM classifier is trained with these extended features set to determine the optimal hyper-plane for separating two classes of review datasets. The evaluation of this research work has been carried out on two benchmark movie review datasets IMDB, SST2. Experimental results with comparative studies based on performance accuracy and F-score value are reported to highlight the benefits of the developed frameworks.


2020 ◽  
Author(s):  
Monalisha Ghosh ◽  
Goutam Sanyal

Abstract Sentiment Analysis has recently been considered as the most active research field in the natural language processing (NLP) domain. Deep Learning is a subset of the large family of Machine Learning and becoming a growing trend due to its automatic learning capability with impressive consequences across different NLP tasks. Hence, a fusion-based Machine Learning framework has been attempted by merging the Traditional Machine Learning method with Deep Learning techniques to tackle the challenge of sentiment prediction for a massive amount of unstructured review dataset. The proposed architecture aims to utilize the Convolutional Neural Network (CNN) with a backpropagation algorithm to extract embedded feature vectors from the top hidden layer. Thereafter, these vectors augmented to an optimized feature set generated from binary particle swarm optimization (BPSO) method. Finally, a traditional SVM classifier is trained with these extended features set to determine the optimal hyper-plane for separating two classes of review datasets. The evaluation of this research work has been carried out on two benchmark movie review datasets IMDB, SST2. Experimental results with comparative studies based on performance accuracy and F-score value are reported to highlight the benefits of the developed frameworks.


2021 ◽  
Vol 10 (4) ◽  
pp. 0-0

Multilingual Sentiment analysis plays an important role in a country like India with many languages as the style of expression varies in different languages. The Indian people speak in total 22 different languages and with the help of Google Indic keyboard people can express their sentiments i.e reviews about anything in the social media in their native language from individual smart phones. It has been found that machine learning approach has overcome the limitations of other approaches. In this paper, a detailed study has been carried out based on Natural Language Processing (NLP) using Simple Neural Network (SNN) ,Convolutional Neural Network(CNN), and Long Short Term Memory (LSTM)Neural Network followed by another amalgamated model adding a CNN layer on top of the LSTM without worrying about versatility of multilingualism. Around 4000 samples of reviews in English, Hindi and in Bengali languages are considered to generate outputs for the above models and analyzed. The experimental results on these realistic reviews are found to be effective for further research work.


Author(s):  
Hema Krishnan ◽  
M. Sudheep Elayidom ◽  
T. Santhanakrishnan

Analyzing and gathering the people’s reactions on product trading, public services, etc. are crucial. Sentiment analysis (also termed as opinion mining) is a usual dialogue preparing act that plans on discovering the sentiments after opinions in texts on changing subjects. This research work adopts a novel sentiment analysis approach that comprises six phases like (i) Pre-processing, (ii) Keyword extraction and its sentiment categorization, (iii) Semantic word extraction, (iv) Semantic similarity checking, (v) Feature extraction, and (vi) Classification. Accordingly, the Mongodb documented tweets initially underwent pre-processing with stop word removal, stemming, and blank space removal. Regarding the extracted keywords, the existing semantic words are derived after categorizing the sentiment of keywords. Additionally, the semantic similarity score is evaluated along with their keywords. The subsequent step is feature extraction, where the Holoentropy features such as cross Holoentropy and joint Holoentropy are formulated. Along with this, the extraction of weighted holoentropy features is the major work, where weight is multiplied with the holoentropy features. Moreover, in order to enhance the performance of classification results, the constant term utilized in evaluating the weight function is optimized. For this optimal tuning, a new, improved algorithm termed as Self Adaptive Moth Flame Optimization (SA-MFO) is introduced, which is the adaptive version of MFO algorithm. For classification, this paper aims to use the Deep Convolutional Neural network (DCNN), where the batch size is fine-tuned using the same SA-MFO algorithm. Finally, the performance of the proposed work is compared over other conventional models with respect to different performance measures.


2021 ◽  
Vol 10 (4) ◽  
pp. 1-12
Author(s):  
Abhijit Bera ◽  
Mrinal Kanti Ghose ◽  
Dibyendu Kumar Pal

Multilingual Sentiment analysis plays an important role in a country like India with many languages as the style of expression varies in different languages. The Indian people speak in total 22 different languages and with the help of Google Indic keyboard people can express their sentiments i.e reviews about anything in the social media in their native language from individual smart phones. It has been found that machine learning approach has overcome the limitations of other approaches. In this paper, a detailed study has been carried out based on Natural Language Processing (NLP) using Simple Neural Network (SNN) ,Convolutional Neural Network(CNN), and Long Short Term Memory (LSTM)Neural Network followed by another amalgamated model adding a CNN layer on top of the LSTM without worrying about versatility of multilingualism. Around 4000 samples of reviews in English, Hindi and in Bengali languages are considered to generate outputs for the above models and analyzed. The experimental results on these realistic reviews are found to be effective for further research work.


2021 ◽  
Vol 11 (23) ◽  
pp. 11255
Author(s):  
Marjan Kamyab ◽  
Guohua Liu ◽  
Michael Adjeisah

Sentiment analysis (SA) detects people’s opinions from text engaging natural language processing (NLP) techniques. Recent research has shown that deep learning models, i.e., Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Transformer-based provide promising results for recognizing sentiment. Nonetheless, CNN has the advantage of extracting high-level features by using convolutional and max-pooling layers; it cannot efficiently learn a sequence of correlations. At the same time, Bidirectional RNN uses two RNN directions to improve extracting long-term dependencies. However, it cannot extract local features in parallel, and Transformer-based like Bidirectional Encoder Representations from Transformers (BERT) are the computational resources needed to fine-tune, facing an overfitting problem on small datasets. This paper proposes a novel attention-based model that utilizes CNNs with LSTM (named ACL-SA). First, it applies a preprocessor to enhance the data quality and employ term frequency-inverse document frequency (TF-IDF) feature weighting and pre-trained Glove word embedding approaches to extract meaningful information from textual data. In addition, it utilizes CNN’s max-pooling to extract contextual features and reduce feature dimensionality. Moreover, it uses an integrated bidirectional LSTM to capture long-term dependencies. Furthermore, it applies the attention mechanism at the CNN’s output layer to emphasize each word’s attention level. To avoid overfitting, the Guasiannoise and GuasianDroupout are adopted as regularization. The model’s robustness is evaluated on four English standard datasets, i.e., Sentiment140, US-airline, Sentiment140-MV, SA4A with various performance matrices, and compared efficiency with existing baseline models and approaches. The experiment results show that the proposed method significantly outperforms the state-of-the-art models.


2019 ◽  
Vol 8 (3) ◽  
pp. 6634-6643 ◽  

Opinion mining and sentiment analysis are valuable to extract the useful subjective information out of text documents. Predicting the customer’s opinion on amazon products has several benefits like reducing customer churn, agent monitoring, handling multiple customers, tracking overall customer satisfaction, quick escalations, and upselling opportunities. However, performing sentiment analysis is a challenging task for the researchers in order to find the users sentiments from the large datasets, because of its unstructured nature, slangs, misspells and abbreviations. To address this problem, a new proposed system is developed in this research study. Here, the proposed system comprises of four major phases; data collection, pre-processing, key word extraction, and classification. Initially, the input data were collected from the dataset: amazon customer review. After collecting the data, preprocessing was carried-out for enhancing the quality of collected data. The pre-processing phase comprises of three systems; lemmatization, review spam detection, and removal of stop-words and URLs. Then, an effective topic modelling approach Latent Dirichlet Allocation (LDA) along with modified Possibilistic Fuzzy C-Means (PFCM) was applied to extract the keywords and also helps in identifying the concerned topics. The extracted keywords were classified into three forms (positive, negative and neutral) by applying an effective machine learning classifier: Convolutional Neural Network (CNN). The experimental outcome showed that the proposed system enhanced the accuracy in sentiment analysis up to 6-20% related to the existing systems.


2021 ◽  
Vol 11 (6) ◽  
pp. 2838
Author(s):  
Nikitha Johnsirani Venkatesan ◽  
Dong Ryeol Shin ◽  
Choon Sung Nam

In the pharmaceutical field, early detection of lung nodules is indispensable for increasing patient survival. We can enhance the quality of the medical images by intensifying the radiation dose. High radiation dose provokes cancer, which forces experts to use limited radiation. Using abrupt radiation generates noise in CT scans. We propose an optimal Convolutional Neural Network model in which Gaussian noise is removed for better classification and increased training accuracy. Experimental demonstration on the LUNA16 dataset of size 160 GB shows that our proposed method exhibit superior results. Classification accuracy, specificity, sensitivity, Precision, Recall, F1 measurement, and area under the ROC curve (AUC) of the model performance are taken as evaluation metrics. We conducted a performance comparison of our proposed model on numerous platforms, like Apache Spark, GPU, and CPU, to depreciate the training time without compromising the accuracy percentage. Our results show that Apache Spark, integrated with a deep learning framework, is suitable for parallel training computation with high accuracy.


Sign in / Sign up

Export Citation Format

Share Document