scholarly journals Big Data Management and Analytics in Scientific Programming: A Deep Learning-Based Method for Aspect Category Classification of Question-Answering-Style Reviews

2020 ◽  
Vol 2020 ◽  
pp. 1-10 ◽  
Author(s):  
Hanqian Wu ◽  
Mumu Liu ◽  
Shangbin Zhang ◽  
Zhike Wang ◽  
Siliang Cheng

Online product reviews are exploring on e-commerce platforms, and mining aspect-level product information contained in those reviews has great economic benefit. The aspect category classification task is a basic task for aspect-level sentiment analysis which has become a hot research topic in the natural language processing (NLP) field during the last decades. In various e-commerce platforms, there emerge various user-generated question-answering (QA) reviews which generally contain much aspect-related information of products. Although some researchers have devoted their efforts on the aspect category classification for traditional product reviews, the existing deep learning-based approaches cannot be well applied to represent the QA-style reviews. Thus, we propose a 4-dimension (4D) textual representation model based on QA interaction-level and hyperinteraction-level by modeling with different levels of the text representation, i.e., word-level, sentence-level, QA interaction-level, and hyperinteraction-level. In our experiments, the empirical studies on datasets from three domains demonstrate that our proposals perform better than traditional sentence-level representation approaches, especially in the Digit domain.

2021 ◽  
Vol 47 (05) ◽  
Author(s):  
NGUYỄN CHÍ HIẾU

Knowledge Graphs are applied in many fields such as search engines, semantic analysis, and question answering in recent years. However, there are many obstacles for building knowledge graphs as methodologies, data and tools. This paper introduces a novel methodology to build knowledge graph from heterogeneous documents.  We use the methodologies of Natural Language Processing and deep learning to build this graph. The knowledge graph can use in Question answering systems and Information retrieval especially in Computing domain


Electronics ◽  
2021 ◽  
Vol 10 (21) ◽  
pp. 2671
Author(s):  
Yu Zhang ◽  
Junan Yang ◽  
Xiaoshuai Li ◽  
Hui Liu ◽  
Kun Shao

Recent studies have shown that natural language processing (NLP) models are vulnerable to adversarial examples, which are maliciously designed by adding small perturbations to benign inputs that are imperceptible to the human eye, leading to false predictions by the target model. Compared to character- and sentence-level textual adversarial attacks, word-level attack can generate higher-quality adversarial examples, especially in a black-box setting. However, existing attack methods usually require a huge number of queries to successfully deceive the target model, which is costly in a real adversarial scenario. Hence, finding appropriate models is difficult. Therefore, we propose a novel attack method, the main idea of which is to fully utilize the adversarial examples generated by the local model and transfer part of the attack to the local model to complete ahead of time, thereby reducing costs related to attacking the target model. Extensive experiments conducted on three public benchmarks show that our attack method can not only improve the success rate but also reduce the cost, while outperforming the baselines by a significant margin.


Author(s):  
Muhammad Zulqarnain ◽  
Rozaida Ghazali ◽  
Yana Mazwin Mohmad Hassim ◽  
Muhammad Rehan

<p>Text classification is a fundamental task in several areas of natural language processing (NLP), including words semantic classification, sentiment analysis, question answering, or dialog management. This paper investigates three basic architectures of deep learning models for the tasks of text classification: Deep Belief Neural (DBN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), these three main types of deep learning architectures, are largely explored to handled various classification tasks. DBN have excellent learning capabilities to extracts highly distinguishable features and good for general purpose. CNN have supposed to be better at extracting the position of various related features while RNN is modeling in sequential of long-term dependencies. This paper work shows the systematic comparison of DBN, CNN, and RNN on text classification tasks. Finally, we show the results of deep models by research experiment. The aim of this paper to provides basic guidance about the deep learning models that which models are best for the task of text classification.</p>


Author(s):  
Mrunal Malekar

Domain based Question Answering is concerned with building systems which provide answers to natural language questions that are asked specific to a domain. It comes under Information Retrieval and Natural language processing. Using Information Retrieval, one can search for the relevant documents which may contain the answer but it won’t give the exact answer for the question asked. In the presented work, a question answering search engine has been developed which first finds out the relevant documents from a huge textual document data of a construction company and then goes a step beyond to extract answer from the extracted document. The robust question answering system developed uses Elastic Search for Information Retrieval [paragraphs extraction] and Deep Learning for answering the question from the short extracted paragraph. It leverages BERT Deep Learning Model to understand the layers and representations between the question and answer. The research work also focuses on how to improve the search accuracy of the Information Retrieval based Elastic Search engine which returns the relevant documents which may contain the answer.


2021 ◽  
Vol 7 ◽  
pp. e570
Author(s):  
Muhammad Zulqarnain ◽  
Ahmed Khalaf Zager Alsaedi ◽  
Rozaida Ghazali ◽  
Muhammad Ghulam Ghouse ◽  
Wareesa Sharif ◽  
...  

Question classification is one of the essential tasks for automatic question answering implementation in natural language processing (NLP). Recently, there have been several text-mining issues such as text classification, document categorization, web mining, sentiment analysis, and spam filtering that have been successfully achieved by deep learning approaches. In this study, we illustrated and investigated our work on certain deep learning approaches for question classification tasks in an extremely inflected Turkish language. In this study, we trained and tested the deep learning architectures on the questions dataset in Turkish. In addition to this, we used three main deep learning approaches (Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN)) and we also applied two different deep learning combinations of CNN-GRU and CNN-LSTM architectures. Furthermore, we applied the Word2vec technique with both skip-gram and CBOW methods for word embedding with various vector sizes on a large corpus composed of user questions. By comparing analysis, we conducted an experiment on deep learning architectures based on test and 10-cross fold validation accuracy. Experiment results were obtained to illustrate the effectiveness of various Word2vec techniques that have a considerable impact on the accuracy rate using different deep learning approaches. We attained an accuracy of 93.7% by using these techniques on the question dataset.


2021 ◽  
Author(s):  
Nathan Ji ◽  
Yu Sun

The digital age gives us access to a multitude of both information and mediums in which we can interpret information. A majority of the time, many people find interpreting such information difficult as the medium may not be as user friendly as possible. This project has examined the inquiry of how one can identify specific information in a given text based on a question. This inquiry is intended to streamline one's ability to determine the relevance of a given text relative to his objective. The project has an overall 80% success rate given 10 articles with three questions asked per article. This success rate indicates that this project is likely applicable to those who are asking for content level questions within an article.


2016 ◽  
Vol 8s1 ◽  
pp. BII.S37791 ◽  
Author(s):  
Manabu Torii ◽  
Sameer S. Tilak ◽  
Son Doan ◽  
Daniel S. Zisook ◽  
Jung-wei Fan

In an era when most of our life activities are digitized and recorded, opportunities abound to gain insights about population health. Online product reviews present a unique data source that is currently underexplored. Health-related information, although scarce, can be systematically mined in online product reviews. Leveraging natural language processing and machine learning tools, we were able to mine 1.3 million grocery product reviews for health-related information. The objectives of the study were as follows: (1) conduct quantitative and qualitative analysis on the types of health issues found in consumer product reviews; (2) develop a machine learning classifier to detect reviews that contain health-related issues; and (3) gain insights about the task characteristics and challenges for text analytics to guide future research.


2021 ◽  
Vol 11 (21) ◽  
pp. 9938
Author(s):  
Kun Shao ◽  
Yu Zhang ◽  
Junan Yang ◽  
Hui Liu

Deep learning models are vulnerable to backdoor attacks. The success rate of textual backdoor attacks based on data poisoning in existing research is as high as 100%. In order to enhance the natural language processing model’s defense against backdoor attacks, we propose a textual backdoor defense method via poisoned sample recognition. Our method consists of two parts: the first step is to add a controlled noise layer after the model embedding layer, and to train a preliminary model with incomplete or no backdoor embedding, which reduces the effectiveness of poisoned samples. Then, we use the model to initially identify the poisoned samples in the training set so as to narrow the search range of the poisoned samples. The second step uses all the training data to train an infection model embedded in the backdoor, which is used to reclassify the samples selected in the first step, and finally identify the poisoned samples. Through detailed experiments, we have proved that our defense method can effectively defend against a variety of backdoor attacks (character-level, word-level and sentence-level backdoor attacks), and the experimental effect is better than the baseline method. For the BERT model trained by the IMDB dataset, this method can even reduce the success rate of word-level backdoor attacks to 0%.


2021 ◽  
Author(s):  
Xinghao Yang ◽  
Yongshun Gong ◽  
Weifeng Liu ◽  
JAMES BAILEY ◽  
Tianqing Zhu ◽  
...  

Deep learning models are known immensely brittle to adversarial image examples, yet their vulnerability in text classification is insufficiently explored. Existing text adversarial attack strategies can be roughly divided into three categories, i.e., character-level attack, word-level attack, and sentence-level attack. Despite the success brought by recent text attack methods, how to induce misclassification with the minimal text modifications while keeping the lexical correctness, syntactic soundness, and semantic consistency simultaneously is still a challenge. To examine the vulnerability of deep models, we devise a Bigram and Unigram based adaptive Semantic Preservation Optimization (BU-SPO) approach which attacks text documents not only at a unigram word level but also at a bigram level to avoid generating meaningless sentences. We also present a hybrid attack strategy that collects substitution words from both synonyms and sememe candidates, to enrich the potential candidate set. Besides, a Semantic Preservation Optimization (SPO) method is devised to determine the word substitution priority and reduce the perturbation cost. Furthermore, we constraint the SPO with a semantic Filter (dubbed SPOF) to improve the semantic similarity between the input text and the adversarial example. To estimate the effectiveness of our proposed methods, BU-SPO and BU-SPOF, we attack four victim deep learning models trained on three real-world text datasets. Experimental results demonstrate that our approaches accomplish the highest semantics consistency and attack success rates by making the minimal word modifications compared with competitive methods.


2021 ◽  
Author(s):  
Xinghao Yang ◽  
Yongshun Gong ◽  
Weifeng Liu ◽  
JAMES BAILEY ◽  
Tianqing Zhu ◽  
...  

Deep learning models are known immensely brittle to adversarial image examples, yet their vulnerability in text classification is insufficiently explored. Existing text adversarial attack strategies can be roughly divided into three categories, i.e., character-level attack, word-level attack, and sentence-level attack. Despite the success brought by recent text attack methods, how to induce misclassification with the minimal text modifications while keeping the lexical correctness, syntactic soundness, and semantic consistency simultaneously is still a challenge. To examine the vulnerability of deep models, we devise a Bigram and Unigram based adaptive Semantic Preservation Optimization (BU-SPO) approach which attacks text documents not only at a unigram word level but also at a bigram level to avoid generating meaningless sentences. We also present a hybrid attack strategy that collects substitution words from both synonyms and sememe candidates, to enrich the potential candidate set. Besides, a Semantic Preservation Optimization (SPO) method is devised to determine the word substitution priority and reduce the perturbation cost. Furthermore, we constraint the SPO with a semantic Filter (dubbed SPOF) to improve the semantic similarity between the input text and the adversarial example. To estimate the effectiveness of our proposed methods, BU-SPO and BU-SPOF, we attack four victim deep learning models trained on three real-world text datasets. Experimental results demonstrate that our approaches accomplish the highest semantics consistency and attack success rates by making the minimal word modifications compared with competitive methods.


Sign in / Sign up

Export Citation Format

Share Document