scholarly journals Fake or Genuine? Contextualised Text Representation for Fake Review Detection

2021 ◽  
Author(s):  
Rami Mohawesh ◽  
Shuxiang Xu ◽  
Matthew Springer ◽  
Muna Al-Hawawreh ◽  
Sumbal Maqsood

Online reviews have a significant influence on customers' purchasing decisions for any products or services. However, fake reviews can mislead both consumers and companies. Several models have been developed to detect fake reviews using machine learning approaches. Many of these models have some limitations resulting in low accuracy in distinguishing between fake and genuine reviews. These models focused only on linguistic features to detect fake reviews and failed to capture the semantic meaning of the reviews. To deal with this, this paper proposes a new ensemble model that employs transformer architecture to discover the hidden patterns in a sequence of fake reviews and detect them precisely. The proposed approach combines three transformer models to improve the robustness of fake and genuine behaviour profiling and modelling to detect fake reviews. The experimental results using semi-real benchmark datasets showed the superiority of the proposed model over state-of-the-art models.

Author(s):  
Tham Vo

Recently, advanced techniques in deep learning such as recurrent neural network (GRU, LSTM and Bi-LSTM) and auto-encoding (attention-based transformer and BERT) have achieved great successes in multiple application domains including text summarization. Recent state-of-the-art encoding-based text summarization models such as BertSum, PreSum and DiscoBert have demonstrated significant improvements on extractive text summarization tasks. However, recent models still encounter common problems related to the language-specific dependency which requires the supports of the external NLP tools. Besides that, recent advanced text representation methods, such as BERT as the sentence-level textual encoder, also fail to fully capture the representation of a full-length document. To address these challenges, in this paper we proposed a novel s emantic-ware e mbedding approach for ex tractive text sum marization , called as: SE4ExSum. Our proposed SE4ExSum is an integration between the use of feature graph-of-words (FGOW) with BERT-based encoder for effectively learning the word/sentence-level representations of a given document. Then, the g raph c onvolutional n etwork (GCN) based encoder is applied to learn the global document's representation which is then used to facilitate the text summarization task. Extensive experiments on benchmark datasets show the effectiveness of our proposed model in comparing with recent state-of-the-art text summarization models.


Kybernetes ◽  
2019 ◽  
Vol 48 (6) ◽  
pp. 1355-1372 ◽  
Author(s):  
Ying Huang ◽  
Nu-nu Wang ◽  
Hongyu Zhang ◽  
Jianqiang Wang

Purpose The purpose of this paper is to propose a model for product recommendation to improve the accuracy of recommendation based on the current search engines used in e-commerce platforms like Tmall.com. Design/methodology/approach First, the proposed model comprehensively considers price, trust and online reviews, which all represent critical factors in consumers’ purchasing decisions. Second, the model introduces the quantization methods for these criteria incorporating fuzzy theory. Third, the model uses a distance measure between two single valued neutrosophic sets based on the prioritized average operator to consolidate the influences of positive, neutral and negative comments. Finally, the model uses multi-criteria decision-making methods to integrate the influences of price, trust and online reviews on purchasing decisions to generate recommendations. Findings To demonstrate the feasibility and efficiency of the proposed model, a case study is conducted based on Tmall.com. The results of case study indicate that the recommendations of our model perform better than those of current search engines of Tmall.com. The proposed model can significantly improve the accuracy of product recommendations based on search engines. Originality/value The product recommendation method can meet the critical challenge from the search engines on e-commerce platforms. In addition, the proposed method could be used in practice to develop a new application for e-commerce platforms.


2020 ◽  
Vol 34 (05) ◽  
pp. 7797-7804
Author(s):  
Goran Glavašš ◽  
Swapna Somasundaran

Breaking down the structure of long texts into semantically coherent segments makes the texts more readable and supports downstream applications like summarization and retrieval. Starting from an apparent link between text coherence and segmentation, we introduce a novel supervised model for text segmentation with simple but explicit coherence modeling. Our model – a neural architecture consisting of two hierarchically connected Transformer networks – is a multi-task learning model that couples the sentence-level segmentation objective with the coherence objective that differentiates correct sequences of sentences from corrupt ones. The proposed model, dubbed Coherence-Aware Text Segmentation (CATS), yields state-of-the-art segmentation performance on a collection of benchmark datasets. Furthermore, by coupling CATS with cross-lingual word embeddings, we demonstrate its effectiveness in zero-shot language transfer: it can successfully segment texts in languages unseen in training.


Mathematics ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. 2075
Author(s):  
Óscar Apolinario-Arzube ◽  
José Antonio García-Díaz ◽  
José Medina-Moreira ◽  
Harry Luna-Aveiga ◽  
Rafael Valencia-García

Automatic satire identification can help to identify texts in which the intended meaning differs from the literal meaning, improving tasks such as sentiment analysis, fake news detection or natural-language user interfaces. Typically, satire identification is performed by training a supervised classifier for finding linguistic clues that can determine whether a text is satirical or not. For this, the state-of-the-art relies on neural networks fed with word embeddings that are capable of learning interesting characteristics regarding the way humans communicate. However, as far as our knowledge goes, there are no comprehensive studies that evaluate these techniques in Spanish in the satire identification domain. Consequently, in this work we evaluate several deep-learning architectures with Spanish pre-trained word-embeddings and compare the results with strong baselines based on term-counting features. This evaluation is performed with two datasets that contain satirical and non-satirical tweets written in two Spanish variants: European Spanish and Mexican Spanish. Our experimentation revealed that term-counting features achieved similar results to deep-learning approaches based on word-embeddings, both outperforming previous results based on linguistic features. Our results suggest that term-counting features and traditional machine learning models provide competitive results regarding automatic satire identification, slightly outperforming state-of-the-art models.


2020 ◽  
Vol 17 (8) ◽  
pp. 3421-3426
Author(s):  
D. Deva Hema ◽  
J. Tharun ◽  
G. Arun Dev ◽  
N. Sateesh

Our day-to-day activity is highly influenced by development of Internet. One of the rapid growing area in Internet is E-commerce. People are eager to buy products from online sites like Amazon, embay, Flipkart etc. Customers can write reviews about the products purchased online. The purchasing of good through online has been increasing exponentially since last few years. As there is no physical contact with goods before purchasing through online, people totally rely on reviews about the product before purchasing it. Hence review plays an important role in deciding the quality of the product. There are many customers who give online reviews about the product after using it. Hence the quality of the product is decided by the reviews of the customers. Thus, detection of fake reviews has become one of the important task. The proposed system will help in finding such fake reviews about the product, so that the fake reviews can be eliminated. Therefore, the purchasing of the products will be totally based on the genuine reviews. The proposed system uses Deep Recurrent Neural Network (DRNN) to predict the fake reviews and the performance of the proposed method has compared with Naïve Bayes Algorithm. The proposed model shows good accuracy and can handle huge amount of data over the existing system.


Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 73
Author(s):  
Marjan Stoimchev ◽  
Marija Ivanovska ◽  
Vitomir Štruc

In the past few years, there has been a leap from traditional palmprint recognition methodologies, which use handcrafted features, to deep-learning approaches that are able to automatically learn feature representations from the input data. However, the information that is extracted from such deep-learning models typically corresponds to the global image appearance, where only the most discriminative cues from the input image are considered. This characteristic is especially problematic when data is acquired in unconstrained settings, as in the case of contactless palmprint recognition systems, where visual artifacts caused by elastic deformations of the palmar surface are typically present in spatially local parts of the captured images. In this study we address the problem of elastic deformations by introducing a new approach to contactless palmprint recognition based on a novel CNN model, designed as a two-path architecture, where one path processes the input in a holistic manner, while the second path extracts local information from smaller image patches sampled from the input image. As elastic deformations can be assumed to most significantly affect the global appearance, while having a lesser impact on spatially local image areas, the local processing path addresses the issues related to elastic deformations thereby supplementing the information from the global processing path. The model is trained with a learning objective that combines the Additive Angular Margin (ArcFace) Loss and the well-known center loss. By using the proposed model design, the discriminative power of the learned image representation is significantly enhanced compared to standard holistic models, which, as we show in the experimental section, leads to state-of-the-art performance for contactless palmprint recognition. Our approach is tested on two publicly available contactless palmprint datasets—namely, IITD and CASIA—and is demonstrated to perform favorably against state-of-the-art methods from the literature. The source code for the proposed model is made publicly available.


2022 ◽  
pp. 1-10
Author(s):  
Daniel Trevino-Sanchez ◽  
Vicente Alarcon-Aquino

The need to detect and classify objects correctly is a constant challenge, being able to recognize them at different scales and scenarios, sometimes cropped or badly lit is not an easy task. Convolutional neural networks (CNN) have become a widely applied technique since they are completely trainable and suitable to extract features. However, the growing number of convolutional neural networks applications constantly pushes their accuracy improvement. Initially, those improvements involved the use of large datasets, augmentation techniques, and complex algorithms. These methods may have a high computational cost. Nevertheless, feature extraction is known to be the heart of the problem. As a result, other approaches combine different technologies to extract better features to improve the accuracy without the need of more powerful hardware resources. In this paper, we propose a hybrid pooling method that incorporates multiresolution analysis within the CNN layers to reduce the feature map size without losing details. To prevent relevant information from losing during the downsampling process an existing pooling method is combined with wavelet transform technique, keeping those details "alive" and enriching other stages of the CNN. Achieving better quality characteristics improves CNN accuracy. To validate this study, ten pooling methods, including the proposed model, are tested using four benchmark datasets. The results are compared with four of the evaluated methods, which are also considered as the state-of-the-art.


2016 ◽  
Vol 56 (8) ◽  
pp. 975-987 ◽  
Author(s):  
Sungwoo Choi ◽  
Anna S. Mattila ◽  
Hubert B. Van Hoof ◽  
Donna Quadri-Felitti

As online reviews have become increasingly prevalent in recent years and their influence on consumers’ purchasing decisions has grown exponentially, some companies have begun to ask people to write fake reviews about their businesses or their competitors while offering compensation in return. This process has drawn the attention of regulators because it knowingly misleads consumers. This article reports on two studies that looked at the effect of two types of incentives (self-benefiting or charitable) on individuals’ intentions to write fake reviews and examined the moderating role of a person’s sense of power on his or her propensity to post a fake review. The study findings indicate that powerless individuals are more likely to post a fake review when presented with a monetary incentive rather than a charity incentive, while powerful individuals are not impacted by incentive type. Moreover, when asked to post negative fake reviews about competitors, such effects are mitigated.


Cancers ◽  
2020 ◽  
Vol 12 (8) ◽  
pp. 2031 ◽  
Author(s):  
Taimoor Shakeel Sheikh ◽  
Yonghee Lee ◽  
Migyung Cho

Diagnosis of pathologies using histopathological images can be time-consuming when many images with different magnification levels need to be analyzed. State-of-the-art computer vision and machine learning methods can help automate the diagnostic pathology workflow and thus reduce the analysis time. Automated systems can also be more efficient and accurate, and can increase the objectivity of diagnosis by reducing operator variability. We propose a multi-scale input and multi-feature network (MSI-MFNet) model, which can learn the overall structures and texture features of different scale tissues by fusing multi-resolution hierarchical feature maps from the network’s dense connectivity structure. The MSI-MFNet predicts the probability of a disease on the patch and image levels. We evaluated the performance of our proposed model on two public benchmark datasets. Furthermore, through ablation studies of the model, we found that multi-scale input and multi-feature maps play an important role in improving the performance of the model. Our proposed model outperformed the existing state-of-the-art models by demonstrating better accuracy, sensitivity, and specificity.


2021 ◽  
Vol 6 (1) ◽  
pp. 1-5
Author(s):  
Zobeir Raisi ◽  
Mohamed A. Naiel ◽  
Paul Fieguth ◽  
Steven Wardell ◽  
John Zelek

The reported accuracy of recent state-of-the-art text detection methods, mostly deep learning approaches, is in the order of 80% to 90% on standard benchmark datasets. These methods have relaxed some of the restrictions of structured text and environment (i.e., "in the wild") which are usually required for classical OCR to properly function. Even with this relaxation, there are still circumstances where these state-of-the-art methods fail.  Several remaining challenges in wild images, like in-plane-rotation, illumination reflection, partial occlusion, complex font styles, and perspective distortion, cause exciting methods to perform poorly. In order to evaluate current approaches in a formal way, we standardize the datasets and metrics for comparison which had made comparison between these methods difficult in the past. We use three benchmark datasets for our evaluations: ICDAR13, ICDAR15, and COCO-Text V2.0. The objective of the paper is to quantify the current shortcomings and to identify the challenges for future text detection research.


Sign in / Sign up

Export Citation Format

Share Document