scholarly journals A deep learning based technique for plagiarism detection: a comparative study

Author(s):  
Hambi El Mostafa ◽  
Faouzia Benabbou

<table width="0" border="1" cellspacing="0" cellpadding="0"><tbody><tr><td valign="top" width="593"><p>The ease of access to the various resources on the web-enabled the democratization of access to information but at the same time allowed the appearance of enormous plagiarism problems. Many techniques of plagiarism were identified in the literature, but the plagiarism of idea steels the foremost troublesome to detect, because it uses different text manipulation at the same time. Indeed, a few strategies have been proposed to perform semantic plagiarism detection, but they are still numerous challenges to overcome. Unlike the existing states of the art, the purpose of this study is to give an overview of different propositions for plagiarism detection based on the deep learning algorithms. The main goal of these approaches is to provide a high quality of worlds or sentences vector representation. In this paper, we propose a comparative study based on a set of criterions like: Vector representation method, Level Treatment, Similarity Method and Dataset. One result of this study is that most of researches are based on world granularity and use the word2vec method for word vector representation, which sometimes is not suitable to keep the meaning of the whole sentences. Each technique has strengths and weaknesses; however, none is quite mature for semantic plagiarism detection.</p></td></tr></tbody></table>

2018 ◽  
Vol 7 (2.14) ◽  
pp. 5726
Author(s):  
Oumaima Hourrane ◽  
El Habib Benlahmar ◽  
Ahmed Zellou

Sentiment analysis is one of the new absorbing parts appeared in natural language processing with the emergence of community sites on the web. Taking advantage of the amount of information now available, research and industry have been seeking ways to automatically analyze the sentiments expressed in texts. The challenge for this task is the human language ambiguity, and also the lack of labeled data. In order to solve this issue, sentiment analysis and deep learning have been merged as deep learning models are effective due to their automatic learning capability. In this paper, we provide a comparative study on IMDB movie review dataset, we compare word embeddings and further deep learning models on sentiment analysis and give broad empirical outcomes for those keen on taking advantage of deep learning for sentiment analysis in real-world settings.


Author(s):  
Grégory Cobéna ◽  
Talel Abdessalem

Change detection is an important part of version management for databases and document archives. The success of XML has recently renewed interest in change detection on trees and semi-structured data, and various algorithms have been proposed. We study different algorithms and representations of changes based on their formal definition and on experiments conducted over XML data from the Web. Our goal is to provide an evaluation of the quality of the results, the performance of the tools and, based on this, guide the users in choosing the appropriate solution for their applications.


Author(s):  
Ricardo Barros ◽  
Geraldo Xexéo ◽  
Wallace A. Pinheiro ◽  
Jano de Souza

Due to the amount of information on the Web being so large and being of varying levels of quality, it is becoming increasingly difficult to find precisely what is required on the Web, particularly if the information consumer does not have precise knowledge of his or her information needs. On the Web, while searching for information, users can find data that is old, imprecise, invalid, intentionally wrong, or biased, due to this large amount of available data and comparative ease of access. In this environment users constantly receive useless, outdated, or false data, which they have no means to assess. This chapter addresses the issues regarding the large amount and low quality of Web information by proposing a methodology that adopts user and context-aware quality filters based on Web metadata retrieval. This starts with an initial evaluation and adjusts it to consider context characteristics and user perspectives to obtain aggregated evaluation values.


2021 ◽  
Vol 13 (4) ◽  
pp. 1-35
Author(s):  
Gabriel Amaral ◽  
Alessandro Piscopo ◽  
Lucie-aimée Kaffee ◽  
Odinaldo Rodrigues ◽  
Elena Simperl

Wikidata is one of the most important sources of structured data on the web, built by a worldwide community of volunteers. As a secondary source, its contents must be backed by credible references; this is particularly important, as Wikidata explicitly encourages editors to add claims for which there is no broad consensus, as long as they are corroborated by references. Nevertheless, despite this essential link between content and references, Wikidata's ability to systematically assess and assure the quality of its references remains limited. To this end, we carry out a mixed-methods study to determine the relevance, ease of access, and authoritativeness of Wikidata references, at scale and in different languages, using online crowdsourcing, descriptive statistics, and machine learning. Building on previous work of ours, we run a series of microtasks experiments to evaluate a large corpus of references, sampled from Wikidata triples with labels in several languages. We use a consolidated, curated version of the crowdsourced assessments to train several machine learning models to scale up the analysis to the whole of Wikidata. The findings help us ascertain the quality of references in Wikidata and identify common challenges in defining and capturing the quality of user-generated multilingual structured data on the web. We also discuss ongoing editorial practices, which could encourage the use of higher-quality references in a more immediate way. All data and code used in the study are available on GitHub for feedback and further improvement and deployment by the research community.


2020 ◽  
Vol 71 (7) ◽  
pp. 868-880
Author(s):  
Nguyen Hong-Quan ◽  
Nguyen Thuy-Binh ◽  
Tran Duc-Long ◽  
Le Thi-Lan

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.


2020 ◽  
Author(s):  
Saeed Nosratabadi ◽  
Amir Mosavi ◽  
Puhong Duan ◽  
Pedram Ghamisi ◽  
Ferdinand Filip ◽  
...  

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.


Sign in / Sign up

Export Citation Format

Share Document