authorship verification
Recently Published Documents


TOTAL DOCUMENTS

82
(FIVE YEARS 42)

H-INDEX

13
(FIVE YEARS 4)

2021 ◽  
Vol 28 (3) ◽  
pp. 250-259
Author(s):  
Ksenia Vladimirovna Lagutina

The article compares character-level, word-level, and rhythm features for the authorship verification of literary texts of the 19th-21st centuries. Text corpora contains fragments of novels, each fragment has a size of about 50 000 characters. There are 40 fragments for each author. 20 authors who wrote in English, Russian, French, and 8 Spanish-language authors are considered.The authors of this paper use existing algorithms for calculation of low-level features, popular in the computer linguistics, and rhythm features, common for the literary texts. Low-level features include n-grams of words, frequencies of letters and punctuation marks, average word and sentence lengths, etc. Rhythm features are based on lexico-grammatical figures: anaphora, epiphora, symploce, aposiopesis, epanalepsis, anadiplosis, diacope, epizeuxis, chiasmus, polysyndeton, repetitive exclamatory and interrogative sentences. These features include the frequency of occurrence of particular rhythm figures per 100 sentences, the number of unique words in the aspects of rhythm, the percentage of nouns, adjectives, adverbs and verbs in the aspects of rhythm. Authorship verification is considered as a binary classification problem: whether the text belongs to a particular author or not. AdaBoost and a neural network with an LSTM layer are considered as classification algorithms. The experiments demonstrate the effectiveness of rhythm features in verification of particular authors, and superiority of feature types combinations over single feature types on average. The best value for precision, recall, and F-measure for the AdaBoost classifier exceeds 90% when all three types of features are combined.


2021 ◽  
Author(s):  
Ariel Stolerman

2021 ◽  
Vol 37 ◽  
pp. 301185
Author(s):  
Anh Duc Le ◽  
Justin P.L. McGuinness ◽  
Edward Dixon

2021 ◽  
pp. 016555152110077
Author(s):  
Pelin Canbay ◽  
Ebru A Sezer ◽  
Hayri Sever

Authorship verification (AV) is one of the main problems of authorship analysis and digital text forensics. The classical AV problem is to decide whether or not a particular author wrote the document in question. However, if there is one and relatively short document as the author’s known document, the verification problem becomes more difficult than the classical AV and needs a generalised solution. Regarding to decide AV of the given two unlabeled documents (2D-AV), we proposed a system that provides an author-independent solution with the help of a Binary Background Model (BBM). The BBM is a supervised model that provides an informative background to distinguish document pairs written by the same or different authors. To evaluate the document pairs in one representation, we also proposed a new, simple and efficient document combination method based on the geometric mean of the stylometric features. We tested the performance of the proposed system for both author-dependent and author-independent AV cases. In addition, we introduced a new, well-defined, manually labelled Turkish blog corpus to be used in subsequent studies about authorship analysis. Using a publicly available English blog corpus for generating the BBM, the proposed system demonstrated an accuracy of over 90% from both trained and unseen authors’ test sets. Furthermore, the proposed combination method and the system using the BBM with the English blog corpus were also evaluated with other genres, which were used in the international PAN AV competitions, and achieved promising results.


Author(s):  
Ksenia Lagutina ◽  
Nadezhda Lagutina ◽  
Elena Boychuk ◽  
Vladislav Larionov ◽  
Ilya Paramonov

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Suleyman Alterkavı ◽  
Hasan Erbay

Compromising the online social network account of a genuine user, by imitating the user’s writing trait for malicious purposes, is a standard method. Then, when it happens, the fast and accurate detection of intruders is an essential step to control the damage. In other words, an efficient authorship verification model is a binary classification for the investigation of the text, whether it is written by a genuine user or not. Herein, a novel authorship verification framework for hijacked social media accounts, compromised by a human, is proposed. Significant textual features are derived from a Twitter-based dataset. They are composed of 16124 tweets with 280 characters crawled and manually annotated with the authorship information. XGBoost algorithm is then used to highlight the significance of each textual feature in the dataset. Furthermore, the ELECTRE approach is utilized for feature selection, and the rank exponent weight method is applied for feature weighting. The reduced dataset is evaluated with many classifiers, and the achieved result of the F-score is 94.4%.


2021 ◽  
Vol 58 (1) ◽  
pp. 4262-4266
Author(s):  
N. Selvaganesh, Sharmila D , A. V. Pra.bu

Digital forensics is the study of recovery and investigation of the materials found in digital devices, mainly in computers. Forensic authorship analysis is a branch of digital forensics. It includes tasks such as authorship attribution, authorship verification, and author profiling. In Authorship verification, with a given a set of sample documents D written by an author A and an unknown document d, the task is to find whether document d is written by A or not. Authorship verification has been previously done using genetic algorithms, SVM classifiers, etc. The existing system creates an ensemble model by combining the features based on the similarity scores, and the parameter optimization was done using a grid search. The accuracy of verification using the grid search method is 62.14%. The time complexity is high as the system tries all possible combinations of the features during the ensemble model's construction. In the proposed work, Modified Particle Swarm Optimization (MPSO) is used to construct the classification model in the training phase, instead of the ensemble model. In addition to the combination of linguistic and character features, Average Sentence Length is used to improve the verification task accuracy. The accuracy of verification has been improved to 63.38%.


Author(s):  
Merja Laamanen ◽  
Tarja Ladonlahti ◽  
Sanna Uotinen ◽  
Alexandra Okada ◽  
David Bañeres ◽  
...  

AbstractTrust-based e-assessment systems are increasingly important in the digital age for both academic institutions and students, including students with special educational needs and disabilities (SEND). Recent literature indicates a growing number of studies about e-authentication and authorship verification for quality assurance with more flexible modes of assessment. Yet understanding the acceptability of e-authentication systems among SEND students is underexplored. This study examines SEND students’ views about the use of e-authentication systems, including perceived advantages and disadvantages of new technology-enhanced assessment. This study aims to shed light on this area by examining the attitudes of 267 SEND students who used, or were aware of, an authentication system known as adaptive trust-based e-assessment system for learning (TeSLA). The results suggest a broadly positive acceptability of these e-authentication technologies by SEND students. In the view of these students, the key advantages are the ability of proving the originality of their work, and trust-based e-assessment results; the key disadvantages are the possibility that the technology might not work or present wrong outputs in terms of cheating.


Sign in / Sign up

Export Citation Format

Share Document