Exploring syntactic and semantic features for authorship attribution

2021 ◽  
Vol 111 ◽  
pp. 107815
Author(s):  
Haiyan Wu ◽  
Zhiqiang Zhang ◽  
Qingfeng Wu

The internet is increasing exponentially with textual content primarily through social websites. The problems were also increasing with anonymous textual data in the internet. The researchers are searching for alternative techniques to know the author of an unknown document. Authorship Attribution is one such technique to predict the details of an unknown document. The researchers extracted various classes of stylistic features like character, lexical, syntactic, structural, content and semantic features to distinguish the authors writing style. In this work, the experiment performed with most frequent content specific features, n-grams of character, word and POS tags. A standard dataset is used for experimentation and identified that the combination of content based and n-gram features achieved best accuracy for prediction of author. Two standard classification algorithms were used for author prediction. The Random forest classifier attained best accuracy for prediction of author when compared with Naïve Bayes Multinomial classifier. The achieved results were good compared to many existing solutions to the Authorship Attribution.


2009 ◽  
Vol 13 (1) ◽  
pp. 21-38
Author(s):  
Eun Joo Kwak
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document