Mathematical Alogrithimic Analysis Of Natural Language L. Gerber, Uncredited Authorship Attribution: A Machine Learning Approach

Authors

  • Ambreen Zehra Rizvi Assistant Professor, Faculty of Engineering, Science & Technology, Hamdard University Main Campus, Karachi, Pakistan
  • Nazra Zahid Shaikh Senior Lecturer, Department of English, Faculty of Social Sciences and Humanities, Hamdard University Main Campus, Karachi, Pakistan.

Abstract

Authorship attribution is the problem of associating an author with a document by computational means. In this paper the use of SVM and Neural Networks to classify texts by author, taking advantage of stylistic features such as lexical, syntactic and structural patterns, is studied. We test on two datasets: the Blog Authorship Corpus and Project Gutenberg text, with results of 92.3% accuracy with SVM and 94.7% with a Neural Network. We show how machine learning models are able to capture stylistic fingerprints, which is of interest for forensic linguistics, plagiarism detection, and digital humanities.

Keywords: Authorship attribution, stylometry, machine learning, SVM, neural networks

Downloads

Published

2025-06-28