TY - JOUR
T1 - Feature vector difference based authorship verification for open-world settings
AU - Weerasinghe, Janith
AU - Singh, Rhia
AU - Greenstadt, Rachel
N1 - Funding Information:
We thank PAN2021 organizers for organizing the shared task and helping us through the submission process. We also thank the reviewers for their helpful comments and feedback. Our work was supported by the National Science Foundation under grant 1931005 and the McNulty Foundation.
Publisher Copyright:
© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PY - 2021
Y1 - 2021
N2 - This paper describes the approach we took to create a machine learning model for the PAN 2021 Authorship Verification Task. The goal of this task is to predict if a given pair of documents are written by the same author. For each document pair, we extracted stylometric features from the documents and used the absolute difference between the feature vectors as input to our classifier. Our new model is similar to out last year's model with minor improvements to the feature set and the classifier. We trained two models on the two small and large datasets which achieved AUCs of 0.967 and 0.972 in the final evaluations.
AB - This paper describes the approach we took to create a machine learning model for the PAN 2021 Authorship Verification Task. The goal of this task is to predict if a given pair of documents are written by the same author. For each document pair, we extracted stylometric features from the documents and used the absolute difference between the feature vectors as input to our classifier. Our new model is similar to out last year's model with minor improvements to the feature set and the classifier. We trained two models on the two small and large datasets which achieved AUCs of 0.967 and 0.972 in the final evaluations.
KW - Authorship verification
KW - Machine learning
KW - Natural language processing
KW - Stylometry
UR - http://www.scopus.com/inward/record.url?scp=85113515704&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85113515704&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85113515704
SN - 1613-0073
VL - 2936
SP - 2201
EP - 2207
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021
Y2 - 21 September 2021 through 24 September 2021
ER -