TY - JOUR
T1 - Feature Vector Difference based Neural Network and Logistic Regression Models for Authorship Verification Notebook for PAN at CLEF 2020
AU - Weerasinghe, Janith
AU - Greenstadt, Rachel
N1 - Publisher Copyright:
Copyright © 2020 for this paper by its authors.
PY - 2020
Y1 - 2020
N2 - This paper describes the approach we took to create a machine learning model for the PAN 2020 Authorship Verification Task. For each document pair, we extracted stylometric features from the documents and used the absolute difference between the feature vectors as input to our classifier. We created two models: a Logistic Regression Model trained on a small dataset, and a Neural Network based model trained on the large dataset. These models achieved AUCs of 0.939 and 0.953 on the small and large datasets, making them the second-best models on both datasets submitted to the shared task.
AB - This paper describes the approach we took to create a machine learning model for the PAN 2020 Authorship Verification Task. For each document pair, we extracted stylometric features from the documents and used the absolute difference between the feature vectors as input to our classifier. We created two models: a Logistic Regression Model trained on a small dataset, and a Neural Network based model trained on the large dataset. These models achieved AUCs of 0.939 and 0.953 on the small and large datasets, making them the second-best models on both datasets submitted to the shared task.
UR - http://www.scopus.com/inward/record.url?scp=85113565713&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85113565713&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85113565713
SN - 1613-0073
VL - 2696
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 11th Conference and Labs of the Evaluation Forum, CLEF 2020
Y2 - 22 September 2020 through 25 September 2020
ER -