Feature Vector Difference based Neural Network and Logistic Regression Models for Authorship Verification Notebook for PAN at CLEF 2020

Janith Weerasinghe, Rachel Greenstadt

    Research output: Contribution to journalConference articlepeer-review

    Abstract

    This paper describes the approach we took to create a machine learning model for the PAN 2020 Authorship Verification Task. For each document pair, we extracted stylometric features from the documents and used the absolute difference between the feature vectors as input to our classifier. We created two models: a Logistic Regression Model trained on a small dataset, and a Neural Network based model trained on the large dataset. These models achieved AUCs of 0.939 and 0.953 on the small and large datasets, making them the second-best models on both datasets submitted to the shared task.

    Original languageEnglish (US)
    JournalCEUR Workshop Proceedings
    Volume2696
    StatePublished - 2020
    Event11th Conference and Labs of the Evaluation Forum, CLEF 2020 - Thessaloniki, Greece
    Duration: Sep 22 2020Sep 25 2020

    ASJC Scopus subject areas

    • General Computer Science

    Fingerprint

    Dive into the research topics of 'Feature Vector Difference based Neural Network and Logistic Regression Models for Authorship Verification Notebook for PAN at CLEF 2020'. Together they form a unique fingerprint.

    Cite this