Doppelgänger finder: Taking stylometry to the underground

Sadia Afroz, Aylin Caliskan-Islam, Ariel Stolerman, Rachel Greenstadt, Damon McCoy

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Stylometry is a method for identifying anonymous authors of anonymous texts by analyzing their writing style. While stylometric methods have produced impressive results in previous experiments, we wanted to explore their performance on a challenging dataset of particular interest to the security research community. Analysis of underground forums can provide key information about who controls a given bot network or sells a service, and the size and scope of the cybercrime underworld. Previous analyses have been accomplished primarily through analysis of limited structured metadata and painstaking manual analysis. However, the key challenge is to automate this process, since this labor intensive manual approach clearly does not scale. We consider two scenarios. The first involves text written by an unknown cybercriminal and a set of potential suspects. This is standard, supervised stylometry problem made more difficult by multilingual forums that mix l33t-speak conversations with data dumps. In the second scenario, you want to feed a forum into an analysis engine and have it output possible doppelgangers, or users with multiple accounts. While other researchers have explored this problem, we propose a method that produces good results on actual separate accounts, as opposed to data sets created by artificially splitting authors into multiple identities. For scenario 1, we achieve 77% to 84% accuracy on private messages. For scenario 2, we achieve 94% recall with 90% precision on blogs and 85.18% precision with 82.14% recall for underground forum users. We demonstrate the utility of our approach with a case study that includes applying our technique to the Carders forum and manual analysis to validate the results, enabling the discovery of previously undetected doppelganger accounts.

    Original languageEnglish (US)
    Title of host publicationProceedings - IEEE Symposium on Security and Privacy
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages212-226
    Number of pages15
    ISBN (Electronic)9781479946860
    DOIs
    StatePublished - Nov 13 2014
    Event35th IEEE Symposium on Security and Privacy, SP 2014 - San Jose, United States
    Duration: May 18 2014May 21 2014

    Publication series

    NameProceedings - IEEE Symposium on Security and Privacy
    ISSN (Print)1081-6011

    Other

    Other35th IEEE Symposium on Security and Privacy, SP 2014
    Country/TerritoryUnited States
    CitySan Jose
    Period5/18/145/21/14

    Keywords

    • Stylometry
    • cybercrime
    • underground forum

    ASJC Scopus subject areas

    • Safety, Risk, Reliability and Quality
    • Software
    • Computer Networks and Communications

    Fingerprint

    Dive into the research topics of 'Doppelgänger finder: Taking stylometry to the underground'. Together they form a unique fingerprint.

    Cite this