Privacy detective: Detecting private information and collective privacy behavior in a large social network

Aylin Caliskan-Islam, Jonathan Walsh, Rachel Greenstadt

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Detecting the presence and amount of private information being shared in online media is the first step towards analyzing information revealing habits of users in social networks and a useful method for researchers to study aggregate privacy behavior. In this work, we aim to find out if text contains private content by using our novel learning based approach 'privacy detective' that combines topic modeling, named entity recognition, privacy ontology, sentiment analysis, and text normalization to represent privacy features. Privacy detective investigates a broader range of privacy concerns compared to previous approaches that focus on keyword searching or profile related properties. We collected 500,000 tweets from 100,000 Twitter users along with other information such as tweet linkages and follower relationships. We reach 95.45% accuracy in a two- class task classifying Twitter users who do not reveal much private information and Twitter users who share sensitive information. We score timelines according to three privacy levels after having Amazon Mechanical Turk (AMT) workers annotate collected tweets according to privacy categories. Supervised machine learning classification results on these annotations reach 69.63% accuracy on a three-class task. Inter-annotator agreement on timeline privacy scores between various AMT workers and our classifiers fall under the same positive agreement level. Additionally, we show that a user's privacy level is correlated with her friends' privacy scores and also with the privacy scores of people mentioned in her text but not with the number of her followers. As such, privacy in social networks appear to be socially constructed, which can have great implications for privacy enhancing technologies and educational interventions.

    Original languageEnglish (US)
    Title of host publicationProceedings of the ACM Conference on Computer and Communications Security
    PublisherAssociation for Computing Machinery
    Pages35-46
    Number of pages12
    ISBN (Electronic)9781450331487
    DOIs
    StatePublished - Nov 3 2014
    Event13th Workshop on Privacy in the Electronic Society, WPES 2014, in Conjunction with the ACM Conference on Computer and Communications Security, ACM CCS 2014 - Scottsdale, United States
    Duration: Nov 3 2014 → …

    Publication series

    NameProceedings of the ACM Conference on Computer and Communications Security
    ISSN (Print)1543-7221

    Conference

    Conference13th Workshop on Privacy in the Electronic Society, WPES 2014, in Conjunction with the ACM Conference on Computer and Communications Security, ACM CCS 2014
    Country/TerritoryUnited States
    CityScottsdale
    Period11/3/14 → …

    Keywords

    • Detecting private information
    • Privacy
    • Privacy behavior
    • Sensitive information
    • Social network
    • Text classification

    ASJC Scopus subject areas

    • Software
    • Computer Networks and Communications

    Fingerprint

    Dive into the research topics of 'Privacy detective: Detecting private information and collective privacy behavior in a large social network'. Together they form a unique fingerprint.

    Cite this