User Behavior Fingerprinting with Multi-Item-Sets and Its Application in IPTV Viewer Identification

Can Yang, Lan Wang, Houwei Cao, Qihu Yuan, Yong Liu

Research output: Contribution to journalArticlepeer-review

Abstract

User activities in cyberspace leave unique traces for user identification (UI). Individual users can be identified by their frequent activity items through statistical feature matching. However, such approaches face the data sparsity problem. In this paper, we propose to address this problem by multi-item-set fingerprinting that identifies users not only based on their frequent individual activity items, but also their frequent consecutive item sequences with different lengths. We also propose a new similarity metric between fingerprint vectors that combines the advantages of Jaccard distance and relative entropy distance. Furthermore, we develop a fusion decision scheme by consolidating matching candidates generated by different similarity metrics. It improves the precision at the price of extra rejection. Our proposed approaches can be used in both one-by-one matching and bipartite graph group matching. Through extensive experiments on three real user datasets, in particular a large-scale Internet Protocol Television (IPTV) viewer dataset, we demonstrate that the proposed approaches outperform the state-of-the-art methods. The average matching precision reaches 93.8% for a dataset of 1,000 users and 100% for a dataset of 100 users. This work is of significance for information forensics and raises a new challenge for human privacy protection in cyberspace.

Original languageEnglish (US)
Article number9340396
Pages (from-to)2667-2682
Number of pages16
JournalIEEE Transactions on Information Forensics and Security
Volume16
DOIs
StatePublished - 2021

Keywords

  • IPTV
  • User identification
  • deanonymization
  • frequent item set
  • pattern recognition
  • statistical feature matching
  • user behaviors
  • user identification

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'User Behavior Fingerprinting with Multi-Item-Sets and Its Application in IPTV Viewer Identification'. Together they form a unique fingerprint.

Cite this