Distribution-Agnostic Database De-Anonymization Under Synchronization Errors

Serhat Bakirtas, Elza Erkip

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

There has recently been an increased scientific in-terest in the de-anonymization of users in anonymized databases containing user-level microdata via multifarious matching strate-gies utilizing publicly available correlated data. Existing literature has either emphasized practical aspects where underlying data distribution is not required, with limited or no theoret-ical guarantees, or theoretical aspects with the assumption of complete availability of underlying distributions. In this work, we take a step towards reconciling these two lines of work by providing theoretical guarantees for the de-anonymization of random correlated databases without prior knowledge of data distribution. Motivated by time-indexed microdata, we consider database de-anonymization under both synchronization errors (column repetitions) and obfuscation (noise). By modifying the previously used replica detection algorithm to accommodate for the unknown underlying distribution, proposing a new seeded deletion detection algorithm, and employing statistical and information-theoretic tools, we derive sufficient conditions on the database growth rate for successful matching. Our findings demonstrate that a double-logarithmic seed size relative to row size ensures successful deletion detection. More importantly, we show that the derived sufficient conditions are the same as in the distribution-aware setting, negating any asymptotic loss of performance due to unknown underlying distributions.

Original languageEnglish (US)
Title of host publicationWIFS 2023 - IEEE Workshop on Information Forensics and Security
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350324914
DOIs
StatePublished - 2023
Event2023 IEEE International Workshop on Information Forensics and Security, WIFS 2023 - Nurnberg, Germany
Duration: Dec 4 2023Dec 7 2023

Publication series

NameWIFS 2023 - IEEE Workshop on Information Forensics and Security

Conference

Conference2023 IEEE International Workshop on Information Forensics and Security, WIFS 2023
Country/TerritoryGermany
CityNurnberg
Period12/4/2312/7/23

Keywords

  • alignment
  • database
  • dataset
  • de-anonymization
  • distribution-agnostic
  • matching
  • obfuscation
  • privacy
  • synchronization

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Distribution-Agnostic Database De-Anonymization Under Synchronization Errors'. Together they form a unique fingerprint.

Cite this