TY - GEN
T1 - Database Matching Under Adversarial Column Deletions
AU - Bakirtas, Serhat
AU - Erkip, Elza
N1 - Funding Information:
This work is supported by National Science Foundation grants 1815821 and 2148293.
Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - The de-anonymization of users from anonymized microdata through matching or aligning with publicly-available correlated databases has been of scientific interest recently. While most of the rigorous analyses of database matching have focused on random-distortion models, the adversarial-distortion models have been wanting in the relevant literature. In this work, motivated by synchronization errors in the sampling of time-indexed microdata, matching (alignment) of random databases under adversarial column deletions is investigated. It is assumed that a constrained adversary, which observes the anonymized database, can delete up to a δ fraction of the columns (attributes) to hinder matching and preserve privacy. Column histograms of the two databases are utilized as permutation-invariant features to detect the column deletion pattern chosen by the adversary. The detection of the column deletion pattern is then followed by an exact row (user) matching scheme. The worst-case analysis of this two-phase scheme yields a sufficient condition for the successful matching of the two databases, under the near-perfect recovery condition. A more detailed investigation of the error probability leads to a tight necessary condition on the database growth rate, and in turn, to a single-letter characterization of the adversarial matching capacity. This adversarial matching capacity is shown to be significantly lower than the "random"matching capacity, where the column deletions occur randomly. Overall, our results analytically demonstrate the privacy-wise advantages of adversarial mechanisms over random ones during the publication of anonymized time-indexed data.
AB - The de-anonymization of users from anonymized microdata through matching or aligning with publicly-available correlated databases has been of scientific interest recently. While most of the rigorous analyses of database matching have focused on random-distortion models, the adversarial-distortion models have been wanting in the relevant literature. In this work, motivated by synchronization errors in the sampling of time-indexed microdata, matching (alignment) of random databases under adversarial column deletions is investigated. It is assumed that a constrained adversary, which observes the anonymized database, can delete up to a δ fraction of the columns (attributes) to hinder matching and preserve privacy. Column histograms of the two databases are utilized as permutation-invariant features to detect the column deletion pattern chosen by the adversary. The detection of the column deletion pattern is then followed by an exact row (user) matching scheme. The worst-case analysis of this two-phase scheme yields a sufficient condition for the successful matching of the two databases, under the near-perfect recovery condition. A more detailed investigation of the error probability leads to a tight necessary condition on the database growth rate, and in turn, to a single-letter characterization of the adversarial matching capacity. This adversarial matching capacity is shown to be significantly lower than the "random"matching capacity, where the column deletions occur randomly. Overall, our results analytically demonstrate the privacy-wise advantages of adversarial mechanisms over random ones during the publication of anonymized time-indexed data.
UR - http://www.scopus.com/inward/record.url?scp=85165021854&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85165021854&partnerID=8YFLogxK
U2 - 10.1109/ITW55543.2023.10161615
DO - 10.1109/ITW55543.2023.10161615
M3 - Conference contribution
AN - SCOPUS:85165021854
T3 - 2023 IEEE Information Theory Workshop, ITW 2023
SP - 181
EP - 185
BT - 2023 IEEE Information Theory Workshop, ITW 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE Information Theory Workshop, ITW 2023
Y2 - 23 April 2023 through 28 April 2023
ER -