TY - GEN
T1 - Vaccine
T2 - 2019 World Wide Web Conference, WWW 2019
AU - Shvartzshnaider, Yan
AU - Wies, Thomas
AU - Pavlinovic, Zvonimir
AU - Lakshminarayanan,
AU - Mittal, Prateek
AU - Balashankar, Ananth
AU - Nissenbaum, Helen
N1 - Funding Information:
We thank Paula Kift and Schrasing Tong for their help in the initial stage of this work. This work is supported by the following National Science Foundation (NSF) under grants CCF-1350574, CNS-1514422, CNS-1801501, CNS-1704527, SES-1537324, SES-1650589, the National Security Agency grant H98230-18-D-006 and the Cisco CG-653005 Research Award. We are also grateful to the NSF I-CORPS 1650769 grant that allowed us to discuss the data leakage problem with large number of companies and ultimately helped us with articulating the VACCINE problem.
Publisher Copyright:
© 2019 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC-BY 4.0 License.
PY - 2019/5/13
Y1 - 2019/5/13
N2 - Modern enterprises rely on Data Leakage Prevention (DLP) systems to enforce privacy policies that prevent unintentional flow of sensitive information to unauthorized entities. However, these systems operate based on rule sets that are limited to syntactic analysis and therefore completely ignore the semantic relationships between participants involved in the information exchanges. For similar reasons, these systems cannot enforce complex privacy policies that require temporal reasoning about events that have previously occurred. To address these limitations, we advocate a new design methodology for DLP systems centered on the notion of Contextual Integrity (CI). We use the CI framework to abstract real-world communication exchanges into formally defined information flows where privacy policies describe sequences of admissible flows. CI allows us to decouple (1) the syntactic extraction of flows from information exchanges, and (2) the enforcement of privacy policies on these flows. We applied this approach to built VACCINE, a DLP auditing system for emails. VACCINE uses state-of-the-art techniques in natural language processing to extract flows from email text. It also provides a declarative language for describing privacy policies. These policies are automatically compiled to operational rules that the system uses for detecting data leakages. We evaluated VACCINE on the Enron email corpus and show that it improves over the state of the art both in terms of the expressivity of the policies that DLP systems can enforce as well as its precision in detecting data leakages.
AB - Modern enterprises rely on Data Leakage Prevention (DLP) systems to enforce privacy policies that prevent unintentional flow of sensitive information to unauthorized entities. However, these systems operate based on rule sets that are limited to syntactic analysis and therefore completely ignore the semantic relationships between participants involved in the information exchanges. For similar reasons, these systems cannot enforce complex privacy policies that require temporal reasoning about events that have previously occurred. To address these limitations, we advocate a new design methodology for DLP systems centered on the notion of Contextual Integrity (CI). We use the CI framework to abstract real-world communication exchanges into formally defined information flows where privacy policies describe sequences of admissible flows. CI allows us to decouple (1) the syntactic extraction of flows from information exchanges, and (2) the enforcement of privacy policies on these flows. We applied this approach to built VACCINE, a DLP auditing system for emails. VACCINE uses state-of-the-art techniques in natural language processing to extract flows from email text. It also provides a declarative language for describing privacy policies. These policies are automatically compiled to operational rules that the system uses for detecting data leakages. We evaluated VACCINE on the Enron email corpus and show that it improves over the state of the art both in terms of the expressivity of the policies that DLP systems can enforce as well as its precision in detecting data leakages.
KW - Contextual Integrity
KW - DLP
KW - Data Leakage Detection
KW - Privacy
UR - http://www.scopus.com/inward/record.url?scp=85066901764&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066901764&partnerID=8YFLogxK
U2 - 10.1145/3308558.3313655
DO - 10.1145/3308558.3313655
M3 - Conference contribution
AN - SCOPUS:85066901764
T3 - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
SP - 1702
EP - 1712
BT - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PB - Association for Computing Machinery, Inc
Y2 - 13 May 2019 through 17 May 2019
ER -