TY - GEN
T1 - CookieGraph
T2 - 30th ACM SIGSAC Conference on Computer and Communications Security, CCS 2023
AU - Munir, Shaoor
AU - Siby, Sandra
AU - Iqbal, Umar
AU - Englehardt, Steven
AU - Shafiq, Zubair
AU - Troncoso, Carmela
N1 - Publisher Copyright:
© 2023 Copyright held by the owner/author(s)
PY - 2023/11/15
Y1 - 2023/11/15
N2 - As third-party cookie blocking is becoming the norm in mainstream web browsers, advertisers and trackers have started to use first-party cookies for tracking. To understand this phenomenon, we conduct a differential measurement study with versus without third-party cookies. We find that first-party cookies are used to store and exfiltrate identifiers to known trackers even when third-party cookies are blocked. As opposed to third-party cookie blocking, first-party cookie blocking is not practical because it would result in major breakage of website functionality. We propose CookieGraph, a machine learning-based approach that can accurately and robustly detect and block first-party tracking cookies. CookieGraph detects first-party tracking cookies with 90.18% accuracy, outperforming the state-of-the-art CookieBlock by 17.31%. We show that CookieGraph is robust against cookie name manipulation, while CookieBlock's accuracy drops by 15.87%. While blocking all first-party cookies results in major breakage on 32% of the sites with SSO logins, and CookieBlock reduces it to 10%, we show that CookieGraph does not cause any major breakage on these sites. Our deployment of CookieGraph shows that first-party tracking cookies are used on 89.86% of the top-million websites. We find that 96.61% of these first-party tracking cookies are in fact ghostwritten by third-party scripts embedded in the first-party context. We also find evidence of first-party tracking cookies being set by fingerprinting scripts. The most prevalent first-party tracking cookies are set by major advertising entities such as Google, Facebook, and TikTok.
AB - As third-party cookie blocking is becoming the norm in mainstream web browsers, advertisers and trackers have started to use first-party cookies for tracking. To understand this phenomenon, we conduct a differential measurement study with versus without third-party cookies. We find that first-party cookies are used to store and exfiltrate identifiers to known trackers even when third-party cookies are blocked. As opposed to third-party cookie blocking, first-party cookie blocking is not practical because it would result in major breakage of website functionality. We propose CookieGraph, a machine learning-based approach that can accurately and robustly detect and block first-party tracking cookies. CookieGraph detects first-party tracking cookies with 90.18% accuracy, outperforming the state-of-the-art CookieBlock by 17.31%. We show that CookieGraph is robust against cookie name manipulation, while CookieBlock's accuracy drops by 15.87%. While blocking all first-party cookies results in major breakage on 32% of the sites with SSO logins, and CookieBlock reduces it to 10%, we show that CookieGraph does not cause any major breakage on these sites. Our deployment of CookieGraph shows that first-party tracking cookies are used on 89.86% of the top-million websites. We find that 96.61% of these first-party tracking cookies are in fact ghostwritten by third-party scripts embedded in the first-party context. We also find evidence of first-party tracking cookies being set by fingerprinting scripts. The most prevalent first-party tracking cookies are set by major advertising entities such as Google, Facebook, and TikTok.
KW - cookies
KW - machine learning
KW - privacy
KW - tracking
KW - web security
UR - http://www.scopus.com/inward/record.url?scp=85168632108&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85168632108&partnerID=8YFLogxK
U2 - 10.1145/3576915.3616586
DO - 10.1145/3576915.3616586
M3 - Conference contribution
AN - SCOPUS:85168632108
T3 - CCS 2023 - Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security
SP - 3490
EP - 3504
BT - CCS 2023 - Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security
PB - Association for Computing Machinery, Inc
Y2 - 26 November 2023 through 30 November 2023
ER -