TY - GEN
T1 - Use fewer instances of the letter "i"
T2 - 12th International Symposium on Privacy Enhancing Technologies, PETS 2012
AU - McDonald, Andrew W.E.
AU - Afroz, Sadia
AU - Caliskan, Aylin
AU - Stolerman, Ariel
AU - Greenstadt, Rachel
PY - 2012
Y1 - 2012
N2 - This paper presents Anonymouth, a novel framework for anonymizing writing style. Without accounting for style, anonymous authors risk identification. This framework is necessary to provide a tool for testing the consistency of anonymized writing style and a mechanism for adaptive attacks against stylometry techniques. Our framework defines the steps necessary to anonymize documents and implements them. A key contribution of this work is this framework, including novel methods for identifying which features of documents need to change and how they must be changed to accomplish document anonymization. In our experiment, 80% of the user study participants were able to anonymize their documents in terms of a fixed corpus and limited feature set used. However, modifying pre-written documents were found to be difficult and the anonymization did not hold up to more extensive feature sets. It is important to note that Anonymouth is only the first step toward a tool to acheive stylometric anonymity with respect to state-of-the-art authorship attribution techniques. The topic needs further exploration in order to accomplish significant anonymity.
AB - This paper presents Anonymouth, a novel framework for anonymizing writing style. Without accounting for style, anonymous authors risk identification. This framework is necessary to provide a tool for testing the consistency of anonymized writing style and a mechanism for adaptive attacks against stylometry techniques. Our framework defines the steps necessary to anonymize documents and implements them. A key contribution of this work is this framework, including novel methods for identifying which features of documents need to change and how they must be changed to accomplish document anonymization. In our experiment, 80% of the user study participants were able to anonymize their documents in terms of a fixed corpus and limited feature set used. However, modifying pre-written documents were found to be difficult and the anonymization did not hold up to more extensive feature sets. It is important to note that Anonymouth is only the first step toward a tool to acheive stylometric anonymity with respect to state-of-the-art authorship attribution techniques. The topic needs further exploration in order to accomplish significant anonymity.
KW - anonymity
KW - machine learning
KW - privacy
KW - stylometry
UR - http://www.scopus.com/inward/record.url?scp=84864225669&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84864225669&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-31680-7_16
DO - 10.1007/978-3-642-31680-7_16
M3 - Conference contribution
AN - SCOPUS:84864225669
SN - 9783642316791
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 299
EP - 318
BT - Privacy Enhancing Technologies - 12th International Symposium, PETS 2012, Proceedings
Y2 - 11 July 2012 through 13 July 2012
ER -