TY - GEN
T1 - Linguistic properties of multi-word passphrases
AU - Bonneau, Joseph
AU - Shutova, Ekaterina
PY - 2012
Y1 - 2012
N2 - We examine patterns of human choice in a passphrase-based authentication system deployed by Amazon, a large online merchant. We tested the availability of a large corpus of over 100,000 possible phrases at Amazon's registration page, which prohibits using any phrase already registered by another user. A number of large, readily-available lists such as movie and book titles prove effective in guessing attacks, suggesting that passphrases are vulnerable to dictionary attacks like all schemes involving human choice. Extending our analysis with natural language phrases extracted from linguistic corpora, we find that phrase selection is far from random, with users strongly preferring simple noun bigrams which are common in natural language. The distribution of chosen passphrases is less skewed than the distribution of bigrams in English text, indicating that some users have attempted to choose phrases randomly. Still, the distribution of bigrams in natural language is not nearly random enough to resist offline guessing, nor are longer three- or four-word phrases for which we see rapidly diminishing returns.
AB - We examine patterns of human choice in a passphrase-based authentication system deployed by Amazon, a large online merchant. We tested the availability of a large corpus of over 100,000 possible phrases at Amazon's registration page, which prohibits using any phrase already registered by another user. A number of large, readily-available lists such as movie and book titles prove effective in guessing attacks, suggesting that passphrases are vulnerable to dictionary attacks like all schemes involving human choice. Extending our analysis with natural language phrases extracted from linguistic corpora, we find that phrase selection is far from random, with users strongly preferring simple noun bigrams which are common in natural language. The distribution of chosen passphrases is less skewed than the distribution of bigrams in English text, indicating that some users have attempted to choose phrases randomly. Still, the distribution of bigrams in natural language is not nearly random enough to resist offline guessing, nor are longer three- or four-word phrases for which we see rapidly diminishing returns.
UR - http://www.scopus.com/inward/record.url?scp=84868356659&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84868356659&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-34638-5_1
DO - 10.1007/978-3-642-34638-5_1
M3 - Conference contribution
AN - SCOPUS:84868356659
SN - 9783642346378
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 1
EP - 12
BT - Financial Cryptography and Data Security - FC 2012 Workshops, USEC and WECSR 2012, Revised Selected Papers
PB - Springer Verlag
T2 - 16th International Conference on Financial Cryptography and Data Security - FC 2012 Workshops, USEC and WECSR 2012, FC 2012 Workshops - USEC and WECSR 2012
Y2 - 2 March 2012 through 2 March 2012
ER -