TY - GEN
T1 - What’s in a name
T2 - 16th International Conference on Database Systems for Advanced Applications, DASFAA 2011
AU - Tang, Cong
AU - Ross, Keith
AU - Saxena, Nitesh
AU - Chen, Ruichuan
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2011.
PY - 2011
Y1 - 2011
N2 - In this paper, by crawling Facebook public profile pages of a large and diverse user population in New York City, we create a comprehensive and contemporary first name list, in which each name is annotated with a popularity estimate and a gender probability. First, we use the name list as part of a novel and powerful technique for inferring Facebook users’ gender. Our name-centric approach to gender prediction partitions the users into two groups, A and B, and is able to accurately predict genders for users belonging to A. Applying our methodology to NYC users in Facebook, we are able to achieve an accuracy of 95.2% for group A consisting of 95.1% of the NYC users. This is a significant improvement over recent results of gender prediction [14], which achieved a maximum accuracy of 77.2% based on users’ group affiliations. Second, having inferred the gender of most users in our Facebook dataset, we learn several interesting gender characteristics and analyze how males and females behave in Facebook. We find, for example, that females and males exhibit contrasting behaviors while hiding their attributes, such as gender, age, and sexual preference, and that females are more conscious about their online privacy on Facebook.
AB - In this paper, by crawling Facebook public profile pages of a large and diverse user population in New York City, we create a comprehensive and contemporary first name list, in which each name is annotated with a popularity estimate and a gender probability. First, we use the name list as part of a novel and powerful technique for inferring Facebook users’ gender. Our name-centric approach to gender prediction partitions the users into two groups, A and B, and is able to accurately predict genders for users belonging to A. Applying our methodology to NYC users in Facebook, we are able to achieve an accuracy of 95.2% for group A consisting of 95.1% of the NYC users. This is a significant improvement over recent results of gender prediction [14], which achieved a maximum accuracy of 77.2% based on users’ group affiliations. Second, having inferred the gender of most users in our Facebook dataset, we learn several interesting gender characteristics and analyze how males and females behave in Facebook. We find, for example, that females and males exhibit contrasting behaviors while hiding their attributes, such as gender, age, and sexual preference, and that females are more conscious about their online privacy on Facebook.
UR - http://www.scopus.com/inward/record.url?scp=85012302520&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85012302520&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-20244-5_33
DO - 10.1007/978-3-642-20244-5_33
M3 - Conference contribution
AN - SCOPUS:85012302520
SN - 9783642202438
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 344
EP - 356
BT - Database Systems for Adanced Applications - 16th International Conference, DASFAA 2011, International Workshops
A2 - Xu, Jianliang
A2 - Yu, Ge
A2 - Zhou, Shuigeng
A2 - Unland, Rainer
PB - Springer Verlag
Y2 - 22 April 2011 through 25 April 2011
ER -