TY - GEN
T1 - Mining groups of common interest
T2 - 9th International Conference on International Conference on Machine Learning and Data Mining, MLDM 2013
AU - Li, Liyun
AU - Memon, Nasir
PY - 2013
Y1 - 2013
N2 - This paper tackles the problem of detecting topical communities from within an organization by mining readily available network access pattern information. A Bayesian generative process is used to model the behavior of user's network access pattern and thereby her consumption of online content. The idea is that users within same topical interest group tend to share similar online access patterns. By leveraging this pattern, along with side information of domain-names and keywords within the accessed websites, one is able to model these observations under the framework of a mixed membership statistical model. Hence the access patterns of users-to-websites, as measured at the edge of an organization's network boundary, can be decomposed into constituent topical communities without any human effort in selecting specific features. Experimental results on real-world network flow trace demonstrate that the proposed method can effectively detect topically meaningful community structures. Besides better detection accuracy of communities compared with other community detection methods, the proposed method can detect interesting but non-evident hidden communities which cannot readily be detected by other known methods.
AB - This paper tackles the problem of detecting topical communities from within an organization by mining readily available network access pattern information. A Bayesian generative process is used to model the behavior of user's network access pattern and thereby her consumption of online content. The idea is that users within same topical interest group tend to share similar online access patterns. By leveraging this pattern, along with side information of domain-names and keywords within the accessed websites, one is able to model these observations under the framework of a mixed membership statistical model. Hence the access patterns of users-to-websites, as measured at the edge of an organization's network boundary, can be decomposed into constituent topical communities without any human effort in selecting specific features. Experimental results on real-world network flow trace demonstrate that the proposed method can effectively detect topically meaningful community structures. Besides better detection accuracy of communities compared with other community detection methods, the proposed method can detect interesting but non-evident hidden communities which cannot readily be detected by other known methods.
KW - Generative Model
KW - LDA
KW - Network Flow
KW - Topical Community Detection
UR - http://www.scopus.com/inward/record.url?scp=84881259972&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84881259972&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-39712-7_31
DO - 10.1007/978-3-642-39712-7_31
M3 - Conference contribution
AN - SCOPUS:84881259972
SN - 9783642397110
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 405
EP - 420
BT - Machine Learning and Data Mining in Pattern Recognition - 9th International Conference, MLDM 2013, Proceedings
Y2 - 19 July 2013 through 25 July 2013
ER -