TY - GEN
T1 - Rhea
T2 - 5th IEEE International Conference on Big Data, Big Data 2017
AU - Liakos, Panagiotis
AU - Ntoulas, Alexandrosb
AU - Delis, Alex
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/1
Y1 - 2017/7/1
N2 - Processing the full activity stream of a social network in real time is oftentimes prohibitive in terms of both storage and computational cost. One way to work around this problem is to take a sample of the social activity and use this sample to feed into applications such as content recommendation, opinion mining, or sentiment analysis. In this paper, we study the problem of extracting samples of authoritative content from a social activity stream. Specifically, we propose an adaptive stream sampling approach, termed Rhea, that processes a stream of social activity in real-time and samples the content of users that are more likely to provide influential information. To the best of our knowledge, Rhea is the first algorithm that dynamically adapts over time to account for evolving trends in the activity stream. Thus, we are able to capture high quality content from emerging users that contemporary white-list based methods ignore. We evaluate Rhea using two popular social networks reaching up to half a billion posts. Our results show that we significantly outperform previously proposed methods in terms of both recall and precision, while also offering remarkably more accurate ranking.
AB - Processing the full activity stream of a social network in real time is oftentimes prohibitive in terms of both storage and computational cost. One way to work around this problem is to take a sample of the social activity and use this sample to feed into applications such as content recommendation, opinion mining, or sentiment analysis. In this paper, we study the problem of extracting samples of authoritative content from a social activity stream. Specifically, we propose an adaptive stream sampling approach, termed Rhea, that processes a stream of social activity in real-time and samples the content of users that are more likely to provide influential information. To the best of our knowledge, Rhea is the first algorithm that dynamically adapts over time to account for evolving trends in the activity stream. Thus, we are able to capture high quality content from emerging users that contemporary white-list based methods ignore. We evaluate Rhea using two popular social networks reaching up to half a billion posts. Our results show that we significantly outperform previously proposed methods in terms of both recall and precision, while also offering remarkably more accurate ranking.
UR - http://www.scopus.com/inward/record.url?scp=85047789110&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85047789110&partnerID=8YFLogxK
U2 - 10.1109/BigData.2017.8257984
DO - 10.1109/BigData.2017.8257984
M3 - Conference contribution
AN - SCOPUS:85047789110
T3 - Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
SP - 686
EP - 695
BT - Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
A2 - Nie, Jian-Yun
A2 - Obradovic, Zoran
A2 - Suzumura, Toyotaro
A2 - Ghosh, Rumi
A2 - Nambiar, Raghunath
A2 - Wang, Chonggang
A2 - Zang, Hui
A2 - Baeza-Yates, Ricardo
A2 - Baeza-Yates, Ricardo
A2 - Hu, Xiaohua
A2 - Kepner, Jeremy
A2 - Cuzzocrea, Alfredo
A2 - Tang, Jian
A2 - Toyoda, Masashi
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 11 December 2017 through 14 December 2017
ER -