TY - JOUR
T1 - Unraveling the BitTorrent ecosystem
AU - Zhang, Chao
AU - Dhungel, Prithula
AU - Wu, Di
AU - Ross, Keith W.
N1 - Funding Information:
This work has been in part supported by the US National Science Foundation (NFS) grant CNS-0917767, the Special Fund for Basic Scientific Research of Central Colleges, Sun Yat-Sen University (grant no. 2009-35000-3161425/ 09LGPY56) and Sun Yat-Sen University “Hundred Talents Program” (grant no. 35000-3226138).
PY - 2011
Y1 - 2011
N2 - BitTorrent is the most successful open Internet application for content distribution. Despite its importance, both in terms of its footprint in the Internet and the influence it has on emerging P2P applications, the BitTorrent Ecosystem is only partially understood. We seek to provide a nearly complete picture of the entire public BitTorrent Ecosystem. To this end, we crawl five of the most popular torrent-discovery sites over a ine-month period, identifying all of 4.6 million and 38,996 trackers that the sites reference. We also develop a high-performance tracker crawler, and over a narrow window of 12 hours, crawl essentially all of the public Ecosystem's trackers, obtaining peer lists for all referenced torrents. Complementing the torrent-discovery site and tracker crawling, we further crawl Azureus and Mainline DHTs for a random sample of torrents. Our resulting measurement data are more than an order of magnitude larger (in terms of number of torrents, trackers, or peers) than any earlier study. Using this extensive data set, we study in-depth the Ecosystem's torrent-discovery, tracker, peer, user behavior, and content landscapes. For peer statistics, the analysis is based on one typical snapshot obtained over 12 hours. We further analyze the fragility of the Ecosystem upon the removal of its most important tracker service.
AB - BitTorrent is the most successful open Internet application for content distribution. Despite its importance, both in terms of its footprint in the Internet and the influence it has on emerging P2P applications, the BitTorrent Ecosystem is only partially understood. We seek to provide a nearly complete picture of the entire public BitTorrent Ecosystem. To this end, we crawl five of the most popular torrent-discovery sites over a ine-month period, identifying all of 4.6 million and 38,996 trackers that the sites reference. We also develop a high-performance tracker crawler, and over a narrow window of 12 hours, crawl essentially all of the public Ecosystem's trackers, obtaining peer lists for all referenced torrents. Complementing the torrent-discovery site and tracker crawling, we further crawl Azureus and Mainline DHTs for a random sample of torrents. Our resulting measurement data are more than an order of magnitude larger (in terms of number of torrents, trackers, or peers) than any earlier study. Using this extensive data set, we study in-depth the Ecosystem's torrent-discovery, tracker, peer, user behavior, and content landscapes. For peer statistics, the analysis is based on one typical snapshot obtained over 12 hours. We further analyze the fragility of the Ecosystem upon the removal of its most important tracker service.
KW - BitTorrent Ecosystem
KW - content distribution
KW - measurement.
KW - peer-to-peer
UR - http://www.scopus.com/inward/record.url?scp=79957594613&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79957594613&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2010.123
DO - 10.1109/TPDS.2010.123
M3 - Article
AN - SCOPUS:79957594613
SN - 1045-9219
VL - 22
SP - 1164
EP - 1177
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 7
M1 - 5482574
ER -