Loading…

Reproducing Popularity Distribution of YouTube Videos

To provide video streaming of user-generated contents (UGCs) with high quality and at low cost by maximizing the effect of content delivery network (CDN), CDN providers are required to adequately design CDN cache servers by accurately estimating the UGC view-count distribution. To achieve this goal...

Full description

Saved in:
Bibliographic Details
Published in:IEEE eTransactions on network and service management 2019-09, Vol.16 (3), p.1100-1112
Main Authors: Kamiyama, Noriaki, Murata, Masayuki
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To provide video streaming of user-generated contents (UGCs) with high quality and at low cost by maximizing the effect of content delivery network (CDN), CDN providers are required to adequately design CDN cache servers by accurately estimating the UGC view-count distribution. To achieve this goal in a practical time frame, we need to construct a simple time-series model that captures the transition of UGC popularity. Therefore, in this paper, we first analyze the daily view count (DVC) of YouTube videos over nine months and find that the DVC of YouTube videos obeys a lognormal distribution. As a simple time-series model of the DVC of each YouTube video, we propose the grouped MPP (gMPP), extending the multiplicative process (MPP) which is widely known as a simple time-series model generating a lognormal distribution. We also propose reproducing the DVC distribution of YouTube videos by using a superposed gMPP (SgMPP) that aggregates multiple gMPPs. The SgMPP can accurately reproduce the DVC distribution of YouTube videos with a low computational overhead, so we can expect to use the SgMPP as the input for computer simulations for designing various network components that require the popularity distribution of UGC, e.g., cache capacities. Through numerical evaluation, we confirm that we can adequately design the storage capacity of a cache server with the average error rate of several percent against the target cache hit ratio.
ISSN:1932-4537
1932-4537
DOI:10.1109/TNSM.2019.2914222