YouTube Most Popular Videos

Learning Engagement With Bursting Velocity

Faysal.El.Khettabi@gmail.com

Introduction and Motivation

Every minute, creators, channels and Multi-channel network (MCN) around the world upload around or more than 300 hours of video content into the YouTube social network. This online platform has around or more than one billion viewers who watch millions of hours of YouTube videos making it the most adequate place for advertisers. By consequent YouTube generates billions in revenue through advertising, see YouTube-Differential-Analytics.

A striking phenomena in this on-line platform, some particular videos become hugely popular while the majority receive very little attention. It's hard to be convinced that the content alone does influence the popularity of a video. It's likely that an initial burst in popularity determines more future popularity. Many factors affect the evolution of popularity of a video on YouTube. Many studies have been interested to analyze the factors that affect how the popularity of a video is evolving over time, see On the Dynamics of Social Media Popularity: A YouTube Case Study and Engagement dynamics and sensitivity analysis of YouTube videos. These works are very informative and inspiring.

The first concern that we have is related to the used dataset to derive the insight. Their conclusions are based on the nature or the characteristic of the used dataset and any application of the obtained results to other YouTube datasets may encounter some biases. A second serious concern is the used datasets contain only daily-viewcount samples of YouTube videos. In our numerical results section, We will challenge the characteristic of these datasets with a dataset formed "around the clock" by the most popular videos in YouTube platform.

We think that an early sampling using for instance every 10 minutes is more informative than a daily viewcount. The early video popularity detection is more effective and of great importance to support and derive a design to take advantage of the early and subsequent formed network effect to boost the online planning advertising campaigns, estimating costs and fostering more revenue for all parties involved in the platform.

A video popularity burst in YouTube platform can be classified into three main types which occur accordingly to different origins and mechanisms.

The Types I and II are very substantial in YouTube platform. Specially for targeted advertising that focuses on certain socio-economical patterns in the viewers, these patterns help the advertiser to optimize the dynamics of his online advertisement. These types contain the most popular videos which have more than 100 million view-counts daily. Such key videos are important for further media social analysis as the observed activities during an interval of time ( bursting velocity) around them may have informative insights to personalization advertising or online marketing. However, the complexity of the graph-structure of these key videos to YouTube network need to be performed with an appropriate and efficient approach. The viewers of these types of videos in YouTube are attracted mainly by "controversial", "sensitive subjects" and specific events, shaping the platform into a trending streaming media ( the attraction occurs accordingly to different origins and mechanisms which need to be explored).

It seems that the viewers are often attracted to view a video that is exogenously made popular (first initial burst) and the subsequent popularity of this video ( bursting velocity) is more induced endogenously after. We call this process a learning engagement that inspire limitless possibilities and empower video discovery and popularity in YouTube platform, i.e the viewers sustain their interest over an interval of time (bursting engagement) and are positive about the viewed content "to advise it to other viewers" (adaptive engagement). A chain interaction occurs attracting more subsequent viewers. During a specific period of time, the cycle repeats to sustain more interactions thus leading to the possibility of a self-propagating series of viewers to watch or explore the video content. A trend is created for that video which may be showing eagerness or enthusiasm of the viewers or something else to identify and not really related to the "keenness".

This resulting trend is an important point that differentiate the Old TV era and the new TV era. The viewers not only "watch what they want to watch, when they want to watch it", the viewers are creating the trend which can be interpreted socially, economically and commercially and may be exploited more efficiently by the online planning advertising campaigns. The network effect is easy to form with a substantial critical mass from the created trend which is very helpful to the platform, the channels and the advertisers to growth financially.

Numerical Facts

The burst phenomena in YouTube platform is investigated by collecting the most popular videos and their view-counts are sampled every 10 minutes, our approach is first to detect the initial popularity burst of a video and its subsequent popularity in the streaming data environment or ecosystem. The collected data will be subject to more statistical analysis.

We used a mixing of R-Tuber and R-Google Analytics to access YouTube API. We leveraged the two packages by homemade R Software coding to enable more data collection optimization, data statistical analysis and visualization.

R is a software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing . The interactive data visualizations in web browsers uses the NVD3 data visualization library . The R-NVD3 library is in rCharts project.

Approximatively, our data batch processing extracts the most 50 popular videos every 10 minutes, the videos are collected into one single data and a follow up is conducted from the starting time: "2017-01-30 19:56:05", initial view-count = 424872083, till the end: "2017-01-31 11:05:06", updated view-count = 470385011. The list has around 400 videos with an updated view-count 45512928. The YouTube network is being able to exhibit at any time some key videos in any video category.

For instance, music category represents 6% of all categories, the videos in music category are representing 24% of all view-counts. The velocity, the change of the viewcount relatively to time, is 325 view-counts per second.

The gaming category represents 4% of all categories, the videos in gaming category are representing 2% of all view-counts. The velocity, the change of the viewcount relatively to time, is 5 view-counts per second. One of the cited work above used a dataset mainly representative of the gaming category which is not the main category in our data set representing the most popular videos in YouTube platform.

The Table below lists the characteristic of data by category:
CATEGORY ID CATEGORYPROPORTION VIDEOSVIEWCOUNTRATIO VELOCITY
Music 10 6.05 24.81 325.06
Entertainment 24 19.15 8.75 142.74
News & Politics 25 18.59 8.21 64.01
Science & Technology 28 5.09 17.94 60.22
People & Blogs 22 10.19 10.39 56.8
Education 27 7.95 16.04 43.55
Comedy 23 8.73 3.93 43.04
Film & Animation 1 4.64 4.21 38.83
Sports 17 6.34 22.34 33.87
How-to & Style 26 7.85 5.15 16.65
Gaming 20 3.2 1.62 4.19
Pets & Animals 15 0.77 11.38 4.06
Cars & Vehicles 2 0.58 4.24 1.02
Travel & Events 19 0.58 2.55 0.23
Non-profits & Activism 29 0.29 0.21 0.2


Bursting Velocity

The bursting velocity is in:



Discussion and Conclusion:

We processed a number of data batch processing, the categories show more concise statistics as listed in the Table above. However video statistics are providing more dynamic insights. For instance, during the weekend, the data shows that videos related to entrainment category are more viewed than videos in music category. During the week, videos related to music category are the most viewed (certainly more statistical tests are needed using more overlapped data batch processing using research of study designs in Biostatistics, longitudinal studies, for instance.).

In general, a temporal dynamic insights is more suitable to extract real-time information from the observed bursting velocity of the most popular videos in YouTube platform. A mobile insight solution is a suitable way to enlighten the space between the platform, the channels and the advertisers, enabling them to improve their propositions and business models, specially the monetization problem faced by YouTube platform, see YouTube-Differential-Analytics.

Image Description
My Lonely Honey Video




Author scientific profile:

Statistics and Applied Mathematics for Data Analytics, Identify opportunities to apply Mathematical Statistics, Numerical Methods, Machine Learning and Pattern Recognition to investigate and implement solutions to the field of Data Content Analytics. Data prediction via computational methods to predict from massive amounts of data (Big Data Content). These methods included clustering, regression, survival analysis, neural network, classification , ranking, deep discrepancy learning .

Author: Faysal.El.Khettabi@gmail.com , Living in Vancouver, BC, Canada.
The MIT License (MIT) Copyright 1994-2017, Faysal El Khettabi, Numerics&Analytics, All Rights Reserved.