YouTube video replay data is unfortunately not available through YouTube’s API, and it’s not a simple scrape job to get, so I built an API for fetching the replay heatmap data.

Pretty cool, but turning that data into something actionable is always better. What if we could use that data to determine which moments are the most popular?

This has many applications (growth, business, analytics) that I won’t get into here, but I will talk about how I went about processing the data to reliably determine what the most popular moments are from YouTube videos that contain replay data.

## Visualizing the Data

The video used to test is: https://www.youtube.com/watch?v=RusBe_8arLQ

This video has a replay heatmap like so:

The first thing to do is make a nice chart of the replays over time:

We’ll want to smooth that out to make it easier to process since there are so many little local peaks (we only care about the major ones). I wrote a simple algorithm for smoothing that works like this:

For each point:

- Get the point to the left (if exists)

- Ge the point to the right (if exists)

- Take the average of the point, 1/3 of the left point, and 1/3 of the right point.

- Multiply the final result by 1.9x to bring the final number up

This algorith is very primitive, but gives the points to the left and right a bit of influence over the center. Yet, it’s highly effective in smoothing the data. Since all of these values are relative anyway (from 0-1), we don’t need to maintain exact numbers.

The final result is far smoother and easier to process. We can see how fewer peaks there are, and how much clarity there is to the most popular moments:

## Processing the Data

The next step is to simply find all of the local maxima and minima. In simple terms, that means just finding all the relative high values and low values. We’ll put the high values in green and low values in blue:

If they don’t look entirely lined up, it’s because I added a pretty aggressive bezier to the graph, which deifnitely skews where the highest point is visually.

Another quality of life adjustment I made is to ignore local maxima that are less than 45% of the largest value in the graph. Relative to the rest of the video these moments aren’t very popular.

For a quick sanity check, we can project those points to the raw data and see how we immediately remove many small bumps from consideration:

Immediatelly we can see that there are some sections that we’d want to consider being the same moment:

In order to group these divisions into moments, we want to walk out from each maxima and determine whether we should consider the next maxima as part of the same moment or not.

We can introduce another algorithm that works as follows:

Sort the local maxima descending in value, and for each local maxima:

- Walk left and right from the current maxima to find the next maxima

- Find all local minima between the target and nearest maxima

- If the near maxima is too close, then we remove that maxima and the minima between

- If far enough, but all of the minima are greater than 65% of both of the maxima, we remove all the minima and delete the border local maxima

- If too far, then we skip

- Keep doing this until step #3,4, 5 don’t remove anything

This process will remove all of the trailing local maxima highlighted in red above that should clearly be the same “moment”:

Now from each of the local maxima (green), we walk left and right until we hit the first local minima. That is our “popular moment”.

To illustrate the bounds of the moments, I’ll create red boxes with black borders around each of those moments:

BOOM! We’ve now just calculated the most popular moments on YouTube videos! Note that popular moments can end either at the local minima, or the start/end of a video.

To make the system a bit more flexible, we can parameterize how long or short the “moment” can be in duration, which will chop the longer segments up. If we take a max distance of 30 seconds between maxima, the longer moment gets divided up:

As you can see, the moment at 04:58 has been divided into two moments.

## Testing on Other Videos

To make sure that this isn’t a one-trick pony and only works on this data set, we can run some tests on other data set sfrom other videos of varying length.

Here are some images of applying the same process to other video replay heatmaps (40s max moment, 10s min):

Obviously the moments can be adjusted by humans, but the primitive suggestion allows automation of some pretty awesome optimization of content creation.