Html – How does Youtube’s HTML5 video player control buffering

htmlvideoyoutube

I was watching a youtube video and I decided to investigate some parts of its video player. I noticed that unlike most HTML5 video I have seen, Youtube's video player does not do a normal video source and instead utilizes a blob url as the source.

Previously I have tested HTML5 videos and I found that the server starts streaming the whole video from the start and buffers in the background the complete rest of the video. This means that if your video is 300 megs, all 300 megs will be downloaded. If you seek to the middle, it will start downloading from the seek position all the way to the end.

Youtube does not work this way (at least in chrome). Instead it manages to control buffering so it only buffers a certain amount while paused. It also seems to only buffer the relevant pieces, so if you skip around it will make sure not to buffer pieces that are unlikely to be watched.

In my attempts to investigate how this worked, I noticed the video src tag has a value of blob:http%3A//www.youtube.com/ee625eee-2802-49b2-a13f-eb374d551d54, which pointed me to blobs, which then led me to typed arrays. Using those two resources I am able to load a mp4 video into a blob and display it in a HTML5 video tag.

However, what I am now stuck on is how Youtube deals with the pieces. Looking at the network traffic it appears to sends requests to http://r6---sn-p5q7ynee.c.youtube.com/videoplayback which returns binary video data back in chunks of 1.1mb. It also seems worth noting that most normal requests due to HTML5 video requests seem to receive a 206 response code back while it streams, yet youtube's playvideo calls get a 200 back.

I tried to attempt to only load a range of bytes (via setting the Range http header) which unfortunately failed (I'm assuming because there was no meta-data for the video coming with the video).

At this point I'm stuck on figuring out how Youtube accomplishes this. I came up with several ideas though none of which I am completely sold on:

1) Youtube is sending down self contained video and audio chunks with each /videoplayback call. This seems like a pretty heavy burden on the upload side and it seems like it would be difficult to stitch these together to make it appear like it's one seemless video. Also, the video tag seems to think it's one full video, judging from calling $('video').duration and $('video').currentTime, which leads me to believe that the video tag thinks it's a single video file. Finally, the vidoe src tag never changes which makes me believe it is working with a singular blob and not switching out blobs.

2) Youtube constructs an empty blob pre-sized to the full video array and updates the blob with pieces as it downloads it. It would then make sure the user has not gotten too close to the last downloaded piece (to prevent the user from entering an undownloaded section of the blob). The problem that I see with this that I don't see any way to dynamically update a blob through javascript (although maybe I'm just having trouble googling for it)

3) Youtube downloads the meta data and then starts constructing the blob in order by appending the video pieces as it downloads them. The problem I see with this method is I don't understand how it would handle seeks in post-buffered territory.

Maybe I"m just missing an obvious answer that's right in front of me. Anyone have any ideas?


edit: I just thought of a fourth option. Another idea is they might use the file API to write the binary chunks to a file and use that file to stream off of. The file API seems to have the ability to seek to specific positions, therefore allowing you to fill a video with empty bytes and fill them in as they are received. This would definitely accommodate video seeking as well.

Best Answer

When you look at the AppData of GoogleChrome, while playing a youtube video, you will see that it buffers in segmented files. The videos uploaded to youtube are segmented, which is why you can't perfectly pinpoint a timeframe in the first click on the bar if that timeframe is outside of the current segment.

The amount of segments depends on the length of the video, and the time from which you start and stop playing back the video.

When you are linked to a timeframe of a video, it will simply skip the buffering of the segments that come before that timeframe.

Unfortunately I don't know much about the coding for video playback, but I hope this points you in the right direction.