Even though MP4 is the cross-browser standard for HTML5 video, WebM remains an important video technology for multiple reasons:

For all of these reasons, WebM video is essential to any website’s decision HTML5 video implementation. In an earlier blog post, we discuss how to optimize MP4 videos for fast HTML5 streaming. In this post, we will do the same, showing how to optimize WebM videos for fast streaming and seeking.

How WebM Stores and Plays Content

To understand the optimizations we can make to WebM videos, we need to understand their structure and contents. WebM files are just a constrained version the of the Matroska Multimedia container format. Matroska files, usually just called MKV files, use a kind of binary XML called EBML to store different things like video tracks, audio tracks, subtitles, and other data. These data chunks are called elements and they are analogous to the atoms in an MP4 file, or the “chunks” used in a PNG image. This introduction of WebM File structure is very approachable if you want more nerdy details.

WebM files contain a special element called the SeekHead, which acts like a table of contents, pointing to the other elements in the file. To play a video, a program first locates the SeekHead, looks for information about the location of the elements which contain video and audio data, and then jumps to those elements to being playing the video. If this sounds familiar, it’s essentially the same steps as playing an MP4 video, as we discussed in depth our Optimizing MP4 Video for fast streaming post, where a video player must first locate the moov atom, and use that to determine where the video and audio content is.

The order of the elements inside a WebM file can be arbitrary. However when playing a video locally the order of elements is not a concern. The computer already has the entire file, and skipping around inside the file between different elements isn’t an issue. However, when streaming a video over HTTP, the order of the elements matters greatly, because the browser doesn’t have the entire file yet. If the browser doesn’t receive certain elements at the start of the file, it must make HTTP range requests to find the appropriate data. This impacts not only how quickly a movie can start playing, but also how a user can seek around to different times inside the video. The optimizations we are talking about today are both best practices recommended by the WebM Project. and are entirely about optimizing the order of elements in a WebM file.

Fast Start for WebM

As mentioned above, to start playing a WebM video, a browser has to know where the audio and video data is. This is stored in the SeekHead element, so the browser first has to find that first. By default most video creation tools put a SeekHead element at the start of the video. The problem is that WebM files can contain multiple SeekHead elements! In this case, the first SeekHead will container a pointer to a second SeekHead located at the end of the file. Even if the first SeekHead contains pointers to the video and audio tracks, the browser still must go fetch the second SeekHead element, to see if there are additional video or audio tracks in the file, and determine which one has preference. Even if the second SeekHead is completely empty, it doesn’t matter; the browser must download and parse all SeekHead elements in the WebM file before it can play video content.

Seeking around trying to parse all the SeekHead elements wastes time and bandwidth. We can see in the waterfall chart below a browser trying to stream an unoptimized WebM file using HTML5 video:

webm-unoptimized

You can see the browser download makes 3 requests before it can start playing the video. In the first request, the browser spends 135 ms downloading the first 53.7 KB of the video using an HTTP range request. The main SeekHead is inside this data, and references a second SeekHead element at the end of the file. Next you can see the browser spends 137 ms to download a few hundred bytes that make up the second SeekHead, again using an HTTP range request. After getting this second SeekHead element, the browser knows where the video and audio streams are located. Finally, the browser makes a third and final request to get the audio/video data and can start to play the video. All these requests must happen in series have delayed the start of the video by over a quarter of a second!

Of course, it would be even worse if you haven’t configured your server to support HTTP range requests: the browser can’t skip around to get all the SeekHead elements and must download the entire file! This is yet another reason why your should optimize your site with partial download support.

The ideal way to prepare a WebM video for streaming would be to restructure the file so there is only a single SeekHead element, which is located at the start of the file. This way the browser can avoid downloading the entire movie or waste time making additional requests in an attempt to find all the SeekHeads. The waterfall of a website with a streaming-optimized video looks like this:

webm-faststart

We will discuss how you can do this optimization in just a few sections.

Fast Seeking with WebM Cues

Another, though less critical, optimization for WebM involves seeking inside of a streaming video. If you want to jump to a different part of an HTML5 video that is playing, you can click on the “scrubber” bar to seek to a new time. When you seek, the video player needs to translate the new time you have selected to a byte location inside the video content. WebM videos contain a special Cues element, which browser uses to seek. The SeekHead element mentioned above tells the browser where the Cues element is located. However the browser doesn’t go out and download the Cues element until you actually seek inside the video. Look at the waterfall below:

webm-badseek

Here you can see a WebM video playing which is optimized for fast start. However, as soon as I seek in the video, the browser issues an HTTP range request to download the Cues element. Using the Cues element, the browser determines where in the video file I “seeked” and sends a third HTTP range request to start play the video at the new time. Going to get the Cues element has delayed seeking by 100+ ms.

How do we avoid all this? Simple. If the browser already has the Cues element, it doesn’t need to go fetch it when the user seeks. We can do this by rearranging the WebM file so the Cues element is right at the start of the video file before the video data. This way the player gets the Cues element as part of the initial request and seeking becomes instant. Look at the waterfall below:

webm-fast-seek

You can see the initial request to play the HTML5 WebM video. Then, when I seek inside the video, only a single HTTP range request is made to start downloading data at the new time location.

Optimizing WebM Videos with mkclean

Organizing the elements of a WebM file to optimize for fast streaming and fast seeking is actually quite easy. mkclean is a free, open source command line tool that can do both optimizations for you. It works on existing WebM video files, and since it does not have to re-encode the video, it is a fast operation that does not alter the quality of your videos in any way.

The following will take the input file original.webm and produce an optimized version named optimized.webm

 mkclean --doctype 4 --keep-cues --optimize original.webm optimized.webm

 

For our Splunk customers, we have added two new checks to Zoompf, our performance analysis product, which will detect WebM videos that have not been optimized for fast streaming, and WebM videos which have not been optimized for fast seeking. If you just want a quick check of your site for unoptimized default HTML5 videos, you can use our Free Performance Report.

Conclusions

Using HTML5 video on your site means that the video files are streamed a little at a time to your visitors’ browsers. Because visitors don’t have the entire video file from the start, you must optimize the videos to ensure they are structured in a way that allows fast streaming and seeking and avoids extra delays and wasted bandwidth. In this post, I showed you how to use mkclean to optimize WebM videos, so they conform with the WebM Projects best practices, and provide your users with the fastest and best possible HTML5 video experience.