Pixel-Perfect WebM
WebM is probably the future, but like all things containing the word “web,” there’s an element of nuance and surprise that emerges while working with it.
VP9 is supported pretty much everywhere that matters — even Edge manages to produce a video UI, although one imagines it grumbling as it does so — but the only pixel format that actually works across computers and phones alike is YUV420p. For the most part, this detail manifests itself as an additional flag you must pass to your video encoder of choice (e.g. ffmpeg -pix_fmt yuv420p
) lest you present a perfectly playable but nevertheless blank video to some, if not all, of your audience.
But there’s another layer here, one that picks apart the strange name of YUV420p, buried in a drawer somewhere with the label “Chroma Subsampling.” For efficiency reasons, YUV420p will keep only one pixel of color information for every four pixels of video, which means a loss of detail even if you tell it to encode “losslessly.”
For the most part, this produces a perfectly acceptable result, unless you’re doing something that lends itself to pixel precision like screencasting a small terminal window (and, let’s be honest, cultivating a habit of getting hung up on details). The correct solution to this is to move to YUV444p, which keeps a good old-fashioned one-to-one ratio of color data to pixels. Except, like so many “standards,” it produces inconsistent results across browsers. In my limited testing, Chrome and Firefox on Android were able to play YUV444p, along with Firefox on Linux, but Chrome, Firefox, and Edge on Windows didn’t have a clue what was going on and put up a black box instead. Even my file browser wasn’t happy with it, if I’m honest, as it generated a sickly-green thumbnail for the video file.
Options for platform-agnostic pixel-precision are slim, it appears, at least if you’re keeping to strictly HTML5 video. I’ve found a simple way around this for now, though. Since chroma subsampling wants to reduce every 2x2 block of pixels to a single chromatic data point, why not feed it 4x the amount of pixels? And to prevent the filesize from growing unnecessarily, don’t let ffmpeg
attempt to do any kind of interpolation when scaling up — stick with the nearest neighbor algorithm and make plain ol’ 2x2 chunks.
#!/usr/bin/env bash
src="$1"
dest="$2"
rm -f ffmpeg2pass-*.log
ffmpeg -i "$src" \
-y \
-vcodec libvpx-vp9 \
-crf 23 \
-pix_fmt yuv420p \
-vf scale=iw\*2:ih\*2 \
-sws_flags neighbor \
-deadline best \
-pass 1 \
-f webm /dev/null
ffmpeg -i "$src" \
-y \
-vcodec libvpx-vp9 \
-crf 23 \
-pix_fmt yuv420p \
-vf scale=iw\*2:ih\*2 \
-sws_flags neighbor \
-deadline best \
-pass 2 \
"$dest"
This is the best I’ve come up with for now, but I’m hoping YUV444p support will be more of a thing in the future to avoid this kind of workaround.