Transcode all your animu for your shitty phone, TV

You finally got yourself a new phone and you think it would be a good idea to watch some weeb shit on the go. Likewise for your new Smart TV. However, woe is you, because there is format incompatibility: though your OS supports MKV, H.264, ASS, and FLAC or AAC, the file doesn’t seem to be playable. Luckily you’re reading the blog so you won’t kill yourself in desperation.

First order of business is getting ffmpeg from your favourite distro’s repository. You can also transcode with gstreamer if you’d like, but ffmpeg is more popular so you’ll find more guides about how it works on the internet.

Your problem on the video end is likely that the pixel format for the video stream is yuv420p10le and you want yuv420p. We want to transcode the video stream while preserving other information. Most devices should support selecting among multiple subtitle languages, so that isn’t a problem, but switching audio tracks is iffier, so we’d also like having a single audio track. Lastly, MKV supports embedding attachments, mostly fonts, and though support is spotty, devices seem content ignoring them if they lack support, so that’s acceptable.

This is what you need:

ffmpeg -i "input" -c:s copy -c:a libmp3lame -q:a 4\ -c:v libx264 -crf 25 -preset ultrafast -tune animation -pix_fmt yuv420p\
	-map 0:v -map 0:t? -map 0:s? -map 0:a:0\
	"output"

-i input is your source, and the last argument is your output. With -c:s copy you copy the subtitle track without conversion. -c:a libmp3lame -q:a 4 transcodes the audio stream to MP3 VBR, ~160 ABR, because the source is likely FLAC and a waste of space for a phone or a TV, since you can’t hear the difference without expensive audio gear and the ear drums of a 12-year-old. AAC is a far more efficient format but the LAME MP3 encoder is significantly faster, almost as fast as copying the stream without transcoding.

-pix_fmt yuv420p is where the magic happens for the video stream. libx264 is the open source H.264 encoder in software. If you own very modern NVIDIA or Intel silicon, you should be able to outdo the software encoder with NVENC or Quicksync respectively, but configuration is iffier. I have a Sandy Bridge and a Haswell, and in both cases the software implementation is faster than hardware. x264 in particular has been an optimisation focus for years; HEVC, VP8, and VP9 software encoders are far less optimised and hardware encoders can truly shine there.

-crf 25 sets a constant quality rate at the expense of unpredictable final file sizes, lower is better. How much you want to tune it depends on your hardware and your eyeballs, but I don’t think there’s any point for anything under 20; I don’t fucking care what you’ve heard from Coalgirls. I consider 25 a sweet spot but run your own tests. VBR or CBR encoding might improve encoding speed over CRF at the expense of quality, but the tradeoff wasn’t significant enough in my testing.

The -preset determines how fast the encoding will go, at the expense of file size. This is mostly offset by a CRF of 25, so the end result will almost always be smaller than the source (albeit lower quality). If you’re aiming for archival you should maximise encoding quality, but otherwise for one-shots such as these, ultrafast is the way to go: I’m aiming for at least a 2.5x encoding speed, ideally over 3.0x. -tune animation tunes the encoder for visual updates often found in cartoons, but also in my experience it also makes transcoding faster so you might want to use it in general.

The -map arguments select MKV streams. The arguments select all video streams, all attachment and subtitle streams (if any exist), and the only thing you might want to configure is -map:a:0 which selects the first audio stream. There isn’t any standard in how they’re ordered so you should ffprobe your file first to see the available streams then use the proper index. You can select by language, but some releases include commentary audio so you should always double check.

This might seem trivial, but it’s possible your encoding bottleneck is I/O rather than CPU, so don’t read and write from the same medium unless it’s very fast flash storage.

But I know what you’re thinking. You’re thinking it would be very cool if you could transcode all your files Just In Time, instead of transcoding ahead of time, running adb push boku_no_pico.mkv /sdcard/Movies/ and waiting for the transfer to finish. Luckily for you, ffmpeg supports writing to stdout if the output is -, and you can pipe this to a TCP socket. For example, in node.js:

const ffmpeg = child_process.spawn("ffmpeg", ["-i", "." + decodeURI(req.url.pathname), "-f", "matroska", "-c:s", "copy", "-c:a", "libmp3lame", "-q:a", "4",
			"-c:v", "libx264", "-crf", "25", "-preset", "ultrafast", "-tune", "animation", "-pix_fmt", "yuv420p", "-movflags", "+faststart",
			"-map", "0:v", "-map", "0:t", "-map", "0:s", "-map", "0:a:0",
			"-"])
res.statusCode = 200
res.setHeader("Content-Type", "application/octet-stream")
res.on("close", () => ffmpeg.kill())
req.on("close", () => ffmpeg.kill())
req.on("aborted", () => ffmpeg.kill())
ffmpeg.stdout.pipe(res)
ffmpeg.stderr.pipe(process.stderr)

The good news is this works as expected and you can play all your child porn over the network, transcoded on the fly and nothing is ever written to disk. The bad news is I’ve only been able to make this work client side on mpv, which defeats the purpose since you’re running full fat Linux and ffmpeg on it anyway; at that point you’d be better served by sshfs. But if you’ve ever wanted to build your very own home theatre software, at least you know where to start now.

God Bless A-cups and Make Anime Great Again.