This lesson is being piloted (Beta version)

Transcoding One File


Teaching: 20 min
Exercises: 10 min
  • How can I use Python to transcode a file

  • Import the subprocess module

  • Transcode and compress a single Quicktime file

Alice has heard good things about saving space by using lossless compression on preservation master files. Perhaps she could find a way to use Python to generate new, lossless preservation masters. Alice is taking a big leap, copying code from all over the Internet (Stack Exchange, ffmprovisr, random forums). This is something that people with all levels of coding skill do on a regular basis, and while this code may initially not make sense, as you continue to work with it, things will become more clear.

Python, FFmpeg, and Lossless Transcoding

Let’s experiment with transcoding one of our preservation master files to FFV1 and rewrapping it from mov to mkv.

Leaning on ffmprovisr, we can find the following recipe for transcoding to FFV1/MKV:

ffmpeg -i input_file -map 0 -dn -c:v ffv1 -level 3 -g 1 -slicecrc 1 -slices 16 -c:a copy output_file.mkv

And as an experiment, let’s choose a random file in the federal_grant folder. First, let’s navigate to that folder (and you do so, try playing around with Jupyter’s tab completion):

cd Desktop/pyforav/federal_grant/

We can double check that we’re in the right place by using Python’s getcwd() command. This is part of the os module in Python so we’ll need to load the module first.:

import os

Now let’s import the subprocess module, which will allow us to invoke external programs from within our Python code:

import subprocess

It’s time for some transcoding. Any of the Federal Grant files would do, but let’s play with['ffmpeg', '-i', '', '-map', '0', '-dn', '-c:v', 'ffv1', '-level', '3', '-g', '1', '-slicecrc', '1', '-slices', '16', '-c:a', 'copy', 'napl_0371_pres.mkv'])
CompletedProcess(args=['ffmpeg', '-i', '', '-map', '0', '-dn', '-c:v', 'ffv1', '-level', '3', '-g', '1', '-slicecrc', '1', '-slices', '16', '-c:a', 'copy', 'napl_0371_pres.mkv'], returncode=1)

Take a peek inside the Federal Grant folder and compare the file sizes of and napl_0371_pres.mkv. One’s much smaller, right?

Super cool, but it’s also a little tedious to type out the file path for each individual file. How can we do this for all of the files that we have within our different directories? The first thing we’ll need is a list, so let’s start there.

Key Points

  • Python syntax can be a little tricky, but starting small and building out is a good general strategy