Image sequence into H.264 / MPEG-4 AVC

Problem

Transcode an image sequence by using the H.264 codec for dissemination purposes.

Solution

ffmpeg                        \
-f image2                     \
-framerate 24                 \
-i input_file_%06d.extension  \
-c:v libx264                  \
-preset veryslow              \
-crf 18                       \
-pix_fmt yuv420p              \
output_file

General command

ffmpeg                         \
-f image2                      \
-framerate frames_per_second   \
-i input_file_regex.extension  \
-c:v libx264                   \
-preset preset_value           \
-crf constant_rate_factor      \
-pix_fmt yuv420p               \
output_file

Command syntax

ffmpeg: starts the command
-f image2: forces the image file de-muxer for single image files
-framerate frames_per_second: sets the frame rate
-i input_file_regex.extension: path, name with regex and extension of the input files
-c:v libx264: The library libx264 re-encodes the video stream using the H.264 video codec.
-preset preset_value: A slower encoding preset means a better compression rate.
-crf constant_rate_factor: A parameter of 18 means a “visually lossless” compression.
-pix_fmt yuv420p: The pixel format for “YUV” colour space with 4:2:0 chroma subsampling and planar colour alignment is chosen for best compatibility.
output_file: path, name and extension of the output file

Discussion

The parameters witch apply to the input files must precede them. Therefore the option -f image2 must precede the image sequence given as input.

The frame rate of sound film is 24 fps (frames per second) and the default frame rate of image2 is 25 fps, therefore we have to change it.

The regex %06d matches six digits long numbers, possibly with leading zeroes. This allows to read in ascending order, one image after the other, the full sequence inside one folder. The command must of course match the naming convention actually used. And for image sequences starting at 086400 (captured at 24 fps with a timecode starting at 01:00:00:00) or at 090000 (captured at 25 fps with a timecode starting at 01:00:00:00), add the flag -start_number 86400 or -start_number 90000 before -i input_file_%06d.ext.

The extension for TIFF files is .tif or maybe .tiff; the extension for DPX files is .dpx (or eventually .cin for old Cineon files). Other file formats are possible.

The video codec is specified by -codec:video, which is usually abbreviated as -c:v (-codec:v or -c:video are also possible). We advise to avoid the alias -vcodec. If the source is RGB, then you may choose the video codec libx264rgb rather than libx264.

Possible -preset values for the H.264 codec include veryslow, slow, medium, fast and veryfast. Slower encoding means that more time is needed, but the compression rate is better.

You can use the parameter -qp 18 (quantisation parameter) rather than -crf 18 (constant rate factor) which gives a similar “visually lossless” result. The range of the scale for crf and qp for 8-bit is from 0 to 51, where 0 is lossless, approximately 18 is “visually lossless”, 23 is the default value and 51 is worst possible. For 10-bit the range is from 0 to 63. Note that most of the non-FFmpeg-based players cannot decode H.264 files holding lossless content.

yuv420p is a common 8-bit and yuv420p10le a 10-bit pixel format. The library libx264 supports both, but you cannot combine 8-bit and 10-bit in the same command, you need two commands.

By default the library libx264 will use the chroma subsampling scheme that matches closest the input file’s chroma subsampling. This can result in the “YUV” colour space with 4:4:4 or 4:2:2 or 4:2:0 chroma subsampling. Many of the non-FFmpeg-based players cannot decode H.264 files having a different chroma subsampling than 4:2:0. Therefore, in order to allow possibly all players to read the file, we suggest use to the yuv420p pixel format for dissemination purposes. And, as sadly usual in the computer world, “YUV” stands for the colour space Y′C_BC_R and not for Y′UV, which is used for PAL video. However, if you choose the video codec libx264rgb rather than libx264, then an RGB pixel format must be chosen, usually rgb24.

You may add the parameter -movflags +faststart allowing to start playing before the whole file is loaded.

Often the MP4 container (.mp4) is choses for wrapping H.264, but others are possible.

A Bash script allowing to perform this transcoding is included in our collection Bash Script for Audiovisual Preservation.

2024-11-29