Video Wizardry

Tutorials – FFmpeg

Article from Issue 206/2018

Linux has some excellent graphical video-editing tools, but sometimes working from the command line with FFmpeg is just better.

Below is the video associated with this article.

How much better? Well, it makes stuff easier to batch process, for starters. Say you have to change every instance of "Bill" in a 100-page text file to "Gary." Sure, you could use the search-and-replace feature in your text editor. That would work if you only had one file, but what would you do if you had a filesystem with hundreds of files scattered all over the place? You would never consider seriously trawling through every directory and subdirectory, opening each file in turn, and clicking through the search-and-replace process, would you? A Bash script using find and sed would be the way to go.

The same goes, believe or not, for video editing. You can do dozens, nay, scores of things with your videos, without ever having to open a graphical video-editing application. All you need is FFmpeg [1].

You've probably used FFmpeg before for converting video and audio files between formats. In its simplest form, that is what it does. The instruction

ffmpeg -i input.mp4 output.webm

converts an MP4 video file into a WebM video file.

However, FFmpeg can do much more than that. It can be used to change the frame rate, switch in and out audio and subtitle tracks, and even cut up and re-arrange sequences within a movie.

Inserting a Watermark

One of the most powerful FFmpeg features is its effects pipeline [2], or "filters," as they are known by FFmpeg users.

You can apply filters to whole audio or video streams or only to certain parts, use them to merge several streams into one in interesting ways, and do much more.

To illustrate how filters work, I'll show you how to use a logo to watermark a video. You can see the logo in Figure 1. It is a PNG with a transparent background. I'll assume the video you'll be using is a 720p (1280x720) MP4 video called example.mp4.

Figure 1: The logo for watermarking your video.

There are several ways you can carry out this task, but the FFmpeg filter page mentions the overlay filter, and that seems to be the most straightforward way to go.

In the instruction

ffmpeg -i example.mp4 -i LM_logo.png -filter_complex "overlay" -codec:a copy example_marked.mp4

FFmpeg takes two inputs, the example.mp4 video file and the LM_logo.png file, and outputs them together – the second placed on top of the first – to example_marked.mp4. Figure 2 shows the result.

Figure 2: The movie marked with a gigantic watermark.

Of interest is the -filter_complex construct, which sits between the inputs and the output. Within -filter_complex, you can string filters together, and they will be applied one after the other to one stream, the other, or both.

Although this is a step in the right direction, the result isn't very subtle. The logo is in the wrong place. Instead of the upper-left corner, it would be better in the lower right, like most channel logos on TV.

Fortunately, most filters can take parameters, and overlay can too:

ffmpeg -i example.mp4 -i LM_logo.png -filter_complex "overlay=W-w-10:H-h-10" -codec:a copy example_marked.mp4

When you pass a parameter to a filter, you do so using the <filter>=<value> syntax. In this case, you pass to overlay the horizontal position and then the vertical position, separated by a colon (:), of the top layer (containing the logo).

FFmpeg also provides a convenient way to pass the width and height of each layer to the overlay filter: W is the width of the first input (the bottom layer), and w is the width of the second input (the top layer). This means that W-w-10 will place the top overlay layer 10 pixels from the left-most edge of the bottom video layer. The same goes for H-h-10, but in the vertical axis (see Figure 3).

Figure 3: You can place your logo by passing the x and y position as parameters to overlay.

However, the logo is still way too big. You can solve this by adding a new filter and chaining it to overlay:

ffmpeg -i example.mp4 -i LM_logo.png -filter_complex "[1:v] scale=150:-1 [ol], [0:v] [ol] overlay=W-w-10:H-h-10" -codec:a copy example_marked.mp4

Several new things are going on here. First, notice the new scale filter, which changes the scale of a video stream by taking a new width and height separated by a colon. If you want to make sure FFmpeg keeps the proportion correct, pass one of the parameters and then -1 as the other. In the example above, you tell scale to make the stream 150 pixels wide and to scale its height proportionally.

But, what video stream are you talking about? The FFmpeg instruction above has two inputs: example.mp4 and LM_logo.mp4. How do you specify which one you want to scale? Well, that is the purpose of [1:v]. In the filter_complex string, you can specify the input on which you want to operate with a number and the type of stream. Inputs start from 0, so the first input (example.mp4) is 0, and the second input (LM_logo.png) is 1. The letter tells the filter on what kind of stream it should operate. A PNG image only has a visual/video component, so you tell scale to use [1:v]. It is totally possible that other types of inputs have more components. For example, the example.mp4 input has video and audio components. To apply an effect to the audio, you would use [0:a]; if it has built-in subtitle tracks to which you want to apply an effect, you would use [0:s], and so on.

At this stage, it is worth mentioning that FFmpeg allows you to label your streams, which is what [ol] is doing: You apply a scaling effect to [1:v], and then (so you can refer to the scaled stream later and not the original input) you give it a name: [ol] (for overlay layer – the name can be anything you want).

As you can see in the fourth line of the code above, you use the video stream from the first input ([0:v]) and overlay the scaled image ([ol]). The comma separating the scale and overlay filters indicates that the output from the first element of the string is piped to the second element, which means you could have used

[0:v] overlay=W-w-10:H-h-10

instead of

[0:v] [ol] overlay=W-w-10:H-h-10

and the result would have been the same.

However, as your filter_complex string becomes more complex, you will discover that labeling streams is not only a helpful memory aid, but also essential to achieving the result you desire.

The end result is that LM_logo.png is shrunk down to a reasonable size and then overlaid in the bottom right-hand corner of the frame, as shown in Figure 4.

Figure 4: The logo is now placed and scaled thanks to the overlay and scale filters.

Blending In

Overlaying is fine if you are okay with an opaque logo obscuring part of your video, but if you want something that will not keep your viewers from seeing all the action, a translucent logo is the way to go.

To do this, you have to shift gears and use the blend filter, instead of overlay, which lets you apply different kinds of blending algorithms to your layers. You can use addition, subtraction, XOR, and so on, which makes blend much more versatile than overlay.

The caveat, though, is that blend requires the streams you are merging to be the same resolution and have the same storage aspect ratio (SAR), which is the dimension of the pixels that make up a frame expressed as a ratio. In modern digital formats, pixels are usually perfectly square; that is, they have a ratio of 1:1. If you are not sure of your clips' SARs, you can use FFmpeg's ffprobe command to find out what they are:

ffprobe example.mp4

The output returns a resolution of 1280x720 pixels and a SAR of 1:1, as expected.

The command

ffprobe LM_logo.png

returns a resolution of 800x314 pixels and a SAR of 2515:2515, which means you have to increase the resolution of the top layer to 1280x720 pixels and, although 2515:2515 is the same as 1:1, FFmpeg doesn't know that, so you also will have to correct the SAR.

If you simply scale up your logo and change its SAR with setsar=sar=1, as in Listing 1 (one command broken into multiple lines), it works, but you get what you see in Figure 5.

Listing 1

Translucent Logo

01 ffmpeg -i example.mp4 -i LM_logo.png
02   -filter_complex "
03    [1:v] scale=1280:720, setsar=sar=1 [lo];
04    [0:v] setsar=sar=1, format=rgba [bg];
05    [bg][lo] blend=all_mode=addition:all_opacity=0.5, format=yuva422p10le
06    "
07   -codec:a copy example_marked.mp4
Figure 5: A translucent logo blended into the video.

Even if what you see in Figure 5 is not what you want, take a look at the code, because it introduces several new and interesting features. First, notice you have three blocks of filter chains on lines 3, 4, and 5 separated by semicolons. A semicolon between blocks of filters indicates that each block is not related to the next, because line 3 applies filters to the logo (input 2), line 4 applies filters to the video (input 1), and line 5 is where the result of both filtered streams are merged.

More specifically, line 3 scales the logo (input 2) up to 1280x720 pixels and changes its SAR to 1. Line 4 makes sure the video's SAR is correct by setting it to 1, as well, and converts each frame to the RGBA color space, so they can be melded with the logo without causing any strange color effects. Finally, on line 5, you use the blend filter on the output from lines 3 and 4 and convert the blended layers to a video-friendly color space.

The blend filter can take more than one parameter. In fact, it can take more than a dozen [3], so instead of just placing the parameters in order one after another and separating them by colons, as was done with scale, you will want to refer to the name of the parameter explicitly to avoid becoming confused. You do this by pairing off each parameter name with its value, as shown in Figure 6.

Figure 6: Passing several parameters with their names to a filter.

In this case, you are passing an option that specifies how you want to merge each pixel using the blend parameter all_mode and telling it what the opacity of each pixel has to be (0.5 is 50% opaque) with the blend parameter all_opacity.

Notice how you are now confidently using custom labels to describe each filtered input by using [bg] for the background video and [lo] for the logo overlay.

As mentioned earlier, this is not exactly the desired outcome. Again, you want the logo to be down at the bottom right-hand corner tucked away inconspicuously, not splattered all over the video obscuring the action.

Fortunately, FFmpeg provides yet another filter that helps with this problem: pad [4] allows you to resize the frame around the image, filling in the new space with a color or alpha transparency. Listing 2 shows how this works.

Listing 2

Positioned and Scaled

01 ffmpeg -i example.mp4 -i LM_logo.png
02        -filter_complex "
03         [1:v] scale=150:-1, pad=1280:720:ow-iw-10:oh-ih-10, setsar=sar=1, format=rgba [lo];
04         [0:v] setsar=sar=1, format=rgba [bg];
05         [bg][lo] blend=all_mode=addition:all_opacity=0.5, format=yuva422p10le
06         "
07        -codec:a copy example_marked.mp4

All the changes happen on line 3, where you resize the logo to make it smaller with the scale filter you used before. Then you create a "padding" around it, to make the frame 1280x720 pixels. As with overlay, you can decide where to place the original image within the padded frame, either by using numbers or playing with the built-in variables: ow (the padded width of the frame), iw (the original width of the image), oh (the padded height of the frame), and ih (the original height of the image). As with the overlay example, ow-iw-10:oh-ih-10 places the logo 10 pixels to the left of the frame's right-most edge and 10 pixels up from the frame's bottom-most edge.

Finally, to make sure nothing funny happens to the colors when you merge the logo layer with the video layer, you convert the color space of the padded frame to rgba with the format filter.

Figure 7 shows the outcome of running this instruction, which is exactly what you want.

Figure 7: An overlaid translucent logo gives a touch of class to your videos.


All of these command-line manipulations might seem like overkill, and there is no denying FFmpeg's steep learning curve, to say the least, but that is in part because video editing is a complex art.

However, the payback is immense. By handing off trivial and repetitive tasks to FFmpeg, you can avoid having to run power-hungry graphical applications and wasting time manually placing and filtering and then rendering your clips.

FFmpeg is also very mature at this stage and is probably much stabler than most graphical video editors, which means you will avoid the frustration of crashing apps. Because it is a command-line tool, it allows you to batch process scores of videos in one go, with no need for human supervision. It also allows you to ship this kind of cycle-consuming work off to a headless server, freeing up your workstation for more important things, like playing games or browsing the web.

Regardless of how you look at it, using FFmpeg for automated video editing is win-win-win.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More