Normalize audio in video files

We are organizing the Rhythm Production and Perception Workshop at RITMO next week. As mentioned in another blog post, we have asked presenters to send us pre-recorded videos. They are all available on the workshop page.

During the workshop, we will play sets of videos in sequence. When doing a test run today, we discovered that the sound levels differed wildly between files. There is clearly the need for normalizing the sound levels to create a good listener experience.

Batch normalization

How does one normalize around 100 video files without too much pain and effort? As always, I turn to my go-to video companion, FFmpeg. Here is a small script I made to do the job:

#!/bin/bash

shopt -s nullglob
for i in *.mp4 *.MP4 *.mov *.MOV *.flv *.webm *.m4v; do 
   name=`echo $i | cut -d'.' -f1`; 
   ffmpeg -i "$i" -c:v copy -af loudnorm=I=-16:LRA=11:TP=-1.5 "${name}_norm.mp4"; 
done

This was the result of some searching around for a smart solution (in Qwant, btw, my new preferred search engine). For example, I use the “nullglob” trick to list multiple file types in the for loop.

The most important part of the script is the normalization, which I found in this blog post. The settings are described as:

  • loudnorm: the name of the normalization filter
  • I: the integrated loudness (from -70 to -5.0)
  • LRA: the loudness range (from 1.0 to 20.0)
  • TP: Indicates the max true peak (from -9.0 to 0.0)

The settings in the script normalize to a high but not maximum signal, which leaves some headroom.

To compress or not

To save processing time and avoid recompressing the video, I have included “-c:v copy” in the script above. Then FFmpeg copies over the video content directly. This is fine for videos with “normal” H.264 compression, which is the case for most .MP4 files. However, when getting 100 files made on all sorts of platforms, there are surely some oddities. There were a couple of cases with weird compression formats, that for some reason failed with the above script. One also had interlacing issues. For them, I modified the script to recompress the files.

#!/bin/bash

shopt -s nullglob
for i in *.mp4 *.MP4 *.mov *.MOV *.flv *.webm *.m4v; do 
    name=`echo $i | cut -d'.' -f1`; 
    ffmpeg -i "$i" -vf yadif -af loudnorm=I=-16:LRA=11:TP=-1.5 "${name}_norm.mp4"; 
done

In this script, the copy part is removed. I have also added “-vf yadif”, which is a de-interlacing video filter.

Summing up

With the first script, I managed to normalize all 100 files in only a few minutes. Some of the files turned up with 0 bytes due to issues with copying the video data. So I ran through these with the second script. That took longer, of course, due to the need for compressing the video.

All in all, the processing took around half an hour. I cannot even imagine how long it would have taken to do this manually in a video editor. I haven’t really thought about the need for normalizing the audio in videos like this before. Next time I will do it right away!

Published by

alexarje

Alexander Refsum Jensenius is a music researcher and research musician living in Oslo, Norway.