Normalize audio in video files

We are organizing the Rhythm Production and Perception Workshop at RITMO next week. As mentioned in another blog post, we have asked presenters to send us pre-recorded videos. They are all available on the workshop page.

During the workshop, we will play sets of videos in sequence. When doing a test run today, we discovered that the sound levels differed wildly between files. There is clearly the need for normalizing the sound levels to create a good listener experience.

Batch normalization

How does one normalize around 100 video files without too much pain and effort? As always, I turn to my go-to video companion, FFmpeg. Here is a small script I made to do the job:

#!/bin/bash

shopt -s nullglob
for i in *.mp4 *.MP4 *.mov *.MOV *.flv *.webm *.m4v; do 
   name=`echo $i | cut -d'.' -f1`; 
   ffmpeg -i "$i" -c:v copy -af loudnorm=I=-16:LRA=11:TP=-1.5 "${name}_norm.mp4"; 
done

This was the result of some searching around for a smart solution (in Qwant, btw, my new preferred search engine). For example, I use the “nullglob” trick to list multiple file types in the for loop.

The most important part of the script is the normalization, which I found in this blog post. The settings are described as:

  • loudnorm: the name of the normalization filter
  • I: the integrated loudness (from -70 to -5.0)
  • LRA: the loudness range (from 1.0 to 20.0)
  • TP: Indicates the max true peak (from -9.0 to 0.0)

The settings in the script normalize to a high but not maximum signal, which leaves some headroom.

To compress or not

To save processing time and avoid recompressing the video, I have included “-c:v copy” in the script above. Then FFmpeg copies over the video content directly. This is fine for videos with “normal” H.264 compression, which is the case for most .MP4 files. However, when getting 100 files made on all sorts of platforms, there are surely some oddities. There were a couple of cases with weird compression formats, that for some reason failed with the above script. One also had interlacing issues. For them, I modified the script to recompress the files.

#!/bin/bash

shopt -s nullglob
for i in *.mp4 *.MP4 *.mov *.MOV *.flv *.webm *.m4v; do 
    name=`echo $i | cut -d'.' -f1`; 
    ffmpeg -i "$i" -vf yadif -af loudnorm=I=-16:LRA=11:TP=-1.5 "${name}_norm.mp4"; 
done

In this script, the copy part is removed. I have also added “-vf yadif”, which is a de-interlacing video filter.

Summing up

With the first script, I managed to normalize all 100 files in only a few minutes. Some of the files turned up with 0 bytes due to issues with copying the video data. So I ran through these with the second script. That took longer, of course, due to the need for compressing the video.

All in all, the processing took around half an hour. I cannot even imagine how long it would have taken to do this manually in a video editor. I haven’t really thought about the need for normalizing the audio in videos like this before. Next time I will do it right away!

New publication: NIME and the Environment

This week I presented the paper NIME and the Environment: Toward a More Sustainable NIME Practice at the International Conference on New Interfaces for Musical Expression (NIME) in Shanghai/online with Raul Masu, Adam Pultz Melbye, and John Sullivan. Below is our 3-minute video summary of the paper.

And here is the abstract:

This paper addresses environmental issues around NIME research and practice. We discuss the formulation of an environmental statement for the conference as well as the initiation of a NIME Eco Wiki containing information on environmental concerns related to the creation of new musical instruments. We outline a number of these concerns and, by systematically reviewing the proceedings of all previous NIME conferences, identify a general lack of reflection on the environmental impact of the research undertaken. Finally, we propose a framework for addressing the making, testing, using, and disposal of NIMEs in the hope that sustainability may become a central concern to researchers.

Paper highlights

Our review of the NIME archive showed that only 12 out of 1867 NIME papers have explicitly mentioned environmental topics. This is remarkably low and calls for action.

My co-authors have launched the NIME eco wiki as a source of knowledge for the community. It is still quite empty, so we call for the community to help develop it further.

In our paper, we also present an environmental cost framework. The idea is that this matrix can be used as a tool to reflect on the resources used at various stages in the research process.

Our proposed NIME environmental cost framework.

The framework was first put into use during the workshop NIME Eco Wiki – a crash course on Monday. In the workshop, participants filled out a matrix each for one of their NIMEs. Even though the framework is a crude representation of a complex reality, many people commented that it was a useful starting point for reflection.

Hopefully, our paper can raise awareness about environmental topics and lead to a lasting change in the NIME community.

Making 100 video poster images programmatically

We are organizing the Rhythm Production and Perception Workshop 2021 at RITMO a week from now. Like many other conferences these days, this one will also be run online. Presentations have been pre-recorded (10 minutes each) and we also have short poster blitz videos (1 minute each).

Pre-recorded videos

People have sent us their videos in advance, but they all have different first “slides”. So, to create some consistency among the videos, we decided to make an introduction slide for each of them. This would then also serve as the “thumbnail” of the video when presented in a grid format.

One solution could be to add some frames at the beginning of each video file. This could probably be done with FFmpeg without recompressing the files. However, given that we are talking about approximately 100 video files, I am sure there would have been some hiccups.

A quicker and better option is to add “poster images” when uploading the files to YouTube. We also support this on UiO’s web pages, which serves as the long-term archive of the material. The question, then, is how to create these 100 poster images without too much work. Here is how I did it on my Ubuntu machine.

Mail Merge in LibreOffice Writer

My initial thought was to start with Impress, the free presentation software in LibreOffice. I quickly searched to see if there is any tool to create slides programmatically but didn’t find anything that seemed to be straightforward.

Instead, I remembered the good old “mail merge” functionality of Writer. This was made for creating envelope labels back in the days when people still sent physical mail. However, it can be tweaked for other things. After all, I have the material I wanted to include in the poster image in a simple spreadsheet, so it was quick and easy to import the spreadsheet in Writer and select the two columns I wanted to include (“author name” and “title”).

A spreadsheet with the source information about authors and paper titles.

I wanted the final image to be in Full-HD format (1920 x 1080 pixels), which is not a standard format in Writer. However, there is the option of choosing a custom page size, so I set up a page size of 192 x 108 mm in Writer. Then I added some fixed elements on the page, including a RITMO emblem and the conference title.

Setting up the template in LibreOffice Writer.

Finally, I saved a file with the merged content and exported as a PDF.

From PDF to PNG

The output of Writer was a multi-page PDF. However, what we need is a single image file per video. So I turned to the terminal and used this oneliner based on pdfseparate to split up the PDF into multiple one-page PDF files:

pdfseparate rppw2021-papers-merged.pdf posters%d.pdf

The trick here is to use the %d command to get a sequential number for each PDF.

Next, I wanted to convert these individual PDF files to PNG files. Here I turned to the convert function of ImageMagick, and wrote a short one-liner that does the trick:

for i in *.pdf; do name=`echo $i | cut -d'.' -f1`; convert -density 300 -resize 1920x1080 -background white -flatten "$i" "$name.png"; done

It looks for all the PDFs in a directory and converts them to a PNG file with a Full-HD resolution. I found that it was necessary to include the “-density 300” to get a nice-looking image. For some reason, the default seems to be a fairly low-quality resolution. To avoid any transparency issues in later stages, I also included the “-background white” and “-flatten” functions.

The end result was a folder of PNG files.

Putting it all together

The last step is to match the video files with the right PNG image in the video playback solution. Here it is shown using the video player we have at UiO:

Once I figured out the workflow, the whole process was very rapid. Hopefully, this post can save someone many hours of manual work!

Launching NOR-CAM – A toolbox for recognition and rewards in academic careers

What is the future of academic career assessment? How can open research practices be included as part of a research evaluation? These were some of the questions we asked ourselves in a working group set up by Universities Norway. Almost two years later, the report is ready. Here I will share some of the ideas behind the suggested Norwegian Career Assessment Matrix (NOR-CAM) and some of the other recommendations coming out of the workgroup.

The Norwegian Career Assessment Matrix (NOR-CAM).

EUA work on research assessment

I have for some years been Norway’s representative in the European University Association’s Expert Group on Open Science/Science 2.0 (on a side note, I have written elsewhere about why I think it should be called Open Research instead). The expert group meets 3-4 times a year, usually in Brussels but nowadays online, to discuss how Open Science principles can be developed and implemented in European universities.

A lot of things have happened in the world of Open Science during the three years that I have been in the expert group. Open access to publications is improving every day. Open access to research data is coming along nicely, although there are still many challenges. Despite the positive developments, there is one key challenge that we always get back to discussing: research assessment. How should researchers get their “points” in the system, who should get the job, and who should get a promotion?

Up until now, publication lists and citation counts have been the most important “currency” for researchers. We have, over the years, seen an unfortunate focus on metrics, like the h-index and the journal impact factor (and others). The challenge is that only asking for publication lists (and publication-related metrics) takes focus away from all the other elements of an open research ecosystem.

Various building blocks in an open research ecosystem.

The need to rethink research assessment led to the EUA Webinar on Academic Career Assessment in the Transition to Open Science last year. As the title of the webinar shows, we decided to broaden the perspective from only thinking about research assessment to considering academic career assessment more generally. This also became the focus of the Universities Norway workgroup and the final report.

Six principles

In the report we list six principles for the future of career assessment:

  1. Measure quality and excellence through a better balance between quantitative and qualitative goals
  2. Recognise several competencies as merits but not in all areas at the same time or by each employee
  3. Assess all results, activities and competencies in the light of Open Science principles
  4. Practice transparency in the assessment and visibility of what should be recognised as merit
  5. Promote gender balance and diversity
  6. Assist in the concrete practice of job vacancy announcements and assessment processes locally

Four recommendations

The work group then went on to suggest four recommendations for different actors (individuals, institutions, research funders, government):

  1. To establish a comprehensive framework for the assessment of academic careers that:
    • balances quantitative and qualitative goals and forms of documentation for academic standards and competencies
    • enables diverse career paths and promotes high standards in the three key areas: education, research and interaction with society
    • recognises the independent and individual competencies of academic staff as well as their achievements in groups and through collaboration
    • values ??Open Science principles (including open assessment systems)
    • values and encourages academic leadership and management
  2. To engage internationally in developing a Norwegian assessment model because:
    • changes in the assessment criteria cannot be made by one country alone
    • a Norwegian model can contribute to related processes internationally
  3. To use NOR-CAM as a practical and flexible tool for assessing academic results, competence and experience for academic personnel. NOR-CAM will highlight six areas of expertise through systematic documentation and reflection
  4. To develop an ‘automagic CV system’ that enables academics to retrieve data that can be used to document competencies and results in their own career, including applications for positions, promotions and external funding.

Follow-up

Today, I presented the Norwegian report for the EUA workgroup. In many ways, the circle is completed. After all, the inspiration for the Norwegian report came directly from the work of EUA. Hopefully, the report can inspire others in Europe (and beyond) to think anew about career assessment.

Even though it took nearly two years, writing a report is only the beginning. Now it is time to work on how NOR-CAM can be implemented. I am looking forward to contributing to making it become a reality.

Read the full report here:

Combining audio and video files with FFmpeg

When working with various types of video analysis, I often end up with video files without audio. So I need to add the audio track by copying either from the source video file or from a separate audio file. There are many ways of doing this. Many people would probably reach for a video editor, but the problem is that you would most likely end up recompressing both the audio and video file. A better solution is to use FFmpeg, the swizz-army knife of video processing.

As long as you know that the audio and video files you want to combine are the same duration, this is an easy task. Say that you have two video files:

  • input1.mp4 = original video with audio
  • input2.avi = analysis video without audio

Then you can use this one-liner to copy the audio from one file to the other:

ffmpeg -i input1.mp4 -i input2.avi -c copy -map 1:v:0 -map 0:a:0 -shortest output.avi

The output.avi file will have the same video content as input2.avi, but with audio from input1.mp4. Note that this is a lossless (and fast) procedure, it will just copy the content from the source files.

If you want to convert (and compress) the file in one operation, you can use this one-liner to export an MP4 file with .h264 video and aac audio compression:

ffmpeg -i input1.mp4 -i input2.avi -c copy -map 1:v:0 -map 0:a:0 -shortest -c:v mpeg4 -c:a aac output.mp4

Since this involves compressing the file, it will take (much longer) than the first method.