Convert MPEG-2 files to MPEG-4

Image result for Canon XF-105
Canon XF105

This is a note to self, and could potentially also be useful to others in need of converting “old-school” MPEG-2 files into more modern MPEG-4 files using FFmpeg.

In the fourMs lab we have a bunch of Canon XF105 video cameras that record .MXF files with MPEG-2 compression. This is not a very useful format for other things we are doing, so I often have to recompress them to something else.

Inspecting one of the files, I just also discovered that they record the audio onto two mono channels:

Stream #0:0: Video: mpeg2video (4:2:2), yuv422p(tv, bt709, top first), 1920x1080 [SAR 1:1 DAR 16:9], 50000 kb/s, 25 fps, 25 tbr, 25 tbn, 50 tbc

Stream #0:1: Audio: pcm_s16le, 48000 Hz, mono, s16, 768 kb/s

Stream #0:2: Audio: pcm_s16le, 48000 Hz, mono, s16, 768 kb/s

So I also want to merge these two mono tracks (which are the left and right inputs of the camera) to a stereo track. FFmpeg comes in handy (as always), and I figured out that this little one-liner will do the trick:

ffmpeg -i input.mxf -vf yadif -vcodec libx264 -q:v 3 -filter_complex "[0:a:0][0:a:1]amerge,channelmap=channel_layout=stereo[st]" -map 0:v -map "[st]" output.mp4

An explanation of some of these settings:

  • yadif: this is for deinterlacing the video
  • libx264: this is probably unnecessary, but forces to use the better MPEG-4 compressor
  • q:v 3: I find this to be a good setting for the video compressor
  • filter_complex: this complex string (courtesy of reddit) does the merging of the two mono sources

Will probably try to add it to MGT-terminal at some point, but this blog post will suffice for now.

Simple tips for better video conferencing

Image result for video meeting

Very many people are currently moving to video-based meetings. For that reason I have written up some quick advise on how to improve your setup. This is based on my interview advise, but grouped differently.

Network

Image result for network clipart

The first important thing is to have as good a network as you can. Video conferencing requires a lot of bandwidth, so even though your e-mail and regular browsing works fine, it may still not be sufficient for good video transmission.

  • Cabled network: If you are able to connect with an Ethernet cable to your router, that would usually always be the best and most solid solution.
  • Wireless network: If cable won’t work for you (it is also difficult logistically in my own apartment), try to get as close as possible to your wi-fi router.

Audio

Image result for headset clipart

I would argue that improving the audio is more important than the video for video conferencing. Most video conferencing systems (Skype, Zoom, etc.) will prioritize the audio channel, which means that the video may stutter while the audio is passing through fine.

The main trick is to aim for separating the “foreground” as much as possible from the “background”. There are some very basic audio principles to follow:

  • Use a headset: The best way to get decent sound for video conferencing, is to move the microphone as close as possible to your mouth. Headsets with a microphone boom in front of your face are the best, but a regular mobile phone headset (the one that came with your mobile phone, for example) would still be better than nothing.
  • Use headphones: If you for some reason do not have a headset with built-in microphone, using a regular pair of headphones is still better than using the speakers on your computer. With this setup you use the microphone on the computer, which may not be ideal, but at least you won’t get feedback problems.
  • Avoid reverberant rooms: If you aim for clarity in conversation, it is typically better to sit in a smaller and more damped room than a large one. That means that a bedroom is typically better than a larger living room. If you use a headset this is less important, but particularly if you only use the built-in microphone and speakers on a laptop, this could make a huge difference in how your voice gets through.
  • Mute yourself: In most system there is a button to mute yourself. If you are not talking all the time, it helps to mute yourself from the discussion. Just remember to unmute when you want to say something!

Video

Image result for webcam clipart

The same principle of separating “foreground” from “background” applies to the video.

  • Lighting: To obtain the best possible video image, think about your placement with respect to lighting. It is, for example, not ideal to sit in front of a window, since a bright light in the background will make it difficult to see your face.
  • Background: The best is to sit in front of a plain wall. If that is not possible, consider whether the background of your image is what you want to show to your fellow students/colleagues.
  • Video angle: If you are using the built-in camera on your computer you may not have too many options for how to place the camera. But you may still consider shifting the camera position so that you and your surroundings look as good as possible.

Summing up

There are, of course, many ways to improve your video conferencing setup. Many people believe that you need to invest in expensive equipment to get good results. But even cheap consumer products are very capable of producing decent results these days. So it is more a matter of optimizing what you have. Good luck!

Visualizing some videos from the AIST Dance Video Database

Researchers from AIST have released an open database of dance videos, and I got very excited to try out some visualization methods on some of the files. This was also a good chance to test out some new functionality in the Musical Gestures Toolbox for Matlab that we are developing at RITMO. The AIST collection contains a number of videos. I selected one hip-hop dance video based on a very steady rhythmic pattern, and a contemporary dance video that is more fluid in both motion and music.

Hip-hop dance

The first I have looked at a couple of different files. Let us start with this one:

We can start by looking at the motion video from this. While a motion video gives less information about context, I often find them interesting to study since they reveal the essentials of what is going on.

And from the motion video we can look at the motiongrams and average image:

The horizontal motiongram reveals the repetitiveness of the dance motion, but also some of the variation throughout the different parts. I also really like the “bump” in the vertical motiongram. This is caused by the couple of side-steps he is doing midways in the session. The “line” that can be seen throughout the horizontal motiongram is cased by the cable in the back of the video.

Contemporary dance

And then I looked at another video, with a very different character:

From this we get the following motion video (wait a few seconds, since there is no dance in the beginning…):

The average image and motiongrams from this video reveal the spatial distribution of the dancer’s motion on stage. Here it is also possible to see an artifact of the compression algorithm of the video file in the beginning of the motiongrams.

I really look forwards to continue the explorations of this wonderful new and open database. Thanks to the AIST researchers for sharing!

Testing simple camera and microphone setups for quick interviews

We just started a new run of our free online course Music Moves. Here we have a tradition of recording wrap-up videos every Friday, in which some of the course educators answer questions from the learners. We have recorded these in many different ways over the years, from using high-end cameras and microphones to just using a handheld phone. We have found that using multiple cameras and microphones is just too time-consuming, both in terms of setup and editing. Using only a mobile phone is extremely easy to set up, but we have had challenges with the audibility of the speech. Before recording this semester’s wrapup videos I therefore decided to test out some solutions based on equipment I had lying around:

  • GoPro Hero 7 w/o audio connector
  • Sony RX100 V
  • Zoom Q8
  • Samsung Galaxy Note 8
  • Røde Smartlav+ lavalier microphone
  • DPA Core 4060 lavalier microphone

In the following I will show some of the results of the testing. I decided to skip the Sony camera in this write-up, because it doesn’t have the option of connecting a separate microphone.

Testing various devices in my office.

GoPro Hero 7

The first example is of a GoPro Hero 7 with just the built-in microphone. This worked much better than expected. The audio is quite clear and it is easy to hear what I am saying. The colours of the video are vivid, but the image is compressed quite a bit. The video is very wide-angled, which is super-practical for such an interview setting, although it looks a bit skewed on the edges. But overall this was a positive surprise.

Connecting a Røde Smartlav+ to the GoPro results in a very clean sound. In fact, this could have been a very nice setup, had it not been for some challenges with placing the camera. That is because the audio dongle for the GoPro is (1) bent downwards and (2) this makes it impossible to use the housing needed to put it on a tripod (as can be seen in the picture to the right). This makes it super-clumpsy to use this setup in a real-life situation. I hear rumours about a new audio add-on for new GoPro cameras, and that may be very interesting to check out.

Zoom Q8

My next device is the Zoom Q8. This is actually a sound recorder with a built-in camera, so one would expect that the audio is the main priority. This is also the case. The video is quite noisy, but the sound quality is much better than with the GoPro. Still I find that the microphone picks up quite a bit of the room. This is good for music recordings, but not so good when the focus is on speech quality.

Hooking up a DPA 4060 lavalier microphone to the Zoom Q8 definitely helps. This is a high-quality microphone, and it needs phantom power (which the Zoom Q8 can deliver). As expected, this gives great sound, very loud and clear. The downside is that it requires bringing an extra XLR cable together with the microphone and camera, since the cable of the DPA is too short for such an interview setup. I like the wide-angle of the video, but the quality of the video is not very good.

Samsung Galaxy Note 8

Mobile phones are becoming increasingly powerful, and I also had to try the camera of my Samsung Galaxy Note 8. I have a small Manfrotto mobile phone stand which makes it possible to place it on a tripod at a suitable distance. After recording I realized how much less wide-angle the phone image is than the GoPro and Zoom cameras, leaving my head cut off in the shots. This doesn’t matter for the testing here, however. The first video is using the built-in microphone of the mobile phone. I am very positively surprised about how crisp and clear my voice is coming through here. In fact, it is quite similar to the GoPro. The video quality is also very good, and clearly the best of the three devices being compared here (the Sony camera has much better video, but it was discarded due to the lack of a microphone input).

And, finally, I connected the SmartLav+ lavalier microphone to the Samsung phone. Here the sound is, of course, very similar to the GoPro recordings.

Conclusion

It is not entirely straight forward to conclude from this testing, but here are some of my thoughts after this very rapid and not very systematic testing:

  • Using on-body microphones (lavalier) greatly improves the audibility as compared to using built-in microphones.
  • The DPA 4060 is great, but the the Smartlav+ is more than good enough for interviews.
  • The GoPro could have been a great device for such interviews, had it not been for the skewed image and the clumsiness of the audio adaptor.
  • The Zoom Q8 is the best audio device (as it should!), but its video is too bad, unfortunately.
  • All in all, I think that the easiest and best solution is the Samsung phone with Smartlav+.

Keynote: Experimenting with Open Research Experiments

Yesterday I gave a keynote lecture at the Munin Conference on Scholarly Publishing in Tromsø. This is an annual conference that gathers librarians, research administrators and publishers, but also some researchers and students. It is my first time to the conference, and found it to be a very diverse, interesting and welcoming group of people.

A poster tweet from the Munin conference team.

Most of the other presenters talked about issues related to publishing academic texts, and with a particular focus on the transition to open access (OA). My presentation was focused on MusicLab, an open research pilot project we are running at the University of Oslo.

MusicLab is a collaboration between RITMO and the University Library, and it is a great example of how cool things can happen when progressive librarians work together with cutting-edge researchers. If you never heard about it before, here is a 42-second introduction to what MusicLab is all about:

Since lots of people talked about Open Science at the conference, I started out by arguing for why I believe that Open Research is a more inclusive term than Open Science. I then went on to identify some of the parts that people think about when talking about Open Research:

Some of the building blocks of an Open Research ecosystem.

As can be seen from the slide above, Open Access (which should probably be called Open Publication instead, since many people mistake it to mean Open Research) is just one part of the whole picture. In the picture above, I am also thinking about these building blocks as being placed on a “timeline” going from left to right, although there may certainly be recursive parts of the model as well.

As a researcher, the publication part is typically happening fairly late in the process, so I always try to remind people that the actual research happens before it is published. For example, the writing process is also something that should be thought of as open process, I think, I mentioned some of my explorations into using various tools for writing Open Manuscripts:

None of these are perfect, however, and for some upcoming projects I am thinking about exploring Authorea and Jupyter Notebook as writing tools. After my talk I also got a recommendation for Bookdown, which I would like to look more at as well (although I have for a long time avoided getting into R, since I am currently investing some time in moving my code from Matlab to Python).

MusicLab

After the fairly long introduction, I finally got to the main point of the talk, which is that of MusicLab. Here are some of the slides from that part:

A MusicLab event is built around a concert, but also typically contains a workshop, panel discussion, data collection, and data jockeying.
Some photos from MusicLab vol. 1, which was focused on muscles, and with a performance by Marco Donnarumma (Photos: Simen Kjellin, UiO).
The MusicLab events are part of a pilot project which is aimed at discovering new ways of doing research, education, and dissemination in open ways.

Challenges

One of the points of MusicLab is to jump in and do something that everyone says is “impossible”… We have of course, have our set of challenges, and particularly related to:

  • Privacy (GDPR)
  • Copyright and licenses
  • Storage
  • Archive

I will write more about all of these later, but here just some slides to summarize some points:

Dividing the people at a MusicLab into three groups, helps when it comes to identifying and solving issues of privacy.
We have not solved the problem of copyright in relation to Open Research yet, but we start to get an overview of all the problems…
Storage is not only about saving files somewhere. They need to be usable as well, ideally right away.
This is the list of files from MusicLab vol. 4, and some of the tools we want to use to analyze them.

We have more challenges than solutions at the moment. But it is good to see that things are moving in the right direction. The dream scenario would be a combination of the multimedia visualization tools from Repovizz combined with the interconnectivity of Trompa, the CC-spirit of Audio Commons, the versioning of GitHub, the accessibility and community of Wikipedia, and the longterm archiving of Zenodo. While that may sound entirely far-fetched right now, it could be a reality with some more interoperability.

I got lots of interesting feedback after my talk. It was particularly interesting to hear several people commenting on the importance of having more people from the arts and humanities involved in discussions about Open Research. I am happy to be one such voice, and hopefully MusicLab can inspire others to push the boundaries for what is currently possible.

If you want to watch the entire thing, it can be found towards the end of this recorded live stream: