Try not to headbang challenge

I recently came across a video of the so-called Try not to headbang challenge, where the idea is to, well, not to headbang while listening to music. This immediately caught my attention. After all, I have been researching music-related micromotion over the last years and have run the Norwegian Championship of Standstill since 2012.

Here is an example of Nath & Johnny trying the challenge:

As seen in the video, they are doing ok, although they are far from sitting still. Running the video through the Musical Gestures Toolbox for Python, it is possible to see when and how much they moved clearly.

Below is a quick visualization of the 11-minute long sequence. The videogram (similar to a motiongram but of the original video) shows quite a lot of motion throughout. There is no headbanging, but they do not sit still.

A videogram of the complete video recording (top) with a waveform of the audio track. Two selected frames from the sequence and “zoomed-in” videograms show the motion of specific passages.

There are many good musical examples listed here. We should consider some of them for our next standstill championship. If corona allows, we plan to run a European Championship of Standstill in May 2022. More information soon!

Kayaking motion analysis

Like many others, I bought a kayak during the pandemic, and I have had many nice trips in the Oslo fiord over the last year. Working at RITMO, I think a lot about rhythm these days, and the rhythmic nature of kayaking made me curious to investigate the pattern a little more.

Capturing kayaking motion

My spontaneous investigations into kayak motion began with simply recording a short video of myself kayaking. This was done by placing an action camera (a GoPro Hero 8, to be precise) on my life vest. The result looks like this:

In the future, it would be interesting to also test with a proper motion capture system (see this article for an overview of different approaches). However, as they say, the best motion capture system is the one you have at hand, and cameras are by far the easiest one to bring around.

Analysing kayaking motion

For the analysis, I reached for the Musical Gestures Toolbox for Python. It has matured nicely over the last year and is also where we are putting in most new development efforts these days.

The first step of motion analysis is to generate a motion video:

From the motion video, MGT will also create a motiongram:

Motiongram of a kayaking video.

From the motiongram, it is pretty easy to see the regularity of the kayaking strokes. This may be even easier from the videogram:

Videogram of a kayaking video.

We also get information about the centroid and quantity of motion:

Centroid and quantity of motion of the kayaking video.

The quantity of motion can be used for further statistical analysis. But for now, I am more interested in exploring how it is possible to better visualise the rhythmic properties of the video itself. It was already on the list to implement directograms in MGT, and this is even higher on the list now.

The motion average image (generated from the motion video) does not reveal much about the motion.

Motion average image of the kayaking video.

It is generated by calculating the average of all the frames. What is puzzling is the colour artefacts. I wonder whether that is coming from some compression error in the video or a bug somewhere in MGT for Python. I cannot see the same artefacts in the average image:

Average image of the kayaking video.

Analysing the sound of kayaking

The video recording also has sound, so I was curious to see if this could be used for anything. True, kayaking is a quiet activity, so I didn’t have very high hopes. Also, GoPros don’t have particularly good microphones, and they compress the sound a lot. Still, there could be something in the signal. To begin with, the waveform display of the sound does not tell that much:

A waveform of the sound of kayaking.

The spectrogram does not reveal that much either, although it is interesting to see the effects of the sound compression done by the GoPro (the horizontal lines from 5k and upward).

A spectrogram of the sound of kayaking.

Then the tempogram is more interesting.

A tempogram of the sound of kayaking.

It is exciting to see that it estimates the tempo to be 122 BPM, and this resonates with theories about 120 BPM being the average tempo of moderate human activity.

This little investigation into the sound and video of kayaking made me curious about what else can be found from such recordings. In particular, I will continue to explore approaches to analysing the rhythm of audiovisual recordings. It also made me look forward to a new kayaking season!

Image size

While generating the videograms of Bergensbanen, I discovered that Max/Jitter cannot export images from matrices that are larger than 32767 pixels wide/tall. This is still fairly large, but if I was going to generate a videogram with one pixel stripe per frame in the video, I would need to create an image file that is 1 302 668 pixels wide.

This made me curious as to what type of limitations exist around images. A very quick run-through has told me this:

  • GraphicConverter: 32 000 pixels
  • Photoshop: 30 000 pixels
  • OSX Preview: 30 000 pixels

So it seems that approx. 30 000 pixels wide/tall is some kind of limit to how large digital pictures can be. I guess there is a memory/storage issue related to this, e.g. related to file sizes not exceeding 2GB. For now I have therefore decided to generate videograms that are maximum 32767 pixels wide, but may decide to make some with several separate videograms instead.

Videogram of Bergensbanen

While on paternity leave, I (finally) have time to do small projects that require little brain activity and lots of computation time. One of the things I have wanted to do for a long time is to create a videogram of Bergensbanen (which I briefly mentioned last year). This was a project undertaken by the Norwegian broadcast company (NRK), where they filmed (and broadcast live) the entire train trip from Bergen to Oslo. The best thing is that the entire 7.5 hour video file is available under a CC-license, which opens for many creative applications.

First I wanted to create a videogram based on reducing every single frame in the video file into a pixel stripe in the videogram. However, this is not possible to do in one operation in Max/Jitter, since the video file contains 1 302 668 frames. Jitter cannot export images larger than 32767 pixels wide, and even though I could have set it up to export images of subchunks of the original video, I am not sure if there are any programs that will support reading image files that are more than 1 million pixels wide?

So I have created a videogram based on sampling every 50th frame from the video file, based on a revised version of my VideoAnalysis program. The full videogram (26 056 x 720 pixels) can be found here or on Flickr.


I have also made a more browser friendly version (4096×720 pixels):


Besides just looking at the videogram, which I find quite fascinating in itself, such a display can also reveal various things that happened over time in the recording, e.g.:

  • the NRK logo is rendered as a white line throughout the entire videogram
  • tunnels are dark/black
  • stops at stations can be seen when there are long non-moving parts in the image

These are summarized in this little image excerpt:

Now, this videogram was just a test case, as I am now working on creating videograms of the 5 day recording of Hurtigruten

Difference between videogram and motiongram

For some upcoming blog posts on videograms, I will start by explaining the difference between a motiongram and a videogram. Both are temporal (image) representations of video content (as explained here), and are produced almost in the same way. The difference is that videograms start with the regular video image, and motiongrams start with a motion image.

So for a video of my hand like this:

we will get this horizontal videogram:

Videogram of hi-speed hand motion

and this horizontal motiongram:

Motiongram of hi-speed hand motion

As you see, they both reflect the video content. The main difference is that the videogram preserves the original background colours, while the motiongram only reflects what changes between the frames (i.e. the motion).