Analyzing a double stroke drum roll

Yesterday, PhD fellow Mojtaba Karbassi presented his research on impedance control in robotic drumming at RITMO. I will surely get back to discussing more of his research later. Today, I wanted to share the analysis of one of the videos he showed. Mojtaba is working on developing a robot that can play a double stroke drum roll. To explain what this is, he showed this video he had found online, made by John Wooton:

The double stroke roll is a standard technique for drummers, but not everyone manages to perform it as evenly as in this example. I was eager to have a look at the actions in a little more detail. We are currently beta-testing the next release of the Musical Gestures Toolbox for Python, so I thought this video would be a nice test case.

Motion video

I started the analysis by extracting the part of the video where he is showing the complete drum roll. Next, I generated a motion video of this segment:

This is already fascinating to look at. Since the background is removed, only the motion is visible. Obviously, the framerate of the video is not able to capture the speed that he plays with. I was therefore curious about the level of detail I could achieve in the further analysis.

Audio visualization

Before delving into the visualization of the video file, I made a spectrogram of the sound:

If you are used to looking at spectrograms, you can quite clearly see the change in frequency as the drummer is speeding up and then slowing down again. However, a tempogram of the audio is even clearer:

Here you can really see the change in both the frequency and the onset strength. The audio is sampled at a much higher frequency (44.1 kHz) than the video (25 fps). Is it possible to see some of the same effects in the motion?


I then moved on to create a motiongram of the video:

There are two problems with this motiongram. First, the recording is composed of alternating shots from two different camera angles. These changes between shots can clearly be seen in the motiongram (marked with Camera 1 and 2). Second, this horizontal motiongram only reveals the vertical motion in the video image. Since we are here averaging over each row in the image, the motiongram shows both the left and right-hand motion. For such a recording, it is, therefore, more relevant to look at the vertical motiongram, which shows the horizontal motion:

In this motiongram, we can more clearly see the patterns of each hand. Still, we have the problem of the alternating shots. If we “zoom” in on the part called Camera 2b, it is possible to see the evenness of the motion in the most rapid part:

I also find it fascinating to “zoom” in on the part called Camera 2c, which shows the gradual slow-down of motion:

Finally, let us consider the slowest part of the drum roll (Camera 1d):

Here it is possible to see the beauty of the double strokes very clearly.

Greyscale spectrograms from MIRToolbox

I am using the excellent MIRToolbox for Matlab for a lot of sound analysis applications these days. It meets a lot of my needs, but there are a few things that I miss. Perhaps the most important one is the ability to make clean greyscale spectrograms. The regular mirspectrum function returns a colour spectrogram with lots of garnish, like this:

Such a spectrogram may be useful in some contexts, but not always. Often I just want a plain, greyscale spectrogram, like this:

Here is my trick to do this, based on mirspectrum:

as=mirspectrum(a, 'Frame', 'Max', 3000);
axis xy;
for i=1:100
   lgrays(i,:) = 1-i/100;

Let me know if you have a better solution for doing this.



To allow everyone to watch their own synchronised spectrograms and motiongrams, I have made a small application called AudioVideoAnalysis.

It currently has the following features:

  • Draws a spectrogram from any connected microphone
  • Draws a motiongram/videogram from any connected camera
  • Press the escape button to toggle fullscreen mode

Built with Max/MSP by Cycling ’74 on OS X.5. I will probably make a Windows version at some point, but haven’t gotten that far yet.

A snapshot of the main interface:

The main window of the AudioVideoAnalysis application

The fullscreen can be toggled with the escape button:

Fullscreen mode in the AudioVideoAnalysis application

The are, obviously, lots of things that can and will be improved in future versions. Please let me know of any problems you experience with the application, and if there is anything in particular you think should be included.