Audio recordings as motion capture

I spend a lot of time walking around the city with my daughter these days, and have been wondering how much I move and how the movement is distributed over time. To answer these questions, and to try out a method for easy and cheap motion capture, I decided to record today’s walk to the playground.

I could probably have recorded the accelerometer data in my phone, but I wanted to try an even more low-tech solution: an audio recorder.

While cleaning up some old electronics boxes the other day I found an old Creative ZEN Nano MP3 player. I had totally forgotten about the thing, and I cannot even remember ever using it. But when I found it I remembered that it actually has a built-in microphone and audio recording functionality. The recording quality is horrible, but that doesn’t really matter for what I want to use it for. The good thing is that it can record for hours on the 1GB built-in memory, using some odd compressed audio format (DVI ADPCM).

Since I am mainly interested in recording motion, I decided to put it in my sock and see if that would be a good solution for recording the motion of my foot. I imagined that the sound of my footsteps would be sufficiently loud that they would be easily detected. This is a fairly reduced recording of all my motion, but I was interested in seeing if it was relevant at all.

The result: a 35 MB audio file with 2,5 hours of foot sounds! In case you are interested, here is a 2-minute sample of regular walking. While it is possible to hear a little bit of environmental sounds, the foot steps are very loud and clear.

Now, what can you do with a file like this? To get the file useable for analysis, I started by converting it to a standard AIFF file using Perian in QuickTime 7. After that I loaded it into Matlab using the wonderful MIRToolbox, resampling it to 100 Hz (from 8kHz). It can probably be resampled at an even lower sampling late for this type of data, but I will look more into that later.

The waveform of the 2,5 hour recording looks like this, and reveals some of the structure:

But calculating the smoothed envelope of the curve gives a clearer representation of the motion:

Here we can clearly identify some of the structure of what I (or at least my right foot) was doing for those 2,5 hours. Not bad at all, and definitely relevant for macro-level motion capture.

Based on the findings of a 2 Hz motion peak in the data reported my MacDougall and Moore, I was curious to see if I could find the same in my data. Taking the FFT of the signal gives this overall spectrum:

Clearly, my foot motion shows the strongest peaks at 4 and 5 Hz. I will have to dive into the material a bit more to understand more about these numbers.

The conclusion so far, though, is that this approach may actually be a quite good, cheap and easy method for recording long-term movement data. And with 8kHz sampling rate, this method may also allow for studying micro-movement in more detail. More about that later.

AudioAnalysis v0.5

I am teaching a course in sound theory this semester, and therefore thought it was time to update a little program I developed several years ago, called SoundAnalysis. While there are many excellent sound analysis programs out there (SonicVisualiserPraat, etc.), they all work on pre-recorded sound material. That is certainly the best approach to sound analysis, but it is not ideal in a pedagogical setting where you want to explain things in realtime.

There are not so many realtime audio analysis programs around, at least not anyone that looks and behaves similar on both OSX and Windows. One exception that is worth mentioning is the excellent sound tools from Princeton, but they lack some of the analysis features I am interested in showing to the students.

So my update of the SoundAnalysis program, should hopefully cover a blank spot in the area of realtime sound visualisation and analysis. The new version provides a larger spectrogram view, and the option to change various spectrogram features on the fly. The quantitative features have been moved to a separate window, and now also includes simple beat tracking.

Below is a screenshot giving an overview of the new version:

Overview of AudioAnalysis

Other new selling points include a brand new name… I have also decided to rename it to AudioAnalysis, so that it harmonizes with my AudioVideoAnalysis and VideoAnalysis programs.

The program can be found over on the fourMs software page, and here is a short tutorial video:

Please let me know if you find bugs or other strange things in the program, and I will try to fix them as soon as possible (I expect there to be some Win 64-bit issues…).

Evaluating a semester of podcasting

Earlier this year I wrote a post about how I was going to try out podcasting during the course MUS2006 Musikk og bevegelse this spring semester. As I am preparing for new courses this fall, now is the time to evaluate my podcasting experience, and decide on whether I am going to continue doing this.

 

Why podcasting?

The first question I should ask  myself is why I would be interested in setting up a podcast from my lectures? There are several reasons:

  1. I don’t give away my slides. This is not to be protectionist, but rather because I don’t think that giving away my keynote slides is particularly useful. I have adopted a PZ-style of making slides, which means that the slides are mainly accompanying my speech. There is not much on each slide, and watching them without listening to what I am saying would be like reading a newspaper without text.
  2. I often teach without using slides. In the sound programming courses (1 & 2) I have been teaching, I have spent almost all the teaching time in Max. Here I typically distribute the patches after class, but they probably wouldn’t be very useful to anyone that wasn’t present.
  3. I want to help the students by giving them a chance to see/hear what happened in the class, e.g. in case they were absent.
  4. I want to allow other people interested in the topic to follow the course. At UiO we have two ways of handling course material: closed, in our Fronter-based system, or open, on the course web site. Since the Norwegian State pays for my teaching, and it is free for the students to attend the courses, I also think that everyone else should be able to get access to the content.

So, all in all, I have found it worthwhile to test podcasting for a semester.

Evaluation

When I started the podcasting project, I had a few “unwritten” criteria for what was important for the different hardware/software solutions that I was going to explore. It had to be:

  • easy to set up before the lecture
  • easy to rig down after the lecture
  • easy to handle the files on my computer
  • easy to publish the files online

But what does “easy” mean in this context? I have come to see “easy” as a combination of cognitive load and time. Cognitive load is here used to refer to the complexity of the setup, the amount of hardware and software needed, and how easy they are to setup and work with. More about this in later sections.

Time is another crucial factor. Even though I always try to prepare well in advance of my lectures, quite often I end up sitting into the last couple of minutes fixing slides, examples, etc. This means that I typically don’t have very much time to pack gear before leaving for the auditorium. Also, at UiO we typically only have 15 minutes between lectures. This means that if the previous lecturer ran over with a few minutes, followed by a couple of minutes to pack down, I would have anything between 0-10 minutes to get ready for my lecture. That is not a lot of time for the essentials: getting my computer up of the bag, connect it to a power supply, login, connect the projector, set up the presentation, connect the remote control and check that the sound is working. It is first after this that I have time to start thinking about the extras: setting up for recording the lecture.

During the semester I tried several different types of setups, and I will present and discuss the different solutions below.

 

Audio recording

Podcasting started out as recording audio, and this is probably also the easiest way of getting started. Selecting microphones is a big issue, balancing between recording my own voice but also questions and comments from students. The challenge is also to choosing a setup that is as simple to use as possible. This means that any type of large sound card, mixer, large microphones, etc. could easily be ruled out, since they would require too much time to set up.

My first setup was based on using a small wireless lavaliere microphone for my own voice, combined with a small omnidirectional “conference” microphone to pick up the students. The two microphone signals was “mixed” using a small headphone splitter (which acts as a mixer when running audio the other way). While the solution worked well and gave good audio, it had two major problems:

  • it takes only 2-3 minutes to set up, but even that may be too much when setting up for a lecture.
  • it requires fresh batteries on the wireless microphone, which may be one thing too much to think about.

For these reasons I have found that it is easier to rely on some kind of cabled microphone (or built-in) which can capture both myself and the students. The sound quality may be lower than when using a close-up microphone, but it may still be sufficient for the use.

When it comes to the recording, there are (at least) two solutions:

  • record on the computer
  • record on a separate device

Recording on the computer is probably the easiest solution, since it involves no extra devices. I use QuickTime 7 for quick recordings on the computer (QuickTime X doesn’t allow for recording audio yet). Since most other audio applications rely on QuickTime anyway, you would only get added CPU usage by using any other software for recording. The only annoying thing about QuickTime is that you get a .mov file, which you need to export an AIFF/WAV/MP3/AAC file out of. In terms of CPU usage, I have no problems with recording audio while at the same time presenting my slides (even with video playback).

Recording on a separate device may be beneficial for several reasons. You don’t use any resources on the computer for recording audio, the chance of crashing is close to zero, and you get the benefits of much better microphones than the one built into the computer. I have tested using my Zoom H4, which is almost instant-on and records directly to WAV/MP3 on an SD card. My only concern about this approach is that you need to think about the batteries on the device (or use external power).

The negative side of both of these approaches is that people don’t get to see the visual content of the presentation. For this reason I decided to explore solutions for also recording visuals.

 

Video recording

It is possible to record video on the computer using QuickTime 7, but this draw quite a lot of resources and is not ideal if you also want to have a presentation running at the same time. Then it is better to use a video camera for recording. I have access to many different types of video cameras, also a couple of professional ones, but my favorite camera is the cheapest of them all, a small Sanyo Xacti HD-2000. The nice thing about this camera is that it records directly to MP4-files. Most new video cameras tend to record to AVCHD files, but such files are really cumbersome to work with. This means that very little has to be done with the files after recording.

However, the easiest solution for recording video is just to grab the screen content using the wonderful little application called ProfCast. This application will record audio together with which slides are being presented, and then create the actual video offline after the lecture. This way it saves a lot of CPU during the actual recording, which means that more power is available for running the presentation itself.

 

Triple boot on MacBook

I am back at work after a long vacation, and one of the first things I started doing this year was to reinstall several of my computers. There is nothing like a fresh start once in a while, with the added benefits of some extra hard disk space (not reinstalling all those programs I never use anyway) and performance benefits (incredible how fast a newly installed computer boots up!).

I have been testing Ubuntu on an Asus eee for a while, and have been impressed by how easy it was to install and use. I have been a Unix/linux users for years at the university, but have given up every time I tried to install it on any of my personal computers. Ubuntu is the first distro that actually managed to install without any problems, and which also managed to detect most of the hardware by itself, at least enough to actually work on the system.

Before I started the process on installing Ubuntu on my MacBook aluminum, I had heard rumors about it being a non-straightforward process, but it turned out to be very simple. I used bootcamp to install Windows XP (remember to format the drive using the windows installer, otherwise it won’t boot up…). To my surprise the new Ubuntu 8.10 installer made it possible to install Ubuntu from within Windows, and without needing to repartition anything. Quite a lot of things are autodetected, and there is a community page that suggests how to fix the rest. The built in audio support is not impressive, but an external sound card will hopefully work fine.

I didn’t find any good recommendations for how much hard drive space I should allocate for XP and Ubuntu, and what type of partitions to use. Previously I have had a 20GB NTSF XP partition, and that seemed sufficient, although I couldn’t read and write to the drive from OSX (apparently there are some software solutions for this). To be more flexible in my tri-OS-life, I decided to go for a 32GB FAT32 partition, of which I set aside 15GB for Ubuntu. After all necessary software is installed, mainly Max/MSP on XP and various Linux audio applications on Ubuntu, there are a 3-4 GB available on each system. This should be sufficient as long as I am mainly going to use the two OSes for occasional software testing.

AudioVideoAnalysis

To allow everyone to watch their own synchronised spectrograms and motiongrams, I have made a small application called AudioVideoAnalysis.

It currently has the following features:

  • Draws a spectrogram from any connected microphone
  • Draws a motiongram/videogram from any connected camera
  • Press the escape button to toggle fullscreen mode

Built with Max/MSP by Cycling ’74 on OS X.5. I will probably make a Windows version at some point, but haven’t gotten that far yet.

A snapshot of the main interface:

The main window of the AudioVideoAnalysis application

The fullscreen can be toggled with the escape button:

Fullscreen mode in the AudioVideoAnalysis application

The are, obviously, lots of things that can and will be improved in future versions. Please let me know of any problems you experience with the application, and if there is anything in particular you think should be included.