While I was testing visualization of some videos from the AIST database earlier today, I wanted to also create some “keyframe image displays”. This can be seen as a way of doing multi-exposure photography, and should be quite straightforward to do. Still it took me quite some time to figure out exactly how to implement it. It may be that I was searching for the wrong things, but in case anyone else is looking for the same, here is a quick write up.
The current procedure is done using a combination of two very handy command line tools: FFmpeg and ImageMagick. I would like to add it to both the Matlab and Python versions of the Musical Gestures Toolbox as well, but will need to figure that out another time.
In this example I will use a hip-hop dance video from the AIST database:
The first step is to extract keyframes from the video file using this one-liner ffmpeg command:
This will use the keyframes from the MP4 file, which should be faster than doing a new analysis of the file. It could, of course, also be possible to sample the video at regular intervals, but the keyframes seem to work fine for my usage. I also choose to save the exported keyframes as TIFF files to avoid running multiple rounds of compression on the files. The end result is a bunch of keyframe images that can be used for further processing.
In my search for a solution, I tried a lot of complex things. But it turned out to be super-simple to get what I wanted:
convert *.tiff -background white -compose darken -flatten keyframes.jpg
Here we use the convert function of ImageMagick to add all the exported keyframes together to one combined image:
Since the dancer was moving in more or less the same place all the time, it is quite compact. Running the same functions on another video of a contemporary dancer, on the other hand, shows some of the potential of this visualization method. Here is the video:
Which results in this keyframe display image:
Besides being cool to look at, it is also quite informative when it comes to telling what is going on in the video. You get information about the temporal and spatial movement of the dancer, although it is difficult to understand exactly when she was moving where.
Next is to also include these methods in the Musical Gestures Toolbox.
Researchers from AIST have released an open database of dance videos, and I got very excited to try out some visualization methods on some of the files. This was also a good chance to test out some new functionality in the Musical Gestures Toolbox for Matlab that we are developing at RITMO. The AIST collection contains a number of videos. I selected one hip-hop dance video based on a very steady rhythmic pattern, and a contemporary dance video that is more fluid in both motion and music.
The first I have looked at a couple of different files. Let us start with this one:
We can start by looking at the motion video from this. While a motion video gives less information about context, I often find them interesting to study since they reveal the essentials of what is going on.
And from the motion video we can look at the motiongrams and average image:
The horizontal motiongram reveals the repetitiveness of the dance motion, but also some of the variation throughout the different parts. I also really like the “bump” in the vertical motiongram. This is caused by the couple of side-steps he is doing midways in the session. The “line” that can be seen throughout the horizontal motiongram is cased by the cable in the back of the video.
And then I looked at another video, with a very different character:
From this we get the following motion video (wait a few seconds, since there is no dance in the beginning…):
The average image and motiongrams from this video reveal the spatial distribution of the dancer’s motion on stage. Here it is also possible to see an artifact of the compression algorithm of the video file in the beginning of the motiongrams.
I really look forwards to continue the explorations of this wonderful new and open database. Thanks to the AIST researchers for sharing!
We just started a new run of our free online course Music Moves. Here we have a tradition of recording wrap-up videos every Friday, in which some of the course educators answer questions from the learners. We have recorded these in many different ways over the years, from using high-end cameras and microphones to just using a handheld phone. We have found that using multiple cameras and microphones is just too time-consuming, both in terms of setup and editing. Using only a mobile phone is extremely easy to set up, but we have had challenges with the audibility of the speech. Before recording this semester’s wrapup videos I therefore decided to test out some solutions based on equipment I had lying around:
GoPro Hero 7 w/o audio connector
Sony RX100 V
Samsung Galaxy Note 8
Røde Smartlav+ lavalier microphone
DPA Core 4060 lavalier microphone
In the following I will show some of the results of the testing. I decided to skip the Sony camera in this write-up, because it doesn’t have the option of connecting a separate microphone.
GoPro Hero 7
The first example is of a GoPro Hero 7 with just the built-in microphone. This worked much better than expected. The audio is quite clear and it is easy to hear what I am saying. The colours of the video are vivid, but the image is compressed quite a bit. The video is very wide-angled, which is super-practical for such an interview setting, although it looks a bit skewed on the edges. But overall this was a positive surprise.
Connecting a Røde Smartlav+ to the GoPro results in a very clean sound. In fact, this could have been a very nice setup, had it not been for some challenges with placing the camera. That is because the audio dongle for the GoPro is (1) bent downwards and (2) this makes it impossible to use the housing needed to put it on a tripod (as can be seen in the picture to the right). This makes it super-clumpsy to use this setup in a real-life situation. I hear rumours about a new audio add-on for new GoPro cameras, and that may be very interesting to check out.
My next device is the Zoom Q8. This is actually a sound recorder with a built-in camera, so one would expect that the audio is the main priority. This is also the case. The video is quite noisy, but the sound quality is much better than with the GoPro. Still I find that the microphone picks up quite a bit of the room. This is good for music recordings, but not so good when the focus is on speech quality.
Hooking up a DPA 4060 lavalier microphone to the Zoom Q8 definitely helps. This is a high-quality microphone, and it needs phantom power (which the Zoom Q8 can deliver). As expected, this gives great sound, very loud and clear. The downside is that it requires bringing an extra XLR cable together with the microphone and camera, since the cable of the DPA is too short for such an interview setup. I like the wide-angle of the video, but the quality of the video is not very good.
Samsung Galaxy Note 8
Mobile phones are becoming increasingly powerful, and I also had to try the camera of my Samsung Galaxy Note 8. I have a small Manfrotto mobile phone stand which makes it possible to place it on a tripod at a suitable distance. After recording I realized how much less wide-angle the phone image is than the GoPro and Zoom cameras, leaving my head cut off in the shots. This doesn’t matter for the testing here, however. The first video is using the built-in microphone of the mobile phone. I am very positively surprised about how crisp and clear my voice is coming through here. In fact, it is quite similar to the GoPro. The video quality is also very good, and clearly the best of the three devices being compared here (the Sony camera has much better video, but it was discarded due to the lack of a microphone input).
And, finally, I connected the SmartLav+ lavalier microphone to the Samsung phone. Here the sound is, of course, very similar to the GoPro recordings.
It is not entirely straight forward to conclude from this testing, but here are some of my thoughts after this very rapid and not very systematic testing:
Using on-body microphones (lavalier) greatly improves the audibility as compared to using built-in microphones.
The DPA 4060 is great, but the the Smartlav+ is more than good enough for interviews.
The GoPro could have been a great device for such interviews, had it not been for the skewed image and the clumsiness of the audio adaptor.
The Zoom Q8 is the best audio device (as it should!), but its video is too bad, unfortunately.
All in all, I think that the easiest and best solution is the Samsung phone with Smartlav+.
How does an “old-school” document camera work for modern-day teaching? Remarkably well, I think. Here are some thoughts on my experience over the last few years.
The reason I got started with a document camera was because I felt the need for a more flexible setup for my classroom teaching. Conference presentations with limited time are better done with linear presentation tools, I think, since the slides help with the flow. But for classroom teaching, in which dialogue with students is at the forefront, such linear presentation tools do not give me the flexibility that I need.
Writing on a black/whiteboard could have been an option, but in many modern classrooms these have been replaced by projector screens. I also find that writing on a board is much more tricky than writing with pen on paper. So a document camera, which is essentially a modernized “overhead projector”, is a good solution.
After a little bit of research some years back, I ended up buying a Lumens Ladibug DC193. The reason I went for this one, was because it had the features I needed, combined with being the only nice-looking document camera I could find (aesthetics is important!). A nice feature is that it has a built-in light, which helps in creating a better image also when the room lighting is not very bright.
One very useful feature of the document camera, is the ability to connect my laptop to the HDMI input on the Ladibug, and then connect the Ladibug HDMI output to the screen. The built-in “video mixer” makes it possible to switch between the document camera and the computer screen. This is a feature I have been using much more than I expected, and allows me to change between slides shown on the PC, some hand-writing on paper, and showing parts of web pages.
When I first got the document camera, I thought that I was going to use the built-in recording functionality a lot. It is possible to connect a USB drive directly to the camera, and make recordings. Unfortunately, the video quality is not very good, and the audio quality from the built-in mono microphone is horrible.
One of the best things about a document camera is that it can be used for other things than just showing text on paper. This is particularly useful when I teach with small devices (instruments and electronics) that are difficult to see at a distance. Placing them on the table below the camera makes them appear large and clear on the screen. One challenge, however, is that the document camera is optimized for text on white paper. So I find that it is best to place a white paper sheet under what I want to show.
Things became a little more complicated when I started to teach in the MCT programme. Here all teaching is happening in the Portal, which connects the two campuses in Oslo and Trondheim. Here we use Zoom for the basic video communication, with a number of different computers connected to make it all work together. I was very happy to find that the Ladibug showed up as a regular “web camera” when I connected it to my PC with a USB cable. This makes it possible to connect and send it as a video source to one of the Zoom screens in our setup.
The solution presented above works well in the Portal, where we already have a bunch of other cameras and computers that handle the rest of the communication. For streaming setups outside of the Portal I have previously shown how it is possible to connect the document camera to the Blackmagic web presenter, which allows for also connecting a regular video camera to the SDI input.
More recently I have also explored the use of a video mixer (Sony MCX-500), which allows for connecting more video cameras and microphones at once. Since the video mixer cannot be connected directly to a PC, it is necessary to also add in the Blackmagic web presenter in the mix. This makes for a quite large and complex setup. I used it for one remote lecture once, and even though it worked, it was not as streamlined as I hoped for. So I will need to find an easier solution in the future.
What is clear, however, is that a document camera is very useful for my teaching style. The Ladibug has served me well for some time, but I will soon start to look for a replacement. I particularly miss having full HD, better calibration of the image, as well as better recording functionality. I hope manufacturers are still developing this type of niche product, ideally also nice-looking ones!
Yesterday I gave a keynote lecture at the Munin Conference on Scholarly Publishing in Tromsø. This is an annual conference that gathers librarians, research administrators and publishers, but also some researchers and students. It is my first time to the conference, and found it to be a very diverse, interesting and welcoming group of people.
Most of the other presenters talked about issues related to publishing academic texts, and with a particular focus on the transition to open access (OA). My presentation was focused on MusicLab, an open research pilot project we are running at the University of Oslo.
MusicLab is a collaboration between RITMO and the University Library, and it is a great example of how cool things can happen when progressive librarians work together with cutting-edge researchers. If you never heard about it before, here is a 42-second introduction to what MusicLab is all about:
As can be seen from the slide above, Open Access (which should probably be called Open Publication instead, since many people mistake it to mean Open Research) is just one part of the whole picture. In the picture above, I am also thinking about these building blocks as being placed on a “timeline” going from left to right, although there may certainly be recursive parts of the model as well.
As a researcher, the publication part is typically happening fairly late in the process, so I always try to remind people that the actual research happens before it is published. For example, the writing process is also something that should be thought of as open process, I think, I mentioned some of my explorations into using various tools for writing Open Manuscripts:
None of these are perfect, however, and for some upcoming projects I am thinking about exploring Authorea and Jupyter Notebook as writing tools. After my talk I also got a recommendation for Bookdown, which I would like to look more at as well (although I have for a long time avoided getting into R, since I am currently investing some time in moving my code from Matlab to Python).
After the fairly long introduction, I finally got to the main point of the talk, which is that of MusicLab. Here are some of the slides from that part:
One of the points of MusicLab is to jump in and do something that everyone says is “impossible”… We have of course, have our set of challenges, and particularly related to:
Copyright and licenses
I will write more about all of these later, but here just some slides to summarize some points:
We have more challenges than solutions at the moment. But it is good to see that things are moving in the right direction. The dream scenario would be a combination of the multimedia visualization tools from Repovizz combined with the interconnectivity of Trompa, the CC-spirit of Audio Commons, the versioning of GitHub, the accessibility and community of Wikipedia, and the longterm archiving of Zenodo. While that may sound entirely far-fetched right now, it could be a reality with some more interoperability.
I got lots of interesting feedback after my talk. It was particularly interesting to hear several people commenting on the importance of having more people from the arts and humanities involved in discussions about Open Research. I am happy to be one such voice, and hopefully MusicLab can inspire others to push the boundaries for what is currently possible.
If you want to watch the entire thing, it can be found towards the end of this recorded live stream: