Music thumbnailing

A couple of days ago, I read an interesting paper about a new AI algorithm that can summarize long texts. This is an attempt to solve the problem of tl;dr texts, meaning “too long, didn’t read”. The article reminded me that the same problem exists for music, in which case it would probably be tl;dl: “too long, didn’t listen”. I was interested in this topic back when I wrote my master’s thesis about short-term music recognition. One way to overcome the challenge of listening through full music tracks is by creating music “thumbnails”. That is, a compact representation of the most salient parts of the music in question. This is not a trivial task, of course, and lots of research have gone into it over the years. Strangely, though, I haven’t seen any of the many suggested algorithms implemented in any commercial service (so far). ...

December 6, 2020 · 3 min · 598 words · ARJ

International Computer Music Conference

Some notes from ICMC, Barcelona, Spain 4-10 September 2005. Sunday 4 September I attended a workshop on audio mosaicing, which was more like a set of presentations by different people, but still interesting. Jason Freeman, PhD from Columbia, now at Georgia Tech, talked about a java applet creating a 5 second “thumbnail song” of your iTunes collection. Opening concert Chris Brown had composed a piece for the Reactable interactive table made at UPF. The table is very nice, and is responding quickly, but I felt there was a missing link when it comes to the relationships between the gestures made, objects presented and sonic output. Jose Manuel Berenguer played sounds and visuals. I liked the beginning a lot, with a nice combination of granulated sounds and visual particle sworms. Ali Momeni’s installation “un titled” is using the new moother object which makes it possible to acces the Freesound database from within Max/MSP and PD. Ali used it to query for similar files and organizing them in 2-dimensional “sound spaces”. A large mechanical construction is controlling the parameters via Wacom tablets. Nice concept and I like the idea of getting things bigger and more heavy to use, but I had some problems with the mappings and the concept of having to press the large sticks down in the ground to get new sounds. Monday 5 September Fernando Lopez-Lezcano, CCRMA, Stanford, talked about Planet CCRMA and future issues. On a question on free software, he said something like “to me, free software is definitely not free”. Norbert Schnell from IRCAM presented FTM, a nice collection of Max-objects for more advanced data handling in Max/MSP. Rosemary Mountain, Concordia / Hexagram, showed a setup for testing how people can organize visual and auditory stimuli. She used a wireless barcode reader. Ge Wang, Soundlab, Princeton, showed his Chuck programming language, a text-based music language, with some nice graphical add-ons. I’m very sorry I missed his “text-battle” with Nick Collins at the Off-ICMC. Tuesday 6 September Vegard Sandvold, NOTAM, presented some promising results on the use of semantic descriptors of musical intensity. I tried the experiment when it was up and running, and have some problems with the concept of forcing stimuli into predetermined categories. It would be interesting to do a set of similar experiments using a continuous scale instead. His system is currently used by NRK in the radio-intentiometer. Douglas Geers, Minnesota, presented a nice piece in the evening concert, with a violinist wearing glowing thread which he processed with Jitter. Wednesday 7 September Thursday 8 September Rui Pedro Paiva, University of Coimbra, Portugal, presented a way of melody extraction from a polyphonic signal. Based on auditory filtering, and with no attempt to make it fast, they obtained an average performance of about 82% on a varied set of music. Geoffery Peeters, IRCAM, presented a method for rhythm detection which seems to be very promising. Nick Collins, Cambridge, presented an overview of different segmentation algorithms. Xavier Serra, UPF, presented a nice overview of current music technology research, and called for a roadmap for future research. Friday 9 September Eduardo Reck Miranda, Future Music Lab, Plymouth, showed some of his work using EEG to control music. They still have a long way to go, since the signals are weak and noisy, but they had managed to get people to control simple playback of sequences. Carlos Guedes, NYU / Porto, presented his m-tools, a small package of Max-objects developed for controlling musical rhythm from dance movements. and Perry Cook, Princeton, showed tools Jasch and played a nice set at the Off-ICMC.

September 12, 2005 · 3 min · 595 words · ARJ