Ever since I finished my dissertation in 2007, I have thought about writing it up as a book. Parts of the dissertation were translated and extended in the Norwegian-language textbook Musikk og bevegelse (which, by the way, is out of print but freely available as an ebook). That book focused primarily on music-related body motion and was written for the course MUS2006 at the University of Oslo. However, my action-sound theory was only partially mentioned and never properly presented in a book format.
I started on a book manuscript around ten years ago, but it has taken a long time to get it finalized. Family life, a period as Head of Department, and building up RITMO have taken up much of my time over the last decade. Last summer, I managed to complete the first draft of a book manuscript.
I am thrilled to announce that The MIT Press has accepted to publish the book. As an Open Research advocate, I am equally thrilled that the book will be published Open Access. The plan is to submit the final manuscript in August. So over the last month, I have been polishing up the text. What is the book’s content? Well, quite a lot, but here is a short summary:
What is an instrument? How is it used? How do new technologies change the way we perform and perceive music? This is a theoretical music technology book, informed by new research in embodied music cognition. The author argues that there are some fundamental differences between acoustic and electroacoustic instruments. Instruments have traditionally been sound-makers. New electroacoustic instruments are often music-makers. The book explores current and future approaches to music-making by analysing instruments. This is done through four distinctive themes—musicking, embodiment, interaction, and affection—that all tap into different academic disciplines: music sociology, music psychology, music technology, and music aesthetics. The aim is to combine some influential existing theories from each of these domains with the author’s thinking about the future of musical engagement.
And here is a sneak peek at the table of contents:
Although I think the main structure and content are in place, there will surely be some more changes. The challenge is that as I am reading through and checking citations and references, I come across new exciting things that I want to include. But at some point, I realize that I will have to say that enough is enough…
Many people rely on what I will call linear presentation tools when they lecture. This includes software such as LibreOffice Impress, Google Presentation, MS PowerPoint, or Keynote. These tools are great for smooth, timed, linear lectures. I also use them from time to time, but mainly if I know exactly what to say. They are also good when I lecture with others, and we need to develop a presentation together. However, linear presentation tools do not work equally well for general teaching, where spontaneity is required. For example, I often like to take questions during lectures. Answering questions may quickly lead to a different presentation order than what I had originally planned. For that reason, I have explored different non-linear presentation tools.
Document camera as presentation tool
Sometimes, but seldom, I only speak when I teach. I am a person that thinks very visually, so when I want to explain something, I usually prefer to show something as well. I used to be quite happy with using a black- or whiteboard when teaching, but some years ago I invested in a document camera.
The benefit of teaching with a document camera is that I can show small instruments or electronic parts while teaching. It, of course, also works well to write and draw with pen and paper. In fact, I prefer this to write on a whiteboard.
Such a setup allows me for writing with pen on paper, which leads to a very different delivery than if I am using pre-made slides. It also allows for showing things in front of the camera. The downside to using a document camera is that you need to make all the content on the fly. I usually have a draft of what I want to say, which helps in structuring my thoughts. Sometimes I even pre-make some “slides” that can be shown in front of the camera. But there are also times where I want to pre-make more material. Then I have found that mind-mapping works well.
Mind maps as a presentation tool
I have often found that my drafts for document camera-based lectures were developed as mind maps. That is, multi-dimensional drawings spreading out from a core title or concept. For that reason, I wanted to test whether I could use mind mapping software for presentations.
Over the last couple of years, I have tested various solutions. In the end, I have found Mindomo to fit my needs very well. It is online-based, but they also have a multi-platform app that works well on Ubuntu. It is not the most feature-reach mind mapping software out there, but it has a nice balance of features versus usability. I also like that it has a presentation mode that removes all the editing tools. As such, it works very well for mind map-based presentations.
I have primarily used mind map-based presentations for teaching and internal seminars, but some weeks ago I decided to test it for a research presentation. I was asked to present at the EnTimeMent workshop run by Qualisys, but as I was preparing the presentation I didn’t know exactly who the audience would be and the format of the workshop. Then it is difficult to plan for a linear presentation. Since I had lots of video material to show, this wasn’t an ideal time to use the document camera, either. So I decided to test out a mind map-based presentation.
Below is an embed of the presentation I made:
And here are screenshots showing the fully collapsed and fully open versions of the mind map.
I had planned a structure of how I would run the presentation, moving clockwise through the material. I kept with that plan, more or less. What was nice was that I could adjust how many levels I should dig into the material. After listening to some of the speakers before me, I decided to skip certain parts. This was easy because I could leave out opening some of the sublevels of the presentation.
Here is a recording of the presentation:
I had some issues with the network connection in the beginning (yes, presenting over wifi is not a good idea, but it is sometimes unavoidably), so apologies for the poor audio/video in some parts of the presentation.
I still have to get more familiar with moving around in such presentations, but all in all, I am happy about the flexibility of such a presentation tool. It allows for developing a fairly large pool of material that it is possible to draw on when presenting. Rather than deleting/hiding slides in a linear presentation, a mind map-based presentation can easily be adjusted by not opening various parts.
This year’s Sound and Music Computing (SMC) Conference has opened for virtual lab tours. When we cannot travel to visit each other, this is a great way to showcase how things look and what we are working on.
Needless to say, we only scratched the surface of everything going on in the field of sound and music computing at the University of Oslo in this video. The video focused primarily on our infrastructures. We have several ongoing projects that use these studios and labs and also some non-lab-based projects. This include:
In previous Webinars, such as the RITMO Seminars by Rebecca Fiebrink and Sean Gallagher, I ran everything from my office. These were completely online events, based on each person sitting with their own laptop. This is where the Zoom Webinar solution shines.
Things become more complex once you try to run a hybrid event, where some people are online and others are on-site. Then you need to combine methods for video production and streaming with those of a streamlined video conferencing solution. It is doable but quite tricky.
The production of the RPPW Webinar involved four people: myself as “director”, RITMO admin Marit Furunes as “Zoom captain”, and two MCT students, Thomas Anda and Wenbo Yi as “Zoom station managers”. We could probably have done it with three people, but given that other things were going on simultaneously, I am happy we decided to have four people involved. In the following, I will describe each person’s role.
Zoom station 1
The first Zoom station, RITMO 1, was operated by Wenbo Yi. He was sitting behind the lecture desk in the hall, equipped with one desktop PC with two screens. The right screen was split to be shown with a projector on the wall. This was the screen that showed displays when there were breaks, and so on.
There are three cameras connected to the PC: one front-facing that we used for the moderator of the keynote presentations, one next to the screen on the wall that showed a close-up of conference chair Anne Danielsen during the introductions, and one in the back of the hall that showed the whole space.
There are four microphones connected to the PC and the PA system in the hall. One on the desk that was used for the keynote moderation. We only used one of the wireless microphones, a handheld one that Anne used during her introductions.
The nice thing about Zoom is that it is easy for a person to turn on and off cameras and microphones. However, this is designed around the concept that you are sitting in front of your computer. When you are standing in the middle of the room, someone else will need to click. That was the job of Wenbo. He switched between cameras and microphones and turned on and off slides.
Zoom station 2
The second Zoom station, RITMO 2, was operated by Thomas Anda, sitting in the control room behind the hall. He was controlling a station that was originally designed as a regular video streaming setup. This includes two remote-controlled PTZ cameras connected to a video mixer. For regular streams, we would have tapped the audio and video from the auditorium and made a mix to be streamed to, for example, YouTube. Now, we mainly used one of the PTZ cameras to display the general picture from the hall.
The main job of Thomas was to playback all the 100 pre-recorded videos. We had a separate, ethernet-cabled PC for this task connected to the Zoom Webinar and shared the screen. The videos were cued in VLC, with poster images inserted between the videos. During our testing of this setup, we discovered the uneven sound level of the video files, which led to a normalization procedure for all of them.
In theory, we could have played the video files from RITMO station 1. However, both Wenbo and Thomas had plenty of things to think about, so it would have been hard to do it all alone. Also, having two stations allowed for having two camera views and also added redundancy for the stream.
Zoom captain station
The third station was controlled by our “Zoom captain”, Marit Furunes. She served as the main host of the Webinar most of the time and was responsible for promoting and de-promoting people to panellists during the conference.
It is possible to set up panels in advance, but that requires separate Zoom Webinars and individualized invitation e-mails. We have experienced in the past that people often forget about these e-mails, so we decided to just have one Zoom Webinar for the entire conference and rather move people in and out of the panels manually. That required some manual work by Mari, but it also meant that she was in full control of who could talk in each session.
She was also in charge of turning on and off people’s video and sound, and ensuring that the final streamed looked fine.
I was sitting next to Marit, controlling the “director station”. I was mainly checking that things were running as they should, but I also served as a backup for Marit when she took breaks. In between, I also tweeted some highlights, replied to e-mails that came in, and commented on things in Slack.
Monitoring and control
Together, the four of us involved in the production managed to create a nice-looking result. There were some glitches, but in general, things went as planned. The most challenging part of working with a Webinar-based setup is the lack of control and monitoring. What we have learned is that “what you see is not what you get”. We never really understood what to click on to get the final result we wanted. For example, we often had to click back and forth between the “gallery” and “speaker” view to get the desired result.
Also, as a host, you can turn off other cameras, but you cannot turn them on, only ask for the person to turn them on. That makes sense in many ways. After all, you should not be allowed to turn on the camera of another person remotely. However, as a production tool in an auditorium, this was cumbersome. It happened that Marit and I wanted to turn on the main video camera in the hall (from RITMO 2) or the front-facing camera (from RITMO 1). But we were not allowed to do this. Instead, we had to request that Thomas or Wenbo could turn on the cameras.
The Zoom Webinar function was clearly made for a traditional video-conferencing-like setup. For that it also works very well. As described above, we managed to make it work quite well also in a hybrid setup. However, this required a 4-person strong team and 5 computers. The challenge was that we never really felt that we were completely in control of things. Also, we could not properly monitor the different signals.
The alternative would be a regular video streaming solution, based on video and audio mixers. That would have given us a much higher control of the final stream, including good monitoring. It would have required more equipment (that we have) but not necessarily more people. We would have lost out on some of the Zoom functionality, though, like the Q&A functionality that works very well.
Next time I am doing something like this, I would try to run a stream-based setup instead. Panellists could then come in through a Zoom Room, which could be mixed into a stream using either our hardware video mixer or a software mixer like OBS. Time will tell if that ends up being better or worse.
There are many ways to run conferences. Here is a summary of how we ran the Rhythm Production and Perception Workshop 2021 at RITMO this week. RPPW is called a workshop, but it is really a full-blown conference. Almost 200 participants enjoy 100 talks and posters, 2 keynote speeches, and 3 music performances spread across 4 days.
A hybrid format
We started planning RPPW as an on-site event back in 2019. Then, when the pandemic hit, we quickly turned around and decided to make it into an online-only conference. But as the covid restrictions have been lifted in Norway recently, we decided to run it as a “hybrid” event. That is, everything was run both on-site at RITMO and online. Only RITMO people were physically present, though, so most people experienced it as an online-only event. Still, given that future events will probably be hybrid, we found it to be a good way of experimenting with this new conference format.
In my experience, running hybrid events are more challenging than online-only or on-site-only. The two formats are radically different. If this is not acknowledged and planned for, one risks having inferior experiences for everyone. However, if done right, hybrid may actually work quite well. Looking at feedback in the participant survey, I am happy to see that most people found it to work well. In fact, the majority actually favours a hybrid conference in the future. This would require some attention to details, several of which I will discuss below.
Time zone challenges
There are many technological challenges with running international conferences, but the challenge of different time zones is probably the biggest hurdle. I participated in the NIME 2021 conference last week, which was run from Shanghai. NIME is a truly international conference, with participants spread around the globe. To cater for global participation, the program was split into program blocks that were repeated each day. Therefore most paper/poster sessions ran twice a day. Keynotes happened live and were repeated as a “live replay” later in the day.
I think NIME’s block-based schedule worked quite well. The most important was that it allowed for combinations of participants from Asia+Europe and Europe/America throughout the days. The downside to this approach is that you don’t get the sense of “being together” at any point in time. For such global conferences, this is an unsolvable problem, I think. There will always be someone for whom the scheduling will be in the middle of the night.
For RPPW2021 we didn’t have many submissions from Asia/Australasia, unfortunately. That is a pity from a global perspective. However, on the positive side, this meant that the large majority of participants were based in Europe and the Americas. We, therefore, decided to run the conference in the evenings Oslo time, from 16:00-22:00. This is a little late for Europeans, and it is particularly tricky for those with children and other family obligations (myself included). But it allows for having a live, single-track conference format that creates the sense of “togetherness”.
Talking about time zone challenges, communicating when things happen can be tricky. This is particularly the case now that we have daylight savings here in Norway (CEST, UTC+2). One way to help people get the time right was to implement localized time on the conference page. That way people could easily figure out when sessions started in their local time zone. They had the same thing at NIME, but there you had to choose your time zone manually in the schedule. We implemented automatic schedule adjustments based on the computer clock. This worked for most people, except for those that (for some reason) had another time on their computer.
Many conferences have landed on pre-recorded videos and live Q&A sessions as a default presentation format. We also decided to use this approach for RPPW. We asked people to prepare 10-minute long (short) videos. This is shorter than typical conference presentations, which often lasts 20-30 minutes. In my experience, pre-recorded videos are generally shorter, more precise, and easier to understand than live talks. So you actually save time this way, which can be used for other things. You also avoid all sorts of technical challenges with screen-sharing and so on.
Almost everyone managed to keep their pre-recorded videos within 10 minutes and I don’t think that we lost out on any information that would have been presented in a typical 20-minute lecture. We received some videos that were slightly longer, but these were effectively trimmed and/or sped up a little to be around 10 minutes long.
We had originally planned to use YouTube as our central video archive. However, when the student assistants started uploading videos to our YouTube account, we realized that there is an upload limit of 10 videos per day. We had around 100 videos, and with only a few days left to the conference, we had to find another solution.
Fortunately, UiO has a decent web player as part of the standard content management system. So we decided to rely on UiO’s video server and uploaded all the videos to a folder on the conference web page. We continued to upload 10 videos per day to YouTube, but we never really informed about that playlist. Still, it is useful as a backup solution. In hindsight, I would have planned to use the UiO solution from the start.
Storing the videos online is only part of the job, though. Navigating and searching through them is critical for their usefulness. Therefore, we decided to set up a separate web page for participants on rppw2021.org. This page was designed by a group of students in our Music, Communication and Technology (MCT) master’s programme who had RPPW as their applied project this spring semester. They made a very nice navigation system from which we linked up all the videos hosted on the UiO server.
In the end, all the videos are navigable and searchable on the UiO web page, and they are also on YouTube. People are different, so it is good to have multiple paths to the same material.
I didn’t think much about captions before NIME 2020, in which accessibility was the main topic. However, over the last year, I have realized that captions are important not only for people with hearing impairments. Some struggle with understanding the language; others may want to keep the sound low or off for social or practical reasons. In all these cases, captions may help.
One reason we wanted to upload videos to YouTube was to leverage their great auto-caption feature. Unfortunately, when we switched to the UiO-based video solution, captioning had to be done differently. To the rescue came a new auto-text service that is currently in beta-testing at UiO. The service is actually based on a Google-driven caption engine, probably quite similar to the one used on YouTube. Unfortunately, we had to run this separately for each video, a time-consuming task for one of our student assistants. On the positive side, the auto-texting worked quite well. Much has happened since I tested a similar service a year ago.
The result of the auto-captioning is a .vtt file for each video file. These were uploaded next to the video files and made available to be turned on and off in the video player. An example can be seen here:
As can be seen, auto-texting is not perfect, but it helps a long way. I see that auto-texting is becoming more popular these days. There seems to be an increased awareness of the importance of captions. Hopefully, more and better tools will be made available on all platforms soon. In fact, Zoom has actually opened for live captioning in both Zoom Rooms and Webinars. However, for GDPR reasons, this is not yet available at UiO.
As written about in another blog post, we found it necessary to normalize all the pre-recorded videos. That is, adjust the sound to a similar level. The videos were recorded with all sorts of equipment, so the sound quality and levels varied widely. We thought about this a little too late, unfortunately. By that time, we had already uploaded all the videos to the program page and YouTube. Still, we decided to normalize all the videos to ensure an equal sound level when multiple videos were played back consecutively during the conference. Next time I will think about normalizing all video files from the start.
The streaming of RPPW2021 happened from Forsamlingssalen, our seminar room at RITMO. This hall can usually hold 70-80 people, now we were around 10, including those of us involved in the presentation. I have written a separate blog post about how the Webinar was produced.
For me, one of the important things when it comes to physical-virtual communication is to get a sense of where people are located. Therefore, I argued strongly for using a camera that showed the hall and the people inside. We also decided that Anne Danielsen, who served as conference chair, should have short “intros” and “outros” for each session from the hall. This could be seen as unnecessary from a program perspective, and it definitely complicated the streaming. However, I think this trick was one of the reasons why people reported that they felt they were “at” RITMO.
We used the same Zoom Webinar for the whole conference, and it was running continuously each day. When there were other activities, break, posters, and performances, we put up a poster in the Webinar with information about where to go. We also showed the overview image from the hall the whole time. Several people commented that this was a nice gesture.
There are many ways of doing poster sessions. Some conferences have been adventurous and used virtual reality platforms like GatherTown and Mozilla Hubs. We went more “safe” and decided to use a Zoom Room for the poster sessions. At first, we had planned to have a separate Zoom Room per poster. Fortunately, during NIME 2021, I discovered that it was possible to pre-assign breakout rooms that people could move in and out of. So we decided to use the same solution for RPPW.
This worked well, I think, and people could freely move between rooms. We did three things that further improved the poster room experience:
Each poster session started with a “poster blitz” showing 1-minute pre-recorded videos of all (~15) posters per session. This was a quick way for everyone to get an overview of all the content, and (hopefully) made it easier to choose where to go. Then the breakout rooms were opened for people to choose from.
We had a poster session host sitting in the main Zoom Room all the time, helping people to navigate between rooms. Most people figured out how to switch themselves, but she could also manually move people around based on their requests. She also showed a slide with names and titles, which further simplified the selection of poster rooms.
We had one RITMO student/staff assigned to each poster breakout room. This person entered the breakout room together with the presenter at the start of the poster session. This resource person could help with technical problems, but, more importantly, they would also help start the discussion. We had assigned people based on their interests, so all poster presenters had a knowledgeable person in the room from the start. If nobody comes to your poster at a physical conference, you can at least talk to the people standing next to you. In Zoom, you are very alone if nobody shows up. We did not want that to happen. In the end, the poster sessions turned into very lively events.
We had scheduled three performances during the workshop, all of which were challenging in their own ways. To keep things manageable from a production perspective, these were run by separate teams in different venues and on separate online channels. Just to explain some of the complexity, here is a flowchart of the Fibres Out of Line performance, which featured a dancer improvising with 10 autonomous agents.
Equally challenging was the N-place performance, featuring three musicians in three cities (Oslo, Stockholm, Berlin). This performance relied on a low-latency audio connection and was, in addition, streaming the coordinates of the musicians based on motion tracking.
All the performances worked well. I think this was in large thanks to the separate production teams. Also, having separate spaces to set things up and test properly also helped a lot.
Like many other conferences, we also used Slack as a text-based communication channel during the conference. We explicitly asked people to use the Q&A function of the Zoom Webinar during the live sessions and use Slack for all communication. This generally worked well, I think, although Slack is not everyone’s taste
We spent some time discussing how to organize Slack. In the end, we had these channels:
General channels: #general, #social-introduce-yourself, #tech-support, #feedback
Extra channels: #music-performance, #job-openings, #random
Of these, the #social-introduce-yourself channel was by far the most popular. Lots of people presented themselves there, which created a very nice and warm vibe.
There were also some discussions going on in the thematic channels, but perhaps less than what I have seen at other conferences. One reason for this may be that most people followed the content live, and there was, therefore, less need for Slack discussions. It will be interesting to see whether there will be any activity also after the conference.
Finally, to continue with creating the sense of “being there”, I was eager to set up a physical-virtual coffee room. This was physically set up in the RITMO kitchen and was on one hour before the program started and until one hour after the end of the program. Since only panellists were allowed to show their face and talk in the Zoom Webinar, this coffee room was a way for people to interact freely.
The Coffee Room was not used a lot. Almost no one came in before and during the program, but a couple of the days there were some lively discussions going on after the program had ended. Then we also ended up creating some separate breakout rooms that people could move into for talking in smaller groups.
Even though the Coffee Room was not used a lot, I think it served its purpose. I also feel that it was conceptually very important both for online and on-site participants. The online participants got a chance to look into the “heart” of RITMO’s premises. And RITMO people passing by the screen got reminded that a conference with online participants was going on.
All in all, I think RPPW2021 worked very well. The feedback we have received so far has also been overwhelmingly positive. It is interesting to see that what we ended up doing was not too far from what the MCT students suggested in their applied project.
However, there are always things that could have been improved:
We knew that the scheduling would be hard for some. It didn’t feel good to leave out Asian/Australasian participants. Many Europeans also struggled with the evening schedule. Still, I think many people valued the possibility of “being together”. In retrospect, we should probably have included a 1-hour break in the program for eating lunch/dinner.
Zoom Webinar and Zoom Rooms are robust video conferencing solutions. However, they are tricky from a production perspective. Next time we are doing something like this, I wonder if we should replace the Webinar part with a regular streaming service. We streamed two of the concerts on YouTube and that allows for a very different, and more controlled, video production.
People have generally been good at muting themselves when not talking, so there were no feedback problems. However, the sound quality varied a lot. This was apparent when listening to the PA system in the hall. People using headsets with a microphone closer to the mouth were much easier to listen to than those using laptop microphones in reverberant rooms. So we should have focused more on asking people to use a headset and generally improve their setup.
We included auto-generated captions for the online videos. However, since we had no time to check the captions, and also didn’t want to clutter the screen too much, we decided to play the videos without captions during the live playback. It would have been nice to include professional, selectable captioning. However, we did not have the resources for that this time around. In the future, I hope that it would be possible to include auto-generated live captioning for all sessions.