Running a disputation on YouTube

Last week, Ulf Holbrook defended his dissertation at RITMO. I was in charge of streaming the disputation, and here are some reflections on the technical setup and streaming.

Zoom Webinars vs YouTube Streaming

I have previously written about running a hybrid disputation using a Zoom webinar. We have used variations of that setup also for other events. For example, last year, we ran RPPW as a hybrid conference. There are some benefits of using Zoom, particularly when having many presenters. Zoom rooms are the best for small groups where everyone should be able to participate. For larger groups, and particularly (semi-)public events, Zoom Webinars are the only viable solution. I had only experienced Zoom bombing once (when someone else organised a public event with more than 100 people present), which was an unpleasant experience. That is why we have run all our public events using Zoom Webinars, where we have more fine-grained control of who is allowed to talk and share their video and screen.

I find that a streaming solution (such as on YouTube) is the best for public events where there is no need for much interaction. A public PhD defence is one such type of event. This is a one-way delivery, where the audience is passive; hence streaming is perfectly fine. So for Ulf’s defence, we opted for using Youtube as our streaming service. We could have used UiO’s streaming service, but then we would have missed out on the social media parts of YouTube as a channel, particularly the chat functionality that people could use to ask questions. In the end, nobody asked any questions, but it still felt good to be able to communicate with the audience.


As described previously, we have a relatively complex audiovisual installation in the hall. There are two pairs of PTZ cameras: one pair connected to the lecture desk’s PC in the front and another pair connected to a video mixer in the control room. It is possible to run two 2-camera “productions” at once, one from the back and one from the front. In the Zoom Webinars, we have used the two cameras connected to the front PC. However, we used the cameras connected to the streaming station in the back of the hall for the streaming setup.

The view from the control room in the back of Forsamlingssalen at RITMO. Two PTZ cameras can be seen on the shelf in the lower left corner of the picture.

During the defence, I sat in the control room, switching between cameras and checking that everything went as it should. As can be seen in the image below, the setup consists of a hardware video controller and video mixer, and the mixed signal is then passed on to a streaming PC.

A video from the control room, with a screen of the video mixer’s view to the left, the streaming PC on the right, and my laptop for “monitoring” in the middle.

The nice thing about working with a hardware video mixer is that you can turn it on, and it works. Computers are powerful and versatile, but hardware solutions are more reliable. As seen in the image below, the mixing consisted of choosing between the two PTZ cameras, one that gave an overview shot and another a close-up. In addition, I mixed in the slides shown on the screen. There is a picture-in-picture mode on the video mixer, but one challenge is that you don’t know what is coming on the slides. So you may end up covering some of the slides, as shown in the image below.

The video mixer was used to switch between two camera views and the image from the laptop.

I still struggle to understand the logic of setting up a live stream on YouTube. The video grabber on the streaming PC (a Blackmagic Web Presenter HD) shows up as a “webcam” and can be used directly to stream to YouTube. I had made a test run some days before the defence, and everything worked well. However, when we were about to start the scheduled stream, I realised that the source was changed to “broadcast software” instead of “webcam”. I have no idea why that happened, but fortunately, I had OBS installed on the PC and could start the stream from there. The nice thing about OBS is that you also get access to a software-based audio mixer, which came in handy for tuning the sound slightly during the defence.

The streaming PC was running OBS, getting the signal from the Web Presenter HD.

Being able to monitor the output is key to any production. Unfortunately, for the first part of Ulf’s presentation,n I was monitoring on the video mixer. There everything sounded fine. It was only after a while that I connected to the YouTube stream with my laptop and realised that there was a slight “echo” on the sound.

There was an echo on the sound in the introduction’s opening, caused by two audio streams with a short offset.

It turned out that this was because we were streaming sound twice through OBS, both through the incoming audio on the PC and through the audio channel passed alongside the HDMI signal. I had briefly checked that things were fine when starting up the YouTube channel, but I think what happened was that the two audio streams were slightly getting out of sync, leading to first an echo-like sound and later a delay. After discovering the problem, I quickly managed to turn off one of the audio streams.

The sound was better after turning off one of the audio streams.

For the rest of the disputation, I was careful to monitor the YouTube stream at regular intervals to check that things worked well. But since the Youtube stream was 10+ seconds delayed, I had to do the main monitoring on the video mixer to change camera positions in time.

I have found it important to monitor the final stream on YouTube, in case there are any problems.

Summing up

Apart from the audio problems initially, I think the streaming went well. Of course, we are not aiming at TV production quality. Still, we try to create productions that are ok to watch and listen to from a distance. We have had 3-4 people involved in the productions for some of our previous events. This time, we were two people involved. I ran the online production, and Eirik Slinning Karlsen from the RITMO administration was in control of the microphones in the hall. This setup works well and is more realistic for future events.

Digital competency

What are the digital competencies needed in the future? Our head of department has challenged me to talk about this topic at an internal seminar today. Here is a summary of what I said.

Competencies vs skills

First, I think it is crucial to separate competencies from skills. The latter relates to how you do something. There has been much focus on teaching skills, mainly teaching people how to use various software or hardware. This is not necessarily bad, but it is not the most productive thing in higher education, in my opinion. Developing competency goes beyond learning new skills.

Some argue that skill is only one of three parts of competency, with knowledge and abilities being the others:

Skills + Knowledge + Abilities = Competencies

So a skill can be seen as part of competency, but it is not the same. This is particularly important in higher education, where the aim is to train students for life-long careers. As university teachers, we need to develop our students’ competencies, not only their skills.

Digital vs technological competency

Another misunderstanding is that “digital” and “technology” are synonyms, and they are not. Technologies can be either digital or analogue (or a combination). Think of “computers”. The word originated from humans (often women) that manually computed advanced calculations. Human computers were eventually replaced by mechanical machine computers, while today we mainly find digital computers. Interestingly, there is a growing amount of research on analogue computers again.

I often argue that traditional music notation is a digital representation. Notes such as “C”, “D”, and “E” are symbolic representations of a discrete nature, and these digital notes may be transformed into analogue tones once performed.

One often talks about the differences between acoustic and digital instruments. This is a division I criticise in my upcoming book, but I will leave that argument aside for now. Independent of the sound production, I have over the years grown increasingly fond of Tellef Kvifte’s approach to separating between analogue and digital control mechanisms of musical instruments. Then one could argue that an acoustic piano is a digital instrument because it is based on discrete control (with separate keys for “C”, “D”, “E”…).

Four levels of technology research and usage

When it comes to music technologies, I often like to think of four different layers: basic research, applied research and development, usage, and various types of meta-perspectives. I have given some examples of what these may entail in the table below.

Basic researchApplied research and developmentUsageMeta-perspectives
Music theory
Music cognition
Musical interaction
Interaction design
Instrument making
Digital representation
Signal processing
Machine learning
Four layers of (music) technology research and usage.

Most of our research activities can be categorised as being on the basic research side (plus various types of applied R&D, although mainly at a prototyping stage) or on the meta-perspectives side. To generalise, one could say that the former is more “technology-oriented” while the latter is more “humanities-oriented.” That is a simplification of a complex reality, but it may suffice for now.

The problem is that many educational activities (ours and others) focus on the use of technologies. However, today’s kids don’t need to learn how to use technologies. Most agree that they are eager technology users from the start. It is much more critical that they learn more fundamental issues related to digitalisation and why technologies work the way they do.

Digital representation

Given the level of digitisation that has happened around us over the last decades, I am often struck by the lack of understanding of digital representation. By that, I mean a fundamental understanding of what a digital file contains and how its content ended up in a digital form. This also influences what can be done to the content. Two general examples:

  • Text: even though the content may appear somewhat identical for those looking at a .TXT file versus a .DOCX/ODT file, these are two completely different ways of representating textual information.
  • Numbers: storing numbers in a .DOCX/ODT table is completely different from storing the same numbers in a .XLSX/ODS file (or a .CSV file for that matter).

One can think about these as different file formats that one can convert between. But the underlying question is about what type of digital representation one wants to capture and preserve, which also influences what you can do to the content.

From a musical perspective, there are many types of digital representations:

  • Scores: MIDI, notation formats, MusicXML
  • Audio: uncompressed vs. compressed formats, audio descriptor formats
  • Video: uncompressed vs. compressed formats, video descriptor formats
  • Sensor data: motion capture, physiological sensors, brain imagery

Students (and everyone else) need to understand what such digital representations mean and what they can be used for.

Algorithmic thinking

Computers are based on algorithms, a well-defined set of instructions for doing something. Algorithms can be written in computer code, but they can also be written with a pen on paper or drawn in a flow diagram. The main point is that algorithmic thinking is a particular type of reasoning that people need to learn. It is essential to understand that any complex problem can be broken down into smaller pieces that can be solved independently.

Not everyone will become programmers or software engineers, but there is an increased understanding that everyone should learn basic coding. Then algorithmic thinking is at the core. At UiO, this has been implemented widely in the Faculty for Mathematics and Natural Sciences through the Computing in Science Education. We don’t have a similar initiative in the Faculty of Humanities, but several departments have increased the number of courses that teach such perspectives.

Artificial Intelligence

There is a lot of buzz around AI, but most people don’t understand what it is all about. As I have written about several times on this blog (here and here), this makes people either overly enthusiastic or sceptical about the possibilities of AI. Not everyone can become an AI expert, but more people need to understand AI’s possibilities and limitations. We tried to explain that in the “AI vs Ary” project, as documented in this short documentary (Norwegian only):

The future is analogue

In all the discussions about digitisation and digital competency, I find it essential to remind people that the future is analogue. Humans are analogue; nature is analogue. We have a growing number of machines based on digital logic, but these machines contain many analogue components (such as the mechanical keys that I am typing this text on). Much of the current development in AI is bio-inspired, and there are even examples of new analogue computers. Understanding the limitations of digital technologies is also a competency that we need to teach our students.

All in all, I am optimistic about the future. There is a much broader understanding of the importance of digital competency these days. Still, we need to explain that this entails much more than learning how to use particular software or hardware devices. It is OK to learn such skills, but it is even more important to develop knowledge about how and why such technologies work in the first place.

Completing the MICRO project

I wrote up the final report on the project MICRO – Human Bodily Micromotion in Music Perception and Interaction before Christmas. Now I finally got around to wrapping up the project pages. With the touch of a button, the project’s web page now says “completed”. But even though the project is formally over, its results will live on.

Aims and objectives

The MICRO project sought to investigate the close relationships between musical sound and human bodily micromotion. Micromotion is here used to describe the smallest motion that we can produce and experience, typically at a rate lower than 10 mm/s.

Example plots of the micromotion observed in the motion capture data of a person standing still for 10 minutes.

The last decades have seen an increased focus on the role of the human body in both the performance and the perception of music. Up to now, however, the micro-level of these experiences has received little attention.

The main objective of MICRO was broken down into three secondary objectives:

  1. Define a set of sub-categories of music-related micromotion.
  2. Understand more about how musical sound influences the micromotion of perceivers and which musical features (such as melody, harmony, rhythm, timbre, loudness, spatialization) come into play.
  3. Develop conceptual models for controlling sound through micromotion, and develop prototypes of interactive music systems based on these models.


The project completed most of its planned activities and several more:

  1. The scientific results include many insights about human music-related micromotion. Results have been presented in one doctoral dissertation, two master theses, several journal papers, and at numerous conferences. As hypothesized, music influences human micromotion. This has been verified with different types of music in all the collected datasets. We have also found that music with a regular and strong beat, particularly electronic dance music, leads to more motion. Our data also supports the idea that music with a pulse of around 120 beats per minute is more motion-inducing than music with slower or faster tempi. In addition, we found that people generally moved more when listening with headphones. Towards the end of the project, we began studying whether there are individual differences. One study found that people who score high on empathic concern move more to music than others. This aligns with findings from recent studies of larger-scale music-related body motion.
  2. Data collected from the project has been released openly in Oslo Standstill Database. The database contains data from all Championships of Standstill, the Headphones-Speakers study, and from the Sverm project that preceded MICRO.
  3. Software developed during the project has been made openly available. This includes various analysis scrips implemented in Jupyter Notebooks. Several of the developed software modules have been wrapped up in the Musical Gestures Toolbox for Python.
  4. The scientific results have inspired a series of artistic explorations, including several installations and performances with the Self-playing Guitars, Oslo Muscle Band, and the Micromotion Apps.
  5. The project and its results have been featured in many media appearances, including a number of newspaper stories and several times on national TV and radio.

Open Research

MICRO has been an Open Research flagship project. This includes making the entire project as open as possible but as closed as necessary. The project shares publications, data, source code, application, and other parts of the research process openly.

Summing up

I am very happy about the outcomes of the MICRO project. This is largely thanks to the fantastic team, particularly postdoctoral fellow Victor Gonzalez Sanchez and doctoral fellow Agata Zelechowska.

Results from the Sverm project inspired the MICRO project, and many lines of thought will continue in my new AMBIENT project. I am looking forward to researching unconscious and involuntary micromotion in the years to come.

Recruiting for the AMBIENT project

I am happy to announce that I am recruiting for my new research project AMBIENT: Bodily Entrainment to Audiovisual Rhythms. The project will continue my line of research into the effects of sound and visuals on our bodies and minds and the creative use of such effects. Here is a short video in which I explain the motivation for the project:

Now hiring

The idea is to put together a multidisciplinary team of three early career researchers experienced with one or more of the following methods: sound analysis, video analysis, interviews, questionnaires, motion capture, physiological sensing, statistics, signal processing, machine learning, interactive (sound/music) systems. The announcement texts are available here:

Application deadline: 15 March 2022. Do not hesitate to get in touch if you have any questions about the positions.

About the project

Much focus has been devoted to understanding the “foreground” of human activities: things we say, actions we do, sounds we hear. AMBIENT will study the sonic and visual “background” of indoor environments: the sound of a ventilation system in an office, the footsteps of people in a corridor, or people’s fidgeting in a classroom.

Examples of periodic auditory and visual stimuli in the environments to be studied in AMBIENT: individuals in offices (WP2), physical-virtual coworking (WP3), telematic classroom (WP4).

The project aims to study how such elements influence people’s bodily behaviour and how people feel about the rhythms in an environment. This will be done by studying how different auditory and visual stimuli combine to create rhythms in various settings.

Rhythms can be constructed from different elements: (a) visual, (b) auditory, (c) audiovisual, (d) spatiotemporal, or (e) a combination of audiovisual and spatiotemporal. The numbers in (e) indicate the cyclic, temporal order of the events.

The hypothesis is that various types of rhythms influence people’s bodily behaviour through principles of entrainment, that is, the process by which independent rhythmical systems interact with each other.


The primary objective of AMBIENT is to understand more about bodily entrainment to audiovisual rhythms in both local and telematic environments. This will be studied within everyday workspaces like offices and classrooms.

The primary objective can be broken down into three secondary objectives:

  1. Understand more about the rhythms of in-door environments, and make a theoretical model of such rhythms that can be implemented in software.
  2. Understand more about how people interact with the rhythms of in-door environments, both when working alone – and together.
  3. Explore how such rhythms can be captured and (re)created in a different environment using state-of-the-art audiovisual technologies.

Work packages

The work in AMBIENT is divided into five work packages:

  • WP1: Theoretical Development
  • WP2: Observation study of individuals in their offices
  • WP3: Observation study of physical-virtual workspaces
  • WP4: Exploration of (re)creation of ambience in a telematic classroom
  • WP5: Software development

The work packages overlap and feed into each other in various ways.

The relationships between work packages. The small boxes within WP2–4 indicate the different studies (a/b/c) and their phases (1/2/3). See WP sections for explanations.

Open Research

The AMBIENT project is an open research lighthouse project. The aim is to keep the entire research as open as possible, including sharing methods, data, publications, etc.


The Research Council of Norway, project number 324003, 2021-2025

MusicLab receives Danish P2 Prisen

Yesterday, I was in Copenhagen to receive the Danish Broadcasting Company’s P2 Prisen for “event of the year”. The prize was awarded to MusicLab Copenhagen, a unique “research concert” last October after two years of planning.

The main person behind MusicLab Copenhagen is Simon Høffding, a former postdoc at RITMO, now an associate professor at The University of Southern Denmark. He has collaborated with the world-leading Danish String Quartet for a decade, focusing on understanding more about musical absorption.

Simon and I met up to have a quick discussion about the prize before the ceremony.

The organizers asked if we could do some live data capturing during the prize award ceremony. However, we could not repeat what we did during MusicLab Copenhagen. Then, a team of 20 researchers spent a day setting up before the concert. Instead, I created some real-time video analysis using MGT for Max of the Danish Baroque orchestra. That at least gave some idea about what it is possible to extract from a video recording.

Testing the video visualization during the dress rehearsal.

The prize is a fantastic recognition of a unique event. MusicLab is an innovation project between RITMO and the University Library in Oslo. The aim is to explore how it is possible to carry out Open Research in real-world settings. MusicLab Copenhagen is the largest and most complex MusicLab we have organized to date. In fact, we did one complete concert and one test run of the setup to be sure that everything would work well.

While Simon, Fredrik (from the DSQ), and I were on stage to receive the prize, it should be said that we received it on behalf of many others. Around 20 people from RITMO and many others contributed to the event. Thanks to everyone for making MusicLab Copenhagen a reality!

MusicLab Copenhagen was a huge team effort. Here, many of us gathered in front of Musikhuset in Copenhagen before setting up equipment for the concert in October 2021.