Opportunities and Challenges with Citizen Science

Citizen Science is on everyone’s lips these days, at least on the lips of people working in research administration, funding agencies and in institutional leadership. As a member of the EUA Expert Group on Open Science/Science 2.0, I am also involved in ongoing discussions on the topic.

Yesterday, I took part in the workshop Citizen Science in an institutional context organized by EUA and OpenAire. A recording of my talk is available here:

Video is good for many things, but textual information may be easier to search for and skim through, so in this blog post, I will summarize some of the points from my talk.

As always, I started by briefly explaining my reasoning for talking about Open Research instead of Open Science. This is particularly important for people working within the arts and humanities, for whom “science” may not be the best term.

Defining Citizen Science

There are lots of definitions of citizen science, such as this one:

Citizen science […] is scientific research conducted, in whole or in part, by amateur (or nonprofessional) scientists.

Wikipedia

That is fine on a general level, but it is more unclear what it means in practice. In my experience, many people think of citizen science as primarily involving citizens in the data collection. Then citizen science is just one building block in the (open) research ecosystem:

A narrow definition of citizen science.

Another, more open, definition of citizen science, focuses on the inclusion of citizens in all parts of the research process. This is the approach that I think is most interesting, and is the one I am focusing on.

A broader definition of citizen science.

Opportunities of Citizen Science

I have never thought of myself as a “citizen science researcher”. Some people build all their research on such an approach. For me, it has more been an add-on to other research activities. Still, I have done several citizen science-like projects over the years. In the talk, I presented two of these briefly: Hjernelæring and MusicLab. I will present these briefly in the following.

Case 1: Hjernelæring

The first case is my collaboration with Hjernelæring, a Norwegian company producing educational material for schools. I was challenged to create an exercise that could be used in classrooms, which could also be used to collect research data. My main research project at the moment (MICRO) is focusing on music-related micromotion, so it was natural to build an exercise around this. We have done several studies in the lab over the years, including the Norwegian Championship of Standstill. The latter has been an efficient way of attracting many participants to the study. We also try to give something back. All participants get a chance to download plots of their own micromotion, and we make the data available in the Oslo Standstill Database. So they are free to use and analyze the data if they wish.

Since we couldn’t rely on any particular technology in the classrooms, I ended up making an exercise where the kids would stand still with and without music, and then draw their experiences on a piece of paper. The teacher would then scan these drawings and send to me for analysis. I think this is a nice example of how to get involved with schools. It is research dissemination, because the kids learn about the research we are doing, why we do it, and what can come out of it. And it is data collection, since the teachers provide us with research data.

Case 2: MusicLab

The second case I presented was on MusicLab. This is an innovation project at the University of Oslo, where we at RITMO collaborate with the Science Library in exploring an extreme version of Open Research. Each MusicLab event is organized around a concert. The idea is to collect various types of data (motion capture, physiology, audio, video, etc.) of both performers and audience members that can be used in studying the event. There is usually also a workshop on a topic related to the concert, a panel discussion, and a data jockeying session in which some of the data is analyzed on the fly. As such, we try to open the entire research process to the public, and also include everyone in the data collection and analysis.

The main parts of a MusicLab event.

Challenges of Citizen Science

The last part of the presentation was devoted to presenting some of the challenges of Citizen Science. These were particularly focused on institutional challenges. There are, obviously, also many research challenges. Still, at the moment, I think it is important to help institutions to develop policies and tools for helping researchers to run Citizen Science projects.

My list of challenges include the need for (more):

  • technical infrastructure for data collection, handling, storage, and archiving. Many institutions have built up internal systems for data flows, but it is usually difficult to share data openly. I also see that IT departments are usually involved in handling storage solutions, while libraries are involved in archiving. This creates an unfortunate gap between the two (storage and archiving).
  • channels for connecting to citizens. Working with an external partner is usually a good strategy for connecting with citizens. Still, it also means that the researchers (and institutions) have to rely on a third party in communication with citizens. Some universities have built up their own Citizen Science centres, which may help with facilitating communication.
  • legal support (privacy+copyright). All the “normal” challenges of GDPR, copyright, etc., become even more difficult when involving citizens at all stages of the research process. Clearly, there is a need for more support to solve all the legal issues involved.
  • data management support. It is both a skill and a craft to collect data, handle data, equip data with metadata, store it, and archive it properly. Researchers need to learn all of these to some extent, but we also need more professional data managers to help at all stages. I think libraries’ future will largely be connected to the data management of various kinds.
  • strategies for avoiding bias and pressure from citizens. One of the big criticisms/scepticisms of Citizen Science is that it may lead to all sorts of unfortunate effects. Research is under pressure many places, and by involving more people in the research process, this may also lead to several challenges. I believe that more openness is the answer to this problem. Transparency at all levels will help expose whatever goes on in the data collection and analysis phase. This may mitigate potential challenges arising from people trying to push the research in one way or another. This, of course, requires the development of solid infrastructures, proper metadata, persistent IDs, version control, etc.
  • incentives and rewards for researchers (and institutions). Citizen Science is still new to many. As for anything else, if we want to make a change, it is necessary to support people interested in trying things out.
A sketch of how our new MusicLab app allows for secure data collection to UiO servers. The challenging part is to figure out how to give citizens access to the data, and how to handle the data archiving.

Conclusion

In sum, I believe there is a huge potential in citizen science. After thinking about it for a while, I think that more focus on Citizen Science is a natural extension of current Open Science initiatives. For that to happen, however, we need to solve all of the above. A good starting point is to develop policies from a top-down perspective. Equally important is to give researchers time (and some money) to set up pilot projects to try things out. After all, there will be no Citizen Science, if there are no researchers to initiate it in the first place.

Music thumbnailing

A couple of days ago, I read an interesting paper about a new AI algorithm that can summarize long texts. This is an attempt to solve the problem of tl;dr texts, meaning “too long, didn’t read”.

The article reminded me that the same problem exists for music, in which case it would probably be tl;dl: “too long, didn’t listen”. I was interested in this topic back when I wrote my master’s thesis about short-term music recognition. One way to overcome the challenge of listening through full music tracks is by creating music “thumbnails”. That is, a compact representation of the most salient parts of the music in question. This is not a trivial task, of course, and lots of research have gone into it over the years. Strangely, though, I haven’t seen any of the many suggested algorithms implemented in any commercial service (so far).

Shortening Bolero

I didn’t delve very deep into the topic, but I experimented with modifying a recording of Ravel’s Bolero in my thesis. First, I thought of doing a time compression of a piece that would preserve the overall form and some of the timbral qualities. A test of this was done by compressing Ravel’s Bolero, with a phase vocoder, from 15 minutes to 15 seconds (warning: starts soft and ends loud):

Even though it might be possible to understand the musical content and changing timeless qualities from this example, such a method hardly gives a good representation of what I think is perceptually important in this piece: rhythm, melody, and timbre. A better way might be to cut out short pieces of the entire piece. Since it has been argued that short excerpts are perceptually significant, such a collection of short segments could tell a great deal about a song. Trying to do this automatically I made a Max patch that plays certain segments of a song.

The user interface of the Max patch Music Trailer that plays short excerpts from a sound file.

The patch allows the user to choose the number of segments that shall be played in the song, the window size, or each musical segment’s duration. The patch’s main technical feature is to find the length of the song and calculate the evenly distributed segments to be played.

Below are two examples, one made by sampling 5 excerpts each lasting 3 seconds, or 3 excerpts each lasting 5 seconds. These examples show that it is possible to hear some of the rhythmic figures, melody and timbre.

5 segments, each 3 seconds
3 segments, each 5 seconds

Even though the patch selects segments without any knowledge of musical content, I would argue that the result is more relevant than the compression example presented above.

Salient-based thumbnailing

However, the problem with this approach is that the program does not know anything about the musical content, and this lack of musical “knowledge” might result in missing out on perceptually relevant points. So music thumbnails would be much more useful if they were based on sampling salient features. This is a non-trivial task, and, obviously, I did not manage to implement this during my master’s. I did make a couple of saliency-based music thumbnails manually:

Manually created salient-based music thumbnail (version 1).
Manually created salient-based music thumbnail (version 2).

These two examples capture some of the song’s structure and parts of the melody while still preserving timbral qualities. I haven’t followed the research literature on audio/music thumbnailing very closely in recent years, but I am sure that someone has done a similar thing with AI now. If not, someone should do it!

Meeting New Challenges

Life is always full of challenges, but those challenges are also what drives personal development. I am constantly reminded about that when I see this picture, which was made by my mother Grete Refsum when I started in school.

Gutt med ball (Grete Refsum, 1985)

I think the symbolism in the image is great. The eager child is waiting with open arms for an enormous ball. Even though I am much older now, I think the feeling of starting on something new is always the same. Starting school, starting university, starting a new job, starting a new project. It is always a feeling of something large and unknown that is approaching.

I am currently starting a new book project, which at the moment feels like an enormous ball that I don’t really know how to handle. The nice thing about a ball, however, is that you don’t necessarily need to lift it. You can come a long way by pushing it in the right direction.

Music and AI

Last week I was interviewed about music and artificial intelligence (AI). This led to several different stories on radio, TV, and as text. The reason for the sudden media interest in this topic was a story by The Guardian on the use of deep learning for creating music. They featured an example of the creation of Sinatra-inspired music made using a deep learning algorithm:

After these stories were published, I was asked about participating in a talk-show on Friday evening. I said, yes, of course. After all, music technology research rarely hits the news here in Norway. Unfortunately, my participation in the talk-show was cancelled on short notice after one of the other participants got ill. I had already written up some notes on what I would have liked to convey to the general public about music and AI, so I thought that I could at least publish them here on the blog.

The use of AI in music is not new

While deep learning is the current state-of-the-art, AI has been used in music-making for decades. Previously this was mainly done on symbolic music representations. That is, a computer algorithm is fed with some musical scores, and it tries to create, for example, new melodies. There are lots of examples of this in the computer music literature, such as in the proceedings of the International Computer Music Conference.

The Sinatra-example shows that it is now possible to create convincing music also at the sample level. This is neatly done, but also timbre-based AI has been around for a while. This was actually something I was very excited about during my master’s thesis around 20 years ago. Inspired by David Wessel at UC Berkeley, I trained a set of artificial neural networks with saxophone sounds. Much has happened since then, but the basic principles are the same. You feed a computer with real sound and asks it to come up with something new that is somewhat similar. We have several projects that explore this in the Interaction and Robotics cluster at RITMO.

If one also considers algorithmic music as a type of AI, this has been around for centuries. Here the idea is to create music by formulating an algorithm that can make musical choices. There are rumours of Mozart’s dice-based procedural music in the 18th century, and I am sure that others also thought about this (does anyone know of a good overview of pre-20th-century algorithmic music?).

There will be changes in the music industry

As the news stories this week showed, many people in the music industry are scared about what is happening with the introduction of more AI in music. And, yes, things are quite surely going to change. But this is, again, nothing new. The music industry has always been in change. The development of the professional concert halls in the 19th century was a major change, and the recording industry changed music forever in the 20th century. That doesn’t necessarily mean that everyone else will lose their jobs. Even though everyone can listen to music everywhere these days, many people still enjoy going to concerts to experience live music (and even more so now when corona has deprived us of the possibility).

Sound engineers, music producers, and other people involved in the production of recorded music are examples of new professions that emerged with the new music industry in the 20th century. AI will lead to new music jobs being created in the 21st century. We have already seen that streaming has changed the music industry radically. Although most people don’t think much about it in daily life, there are lots of algorithms and AI involved under the hood of streaming services.

There will surely be some people in the industry that will lose their jobs, but many new jobs will also be created. Music technologists with AI competency will be a sought after competency. This is something we have known for a long time, and that is why we are teaching courses on music and machine learning and interactive music systems in the Music, Communication and Technology master’s programme at UiO.

We need to rethink copyright and licensing

Another topic that has emerged from the discussions this week is how to handle copyright and licensing. And, yes, AI challenges the current systems. I would argue, however, that AI is not the problem as such, but it exposes some basic flaws in the current system. Today’s copyright system is based around the song as the primary copyright “unit”. This probably made sense in a pre-technological age where composers wrote a song that musicians performed. Things are quite different in a technology-driven world, where music comes in many different forms and formats.

The Sinatra-example can be seen as a type of sampling, just at a microscopic scale. Composers have “borrowed” from others throughout the entire music history. Usually, this was in the form of melody, harmony, and rhythm. Producers have for decades used sound fragments (“samples”) of others, which has led to lots of interesting music and several lawsuits. There are numerous challenges here, and we actually have two ongoing research projects at UiO that explore the usage of sampling and copyright in different ways (MASHED and MUSEC).

What is new now, is that AI makes it possible to also “sample” at sample-level, that is, the smallest part of a sound wave. If you don’t know what that means, take a look at the zoomed-in version of a complex sound wave like this:

Zoom in and out of a complex waveform (coded by Jan Van Balen).

Splicing together sound samples with AI opens for some interesting technological, philosophical and legal questions. For example, how short samples can be covered by copyright? What types of representations of such samples should be considered? Waveforms? Equations? A software with API? Clearly, there is a need to think more carefully about how this should be implemented.

Possibilities with music and AI

The above-mentioned challenges (a changing music industry and copyright discussions) are not trivial, and I understand that many people are scared by the current changes. However, this fright of new technology may get in the way of many of the positive aspects of using AI in music.

It should be mentioned that many of the new methods have been developed and explored by composers, producers, and other types of music technologists. The intention has been to use machine learning, evolutionary algorithms, and other types of AI to generate new sound and music that would not otherwise be possible. There are some extreme cases of completely computer-generated music. Check out, for example, the autonomous instruments by my former colleague Risto Holopainen. In most cases, however, AI has been (and is still) used as part of a composition/production process together with humans.

Personally, I am particularly interested in how AI can help to create better interactive music systems. Over the last century, we have seen a shift towards music becoming a “hermetic” product. Up until the 20th century, music was never fixed, it changed according to who played. To experience music, you either had to play yourself or be in the near vicinity of someone else that played. Nowadays, we listen to recorded music that never changes. This has led to an increased perfection of the final product. At the same time, it has removed many people from the experience of participating in musicking themselves.

New technologies, such as AI, allow for creating more personalized music. The procedural audio found in computer games is one such example of music that can be “stretched” in, for example, its duration. New music apps allow users to modify parts of a song, such as adding or removing instruments to a mix. There are also examples of interactive music systems made for work, relaxation, or training (check, for example, the MCT student blog). They all have in common that they respond to some input from the user and modifies the music accordingly. This allows for a more engaging musical experience than fixed recordings. I am sure we will see lots of more such examples in the future, and they will undoubtedly benefit from better AI models.

Music and AI is the future

AI is coming to music as it is coming to everything else these days. This may seem disruptive to parts of the music industry, but it could also be seen as an opportunity. New technologies will lead to new music and new forms of musicking.

Some believe that computers will take over everything. I am sure that is not the case. What has become very clear from our corona home office lives is that humans are made of flesh and blood, we like to move around, and we are social. The music of the future will continue to be based on our needs to move, to run, to dance, and to engage with others musically. New technologies may even help us to do that better. I am also quite sure that we will continue to enjoy playing music ourselves on acoustic instruments.

Visual effect of the different tblend functions in FFmpeg

FFmpeg is a fantastic resource for doing all sorts of video manipulations from the terminal. However, it has a lot of features, and it is not always easy to understand what they all mean.

I was interested in understanding more about how the tblend function works. This is a function that blends successive frames in 30 different ways. To get a visual understanding of how the different operations work, I decided to try them all out on the same video file. I started from this dance video:

Then I ran this script:

This created 30 video files, each showing the effect of the tblend operator in question. Here is a playlist with all the different resultant videos:

Instead of watching each of them independently, I also wanted to make a grid of all 30 videos. This can be done manually in a video editor, but I wanted to check how it can be done with FFmpeg. I came across this nice blog post with an example that almost matched my needs. With a little bit of tweaking, I came up with this script:

The final result is a 30-video grid with all the different tblend operators placed next to each other (in alphabetical order from top left, I think). Consult the playlist to see the individual videos.