MusicTestLab as a Testbed of Open Research

Many people talk about “opening” the research process these days. Due to initiatives like Plan S, much has happened when it comes to Open Access to research publications. There are also things happening when it comes to sharing data openly (or at least FAIR). Unfortunately, there is currently more talking about Open Research than doing. At RITMO, we are actively exploring different strategies for opening our research. The most extreme case is that of MusicLab. In this blog post, I will reflect on yesterday’s MusicTestLab – Slow TV.

About MusicLab

MusicLab is an innovation project by RITMO and the University Library. The aim is to explore new methods for conducting research, research communication and education. The project is organized around events: a concert in a public venue, which is also the object of study. The events also contain an edutainment element through panel discussions with world-leading researchers and artists, as well as “data jockeying” in the form of live data analysis of recorded data.

We have carried out 5 full MusicLab events so far and a couple of in-between cases. Now we are preparing for a huge event in Copenhagen with the Danish String Quartet. The concert has already been postponed once due to corona, but we hope to make it happen in May next year.

The wildest data collection ever

As part of the preparation for MusicLab Copenhagen, we decided to run a MusicTestLab to see if it is at all possible to carry out the type of data collection that we would like to do. Usually, we work in the fourMs Lab, a custom-built facility with state-of-the-art equipment. This is great for many things, but the goal of MusicLab is to do data collection in the “wild”, which would typically mean a concert venue.

For MusicTestLab, we decided to run the event on the stage in the foyer of the Science Library at UiO, which is a real-world venue that gives us plenty of challenges to work with. We decided to bring a full “package” of equipment, including:

  • infrared motion capture (Optitrack)
  • eye trackers (Pupil Labs)
  • physiological sensors (EMG from Delsys)
  • audio (binaural and ambisonics)
  • video (180° GoPros and 360° Garmin)

We are used to working with all of these systems separately in the lab, but it is more challenging when combining them in an out-of-lab setting, and with time pressure on setting everything up in a fairly short amount of time.

Musicians on stage with many different types of sensors on, with RITMO researchers running the data collection and a team from LINK filming.

Streaming live – Slow TV

In addition to actually doing the data collection in a public venue, where people passing by can see what is going on, we decided to also stream the entire setup online. This may seem strange, but we have found that many people are actually interested in what we are doing. Many people also ask about how we do things, and this was a good opportunity to show people the behind-the-scenes of a very complex data collection process. The recording of the stream is available online:

To make it a little more watcher-friendly, the stream features live commentary by myself and Solveig Sørbø from the library. We talk about what is going on and make interviews with the researchers and musicians. As can be seen from the stream, it was a quite hectic event, which was further complicated by corona restrictions. We were about an hour late for the first performance, but we managed to complete the whole recording session within the allocated time frame.

The performances

The point of the MusicLab events is to study live music, and this was also the focal point of the MusicTestLab, featuring the very nice, young student-led Borealis String Quartet. They performed two movements of Haydn’s Op. 76, no. 4 «Sunrise» quartet. The first performance can be seen here (with a close-up of the motion capture markers):

The first performance of Haydn’s string quartet Op. 76, no. 4 (movements I and II) by the Borealis String Quartet.

Then after the first performance, the musicians took off the sensors and glasses, had a short break, and then put everything back on again. The point of this was for the researchers to get more experience with putting everything on properly. From a data collection point of view, it is also interesting to see how reliable the data are between different recordings. The second performance can be seen here, now with a projection of the gaze from the violist’s eye-tracking glasses:

The second performance of Haydn’s string quartet Op. 76, no. 4 (movements I and II) by the Borealis String Quartet.

A successful learning experience

The most important conclusion of the day was that it is, indeed, possible to carry out such a large and complex data collection in an out-of-lab setting. It took an hour longer than expected to set everything up, but it also took an hour less to take everything down. This is valuable information for later. We also learned a lot about what types of clamps, brackets, cables, etc., that are needed for such events. Also useful is the experience of calibrating all the equipment in a new and uncontrolled environment. All in all, the experience will help us in making better data collections in the future.

Sharing with the world

Why is it interesting to share all of this with the world? RITMO is a Norwegian Centre of Excellence, which means that we get a substantial amount of funding for doing cutting-edge research. We are also in a unique position to have a very interdisciplinary team of researchers, with broad methodological expertise. With the trust we have received from UiO and our many funding agencies, we, therefore, feel an obligation to share as much as possible of our knowledge and expertise with the world. Of course, we present our findings at the major conferences and publish our final results in leading journals. But we also believe that sharing the way we work can help others.

Sharing our internal research process with the world is also a way of improving our own way of working. Having to explain what you do to others help to sharpen your own thinking. I believe that this will again lead to better research. We cannot run MusicTestLabs every day. Today all the researchers will copy all the files that we recorded yesterday and start on the laborious post-processing of all the material. Then we can start on the analysis, which may eventually lead to a publication in a year (or two or three) from now. If we do end up with a publication (or more) based on this material, everyone will be able to see how it was collected and be able to follow the data processing through all its chains. That is our approach to doing research that is verifiable by our peers. And, if it turns out that we messed something up, and that the data cannot be used for anything, we have still learned a lot through the process. In fact, we even have a recording of the whole data collection process so that we can go back and see what happened.

Other researchers need to come up with their approaches to opening their research. MusicLab is our testbed. As can be seen from the video, it is hectic. Most importantly, though, is that it is fun!

RITMO researchers transporting equipment to MusicTestLab in the beautiful October weather.

Motiongrams of rhythmic chimpanzee swaying

I came across a very interesting study on the Rhythmic swaying induced by sound in chimpanzees. The authors have shared the videos recorded in the study (Open Research is great!), so I was eager to try out some analyses with the Musical Gestures Toolbox for Matlab.

Here is an example of one of the videos from the collection:

The video quality is not very good, so I had my doubts about what I could find. It is particularly challenging that the camera is moving slightly over time. There is also a part where the camera zooms in towards the end. A good rule of thumb is to always use a tripod and no zoom/pan/tilt when recording video for analysis.

Still, I managed to create a couple of interesting visualizations. Here I include two motiongrams, one horizontal and one vertical:

This horizontal motiongram shows the sideways motion of the monkey. Time runs from left to right.
This vertical motiongram reveals the sideways motion of the monkey. Time runs from top to bottom.

Despite the poor input quality, I was happy to see that the motiongrams are quite illustrative of what we see in the video. They clearly reveal the rhythmic pattern of the monkey’s motion. It would have been interesting to have some longer recordings to do some more detailed analysis of correspondences between sound and motion!

If you are interested in making such visualisations yourself, have a look at our collection of tools in the Musical Gestures Toolbox.

Embed YouTube video with subtitles in different languages

This is primarily a note to self post, but could hopefully also be useful for others. At least, I spent a little too long to figure out to embed a YouTube video with a specific language on the subtitles.

The starting point is that I had this project video that I wanted to embed on a project website:

However, then I found that you can add info about the specific language you want to use by adding this snippet after the URL:

?hl=en&cc_lang_pref=en&cc=1

This means ?hl=en is the language of the controls, &cc_lang_pref=en is the language of the subtitles and &cc=1 turns on the subtitles. The complete block is:

<iframe allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="315" src="https://www.youtube.com/embed/qGN2zbic3JM?hl=en&cc_lang_pref=en&cc=1" width="560"></iframe>

And the embedded video looks like this:

To play the same video with Norwegian subtitles on the Norwegian web page, I use this block:

<iframe allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="315" src="https://www.youtube.com/embed/qGN2zbic3JM?hl=no&cc_lang_pref=no&cc=1" width="560"></iframe>

And this looks like:

Simple when you have found the solution!

Why is open research better research?

I am presenting at the Norwegian Forskerutdanningskonferansen on Monday, which is a venue for people involved in research education. I have been challenged to talk about why open research is better research. In the spirit of openness, this blog post is an attempt to shape my argument. It can be read as an open notebook for what I am going to say.

Open Research vs Open Science

My first point in any talk about open research is to explain why I think “open research” is better than “open science”. Please take a look at a previous blog post for details. The short story is that “open research” feels more inclusive for people from the arts and humanities, who may not identify as “scientists”.

Why not?

I find it strange that in 2020 it is necessary to explain why we believe open research is a good idea. Instead, I would rather suggest that others explain why they do not support the principles of open research. Or, put differently, “why is closed research better research”?

One of the main points of doing research is to learn more and expand our shared knowledge about the world. This is not possible if we do not share the very same knowledge. Sharing has also been a core principle of research/science for centuries. After all, publications are a way of sharing.

The problem is that a lot of today’s publishing is a relic from a post-digital era, and does not take into account all the possibilities afforded by new technologies. The idea of “Science 2.0” is to utilize the potentials of web-based tools in research. Furthermore, this does not only relate to the final publications. A complete open research paradigm involves openness at all levels.

What is Open Research?

There are many definitions of open research, and I will not attempt to come up with the ultimate purpose here. Instead, I will point to (some) of the building blocks in an open research paradigm:

One can always argue about the naming of these, and what they include. The most important is to show that all parts of the research process could, in fact, be open.

How does open research help making better research?

To answer the original question, let me try to come up with one statement for each of the blocks mentioned in the figure above:

  • Open Applications: Funding applications are mainly closed today. But why couldn’t all applications be made publicly available? These would lead to better and more transparent processes, and the applications themselves could be seen as something others can build on. For people to avoid stealing ideas, such public applications would, of course, need to have tracking of applicant IDs, version-controlled IDs on the text, and universal time codes. That way, nobody would be able to claim that they came up with the idea first. One example is how DIKU decided to make all applications and assessments for the call for Centres of Excellence in Education open.
  • Open Assessment: If also, the assessment of research applications were open, this would increase the transparency of who gets funding, and why. The feedback from reviewers would also be openly available for everyone to see and learn how to develop better applications in the future.
  • Open Notebooks: Jumping to when the actual research starts, one could also argue for opening up the entire research process itself. This could involve the use of open notebooks explaining how the research develops. It would also be a way of tracking the steps taken to conduct the research, for example, getting ethics permissions. This could be done on web pages, blogs, or with more computational tools like Jupyter Notebook.
  • Open Methods: During review processes of publications, one of the trickiest parts is to understand how the research was conducted. Then it is crucial that the methods are described clearly and openly. Solutions like the Open Science Framework try to make a complete solution for making material available.
  • Open Source: An increasing amount of methods are computer-based. Sharing the source code of developed software is one approach to opening the methods used in research. It is also of great value for other researchers to build on. Some of the most popular platforms are GitHub and Gitlab.
  • Citizen Science: This is a big topic, but here I would say that it could be a way of opening for contributions by non-researchers in the process. This could be anything from participating in research designs to help with collecting data.
  • Open Data: Sharing the data is necessary so that reviewers can do their job in assessing whether the research results are sane. It is quite remarkable that most papers are still accepted without the reviewers having had access to the data and analysis methods that were used to reach the conclusions. Of course, open data are also of value for other researchers that re-analyze or perform new analysis on the data. In my experience, data are under-analyzed in general. There are numerous platforms available, both commercial (Figshare) and non-profit (Zenodo).
  • Open Manuscripts: Many researchers have been sharing manuscripts with colleagues and getting feedback before submission. With today’s tools, it is possible to do this at scale, asking for feedback on the material even at the stage of manuscripts. There are numerous new tools here, including Authorea and PubPub.
  • Open Peer Review: A traditional review process consists of feedback from 2-3 peers. With an open peer review system, many more peers (and others) could comment on the manuscript, thereby also improving the final quality of the paper. One interesting system here is OpenReview.
  • Open Access: Free access to the final publication is what most people have focused on. This is one crucial building block in the ecosystem, and much (positive) has happened in the last few years. However, we are still far from having universal open access. This is a significant bottleneck in the sharing of new knowledge. Fortunately, the political pressure from cOAlition S and others help in making a change.
  • Open Educational Resources: Academic publications are not for everyone to digest. Therefore, it is also imperative that we create material that can be used by students. This is particularly important to support people’s life-long learning. The popularity of MOOCs on platforms such as EdX, FutureLearn, and Coursera, has shown that there is a large market for this. Many of these are closed, however, which prevents full distribution.
  • Open Citations: Whether you work is cited by peers or not is often critical for many people’s careers. It has become a big business to create citation counts and various types of indexes (the h-index being the most common). The chase for citations has several opposing sides, including self-citations, and dubious pushing for citations to reviewers’ material. Therefore, we need to push for more openness, also when it comes to citations and citation counts.
  • Open Scientific Social Networks: The way people connect is vital in the world of research (as elsewhere). Opening the networks is crucial, particularly for minority researchers, to get access. Diversity will generally always lead to better and more balanced results.
  • Open Assessment: The last block takes us back to the first one. This relates to the assessment of research and researchers and is a topic I have written about before. I also helped organize the 2020 EUA Workshop on Academic Career Assessment in the Transition to Open Science, which has a lot of excellent material online.

Conclusion

As my quick run-through of the different parts of the building blocks has shown, it is possible to open the entire research process. Much experimentation is happening these days, and convergence is happening for some of the blocks. For example, the sharing of source code and data has come a long way in some communities. Some journals even refuse manuscripts without complete data sets and source code. Other parts have barely started. Open assessment may have come shortest, but things are moving also here.

My main argument for opening all parts of the process is that is “sharpening” the research process. You cannot be sloppy if you know that it will be exposed. I often hear people argue that it takes a lot of time to make everything openly available. That is also my experience. On the other hand, why should research be so fast. It is better to focus on quality than quantity. Open research fosters quality research.

One of the most common objections to opening the research process is that other people will steal your ideas, data, code, and so on. However, if everything is tagged correctly, time-stamped, and given unique IDs, it is not possible to steal anything. Everything will be traceable. And plagiarism algorithms will quickly sort out any problems.

The biggest challenge we are facing is that it is challenging to balance between the “old” and the “new” way of doing research. That is why policymakers and researchers need to work together with funders to help flip the model as quickly as possible.

How long is a NIME paper?

Several people have argued that we should change from having a page limit (2/4/6 pages) for NIME paper submissions to a word limit instead. It has also been argued that references should not be counted as part of the text. However, what should the word limits be?

It is always good to look at the history, so I decided to check how long previous NIME papers have been. I started by exporting the text from all of the PDF files with the pdftotext command-line utility:

for i in *.pdf; do name=`echo $i | cut -d'.' -f1`; pdftotext "$i" "${name}.txt"; done

Then I did a word count on these:

wc -w *.txt > wc.txt

And after a little bit of reformatting and sorting, this ends up like this in a spreadsheet format:

And from this we can sort and make a graphical representation of the number of words:

There are some outliers here. A couple of papers are (much) longer than the others, mainly because they contain long appendices. Some files have low word count numbers because the PDF files are protected from editing, and then pdftotext is not able to extract the text. The majority of files, however, are in the range 2500-5000 words.

The word count includes everything, also headers/footers, titles, abstracts, acknowledgements, and references. These differ, but the total words used for these things are 2000-5000 words. So the main text of most papers could be said to be in the range of 2000-4500 words.