I have seen that it is possible to build a complete CV from NVA data, the Norwegian research registry. As part of my quest to collect data of researchers connected to MishMash, I am looking for the best data source(s). Starting with a quick check of my own personal page at UiO, showed that institutional person pages are not the right solution. But what about NVA? Perhaps that is a viable solution?
Exploring NVA data
NVA is the official Norwegian research registry, which combines person profiles, affiliations, publications, and projects into a single system. For my earlier NVA testing, I manually exported a TSV file from the system. Now, I wanted to check what can be retrieved programmatically.
As it turns out, the NVA website is a JavaScript single-page application. There are no og:title, og:description, og:image, or any other Open Graph meta tags. Neither is there any JSON-LD block, no Dublin Core, or no citation_* tags. So I quickly discarded that idea.
Fortunately, NVA has an API that exposes a lot of structured JSON. The useful data is available from api.nva.unit.no. Exploring my own profile, I find the following:
Person profile
GET https://api.nva.unit.no/cristin/person/1328
This endpoint is public and returns a rich Person object that can be used to extract:
| Field | Example |
|---|---|
id | https://api.nva.unit.no/cristin/person/1328 |
identifiers | ORCID + Cristin identifier |
names | First and last name |
contactDetails.email | Public email |
contactDetails.webPage | Personal page |
image | URL to profile picture |
affiliations[] | Organizations, roles, active/inactive |
keywords[] | Structured research topics |
verified | Verification flag |
Publications and other results
GET https://api.nva.unit.no/search/resources?contributor=https://api.nva.unit.no/cristin/person/1328&aggregation=none&size=15
This endpoint is also public and returns paginated results plus totalHits. Useful top-level fields include:
| Field | What it gives you |
|---|---|
totalHits | Total number of results for the person |
hits[].id | NVA resource ID |
hits[].type | Resource type (often Publication) |
hits[].entityDescription.mainTitle | Title |
hits[].entityDescription.publicationDate | Date |
hits[].entityDescription.abstract | Abstract/description |
hits[].entityDescription.contributors[] | Contributors + ORCID/person IDs |
hits[].projects[] | Linked projects |
hits[].fundings[] | Funding source/identifier |
hits[].additionalIdentifiers[] | Handle, Cristin ID, etc. |
hits[].status | Publication state |
You also get pagination links (nextResults, nextSearchAfterResults), so build-time harvesting is straightforward.
Project records
The profile page links to project pages, and project records are publicly available by ID:
GET https://api.nva.unit.no/cristin/project/568602
Typical fields include:
| Field | What it gives you |
|---|---|
title | Project title |
startDate / endDate | Time range |
funding[] | Funding sources |
coordinatingInstitution | Lead institution |
contributors[] | Participants and roles |
Potential for building a directory
All of that means that to create a researcher directory for MishMash, NVA can provide most of what we need:
- Name and identifiers (including ORCID)
- Contact details (if public)
- Affiliations and role labels
- Profile image URL
- Research keywords
- Result counts and selected publications
- Project memberships and roles
There is one caveat: this strategy only works for people who have an NVA profile. While this is a requirement for all researchers working at Norwegian institutions, it does not extend to students and non-academic or non-affiliated researchers. It doesn’t work for international partners either. So we may also need to rely on ORCiD for additional information. That will have to be tomorrow’s challenge.
Thanks to CoPilot for helping me research and implement this solution and for drafting this blog post.
