Towards interoperability for psychological data

March 21, 2024

This is a Mastodon thread. The original thread is available here:


From Monday the 18th of March 2024 to Thursday the 21st of March 2024, Rik and me had a writing week in Leuven ✍️🍻

During this week, Rik worked on a Narrative Response Model for a self-efficacy item (more later, for example in our EHPS symposium in Cascais 💃), and I worked on a repository for psychological questionnaires ✅

I’ll explain a bit in this Mastodon Thread and blog post (at https://sciencer.eu/posts/2024-03-towards-interoperability-for-psychological-data.html).

A GIF of Kermit the frog typing.


Loads of repositories exist already, so let me explain what’s so different about this one and why I’m currently so stoked about the progress 🤩

So, Rik and me are working on trying to work our way out of the theory crisis and the measurement crisis. I try to summarize this in https://sciencer.eu/posts/2022-12-knowing-what-we-re-talking-about–mastodon-thread.html.

Jeff Bridges in The Big Lebowski sitting in a car, looking stoked.


Part of our proposed solution is using comprehensive construct definitions that have Unique Construct Identifiers, UCIDs (also see https://doi.org/jnjp).

However, these have limited value if they cannot be linked to measurement and data.

In addition, most existing repositories for questionnaires (‘scales’) suffer from a number of problems that I’ve been aching to address for a while now:


1️⃣ They are often not open infrastructure. The database and interface are often not publicly available for anybody to copy, adjust and run.

2️⃣ They often do not link to unique construct identifiers that are built on open infrastructure (although they often have system-specific unique identifiers).

3️⃣ They often are not openly machine-readable (for example with an API).

4️⃣ They often do not allow easy conversion of a questionnaire to import them into open software.


This new repository is built upon a number of existing open tools:

1️⃣ The same Quarto website that underlies the Psychological Construct Repository (https://psycore.one);

2️⃣ Open standards for specifying questionnaires in machine-readable formats, with identifiers for questionnaires and single items;

3️⃣ The {psyverse} R package to work with the specifications we designed (https://psyverse.opens.science);

4️⃣ The {limonaid} R package for working with LimeSurvey (https://limonaid.opens.science).


All this is wrapped together to create the repository of which the first so-fresh-it’s-not-even-beta version is live at https://operationalizations.com.

The little ecosystem consists of a number of things (spread over two Mastodon posts, sorry 🤷):

👉 A way to set up a spreadsheet to specify questionnaire content in a Tabulated Open Questionnaire specification (a TOQ spec).

👉 A way to assign Unique Questionnaire Identifiers (UQIDs) to questionnaires.


👉 A way to convert a TOQ into a Serialized Open Questionnaire specification (a SOQ spec) in YAML format.

👉 A Quarto site that embeds a set of TOQs in a website with an interface for humans and one for machines.

👉 A Shiny app that, when you provide a UQID, imports the SOQ spec from that repository and produces a LimeSurvey group file (a .lsg file) that can then be directly imported into LimeSurvey.


Because TOQs and SOQs contain a field to specify one or more Unique Construct Identifiers (UCIDs), it’s always clear what a questionnaire (and so, an item) measures.

This means it’s relatively easy to devise a convention for naming columns or for the code book with metadata that accompanies a dataset that makes datasets fully machine readable. It will make it possible to automatically determine for every column in a dataset which construct the data in that column pertains to.


That makes it possible for psychological data to become FAIR. The Interoperability was always challenging, but now within reach.

This also enables accumulation of evidence. As Individual Participant Data meta-analyses are becoming more common, being able to automatically and reliably determine which constructs are represented how in a dataset can really help accelerate evidence synthesis.


Also, although I now wrote an adapter for LimeSurvey (as it’s an excellent open source application for online studies), adapters can also be written for other software (such as {formr} or even for closed science software such as Qualtrics).

This means repositories can contain links to questionnaires in a variety of formats.

To use the LimeSurvey adapter, I created 💅LAFOQS🦊: the LimeSurvey Adapter For Open Questionnaire Specifications: https://measurement.shinyapps.io/LAFOQS/.


All in all, this is a puzzle piece that I think can really help us move forward. And it all works! There are two examples. One is the BFI-10, which has:

✅ a TOQ specification: https://docs.google.com/spreadsheets/d/1atGTOs9RGTISDxiSH0IfSH5fDPYaDhnJEdbXqWqDRto;

✅ a ❄️UQID: bfi10eng_7sp9mjx3;

✅ a SOQ specification: https://codeberg.org/measurement/operationalizations-website/src/branch/main/data/questionnaires/bfi10eng_7sp9mjx3.yml;

✅ a page in the repo: https://operationalizations.com/questionnaire/bfi10eng_7sp9mjx3;

✅ if you throw the UQID in 💅LAFOQS🦊, a LimeSurvey group that you can directly import 🤩


And of course there’s the overview page with the questionnaires at https://operationalizations.com/questionnaires. Well, for now, with two questionnaires 😬

I still need to do a lot on work on all of this. For example, the repository doesn’t show the information very usefully yet. And of course we need to start adding questionnaires. But as a proof of concept I think this is pretty cool 🤩

So, a work week well spent I’d say. Time to celebrate! 🥳🍻

A photograph of Rik and GJ enjoying a beer.


PS: the beer symbolizes both this week in Leuven (or Louvain; and the beer is called Louve, which is close enough for barbarians like us 😬), and the 💅LAFOQS🦊 Shiny App, since the can shows a fox, and 💅LAFOQS🦊 is, obviously, also a fox 😬

(Ow, Rik found out that Louve is apparently the name for a female wolf 🐺 Ah well, close enough 😬)