Themes

FAIR principles: sharing data for maximisation of results

What is this about?

The FAIR principles describe the ideal way data should be stored and shared to maximise its usefulness and allow the whole research community to benefit from it. FAIR stands for “Findable, Accessible, Interoperable and Re-usable”. The main goal of the FAIR principles is to enable the "‘long term care’ of valuable digital assets" (1 p1), in order to allow them to be reused for future research. The scope of these principles goes beyond the ‘data’ in the conventional sense but includes all the components of the research process, including the algorithms and workflows that lead to the resulting data. This means that FAIR data management supports both human-driven and machine-driven data discovery and exploitation activities.

Why is this important?

While conducting our research we (as researchers) produce different kind of data: we write research designs and workflows, we collect raw data, we analyse them, we write about them. However, most of the time the only part of our research which is publicly shared is represented by the articles which we produce as a result of the entire research cycle. In fact, all phases of research could be of potential interest for other researchers, who could reuse the data we produce in a different way.

By maximizing access to and re-use of research data we can optimize the impact of the data that we produce. In line with the principles of Open Science, good data management becomes a fundamental instrument to promote and facilitate the reusability, accessibility and exploitation of research data, thereby allowing for the generation of new knowledge (2).

In 2014, a workshop named ‘Jointly Designing a data FAIRPORT’ was organized in Leiden by a group of academic and private sector stakeholders. The aim of the workshop was to improve the infrastructure supporting humans and machines in the discovery and analysis of scientific data and their associated algorithms and workflows. A set of guidelines was developed by the participants to support data producers, scientists and data publishers to take full advantage of the generation of data (1). Four foundational principles were agreed by the community to allow stakeholders to discover, integrate, re-use and adequately cite the massive quantities of information being generated by contemporary data intensive science.

These foundational principles were subsequently improved, detailed and elaborated on by the FORCE 11 working group, which is still engaged in fostering the implementation and continued update of the FAIR principles. These foundational principles are known as FAIR principles (Box 1).

Box 1


---------------------------------------------------------------------------------

Findable:

F1. (meta)data are assigned a globally unique and eternally persistent identifier.

F2. data are described with rich metadata.

F3. (meta)data are registered or indexed in a searchable resource.

F4. metadata specify the data identifier.

Accessible:

A1 (meta)data are retrievable by their identifier using a standardized communications protocol.

A1.1 the protocol is open, free, and universally implementable.

A1.2 the protocol allows for an authentication and authorization procedure, where necessary.

A2 metadata are accessible, even when the data are no longer available.

Interoperable:

I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.

I2. (meta)data use vocabularies that follow FAIR principles.

I3. (meta)data include qualified references to other (meta)data.

Re-usable:

R1. meta(data) have a plurality of accurate and relevant attributes.

R1.1. (meta)data are released with a clear and accessible data usage license.

R1.2. (meta)data are associated with their provenance.

R1.3. (meta)data meet domain-relevant community standards (1 p4).

---------------------------------------------------------------------------------

All researchers at all levels but also society as a whole can benefit from the implementation of FAIR data management. Granting access to data produced throughout the entire research cycle is also important on a global scale since it allows researchers who operate in countries with less developed research infrastructures to benefit from others scientists’ work (3).

For whom is this important?

Research subjects, Researchers, Policy makers

What are the best practices?

The European Commission decided to run a pilot under Horizon 2020 the Open Research Data Pilot (ORD pilot). Which aims to improve and maximise access to and re-use of research data generated by Horizon 2020 projects. This initiative supports and requires the application of FAIR principles within the H2020 research projects, and therefore it strives to maximise the output and outreach of publicly funded research.

References

1. Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3:160018 doi: 10.1038/sdata.2016.18 (2016).

2. Vicente-Saez R., C. Martinez-Fuentes. Open Science now: A systematic literature review for an integrated definition. Journal of Business Research (2018) 88: 428-236.

3. The FAIR Data Principles (FORCE 11 discussion forum) [consulted on 02/01/2018]. Available at: https://www.force11.org/group/fairgroup/fairprinciples

Giulia Inguaggiato contributed to this theme.

Latest contribution was June 2, 2019