The FAIR principles for data management, first published in 2016, are more important than ever in supporting modern scientific R&D. Today's scientists need to search, share, and use large and diverse datasets across disciplines, locations, and time zones. Ensuring that data are Findable, Accessible, Interoperable, and Reusable can help organizations operate more efficiently, collaborate with ease, and even innovate more readily using advanced analytics, including artificial intelligence and machine learning.
Examples of FAIR data repositories include:
GenBank: NIH's database of all publicly available genetic sequencing data
Open PHACTS: Pharmacological data-sharing platform to support drug discovery
Worldwide Protein Data Bank: Publicly available data on the 3D structures of proteins, nucleic acids, and complex assemblies
But bench science generates vast quantities of data that do not fit into such repositories, yet are just as critical to the scientific process. How and where should these data be stored, and what are the steps for making the data FAIR?
How to Go FAIR
FAIR standards can be gradually adopted across a company and can even be applied retroactively to existing data. The GO FAIR initiative recommends following these steps to complete the FAIRification process for your organization:
Retrieve non-FAIR data. Identify and gain access to data that does not meet FAIR standards.
Analyze retrieved data. Inspect the data to determine its structure, the concepts represented, and the relationships between various data elements.
Define the semantic model. Using ontological and naming conventions, create or adopt a conceptual model that defines data hierarchies, relationships, and constraints. Data should include computer-readable terms, descriptions, and other attributes.
Make data linkable. Depending on the data type, technologies such as Semantic Web and Linked Data can be used to facilitate integration with other data and systems.
Assign license. While FAIR data need not be open-access, groups should consider what license to assign to their data if they intend to make it shareable and reusable.
Define metadata for the dataset. This crucial step supports all four FAIR principles.
Deploy FAIRified data. Along with metadata and a license, publish FAIR data so that it can be indexed by search engines (accessing the data may still require proper authorization and credentials).
Prism and FAIR Data Principles
Prism 10 is offering new and better ways to enhance the efficiency of scientific progress. The new, more open Prism file format helps organizations connect scientific workflows that allow for greater collaboration and interdisciplinary research. By leveraging the data collected in previous experiments or analyses, scientists can facilitate efficient data reuse and promote transparency in research to make more informed decisions in future experiments. In the spirit of safeguarding interoperability and reusability, the new file format allows Prism customers to embrace FAIR data principles. Let's examine some elements of this.
Metadata: Incorporating rich and relevant metadata—which defines a dataset's context, quality, condition, and characteristics—is central to achieving each of the 4 FAIR principles. Metadata should be descriptive, providing enough information for identification and discovery, as well as structural details about how the data is organized. It should also contain administrative details for future reference, including the technical source, data type, quality, and creation process. Finally, metadata can store statistical information, such as data distribution and outliers. Prism 10's new file format stores raw data, analysis parameters, analysis results, graphs, and layouts.
Compatibility Mode: Prism's new Compatibility Mode is a tool built to help streamline the transition to FAIR data. With options to convert files or use older file formats with Prism's new features, Compatibility Mode helps ensure that access to older work is never lost.
Authorization and identification: Data is accessible if a machine can automatically understand its parameters, but that doesn't necessarily mean the data is publicly or freely available. Prism provides tools for making data accessible while remaining secure. Persistent identifiers—long-lasting references to a digital entity, such as a digital object identifier (DOI) or persistent uniform resource locator (PURL)—help ensure data can be located.
Seamless integrations: Prism 10's integration with Dotmatics supports interoperability throughout the scientific process, including for data registration, discovery, search, visualization, and management.
Measuring Data FAIRness
When implementing the FAIR guidelines, it's important to use established metrics to measure the level of FAIRness achieved. GO FAIR recommends that organizations self-assess their FAIRness level using the following common principles:
All applications adhere to FAIR principles, as outlined in the 2016 Scientific Data article, "The FAIR Guiding Principles for scientific data management and stewardship."
Various parties throughout the data-exchange network support the shift to implementing FAIR guidelines.
Data can be read and used by machines, unless currently impossible for a given data type.
Data and metadata are as open as possible and only as closed as necessary.
FAIR infrastructures are as distributed as possible and only as centralized as necessary.
Steps are taken to avoid using a single service supplier.
A 3-way approach is used that includes meta-data-publication, meta-data-schema (to increase the data's interoperability), and a FAIR implementation profile with details on vocabularies, licenses, access-controls, and more.
Learn why Prism is the preferred analysis and graphing solution for the world's leading scientists.