In today’s research environment, data are generated at an astounding rate—much faster than the ability to catalog them and make them available to other researchers.
A best practice is to use persistent identifiers (PIDs) to label digital data products; however, PIDs have historically been used inconsistently and unpredictably. This can make it difficult for researchers to find shared data at all, let alone in a format they can understand and trust.
So, how can you make data FAIR (findable, accessible, interoperable, and reusable)? The Enhanced Robust Persistent Identification of Data (ERPID) project, funded by the National Science Foundation, set out to answer this question.
“The ERPID testbed, which runs on the Jetstream cloud, assigns PIDs to digital objects and maps them to metadata. Robust metadata is what helps researchers discover data when they don’t know the identifiers,” said Robert Quick, principal investigator on the project. “The testbed also has a data type registry that assigns a standard type such as integer, string, et cetera, to the data. This is central to making data machine actionable—that is, we’re creating a consistent structure so that data can be found and used.”
The project team also developed a service that maps existing data repository schemas to the digital object architecture (DOA). The testbed was then integrated with real-world workflows in rice genomics, weather data, and computational chemistry. Rather than replacing data identifiers, this mapping enhances the findability of previously generated data. The project created a fully realized PID-centric data management infrastructure that allows FAIR principles to be applied to existing and future data repositories, with flexibility to work across diverse scientific disciplines.
“Just a few low-overhead data services can create the cyberinfrastructure needed for real-world application of FAIR data principles,” said Quick. “ERPID adds value by creating data that are more easily discoverable, useful, and—ultimately—reusable by the research community.”