Selecting a Data Repository for your Research Data
September 1, 2021
Data repositories provide storage space for researchers to deposit research data and at the same time, allow potential users to find, access, and possibly reuse the data. Increasingly, funders and journal publishers are expecting researchers to deposit the underpinning datasets of their publication into an open data repository, allowing your data to become more discoverable and accessible for others to use, thus extending impact beyond your research project.
To decide on which repository to deposit your research data, you should consider the nature of your research data, features of the data repository and of course, the requirements of your funders and/or journal publishers.
Data Repository requirements from the funders/ journal publishers
Funders or journal publishers may have specific requirements on where you should deposit your data. Hence, it is essential to know the requirements from the funders or the journal publishers so that you can better identify the data repositories you can consider using. For example, some only accept a data repository that can assign a persistent identifier (e.g. Digital Object Identifier or DOI) for your data. If this is the case, then Harvard Dataverse, Dryad, and Zenodo are suitable repositories as they assign a DOI for the data deposited in their repository.
Choosing repositories generally used by researchers in your research areas will help improve the discoverability of your shared data, especially among your peers. For example, if you are a software engineer, you may wish to deposit your source code in Github, which is a popular repository for other software engineers in IT discipline.
Reputation of the repository
It is important to choose a dependable repository that you can entrust and rely upon to preserve your valuable research data for a long time. Looking up a repository in established directories is a good starting point. For example:
- Re3data: A global registry of research data repositories that covers research data repositories from different academic disciplines.
- FAIRsharing.org: A searchable portal with both in-house and crowd-sourced descriptions of standards, data repositories, and data policies.
- OAD’s Data repositories: A list of data repositories grouped by 15 disciplines with over 20 multidisciplinary repositories.
Additional features of the repository
Each repository might have different unique features, hence you might want to consider which of these features are important or fit your needs. For example, some data repositories like Harvard Dataverse, Dryad, and Figshare allow you to generate a private, randomised URL that allows for a double-blind download of the dataset during the peer review process of your related manuscript.
Do also pay attention to the size limit, cost structure, allowance of the embargo period, licensing terms, and so on, as these will directly impact the data you deposit and the ability to share your data. For example, all data in Dryad is released into the public domain under the terms of a Creative Commons Zero (CC0) waiver. If this is not something your journal allows, then you will have to reconsider depositing your data in Dryad.
If you are still not sure about how to select a suitable repository for your research data, feel free to talk to your Faculty Librarian for help, or visit our guide to learn more.