Breadcrumbs

About Data & Tool Sharing

***THIS PAGE IS PLANNED FOR RELEASE 2 (NOT YET SCHEDULED). ***

Why share data and tools?

Data and tool sharing is central to open science and a key component of the Complement-ARIE program. When you share your data and tools, you show your support for the future of science – for its openness, its reproducibility, and its longevity. Responsible and open data and tool sharing allows you to demonstrate the rigor and reliability of your work to others. By doing so, you invite them to review, reproduce, and reuse your materials, potentially advancing new discoveries. For biomedical data and tools, data sharing can lead to new treatments, therapies, and even cures, improving patient outcomes. Data and tool sharing also contributes to the public good, because it can build trust in science and increase access to scientific knowledge.

Beyond philosophical reasons, data and tool sharing is also required by many funding organizations, including the U.S. National Institutes of Health (NIH) and many other governing bodies, foundations, and journal publications.

What is data and tool sharing?

In general, data and tool sharing means making data and tools (including experimental models, computational models, and other research resources) available to others in a responsible way. This can include the following additional steps:

  • providing information about the data and tools, such as abstracts, code, and protocols

  • embargoing data and tools for a specified period

  • de-identifying data

  • ensuring access controls are in place for sensitive data

  • adding information about the data and tools (called metadata or annotations) to enable querying and discovery

These days, many researchers share data and tools via online repositories equipped with features such as file or code storage, versioning, annotation tools, security protocols, and search functionality. Some well known scientific data repositories include GEO, cBioPortal, and figshare, and commonly used tool repositories include GitHub, Bioconductor, and Hugging Face. At Sage, data and tools are stored in or linked to via a platform called Synapse and made explorable through portals such as the NAMHub.

How do I share my data?

Data sharing via the NAMHub consists of the following steps:

  1. Contact NDHCC. First, get the conversation started by making contact with the NAMHub team at the NYU-Sage NDHCC (namhub@sagebase.org). You’ll need to be set up as a contributor and to have a Data Sharing Agreement in place before moving on to the next steps.

  2. Prepare data. Before submitting data to the NAMHub, review the Contributing Data page of this docs site. Once you are familiar with the requirements and instructions, you can gather all supplemental information and ensure you’ve met the data sharing requirements.

  3. Deposit data. This step includes uploading and annotating your data, as well as providing any supplementary information needed to understand and curate the data. This step may also include data quality checks and metadata validation.

  4. Determine data access controls. We use the term data governance to refer to the practice of determining how data should be shared. This stage encompasses data licensing, as well as deciding how the data should be accessed and by whom. Datasets on the NAMHub can be either unrestricted (open to the public) or restricted access.

  5. Share data. After the above steps are complete, you’re finally ready to share your data with others, whether fully open or with restrictions. This step may occur some time after the above processes, such as once a publication embargo period lifts.

  6. Access data. Once your data is shared, it’s now accessible to others – within the limits of any access restrictions. Because of the steps above, that data will be FAIR, a standard representing findable, accessible, interoperable, and reusable data. FAIR data are discoverable to users through precise metadata, understandable in terms of how the data can be used, machine-readable to enable computational analysis, and, ultimately, fit for reuse.

How do I share my tools?

The NAMHub also will be the central portal for sharing of tools (e.g., computational models) developed as part of the Complement-ARIE program. Please stay tuned for more information!