Skip to content

Preparing your Data Package for Submission

Prepare your data package for submission by creating a "shareable data package"

Once you have finished creating your data package locally, you will need to prepare your data package for submission to a repository, by creating a "shareable data package" or packages. A shareable data package includes the subset of study files that you intend to share, while excluding study files that you do NOT intend to share.

There are several different "flavors" or types of shareable data package that allow you to share study files in different ways (e.g. open access, managed access, right away, under embargo until a specific date, etc.), and if you want to share different sets of study files in different ways, you may want to create more than one shareable data package of different flavors to accomplish this easily.

This section will walk through the different ways you may want to share your data, why you might want to share your data in different ways, the flavors or types of shareable data packages that allow you to share in different ways, how to determine which study files should go into your shareable data package depending upon the flavor or type of your shareable data package you're creating, and how to finalize your shareable data package or packages for submission to a data repository.

If you already know what flavor of shareable data package you want to create, and whether you will create more than one shareable data package, use these links to head directly to step-by-step instructions for how to create each flavor of shareable data package:

Otherwise, keep reading for more information on how to decide.

Why you might want to share your data in different ways

Depending on your specific study situation, you may not want to share all your files in a public data repository regardless of how protected they may be or the timing of sharing (e.g., if you collected human subjects data, you would not want to share fully identified versions of this data publicly). Importantly, these "unshareable" study files should never be included in a shareable data package of any type.

Leaving these "unshareable" study files aside, you still may not want to share all study files in the same way or at the same time:

  • Managed Access: Your study may have generated sensitive data (e.g., identified human subjects data which you subsequently deidentified, with some level of disclosure risk remaining). In this case, you may want to:
    • share the sensitive data as managed access
    • while sharing study data that are not sensitive (e.g., non-human subject data) or non-data supporting files such as data dictionaries, protocols, and code as open access.
  • Open Access: Your study may have generated only non-sensitive data (e.g., non-human subjects data). In this case, you may want to:
    • share all study data and non-data supporting files such as data dictionaries, protocols, and code as open access.
  • At a later date: Your study group may have not yet published on the study results and/or secured any intellectual property rights applicable to the study, and you may want to wait to share until after reaching these milestones. In this case, you may want to:
    • share certain data, results, or other study documents under embargo, so that they are shared at a later date
    • while perhaps sharing other study files (e.g. certain subsets of data that you will not use to publish , certain subsets of data that do not contain intellectual property, data dictionaries, protocols, and code) without an embargo, so that they are shared now.
  • Now: Your study group may have already published on the data and/or secured any intellectual property rights applicable to the study. In this case, you may want to:
    • share all study data and non-data supporting files such as data dictionaries, protocols, and code without an embargo, so that they are shared now.

How and when you want to share all or some of your study files will determine the approach you take to preparing your data package for submission, including what flavor or type of shareable data package you will create, and whether you will create more than one type of shareable data package.

Shareable data packages "flavors" and when to create them

Flavor Description
open-access-now
  • A type of shareable data package that only includes study files that you want to share right away (i.e., no embargo or embargo has already expired) and that you have indicated will be shared as open access
  • Share this shareable data package at your data repository as open access without embargo restrictions
  • Data repository users will be able to see/use these files right away with a low barrier
  • Create if: You have at least some study files that you want to share as open access and right away (no embargo)
managed-access-now
  • A type of shareable data package that only includes study files that you want to share right away (i.e., no embargo or embargo has already expired) and that you have indicated will be shared as either open access or managed access
  • Share this shareable data package at your data repository as managed access without embargo restrictions
  • Data repository users will be able to begin the process of requesting access right away, and will be able to see/use these files after subsequently completing the (relatively high barrier) process of requesting access in accordance with repository policies
  • Create if: You have at least some study files that you want to share as managed access and right away (no embargo)
open-access-by-date
  • A type of shareable data package that includes study files that you want to share right away as well as study files that you will share once a specific date has passed (i.e., under embargo until a specific date in the future) and that you have indicated will be shared as open access
  • Share this shareable data package at your data repository as open access with embargo restrictions
  • Data repository users will be able to see/use these files once the embargo date has passed with a low barrier
  • Create if: You have at least some study files that you want to share as open access but only after a milestone date has been reached (under embargo)
managed-access-by-date
  • A type of shareable data package that includes study files that you want to share right away as well as study files that you will share once a specific date has passed (i.e., under embargo until a specific date in the future) and that you have indicated will be shared either as open access or managed access
  • Share this shareable data package at your data repository as managed access with embargo restrictions
  • Data repository users will be able to begin the process of requesting access once the embargo date passes, and will be able to see/use these files after subsequently completing the (relatively high barrier) process of requesting access in accordance with repository policies
  • Create if: You have at least some study files that you want to share as managed access but only after a milestone date has been reached (under embargo)

When to create more than one shareable data package

Depending on how and when you are sharing your data, you may want to create more than one "flavor" of shareable data package for your data.

Situations in which you may want to create and share multiple shareable data packages of different "flavors" at your data repository include:

You want to share some files as open access and some as managed access

Because it is easiest and fastest for secondary users to get access to an open-access-now (versus managed-access-now) shareable data package, creating and sharing both an open-access-now and managed-access-now shareable data package maximizes access to your valuable study files.

Potential secondary users can access the open-access-now version right away, and this can inform their work plans as they work through the process of gaining authorization to access the managed-access-now version of the shareable data package.

You want to share some files right away and others after you've published or secured intellectual property rights

Because it will be faster for secondary users to get access to an open-access-now shareable data package (versus open-access-by-date, where the milestone or "by date" date, after which the files will be available for sharing, has not yet passed), creating and sharing both an open-access-now and open-access-by-date shareable data package will allow researchers to access some of your data and supporting files now, maximizing access to your valuable study files

Create your shareable data package - Overview

Creating a shareable data package starts with your local data package, which includes study files plus standard data package metadata.

A shareable data package is essentially a "filtered" version of your local data package, including only the subset of study files you intend to share, along with all shareable standard data package metadata to provide context, understandability, and navigation to secondary users.

Local data package: Make a copy of your local data package directory structure

Study files: Filter in a copy of only the study files you've indicated you want to share

  • While creating the standard data package metadata for your local data package, you should have created a Resource Tracker, listing all study files, or a subset of files most relevant to the data or results you are sharing; For each study file/resource listed you should have filled out the "access" and "access date" fields (Resource Tracker schema)
  • Use the values listed for each resource in the "access" and "access date" fields in your local data package's Resource Tracker to determine which study files should be included in your shareable data package; Files that should be included will depend on the shareable data package flavor
  • For a file to be considered for shareability as part of a shareable data package of any flavor, the file must be listed as a resource in the Resource Tracker, the "access" field for that study file/resource is complete and not set to "permanent private", and if the "access" field has a value of "temporary private", the "access date" field (which specifies when the file will become shareable) is complete, and there is a second value set in the "access" field of either "open access" or "managed access" (the way you want to share a file once it is no longer temporary private)

Standard data package metadata: Standard data package metadata include an Experiment Tracker, a Resource Tracker, one Data Dictionary per tabular data file, and one Results Tracker per publication. Filter in shareable standard data package metadata as below:

  • A copy of the Resource Tracker and Experiment Tracker should be included by default in any shareable data package created.
  • A copy of Results Tracker(s) and Data Dictionary(ies) should only be included in a shareable data package if they are considered shareable based on the flavor of shareable data package you are creating, and the "access" and "access date" field values set for these files in the local data package Resource Tracker
  • All local file paths to study files/resources listed in the Resource Tracker or any Results Tracker to be included in a shareable data package should be converted to relative file paths for the modified Tracker versions included in any shareable data package - this is to protect the privacy of local computer systems and to make navigation of study files easier for secondary users

Finalize your shareable data package for submission - Overview

Finalizing your shareable data package for submission starts with your shareable data package(s).

Review Shareable Data Package: Review your shareable data package(s) to confirm that you are sharing all study and standard data package metadata files you intend to share and that you are not sharing any files you do not intend to share.

Create Accessory Files: Accessory files you'll submit alongside your shareable data package(s) include an "overview" Resource Tracker, and a readme file. While your shareable data package(s) will be zip archives and not transparent to secondary users until they download and extract, these accessory files will be submitted as single files that may be viewed/downloaded easily on their own. They are intended to orient potential secondary users to what has been shared in each shareable data package, to provide context that may help potential secondary users to understand if what has been shared will be useful for their purposes, and (if more than one shareable data package has been shared) which shareable data package(s) they may want to access.

  • "Overview" Resource Tracker: A copy of the Resource Tracker from your shareable data package with an added column, one per shareable data package you will share, indicating whether or not each file in the Resource Tracker is shared in each shareable data package
  • Readme file: A text file that provides a high level orientation as to what a shareable data package is, what shareable data packages have been shared, and how to use the "overview" Resource Tracker to understand what study files have been shared in which shareable data package. This file also contains links back to data packaging and standard data package metadata specifications to aid secondary users in using standard data package metadata to navigate, understand, use, and replicate shared data and results in the shareable data package(s)

Zip Shareable Data Package: Zip up your shareable data package(s). Zipping up your shareable data packages creates a single zip file per shareable data package and provides file size compression. You'll submit just one zip file per shareable data package.

Submit Shareable Data Package and Accessory Files: Submit your zipped up shareable data package(s) along with your "overview" Resource Tracker and readme file accessory files at the data repository you've selected. The full inventory of items you will submit, and under what access conditions to share them at your repository, include:

  • Zipped up shareable data package(s): share each of your zipped up shareable data packages at your data repository with access conditions based on the shareable data package flavor
  • "Overview" Resource Tracker: share at your data repository as open access without embargo restrictions
  • Readme file: share at your data repository as open access without embargo restrictions

Next steps

Decide: Decide what type or flavor of shareable data package(s) it makes sense for your study group to create. It may help to review the table of shareable data package flavors here.

Step-by-step instructions: Then, use these links to head directly to step-by-step instructions for how to create each flavor of shareable data package.