Early in Study, Results-Support Orientation, Low Level of Data Sharing Resources¶
Data Packaging Timeline¶
What to do right away¶
Set yourself up for success¶
- Review all study files/resources already produced by or for your study
- Come up with file organization and naming conventions for your study folders and files now - Consider applying HEAL recommendations for file organization and naming. Apply these conventions to existing study files and to any future files and folders as they are created
- Where practicable to implement (without duplicating original files), organize all study files/resources into a single study folder/directory (study folder/directory may of course have sub-directories; see here for guidance on and examples of recommended study folder/directory structure)
- All study files/resources should be stored in a location where the person(s) who will be creating/contributing to your data package documentation can access them all at the same time (e.g. you can have files located on different network drives as long as all network drives can be mounted and accessed at the same time by the person documenting; you CANNOT have files located on two different local computer drives, even if the person documenting can access both computers separately)
A note on copying study files
Although these guidelines provide suggestions on useful adjustments you can make before you start documenting that will make documentation easier (e.g., applying naming conventions and ensuring files are accessible to the person documenting), these guidelines do not mean that you should copy your study files into new or existing folders to group all "to share" files together.
All documentation should be completed based on original files. Creating copies of your files for documentation can introduce inconsistencies in your final package (e.g., if you edit the original file but not the copy, the file documented and likely shared will not be the most up to date version).
The only exception to this rule: After you have completely finished documenting your study files at their local (or network) paths, you will copy the files that you intend to share to finish preparing your data package for submission.
What to do when your manuscript is finalized¶
Initialize your Data Package¶
- Create a "dsc-pkg" folder/directory that will hold all Standard Data Package Metadata Files for your data package
- If all study files/resources are organized into a single study folder/directory, create this folder/directory as a direct sub-directory of your study folder/directory, and name it "dsc-pkg"; consistency in naming and location of this folder/directory relative to your overall study folder/directory will make it easy to recognize as the folder that contains the Standard Data Package Metadata files for your study's data package
- If all study files/resources are NOT organized into a single study folder/directory, create this folder/directory in a disk location that makes sense for you; name it "dsc-pkg", optionally appending a suffix to the name that will make it easy to recognize as "belonging" to a specific study (e.g. "dsc-pkg-study-1" or "dsc-pkg-mindfulness-for-oud"); consistency in naming (i.e. including the "dsc-pkg" prefix) and appending a suffix to the name that is a human-recognizable identifier for the relevant study will make it easy to recognize as the folder that contains the Standard Data Package Metadata files for your study's data package
Make a list of shared results and contributing experiments¶
- Make a list of the results (e.g. figure, figure panel, table, text statement) produced by your study that are shared in the manuscript
- Make a list of the full set of study experiments/activities that produced supporting data or other support for any of the results shared in the manuscript
Start your Experiment Tracker¶
- Start your Experiment Tracker by initializing an empty Experiment Tracker file based on the Experiment Tracker csv template
- Save your Experiment Tracker in your "dsc-pkg" folder as "heal-csv-experiment-tracker.csv"
- Each row in your Experiment Tracker will represent a study experiment/activity
- Use the Experiment Tracker schema to understand what each "question"/field in the Experiment Tracker means and how to "answer"/complete each "question"/field
- Add all study experiments/activities that produced supporting data or other support for any of the results shared in your manuscript to your Experiment Tracker
Start your Results Tracker(s)¶
- Start your Results Tracker by initializing an empty Results Tracker file based on the Results Tracker csv template
- Create a separate Results Tracker for each manuscript, if you have more than one
- Save your Results Tracker in your "dsc-pkg" folder as "heal-csv-results-tracker-manuscript-1.csv" (i.e. filename prefix is "heal-csv-results-tracker-", followed by the filename of your manuscript file)
- Each row in your Results Tracker will represent a single result shared in the associated manuscript (i.e. each row in the Results Tracker called "heal-csv-results-tracker-manuscript-1.csv" will represent a result shared in the manuscript called "manuscript-1")
- Use the Results Tracker schema to understand what each "question"/field in the Results Tracker means and how to "answer"/complete each "question"/field
- The Results Tracker will ask you to list associated files/dependencies for each study result shared in the manuscript (i.e. files that are required to interpret, replicate, or use the result)
- Add all study results shared in your manuscript to your Results Tracker
- Finalize the Results Tracker for the completed manuscript
- Check that all results shared in the manuscript are listed in the associated Results Tracker - Add any that are missing
- Check that associated files/dependencies & figure/table numbers for all results shared in the manuscript are accurately reflected in the manuscript's associated Results Tracker - Correct any that need to be updated
Start your Data Dictionary(ies)¶
- If any associated file/dependency for any of the shared results is a tabular data file, create a Data Dictionary for each associated file/dependency that is a tabular data file
- Start a Data Dictionary for a tabular data file by initializing an empty Data Dictionary file based on the Data Dictionary csv template
- Save your Data Dictionary in your "dsc-pkg" folder as "heal-csv-dd-my-datafile.csv" (i.e. the file name starts with the prefix "heal-csv-dd-", you append the name of the data file to which the Data Dictionary applies, and save as a csv file)
- Each row in your Data Dictionary will represent a variable that is collected/populated in your tabular data file
- Use the Data Dictionary schema to understand what each "question"/field in the Data Dictionary means and how to "answer"/complete each "question"/field
- Add all variables in the tabular data file to your Data Dictionary
- Add a Data Dictionary for each tabular data file
Start your Resource Tracker¶
- Start your Resource Tracker by initializing an empty Resource Tracker file based on the Resource Tracker csv template
- Save your Resource Tracker in your "dsc-pkg" folder as "heal-csv-resource-tracker.csv"
- Each row in your Resource Tracker will represent a study file/resource that you have annotated. Study files include data and non-data supporting files, including HEAL-formatted data dictionaries you may have created
- Use the Resource Tracker schema to understand what each "question"/field in the Resource Tracker means and how to "answer"/complete each "question"/field
- The Resource Tracker will ask you to list associated files/dependencies for each study file/resource (i.e. files that are required to interpret, replicate, or use the study file/resource; for tabular data files, this will include a data dictionary for the file)
Add items to your Resource Tracker¶
- First, add your manuscript and its associated Results Tracker to your Resource Tracker.
- For each shared result in your Results Tracker, add any items you have listed as associated files/dependencies as resources to your Resource Tracker ONLY if that file will be shared in a public repository.
- Remember that whenever you add a tabular data file to your Resource Tracker, you should first create a data dictionary. Then, add the tabular file to the Resource Tracker and list the data dictionary you have created as a dependency.
- Finally, any file that has been added as an associated files/dependency of a resource in the Resource Tracker should also be added as a resource ONLY if it will be shared in a public repository.
- Repeat this process with the associated files/dependencies of these files until you're listing files without any dependencies.
- If there are any other study files that you will share in a public repository that have not already been added to the Resource Tracker, add them to the Resource Tracker now
Congratulations! You have finished preparing your data package locally.
You can now prepare your data package for submission to a public repository.
Tip to confirm your local data package is ready for submission
Before you share your data package, it may be useful to ask someone else in your study group to review your data package and annotations thinking about whether it would be accessible and understandable to a researcher looking at the data package for the first time.
If they are able to easily understand the study structure and how they might go about replicating your dataset/results, that is a good sign that your data package has the necessary resources and adequate detail to be shared.