Thea Atwood, Librarian & Data Specialist @ UMass Amherst
tpatwood@umass.edu || tpatwood.com
This presentation: www.tpatwood.com/20170619_ddrig.html
This presentation as a pdf: www.tpatwood.com/20170619_ddrig.html?print-pdf (then ctrl/cmd+p)
DMPs are a way to help:
Ultimately, a DMP should be a useful starting point in documenting your work.
NSF DMPs are limited to two pages.
Use other, more robust documentation if you want to document more detail.
Some directorates deviate from the guidance in the GPG.
See the NSF's page on disseminating and sharing research results for more information.
What you'll be collecting...
e.g., text, spreadsheets, images, interviews, physical collections, software, code, curriculum materials,...
If using pre-existing data, document where that's coming from.
... and how much
e.g., 5 GB total, 50 interviews at 500MB each, 400 spreadsheets which will on average be 100KB...
Important in demonstrating you've thought ahead, and your requested resources match your data.
Document the formats your work will take.
Strive to avoid obsolescence -- work in file formats that are as open, well documented, and non-proprietary as possible. Or, be able to convert to open file formats by the end of your project
List of open file formats from Wikipedia at: https://en.wikipedia.org/wiki/List_of_open_formats
Increasingly evident that good metadata is required for re-use of your work.
Metadata is the contextual information to help others who'd like to build off of your work understand exactly how to do so.
Metadata examples:
Look for an appropriate standard for your discipline, or use the metadata list compiled by the Research Data Alliance or the Digital Curation Centre.
If one doesn't exist for your discipline, Dublin Core and DataCite are good places to start.
Specifically - how will data be accessible while your project is active, and with your team. This section is also concerned with appropriate protection of privacy, confidentiality, and other rights requirements.
Include information on where data is stored, who has access to data or instrumentation, and how access will be protected.
Please read and understand any ToS you agree to. For example:
When you upload, submit, store, send or receive content to or through Google Drive, you give Google a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our services), communicate, publish, publicly perform, publicly display and distribute such content. The rights you grant in this license are for the limited purpose of operating, promoting, and improving our services, and to develop new ones. This license continues even if you stop using our services unless you delete your content. Make sure you have the necessary rights to grant us this license for any content that you submit to Google Drive.
How data will be made available for others to re-use or build off of after you have completed your project?
Where possible, try to find an appropriate data repository. Increasingly the standard for sharing data, data repositories have numerous benefits over storing data locally or on a website.
Licensing is becoming an important component, and may be required by funders.
e.g., Creative Commons, GNU, MIT
Discipline specific, general purpose, or institutional repositories (like ScholarWorks @ UMass Amherst!) are all great ways to share. Your discipline may have other methods that are valid, too (e.g., gitHub), just note that not all repositories are created equal (e.g., gitHub is NOT for long-term storage).
Ideally, your data will be linked with the publication it underlies.
Archiving and preserving are the final step -- when you have published your work, and your data is in its final, shareable form, it is time to store your data for long-term preservation.
Archiving and preservation are time and resource intensive activities. It takes time and people to maintain servers, run checksums, migrate to new formats, and otherwise ensure the integrity of data for the long term. Look to an appropriate repository -- with appropriately redundant backups -- to meet this need.
This is also a point when, if you haven't seen so yet, you'll see the benefits of good documentation and metadata!
Activities that might help:
Policy
UMass Amherst's Report on Data Ownership, Retention, & Access
Office of Science & Tech. Policy's Increasing Access to the Results of Federally Funded Research
Data Management Best Practices
DMP Guidance
DMPTool (also has example DMPs)
Data repositories
re3data.org - find a discipline specific repository
Questions? Don't hesitate to get in touch!