Writing your Data Management Plan

Thea Atwood, Librarian & Data Specialist @ UMass Amherst

tpatwood@umass.edu || tpatwood.com

This presentation: www.tpatwood.com/20170619_ddrig.html

This presentation as a pdf: www.tpatwood.com/20170619_ddrig.html?print-pdf (then ctrl/cmd+p)

Today:

  • The purpose of the data management plan
  • Dissecting the data management plan
  • Other resources

From the organization's perspective:

DMPs are a way to help:

  • Better document data
  • Reduce duplicity
  • Reduce waste and loss (funds, time, data)
  • Protect fragile digital data
  • Thoughtfully preserve and curate work; especially unique observations

DMPs help you:

  • Articulate what data you'll be collecting
  • Document how you'll organize, protect, make accessible, and share that data
  • Monitor your work: As you are expected to be able to explain and defend your results, the DMP can help you stay organized and in compliance

Ultimately, a DMP should be a useful starting point in documenting your work.

NSF DMPs are limited to two pages.

Use other, more robust documentation if you want to document more detail.

Dissecting the DMP

  • determine sponsor requirements
  • types of data and materials collected
  • standards for data and metadata format and content
  • policies for access
  • re-use, re-distribution, and derivatives
  • archiving and preserving access to data, samples, and other research products

determine sponsor requirements

Some directorates deviate from the guidance in the GPG.

See the NSF's page on disseminating and sharing research results for more information.

types of data and materials collected

What you'll be collecting...

e.g., text, spreadsheets, images, interviews, physical collections, software, code, curriculum materials,...

If using pre-existing data, document where that's coming from.

types of data and material collected

... and how much

e.g., 5 GB total, 50 interviews at 500MB each, 400 spreadsheets which will on average be 100KB...

Important in demonstrating you've thought ahead, and your requested resources match your data.

standards for data and metadata

Document the formats your work will take.

Strive to avoid obsolescence -- work in file formats that are as open, well documented, and non-proprietary as possible. Or, be able to convert to open file formats by the end of your project

standards for data and metadata

List of open file formats from Wikipedia at: https://en.wikipedia.org/wiki/List_of_open_formats

standards for data and metadata

Increasingly evident that good metadata is required for re-use of your work.

Metadata is the contextual information to help others who'd like to build off of your work understand exactly how to do so.

standards for data and metadata

Metadata examples:

standards for data and metadata

Look for an appropriate standard for your discipline, or use the metadata list compiled by the Research Data Alliance or the Digital Curation Centre.

If one doesn't exist for your discipline, Dublin Core and DataCite are good places to start.

policies for access and sharing

Specifically - how will data be accessible while your project is active, and with your team. This section is also concerned with appropriate protection of privacy, confidentiality, and other rights requirements.

Include information on where data is stored, who has access to data or instrumentation, and how access will be protected.

Side note - Terms of Service and Use

Please read and understand any ToS you agree to. For example:

When you upload, submit, store, send or receive content to or through Google Drive, you give Google a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our services), communicate, publish, publicly perform, publicly display and distribute such content. The rights you grant in this license are for the limited purpose of operating, promoting, and improving our services, and to develop new ones. This license continues even if you stop using our services unless you delete your content. Make sure you have the necessary rights to grant us this license for any content that you submit to Google Drive.

re-use, re-distribution, and derivatives

How data will be made available for others to re-use or build off of after you have completed your project?

Where possible, try to find an appropriate data repository. Increasingly the standard for sharing data, data repositories have numerous benefits over storing data locally or on a website.

re-uses, re-distribution, and derivatives

Licensing is becoming an important component, and may be required by funders.

e.g., Creative Commons, GNU, MIT

re-use, re-distribution, and derivatives

Discipline specific, general purpose, or institutional repositories (like ScholarWorks @ UMass Amherst!) are all great ways to share. Your discipline may have other methods that are valid, too (e.g., gitHub), just note that not all repositories are created equal (e.g., gitHub is NOT for long-term storage).

Ideally, your data will be linked with the publication it underlies.

archiving and preserving

Archiving and preserving are the final step -- when you have published your work, and your data is in its final, shareable form, it is time to store your data for long-term preservation.

archiving and preserving

Archiving and preservation are time and resource intensive activities. It takes time and people to maintain servers, run checksums, migrate to new formats, and otherwise ensure the integrity of data for the long term. Look to an appropriate repository -- with appropriately redundant backups -- to meet this need.

This is also a point when, if you haven't seen so yet, you'll see the benefits of good documentation and metadata!

Recap: Dissecting the DMP

  • determine sponsor requirements
  • types of data and materials collected
  • standards for data and metadata format and content
  • policies for access
  • re-use, re-distribution, and derivatives
  • archiving and preserving access to data, samples, and other research products

Other resources

Local help from Research Data Services @ UMA!

Other resources

Activities that might help:

  • Concept map
  • Workflow map
  • Talking about data management with others
  • Thinking of the end and planning from there

Other resources

Policy

UMass Amherst's Report on Data Ownership, Retention, & Access

Office of Science & Tech. Policy's Increasing Access to the Results of Federally Funded Research

Other resources

Data Management Best Practices

DataOne Best Practices

MANTRA

Other resources

DMP Guidance

DMPTool (also has example DMPs)

DMP examples from RIO Journal

Other resources

Data repositories

ScholarWorks @ UMA

re3data.org - find a discipline specific repository

Thanks!

Questions? Don't hesitate to get in touch!

tpatwood@umass.edu