Skip to Main Content

Data Management and Sharing Plans

This guide will take you through each of the elements you need to consider when creating a data management and sharing plan

Managing Your Files

What is file management and why does it matter for your research data management process? This short video provides you an introduction to file management, what it is and why it matters.

File Naming Conventions

There are no hard set rules when it comes to creating a file organization structure, but planning for it in advance with your research team will save you time as you engage in your research. As you (and your collaborators) create a file naming convention here are some general rules to follow:

  • Be consistent 
  • Keep it short - less than 32 characters
  • Don't include the file format extension in the file name - ex. codingdoc.docx --> instead do codingmanual.docx
  • Avoid using special characters
  • Do not use spaces, they can't be read by all programs. Instead use an underscore (coding_manual), capital letters (CodingManual), or dashes (coding-manual)
  • For dates use YYYYMMDD format (global standard, ISO 8601)
  • Avoid generic file names
  • Avoid using acronym names that won't be easily understood by others, if you do, they should be explained in your ReadMe.txt or Codebook
  • Include version numbers on your file
    • To help your files display in sequential order, consider using leading zeros (ex: 001-999)

Once you've established the file naming convention create a ReadMe.txt or Codebook and include details about each element of the file name convention so that those working on the project can follow it and so that future users of your data know how to read the files.

  • File Name Convention - ex: [Date]_[Location]_[DataType]
  • Date - ex: Date sample taken in YYYYMMDD format
  • Location - ex: Location where the sample was taken, DEN - Denver, ANB - Ann Arbor, etc
  • Data Type - ex: Type of data sample pulled with meaningful, but generic description
  • File Name Example - 20220425_DEN_Survey.pdf

Caltech Library has a handy File Naming Convention Worksheet that walks you through the process of creating a file naming convention for a group of files. If you aren't sure how to start, this is a handy resource to use.

Sustainable File Formats

​As technology changes, researchers should plan for both hardware and software obsolescence and consider the longevity of their file format choices to ensure long term readability and access.

File formats more likely to be accessible in the future have the following characteristics:

  • Non-proprietary
  • Open, documented standard
  • Common usage by research community
  • Standard representation (ASCII, Unicode)
  • Unencrypted
  • Uncompressed

Examples of preferred file format choices include:

  • Image: JPEG, JPG-2000, PNG, TIFF
  • Text: plain text (TXT), HTML, XML, PDF/A
  • Audio: AIFF, WAVE
  • Containers: TAR, GZIP, ZIP
  • Databases: XML or CSV

Consider migrating your data into a format with the above characteristics, in addition to keeping a copy in the original software format. If you deposit your data in a repository, your files may be migrated to newer formats, so that they’re usable to future researchers.