Data documentation, also known as metadata, helps you understand your data in detail, and also helps other researchers find, use, and properly cite your data.
Various metadata standards are available for particular file formats and disciplines. A ReadMe.txt file, a Codebook, or Coding Manual should be created to accompany your data files. As you collect or create your data, you want to capture the following information:
Data dictionaries, ReadMe.txt, and Codebooks are all ideal ways of documenting your data. ReadMe.txt files provide information about your data files and help ensure that your data files can always be correctly interpreted by anyone using them. Data dictionaries are often used to describe each element of your dataset - the variable names and values in your spreadsheets. Codebooks are more detailed than data dictionaries and might include information that is in a ReadMe.txt file, as well as describing elements of your dataset, and the instruments used to gather the data (surveys, interview questions).
The following resources provide additional information on how to create these documents:
Data organization includes having a consistent folder and file structure, along with using sustainable file formats and having an established file naming convention. If you need a refresher on file naming conventions and sustainable file formats please visit the previous page: Data File Naming & Management.
When organizing your data you want to use a consistent file structure so that you'll always be able to find your files. Your ReadMe.txt file should record the file structure you decide on for your project in additional to your other data documentation (file name conventions, abbreviations, variables, etc). The ReadMe.txt file should be located at the very top of your file structure hierarchy so it is easy to locate.
Create separate folders for your raw data, processed data, code and outputs, and documentation to avoid confusion. All file names should follow the file naming convention you have established.
There are several protocols that can be followed for structuring your files.