Skip to Main Content

Developing Data Research Topics & Locating Datasets for Papers and Projects

This guide provides an overview of what should be considered when identifying a research topic requiring you to locate and use data.

Important Considerations -- Restricted Data & Federal Privacy Laws

When planning your research topics, it is important to remember that not all data is freely accessible for the years and level of analysis you are consider. Within these areas of study, access to specific types of data is restricted under FERPA and HIPAA regulations.

The Family Educational Rights and Privacy Act (FERPA) protects student data - a student is anyone who is or has been in attendance at an educational agency or institution that maintains educational records. These educational records contain information directly related to specific students and are maintained by the institution and they might exist in paper, electronic files, emails, film, etc. 

The Health Insurance Portability and Accountability Act (HIPAA) protects the privacy of patient health information. 

Due to FERPA and HIPAA regulations, generally research data in some social science and many health science topics discussing individuals will be available in aggregate format and not on the participant/recipient/student level. Gaining access to this microdata is possible, at times, but often is a very lengthy, potentially months long, process that will require documentation, specific research training protocols, having a faculty advisor, and working with William & Mary and the individual/agency holding the data. 

Consider the following when identifying your topic

When thinking about the country/event/issue you are wanting to locate data on consider the following:

  1. For a country specific topic, what has been happening internally and externally? For an event, when did it occur? 
    • ​​​​​​​You'll want to take this into consideration when selecting the time coverage of your topic. For example, if a country was going through political upheaval at the time you want to find data, it's possible it won't be available. Remember the most recent information, particularly the last 3-4 years, might not be available yet due to processing lags.
  2. What is the size of your country and its geographic location? How well studied is your issue? Would someone have actively been gathering data on the topic and made it available?
    • ​​​​​​​Smaller nations and niche issues can be difficult to locate information on. These can be hard to track down or they might be inaccessible (for example, only available in print in an archive somewhere). Sometimes a broader topic will help, other times you need embrace flexibility.
  3. Is your topic going to hit on something that includes sensitive or proprietary information?
    • At times, data is not always freely available and will require a subscription/fee to access, for example mining or other industrial topics. If your topic includes human subjects or federally protected information under FERPA and HIPAA access to data could be restricted. The library might have some resources that could serve as alternative options, so talking with a librarian is helpful in these cases.