Skip to Main Content

Developing Data Research Topics & Locating Datasets for Papers and Projects

This guide provides an overview of what should be considered when identifying a research topic requiring you to locate and use data.

Questions that will help you identify a dataset

You'll need more information about the elements of the dataset you are looking for beyond the topic to identify the dataset you need and to help you think through where you might look for the data. Here are some good starting questions to consider:

  1. What is your unit of analysis? 
    • For example, inflation rates, GDP, population sizes, salary rates, student graduation rates, literacy rates, income, race, gender, etc.
  2. What level is your analysis focusing on?
    • For example, country, state, providence, city, etc?
  3. What time period does your data need to cover? 
    • Historic information is often in print and not digital form, or it could be scanned PDFs of the print form. 
  4. What are the variables you want included in your dataset? 
    • Variables are the things that act on what you are measuring. 
  5. Who would produce/store/collect the data you are looking for?
    • Consider international organizations, governmental agencies, researchers, etc. The box at the bottom includes links to many different repositories and organizations that collect data. 
  6. If your data is not available in a friendly format (ie it requires proprietary software you don't have), how can you convert it? 

Other ways to locate data

In addition to considering the questions above, here are some other suggestions on ways to locate data:

  1. Search Google for keyword + data + file type (for a specific place, try location + data).
    • For example, Transgender student graduate rate data
  2. Do a general search using something like data.gov, Google's Public Data Search or general Dataset Search.
  3. Search a general or subject-specific data repository.
  4. Take a look at currently library subscriptions for data, statistics and polls
  5. Search the literature. Any scholarly articles that used a dataset for their research will cite in the article (older articles are not as consistent), so they can be great starting points to learn about different datasets.
  6. For historic topics, the dataset might not exist yet, however, it is possible to create one by looking at documents available in archives and other print documents and then creating the dataset yourself. 
  7. Sometimes a dataset might not include every variable you are looking for. In those cases, you might consider locating additional related datasets that can be merged for your analysis. This guide from NYU Libraries discusses merging datasets

If all else fails and the dataset you are wanting isn't available to you, remember flexibility is essential. This will happen and you need to be open to pivoting and considering what other options could work for you needs. Locating and working with data takes time.

Research Guides with suggested freely available datasets