Skip to Main Content

EC Econometrics: Data Sets

Examination of methods of analysis commonly used in economics and business.

Selected Data Sets

Consider the time period -- current or  historical, and the format -- table to print or data to download to a spreadsheet.

IPUMS - Integrated Public Use Microdata Series

https://www.ipums.org/

Integrated Public Use Microdata Series (IPUMS) is the world's largest individual-level population database containing USA Census materials and international census records.

 

ATUS – American Time Use Survey

https://www.bls.gov/tus/

Data page: ATUS Data Files - https://www.bls.gov/tus/#data

select single or multi year files or use the module data files

How to use ATUS microdata files

 

NLSY - National Longitudinal Survey of Youth

Home page ; http://www.bls.gov/nls/

Navigator page; register to use: https://www.nlsinfo.org/investigator/pages/login.jsp

The Investigator User's Guide describes how to use this website.
An available tutorial also teaches how to search for variables in the Investigator

 

National Center for Charitable Statistics

Data page: https://nccs.urban.org/nccs/datasets/

  • Primarily pulls data from the IRS Form 990 – there is a lot of information on that form! Anything related to financial impacts of the sector (or subsectors within the nonprofit sector), the NCCS is a great place to find that data/information.  
  • The financial impacts are broken in to dozens of categories, by type of activity and type of nonprofit, ex. Medicaid expansion, ACA’s impact on nonprofit hospitals, or the rise and fall of conservation and environmental organizations over the last 20 years looking across different administrations and their policies, or the impacts of “h-elective” legislation on nonprofit organizations’ lobbying activities….

Inside Airbnb

Data page:https://insideairbnb.com/get-the-data.html

Offers different csv data sets for listing in cities around the world

 

Kaggle datasets

Data page: https://www.kaggle.com/datasets
Kaggle’s community is made up of data scientists and machine learners from all over the world with a variety of skills and backgrounds.

ex: Social Media Engagement Report - https://www.kaggle.com/datasets/aliredaelblgihy/social-media-engagement-report

 

Yelp Datasets

Data page: https://www.yelp.com/dataset

Yelp maintains a dataset for use in personal, educational, and academic purposes. It includes 6 million reviews spanning 189,000 businesses in 10 metropolitan areas. Students are welcome to participate.

Need to sign a form to get data first so I am not sure about file format

 

CDC Data:

Centers for Disease Control and Prevention data set page :https://data.cdc.gov/browse

By topic, sets can differ mostly zip files to download

 

Dow Jones Weekly Returns

Data set page: https://archive.ics.uci.edu/ml/datasets/Dow+Jones+Index

Predicting stock prices is a major application of data analysis and machine learning; zip file to download

Types

Time Series - same phenomina over specificed length of time

Cross Section - 1 point of time in multiple places

Pooled - combination of time series and cross section data

Citing

Don't forget to cite your data sources.

APA Style - Online Data Sets

Point readers to raw data by providing a Web address
- use "Retrieved from"

or a general place that houses data sets on the site
- use "Available from"

example:

United States Department of Housing and Urban Development. (2008).
Indiana income limits [Data file]. Retrieved from https://www.huduser.org/Datasets/
IL/IL08/in_fy2008.pd