Data Sets

This page provides download links, codebooks and data management files (where available) for a variety of data sets commonly used across courses I teach. There are two general categories of data sets, those that have been cleaned to an extent for analysis purposes, and those that are in their original state of disarray.

Cleaned Data

These data sets are typically used in entry level classes such as MATH 130 and MATH 315. Some data sets still have missing values but have been tidyed up a little bit. The code to read the data set into R is provided, along with a glimpse of the data and it’s dimensions. Use this information to confirm that the data set you download and import into R matches this information.

Raw Data

These data sets are used in courses that emphasize data management as well as analysis including but not limited to MATH 456 (Applied Statistics II), MATH 385 (Intro to Data Science), 485 (Advanced Data Science Topics), and MATH 615 (Graduate Statistical Methods). All data sets listed are in their raw format. Codebooks and data management files are supplied as available. The Updated column represents the last time the data management file was edited.