A running list of interesting data and data repositories on the web.

1.  CMU stats data repository.

This data set repository contains a large number of data sets of anything from baseball to body-fat statistics.

2. Kaggle

After the announcement of the “Netflix Prize,” Data mining competitions such as the ones listed here are becoming more and more prevalent. These can be fun, challenging, and a great way to work with data as well as have something to talk about at an interview.

3. Economic Research Service

From the site: The International Macroeconomic Data Set provides data from 1969 through 2020 for real (adjusted for inflation) gross domestic product (GDP), population, real exchange rates, and other variables for the 190 countries and 34 regions that are most important for U.S. agricultural trade.

4. Bulk Census Data

Here you can download raw bulk US census data in usable form – very cool! Check out the post about it as well as some other resources.