IPUMS

IPUMS, the Integrated Public Use Microdata Series, is a great source for big data. IPUMS includes the Current Population Survey (CPS) from the U.S. Census as well as other health and housing data. The IPUMS system allows you to create a subset of these massive repositories of collected data that fits your needs and interests. An initial word of caution to those new to the system is that your requests are not filled immediately, so start early!

Getting Started

We will use the tidyverse and ipumsr packages.

Start at the IPUMS home page and select the survey you wish to explore via the ‘visit site’ option.

From here you will see a prompt to login or register at the very top of the widow:

If you have used IPUMS before, then you may proceed to login and move on to select your data. However, if you are a new user, you will need to apply for access. The register tab will ask you for standard account creation information. (You should use your Harvard college email and indicate your usage accordingly.) There are a handful of data use agreements you will be required to agree to in this process which you should note specifically with regard to citations. Once you have applied, you will need to await the confirmation of your account to log in fully.

Once you have received the email confirmation and logged in properly, you will be able to begin the data selection process. The first option you will need to consider and specify will be the samples (see ‘select samples’ button), or time the period covered by your data. You will want to pay close attention to the intervals in this step as well as to note whether your pull will be a large sample or the entire available set (relevant in census applications). From there you can use the drop down menus and/or the search feature to locate and include the variables that you need or that may be relevant to your query.

You can see below that the household and person tabs (highlighted) will provide a drop down of common variables from this repository under the respective classifications. To add a variable to your ‘cart’ you simply select the + icon, if not available either there is a corresponding explanation or it simply isn’t available.

Once you are satisfied with the variables and intervals you have selected in your sample, you can review and revise your selections in the cart menu prior to requesting the ‘pull.’ Shown below is what the cart review window shows. An X indicates the existence of data in the given time period. Note that there is one variable with no data, indicated by the ... across the time period review columns.

When you are satisfied with the sample specifications, you can proceed to the create data extract page. This will provide you with a handful of final options and a text window to describe the sample you’ve created. You should treat this like a commit message in GitHub - brief and meaningful.

Submitting your request should automatically bring you to the request history page associated with your account. You will note that your requests are not permanently available here, and you should make sure to promptly download your information upon receipt.

Upon receiving the confirmation email for your request, return to the above window. You will need to first download the data via Download.DAT, then you will need to save the DDI link (via ‘save as’) in the same location as your .dat.gz file. Then select the R command file. The last step here will be to unpack / un-zip the .dat.gz file such that the .gz suffix is removed. The IPUMS download instructions recommend 7-zip for those who don’t already have file decompression software on hand.

The R command file link will show you a text file containing roughly the following steps to unpack your data extraction:

if (!require("ipumsr")) stop("Reading IPUMS data into R requires the ipumsr package. It can be installed using the following command: install.packages('ipumsr')")

# this read_ipums_ddi and read_ipums_micro seem to require BOTH the .xml and
# .dat files to run... strange behavior, but it seems to be implicit in the latter IPUMSR::* function
# this is an ongoing point of research

ddi <- read_ipums_ddi("path_to_your_file.xml")
data <- read_ipums_micro(ddi)
## Use of data from IPUMS USA is subject to conditions including that users should
## cite the data appropriately. Use command `ipums_conditions()` for more details.

A successful unpacking and proper saving of the DDI file should result in the ability to execute the code from the R command file as shown above. At this point you can access your ‘Big Data’ as you would any standard object. Below is an example extract from the U.S. census that has been imported in the background. Another useful feature of the DDI is that opening the link in your browser shows you more information about the variable abbreviations and the extract more broadly.

glimpse(census_tbl)
## Rows: 12,800,619
## Columns: 40
## $ YEAR     <int> 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, …
## $ SAMPLE   <int+lbl> 201601, 201601, 201601, 201601, 201601, 201601, 201601, …
## $ SERIAL   <dbl> 1, 1, 2, 3, 3, 3, 3, 4, 4, 5, 5, 6, 6, 7, 8, 8, 8, 8, 8, 9, …
## $ CBSERIAL <dbl> 64, 64, 80, 107, 107, 107, 107, 134, 134, 180, 180, 300, 300…
## $ HHWT     <dbl> 97, 97, 95, 159, 159, 159, 159, 122, 122, 20, 20, 226, 226, …
## $ CLUSTER  <dbl> 2e+12, 2e+12, 2e+12, 2e+12, 2e+12, 2e+12, 2e+12, 2e+12, 2e+1…
## $ STRATA   <dbl> 70001, 70001, 90001, 30201, 30201, 30201, 30201, 60001, 6000…
## $ GQ       <int+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ PERNUM   <dbl> 1, 2, 1, 1, 2, 3, 4, 1, 2, 1, 2, 1, 2, 1, 1, 2, 3, 4, 5, 1, …
## $ PERWT    <dbl> 98, 89, 95, 160, 154, 184, 260, 122, 113, 20, 18, 226, 155, …
## $ FAMSIZE  <int+lbl> 2, 2, 1, 4, 4, 4, 4, 2, 2, 2, 2, 2, 2, 1, 5, 5, 5, 5, 5,…
## $ SEX      <int+lbl> 2, 1, 2, 1, 2, 1, 2, 2, 1, 1, 2, 1, 2, 2, 1, 2, 2, 1, 2,…
## $ AGE      <int+lbl> 84, 84, 78, 46, 52, 20, 16, 66, 68, 64, 63, 25, 23, 35, …
## $ MARST    <int+lbl> 1, 1, 4, 1, 1, 6, 6, 1, 1, 1, 1, 6, 6, 4, 1, 1, 6, 6, 6,…
## $ MARRNO   <int+lbl> 1, 1, 2, 1, 3, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0,…
## $ RACE     <int+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2,…
## $ RACED    <int+lbl> 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 1…
## $ HISPAN   <int+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ HISPAND  <int+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ CITIZEN  <int+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ YRIMMIG  <int+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ YRSUSA1  <int+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ HCOVANY  <int+lbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,…
## $ EDUC     <int+lbl> 7, 10, 7, 8, 6, 6, 4, 6, 10, 6, 6, 7, 8, 10, 7, 6, 1, 0,…
## $ EDUCD    <int+lbl> 71, 101, 71, 81, 65, 63, 40, 63, 101, 65, 65, 71, 81, 10…
## $ SCHLTYPE <int+lbl> 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 0, 2,…
## $ EMPSTAT  <int+lbl> 3, 3, 1, 1, 3, 3, 3, 3, 3, 1, 3, 1, 1, 1, 1, 1, 0, 0, 0,…
## $ EMPSTATD <int+lbl> 30, 30, 10, 10, 30, 30, 30, 30, 30, 10, 30, 10, 10, 10, …
## $ OCC      <dbl+lbl>    0,    0, 5700, 1550,    0,    0,    0, 5700,    0, 62…
## $ OCC2010  <int+lbl> 9920, 9920, 5700, 1550, 9920, 9920, 9920, 5700, 9920, 62…
## $ IND1990  <int+lbl> 0, 0, 831, 882, 0, 0, 0, 633, 0, 60, 772, 831, 820, 842,…
## $ INDNAICS <chr> "0", "0", "622", "5413", "0", "0", "0", "443142", "0", "23",…
## $ UHRSWORK <int+lbl> 0, 0, 40, 40, 0, 0, 0, 0, 0, 40, 0, 24, 40, 40, 35, 27, …
## $ INCTOT   <dbl+lbl>   22510,   14100,   45800,   65000,       0,       0,   …
## $ FTOTINC  <dbl+lbl>  36610,  36610,  45800,  65000,  65000,  65000,  65000, …
## $ INCWAGE  <dbl+lbl>      0,      0,  27300,  65000,      0,      0,      0, …
## $ INCSS    <dbl+lbl>  8100, 14100, 18500,     0,     0,     0,     0,  3900, …
## $ INCWELFR <dbl+lbl>     0,     0,     0,     0,     0,     0,     0,     0, …
## $ INCRETIR <dbl+lbl>  10100,      0,      0,      0,      0,      0,      0, …
## $ POVERTY  <dbl+lbl> 254, 254, 402, 261, 261, 261, 261, 421, 421, 501, 501, 1…

Additional Notes

  • This data has 12,800,619 rows (big data!), so you may need to take steps not to crash your machine such as sampling the data initially or by working on the FAS cloud system.

  • You may also encounter problems pushing these large files to GitHub. See the large file storage options at Git LFS.

  • Some of the variables are ‘haven labeled’ see the haven CRAN files for more information on these.

Haven Labelled Variables

If we look closely at the glimpse of our census data above you will notice that there are few columns with the haven_labelled classification. This feature tends to be more useful for different software applications such as Stata, but while working in R we need to recast these variables into something more useful and convenient. The first few haven variables we see are SAMPLE, OCC(occupation), and IND(industry). Each of these have thousands of unique positions to recode, so ideally we want to be able to automate this.

To create a labeled variable, see haven::labelled(), to manually recode an existing case, we would employ dplyr::recode(). You will find that the labels given to the OCC and IND variables aren’t particularly useful - in this case it only tells you what set of occupation/industry terms survey takers were given to choose from rather than the actual term. IPUMS thankfully provides us with a key file to make these integers into something more useful. Using occupation as the example case, IPUMS does provide you with access to the precise occupation in string format - though it takes a bit more work. HERE you can find the conversion ‘crosswalk’ files that IPUMS provides via their website, and HERE (also linked in the IPUMS crosswalk page) are some more recent crosswalk files provided by the Census Bureau itself. These files are often formatted inconveniently, and can be different year-to-year, so be intentional when you are converting them to your preferred format.

With that said, there are cases where you may want to keep the integer levels of a variable irrespective of the labels attached. In this case, the workaround is much easier! Simply call as_factor() on your labeled variable and you’re able to further manipulate the data as you would normally.

class(census_tbl$OCC)[1]

# returns: "haven_labelled"

ex2 <- census_tbl %>%
  select(OCC) %>%
  mutate(occ = as_factor(OCC))

class(ex2$occ)

# returns: "factor"

There is much more to the IPUMS offerings than just U.S. population statistics. Below you can see what an extract from the M.E.P.S. (Medical Expenditure Panel Survey) or International repositories could look like. Always bear in mind that with an individual access account, you can create almost any configuration you need. While the names and certainly the available interpretations are different, the process by which we import this new tibble and transform any weird variables into our preferred format is largely the same.

ddi_meps <- read_ipums_ddi("IPUMS/extracts/meps/meps_00003.xml")
meps_tbl <- read_ipums_micro(ddi_meps)
## Use of data from IPUMS MEPS is subject to conditions including that users
## should cite the data appropriately. Use command `ipums_conditions()` for more
## details.
glimpse(meps_tbl)
## Rows: 752,639
## Columns: 50
## $ YEAR       <dbl> 1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996…
## $ PERNUM     <dbl> 1, 2, 3, 4, 1, 2, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 1, 2…
## $ DUID       <dbl> 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 7, 7…
## $ PID        <chr> "018", "025", "032", "049", "015", "022", "018", "025", "0…
## $ MEPSID     <chr> "100002018", "100002025", "100002032", "100002049", "10000…
## $ PANEL      <int+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ PSUANN     <dbl+lbl> 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 8, 8, …
## $ STRATANN   <dbl+lbl>  12,  12,  12,  12,  74,  74,  90,  90,  90,  90,  90,…
## $ PSUPLD     <dbl+lbl> 4, 4, 4, 4, 4, 4, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 2, 2, …
## $ STRATAPLD  <dbl+lbl>  23,  23,  23,  23,  43,  43,  44,  44,  44,  44,  44,…
## $ PANELYR    <int> 1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996…
## $ RELYR      <int+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ PERWEIGHT  <dbl> 15024, 14976, 18256, 12598, 6594, 11319, 7674, 4372, 9295,…
## $ SAQWEIGHT  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ DIABWEIGHT <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ AGE        <dbl+lbl> 31, 31,  7,  3, 74, 73, 54, 48, 27, 18, 16, 80, 39, 34…
## $ SEX        <int+lbl> 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 2, 1, 2, 1, 1, 2, 2, …
## $ MARSTAT    <int+lbl> 10, 10, 0, 0, 10, 10, 10, 10, 50, 50, 50, 20, 10, 10, …
## $ REGIONMEPS <int+lbl> 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, …
## $ FAMSIZE    <int+lbl> 4, 4, 4, 4, 2, 2, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 99, 99…
## $ RACEA      <int+lbl> 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,…
## $ YRSINUS    <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ INTERVLANG <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ EDUC       <int+lbl> 500, 401, 102, 998, 500, 109, 302, 302, 401, 202, 201,…
## $ STUDENT    <int+lbl> 8, 8, 8, 8, 8, 8, 8, 8, 8, 1, 8, 8, 8, 8, 8, 8, 8, 8, …
## $ INCTOT     <dbl+lbl>  24778,  25298,      0,      0,  94546,  35796,  29000…
## $ FTOTVAL    <dbl+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ INCWAGE    <dbl+lbl>  24440,  24960,      0,      0,      0,      0,  29000…
## $ INCBUS     <dbl+lbl>     0,     0,     0,     0, 56086,     0,     0,     0…
## $ INCUNEMP   <dbl+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ FOODSTYN   <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ POVLEV     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ FILESTATUS <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ FILETAXFRM <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ HEALTH     <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ USUALPL    <int+lbl> 2, 2, 2, 2, 2, 2, 1, 2, 9, 1, 1, 2, 2, 2, 2, 2, 1, 1, …
## $ HINOTCOV   <int+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, …
## $ HIPRIVATE  <int+lbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 1, 1, …
## $ HIMCARE    <int+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, …
## $ CANCEREV   <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ SMOKENOW   <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ EXPTOT     <dbl> 507, 124, 264, 7410, 2942, 3571, 290, 1238, 94, 55, 0, 709…
## $ CHGTOT     <dbl> 555, 75, 252, 7927, 4712, 2890, 290, 1279, 97, 113, 0, 842…
## $ EREXPTOT   <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ ERFCHGT    <int> 0, 0, 142, 0, 0, 0, 0, 463, 97, 0, 0, 299, 1700, 0, 0, 0, …
## $ HPTOTDIS   <int> 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0…
## $ HPTOTNIGHT <int+lbl> 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, …
## $ RXPRMEDSNO <int> 0, 2, 2, 10, 19, 44, 0, 5, 0, 0, 0, 6, 60, 11, 1, 0, 6, 0,…
## $ RXEXPTOT   <int> 0, 49, 12, 217, 595, 1783, 0, 91, 0, 0, 0, 176, 995, 259, …
## $ VSEXPTOT   <int> 0, 0, 0, 0, 0, 0, 290, 0, 0, 0, 0, 251, 0, 0, 0, 0, 305, 2…

International data is also available.

ddi_intntl <- read_ipums_ddi("IPUMS/extracts/international/ipumsi_00001.xml")
int_tbl <- read_ipums_micro(ddi_intntl)
## Use of data from IPUMS-International is subject to conditions including that
## users should cite the data appropriately. Use command `ipums_conditions()` for
## more details.
glimpse(int_tbl)
## Rows: 12,193,602
## Columns: 30
## $ COUNTRY    <int+lbl> 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40…
## $ YEAR       <int> 1981, 1981, 1981, 1981, 1981, 1981, 1981, 1981, 1981, 1981…
## $ SAMPLE     <int+lbl> 40198101, 40198101, 40198101, 40198101, 40198101, 4019…
## $ SERIAL     <dbl> 1000, 2000, 2000, 3000, 3000, 4000, 5000, 5000, 5000, 6000…
## $ HHWT       <dbl> 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10…
## $ URBAN      <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ REGIONW    <int+lbl> 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44…
## $ BUILTYR    <int+lbl> 1980, 1944, 1944, 1980, 1980, 1918, 1960, 1960, 1960, …
## $ HHTYPE     <int+lbl> 1, 2, 2, 2, 2, 1, 3, 3, 3, 2, 2, 3, 3, 3, 3, 6, 6, 6, …
## $ NFAMS      <int+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ NCOUPLES   <int+lbl> 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ PERNUM     <dbl> 1, 1, 2, 1, 2, 1, 1, 2, 3, 1, 2, 1, 2, 3, 4, 1, 2, 3, 4, 1…
## $ PERWT      <dbl> 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10…
## $ MARST      <int+lbl> 4, 2, 2, 2, 2, 4, 2, 2, 1, 2, 2, 2, 2, 1, 1, 2, 2, 2, …
## $ MARSTD     <int+lbl> 400, 210, 210, 210, 210, 400, 210, 210, 100, 210, 210,…
## $ BPLCOUNTRY <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ CITIZEN    <int+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ YRIMM      <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ RELIGION   <int+lbl> 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, …
## $ RELIGIOND  <int+lbl> 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6106, 6106, …
## $ RACE       <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ SCHOOL     <int+lbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, …
## $ LIT        <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ EDATTAIN   <int+lbl> 2, 2, 2, 3, 2, 2, 3, 2, 3, 3, 2, 3, 3, 0, 0, 3, 2, 3, …
## $ EDATTAIND  <int+lbl> 221, 221, 221, 321, 221, 221, 321, 221, 321, 321, 221,…
## $ EEDATTAIN  <int+lbl> 30, 30, 30, 40, 30, 30, 40, 30, 40, 40, 30, 40, 40, 0,…
## $ EMPSTAT    <int+lbl> 3, 3, 3, 1, 3, 3, 2, 1, 1, 3, 3, 1, 3, 3, 3, 3, 3, 1, …
## $ EMPSTATD   <int+lbl> 343, 343, 310, 100, 343, 343, 200, 100, 100, 343, 310,…
## $ LABFORCE   <int+lbl> 1, 1, 1, 2, 1, 1, 2, 2, 2, 1, 1, 2, 1, 9, 9, 1, 1, 2, …
## $ INCTOT     <dbl+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…

What can you do with all of this?

That is entirely up to you! Maybe you have an idea about the health care system and how policy change affects people in different areas. Maybe you want to know how different demographic segments of the country have changed and fared over time. Maybe you want to know how educational achievement has changed over the last 15 years. All of this can be answered, or at least approximated by data that you can access for free.

Here are some examples: