Calculating percentage of individuals in a census tract with high school diploma or equivalent using tidycensus

Hello everyone,

New here!

I am analyzing ACS data for the state of Kentucky. I am trying to figure out how to calculate percentage of individuals in a census tract who have graduated high school using tidycensus. I realize the educational attainment metrics are complexed by age/sex/race, so any help/pointers in the right direction would be appreciated!

- A

Top Replies

  • I think the table you may be looking for is B15003, which contains educational attainment for the population 25 and older. Using tidycensus, you can specify the variables from that table B15003_017 through B15003_25 that represent the population with a HS diploma and above. To calculate the percentage, you also will need to grab the total population 25 and older to use as the denominator, which is B15003_001.

    This code should get you there - also note the margin of error calculations are included below as well.

    library(tidycensus)
    library(dplyr)
    ky_ed <- get_acs(
    geography = "tract",
    variables = paste0("B15003_0", 17:25), # hs diploma and above variables
    summary_var = "B15003_001", # pop 25 years and older - denominator
     state = "KY"
    )

    ky_ed %>%
    group_by(GEOID, NAME) %>%
    summarize(
    n_hs_above = sum(estimate),
    n_hs_above_moe = moe_sum(moe, estimate),
    n_pop_over_25 = summary_est[1],
    n_pop_over_25_moe = summary_moe[1]
    ) %>%
    ungroup() %>%
    mutate(
    pct_hs_above = n_hs_above / n_pop_over_25,
    pct_hs_above_moe = moe_prop(n_pop_over_25, n_pop_over_25,
    n_hs_above_moe, n_pop_over_25_moe)
    )

    #> # A tibble: 1,115 x 8
    #> GEOID NAME n_hs_above n_hs_above_moe n_pop_over_25 n_pop_over_25_m~
    #> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
    #> 1 210019~ Census Trac~ 958 174. 1210 176
    #> 2 210019~ Census Trac~ 1255 204. 1465 156
    #> 3 210019~ Census Trac~ 1750 283. 2460 272
    #> 4 210019~ Census Trac~ 2395 335. 3107 268
    #> 5 210019~ Census Trac~ 1987 295. 2323 334
    #> 6 210019~ Census Trac~ 1275 196. 1537 191
    #> 7 210019~ Census Trac~ 882 176. 1097 178
    #> 8 210039~ Census Trac~ 2039 271. 2358 215
    #> 9 210039~ Census Trac~ 1521 212. 1850 211
    #> 10 210039~ Census Trac~ 2861 448. 3558 302
    #> # ... with 1,105 more rows, and 2 more variables: pct_hs_above <dbl>,
    #> # pct_hs_above_moe <dbl
  • Thank you sir! I wrote some really ridiculous code using the wrong table that got me the wrong results... this is super helpful!

    - A