Is there a way to define in R which variables to pull from an ACS data set using API? Below is the code I use to pull the entire table for B01003 for all US zip codes, but this table only has Estimate and MOE, so it's not too large. But for other tables, I need to exclude some variables to make this work.
# Install required packagesif (!require(tidycensus)) install.packages("tidycensus")if (!require(dplyr)) install.packages("dplyr")
# Load the packageslibrary(tidycensus)library(dplyr)
# Set your Census API keycensus_api_key("your_api_key", install = TRUE, overwrite = TRUE)
# Define the years you're interested in (2011 to 2022)years <- 2011:2022
# Function to fetch population data for all U.S. ZCTAs for a specific yearget_population_data_us <- function(year) { tryCatch({ data <- get_acs(geography = "zip code tabulation area", year = year, survey = "acs5", table = "B01003", output = "wide") %>% mutate(Year = year) return(data) }, error = function(e) { message("Error with year ", year, ": ", e$message) return(NULL) })}
# Initialize an empty list to store the dataall_data_us <- list()
# Loop over each year, fetching data for all U.S. ZCTAsfor (year in years) { data <- get_population_data_us(year) if (!is.null(data)) { all_data_us[[as.character(year)]] <- data }}
# Combine the data from different yearscombined_data_us <- bind_rows(all_data_us)
# Save the combined data to a CSV file on your desktopwrite.csv(combined_data_us, "~/Desktop/Population_by_ZIP_All_US_2011_2022.csv", row.names = FALSE)
you don't even need to do that. all you need is a vector of variable IDs. optionally you can also use a named vector to rename the variable IDs to whatever you want in your output.
Here's a minimal…
From the documentation for the `tidycensus` R package (https://walker-data.com/tidycensus/articles/basic-usage.html#working-with-acs-data), it looks like you can pass a `variables` argument to the `get_acs` function:
vt <- get_acs(geography = "county", variables = c(medincome = "B19013_001"), state = "VT", year = 2021)
Rather than using the "tables" argument, you can use the "variables" argument, as an other replier suggested. Here's some similar code I used to pull the total population ACS variable over a series of 5 years for all metro areas.
# Which Years? -----------------------------------------------------years <- 2017:2022names(years) <- years# Get metro area population change --------------------------------metro_populations <- map_dfr(years,~{ get_acs( geography = "cbsa", variables = "B01003_001", year = .x, survey = "acs5" )}, .id = "year")
If you want more than one variable, you can put them in an object just like the "years" object here. Then, reference that object in the "variables" argument.
Here's a minimal example to try: walker-data.com/.../an-introduction-to-tidycensus.html