Is there a way to define in R which variables to pull from an ACS data set using API? Below is the code I use to pull the entire table for B01003 for all US zip codes, but this table only has Estimate and MOE, so it's not too large. But for other tables, I need to exclude some variables to make this work.
# Install required packagesif (!require(tidycensus)) install.packages("tidycensus")if (!require(dplyr)) install.packages("dplyr")
# Load the packageslibrary(tidycensus)library(dplyr)
# Set your Census API keycensus_api_key("your_api_key", install = TRUE, overwrite = TRUE)
# Define the years you're interested in (2011 to 2022)years <- 2011:2022
# Function to fetch population data for all U.S. ZCTAs for a specific yearget_population_data_us <- function(year) { tryCatch({ data <- get_acs(geography = "zip code tabulation area", year = year, survey = "acs5", table = "B01003", output = "wide") %>% mutate(Year = year) return(data) }, error = function(e) { message("Error with year ", year, ": ", e$message) return(NULL) })}
# Initialize an empty list to store the dataall_data_us <- list()
# Loop over each year, fetching data for all U.S. ZCTAsfor (year in years) { data <- get_population_data_us(year) if (!is.null(data)) { all_data_us[[as.character(year)]] <- data }}
# Combine the data from different yearscombined_data_us <- bind_rows(all_data_us)
# Save the combined data to a CSV file on your desktopwrite.csv(combined_data_us, "~/Desktop/Population_by_ZIP_All_US_2011_2022.csv", row.names = FALSE)
you don't even need to do that. all you need is a vector of variable IDs. optionally you can also use a named vector to rename the variable IDs to whatever you want in your output.
Here's a minimal…
From the documentation for the `tidycensus` R package (https://walker-data.com/tidycensus/articles/basic-usage.html#working-with-acs-data), it looks like you can pass a `variables` argument to the `get_acs` function:
vt <- get_acs(geography = "county", variables = c(medincome = "B19013_001"), state = "VT", year = 2021)