American Community Survey (ACS)

Preliminaries:
In your web browser, navigate to: https://www.eia.gov/maps/layer_info-m.php. Once there, download these two shapefiles: Petroleum Product Pipelines and Natural Gas Interstate and Intrastate Pipelines.

These come as zipped directories, which you should unzip after downloading. You will be using the information in these files below.

0) Load the American Community Survey (ACS), and Show how you loaded the 2019 ACS 5-year average census tract data on Race and Hispanic/Latino ethnicity for the following Colorado counties: Denver, Boulder, Arapahoe, Adams, Jefferson, Broomfield, and Douglas.

1) Show how you calculate the proportion Black/African-American and proportion Hispanic in each census tract. Display these proportions for the first ten census tracts.

2) Show how you remove the race/ethnicity-specific estimate and margin of error columns. Display the first ten rows of the sf object before and after removing these two columns.

3) Show how you calculate the area and population density of each tract. Display the first ten rows of these two columns.

4) Show how you subset the tract data to retain only tracts with population density above 1000 per square MILE. These are considered “urban” tracts. Display the number of rows of the sf object before and after you subset.

5) Show how you delete the original census tract data and free up space in your computer’s memory/RAM.

6) Show how you find the bounding box around the urban tracts.

7) Show how you cut down the two pipeline data sets to the area of the bounding box. Display the number of rows of the two pipeline sf objects before and after subsetting.

8) Show how you calculate the number of petroleum product pipelines that overlap each urban tract (hint: you may need to use the rowSums() function). Display the top ten rows of this sf object column.

9) Show how you calculate the number of natural gas pipelines that overlap each urban tract (hint: you may need to use the rowSums() function). Display the top ten rows of this sf object column.

10) Show how you calculate the total length of petroleum product pipelines in each urban tract. Display the top ten rows of this sf object column.

11) Show how you calculate the total length of natural gas pipelines in each urban tract. Display the top ten rows of this sf object column.

12) Show how you combine the pipeline count and length data with the race/ethnicity proportion data (hint: make sure you don’t drop any urban tracts). Show the top ten rows of the merged sf object.

13) Show how you save and then reload this combined data set, including the tract geometry.

14) Show how you use ggplot2 to create point plots of pipeline counts/lengths vs race/ethnicity proportions (8 plots in all). Include a best-fit linear regression line in each plot. Display the plot.

15) Show how you regress petroleum product pipeline length in urban tracts against proportion Black/African-American and proportion Hispanic controlling for tract area, and show the results including p-values and confidence intervals (this should be a single regression model that includes three predictor variables). Explain your interpretation of the results.

16) Show how you regress natural gas pipeline length in urban tracts against proportion Black/African-American and proportion Hispanic controlling for tract area, and show the results including p-values and confidence intervals (this should be a single regression model that includes three predictor variables). Explain your interpretation of the results.

Sample Solution

ACED ESSAYS