Question 1
The fertil2.csv data set contains information on a sample of 4,361 women living in the Republic of Botswana in the late 1980’s. The variables in that data set that you will use in the analysis are as follows:
• children – number of living children
• educ – years of education
• age – age in years
• electric =1 if the woman has electricity in her home; = 0 otherwise
• tv =1 if the woman has a TV in her home; = 0 otherwise
• knowmeth =1 if the woman is aware of birth control methods; = 0 otherwise
a. Read the fertil2.csv dataset into R using the code below, after modifying the file path name appropriately. The fertil2.csv file does not include column headers and it does have some missing values. Thus, we need to use the header=FALSE and na.strings = “.” options in the read.csv function, as shown in the code below.
R Code Chunk # 1
The first line of code reads in fertil2.csv: this csv does not have headers and has missing values for some variables and observations. Thus, we use the 'header' and 'na.strings' options.
The other lines give names to each of the columns in the resulting fertil dataframe.
fertil <- read.csv("/Users/bxggse/Documents/RData/fertil2.csv", header=FALSE, na.strings = ".")
colnames(fertil)<-c("mnthborn", "yearborn", "age", "electric", "radio", "tv", "bicycle", "educ", "ceb", "agefbrth", "children", "knowmeth", "usemeth", "monthfm", "yearfm", "agefm", "idlnchld", "heduc", "agesq", "urban", "urbeduc", "spirit", "protest","catholic", "frsthalf", "educ0", "evermarr")
b. Use R code with the fertil2.csv dataset to estimate the parameters of the following regression model. Report the resulting parameter estimates.
children=β0+β1⋅educi+β2⋅agei+β3⋅age2i+β4⋅electrici+β5⋅tvi+β6⋅knowmethi+ui
c. Precisely interpret each of the statistically significant parameter estimates from part b. Use the 5 percent level of significance to determine which parameter estimates are statistically significant.1
d. Using your intuition for how the relevant variables are likely to be correlated with one another, explain how you expect that the tv parameter estimate is affected by the omission of a control (independent variable) for income in the regression model. Explain your intuition in detail. (To put this in context, it might be helpful to know that at the time that these data were generated, televisions were considered to be luxury items for most households in Botswana.)
e. Consider the sign of the parameter estimate that you found for knowmeth. Does the sign of this estimate make intuitive sense? Explain your answer.
f. The sample regression equation for a simple regression model with children as the dependent variable and yrseduc as the independent variable is represented by the black line in Figure 1 below. On that figure, label clearly the following features:
i. The predicted number of children for Individual A
ii. The residual for individual A. Be sure to make clear whether this residual is positive or negative.2
iii. The part of (childreni−¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯children) that is explained by the model, where ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯children is the sample average of the children variable.
Figure 1: Question 1 f
——————————————————————————————————————
Question 2
a. Use the following expression for the Ordinary Least Squares (OLS) slope estimator to clearly explain the concept of an unbiased estimator.
^β1=β1+∑ni=1(xi−¯x)ui∑ni=1(xi−¯x)2
b. Referring to the two restrictions on the residuals that are used to derive the Ordinary Least Squares (OLS) estimators, explain which of the two lines (Line A or Line B) in Figure 2 below is more likely to be the sample regression line for the plotted sample data. Be clear about which restriction each line does seem to satisfy and which it does not seem to satisfy. ¯¯¯x and ¯¯¯y denote the sample mean values for x and y, respectively.
Figure 2: Question 2 b
- A parameter estimate is statistically significant at the 5 percent level if its probability value (“Pr(>|t|)”) is less than or equal to 0.05.↩︎
- Here are a few methods that you can use to submit your responses to Question 1 f: you can print out this page (with Figure 1) of the exam, write-in your response, take a photo of that marked-up page, and submit that as part of your Exam 1 submission. Or, you can provide a clear, written description of the relevant features of this figure. Or, you can use the PDF version of this exam and use the basic annotation tools in your PDF reader to mark-in and label the relevant features.↩︎
Sample Solution