Description

Empirical Exercises 1
Using the data set Growth described in Empirical Exercise E4.1, but excluding the data for Malta, carry out the following exercises.

Construct a table that shows the sample mean, standard deviation, and minimum and maximum values for the series Growth, TradeShare, YearsSchool, Oil, Rev_Coups, Assassinations, and RGDP60. Include the appropriate units for all entries. [Hint: Some initial R-code is written below. Complete the remaining part.]
library(readxl)
library(lmtest)

Loading required package: zoo

#

Attaching package: ‘zoo’

The following objects are masked from ‘package:base’:

#

as.Date, as.Date.numeric

library(sandwich)

growth.dat = read_excel(“Growth.xlsx”)

drop Malta from the data

growth.dat = growth.dat[-65, ]

Calculate mean

g.mean = apply(growth.dat[,-1], 2, mean)

Calculate standard deviation

Calculate standard deviation

Calculate standard deviation

Run a regression of Growth on TradeShare, YearsSchool, Rev_Coups, Assassinations, and RGDP60. What is the value of the coefficient on Rev_Coups? Interpret the value of this coefficient. Is it large or small in a real-world sense? Answer: [Your answer will be here.]

Use the regression to predict the average annual growth rate for a country that has average values for all regressors.

Repeat (c), but now assume that the country’s value for TradeShare is one standard deviation above the mean.

Why is Oil omitted from the regression? What would happen if it were included?

Empirical Exercises 2
Use Earnings_and_Heights.xlsx. In the empirical exercises on earning and height in Chapters 4 and 5, you estimated a relatively large and statistically significant effect of a worker’s height on his or her earnings. One explanation for this result is omitted variable bias: Height is correlated with an omitted factor that affects earnings. For example, Case and Paxson (2008) suggest that cognitive ability (or intelligence) is the omitted factor. The mechanism they describe is straightforward: Poor nutrition and other harmful environmental factors in utero and in early childhood have, on average, deleterious effects on both cognitive and physical development. Cognitive ability affects earnings later in life and thus is an omitted variable in the regression.

Suppose that the mechanism described above is correct. Explain how this leads to omitted variable bias in the OLS regression of Earnings on Height. Does the bias lead the estimated slope to be too large or too small?
If the mechanism described above is correct, the estimated effect of height on earnings should disappear if a variable measuring cognitive ability is included in the regression. Unfortunately, there isn’t a direct measure of cognitive ability in the data set, but the data set does include years of education for each individual. Because students with higher cognitive ability are more likely to attend school longer, years of education might serve as a control variable for cognitive ability; in this case, including education in the regression will eliminate, or at least attenuate, the omitted variable bias problem.

Use the years of education variable (educ) to construct four indicator variables for whether a worker has less that a high school diploma (LT_HS=1 if (educ<12), 0 otherwise), a high school diploma (HS=1 if (educ=12), 0 otherwise), some college (Some_Col=1 if (12 < educ < 16), 0 otherwise), or a bachelor’s degree or higher (College = 1 if (educ \ge 16), 0 otherwise).[Hint: Complete the remaining parts of the R code]

I will show how to generate LT_HS. You need to generate other bianary variables in a similar way.

library(readxl)
library(lmtest)
library(sandwich)
library(car)

Loading required package: carData

h.dat = read_excel(“Earnings_and_Height.xlsx”)
attach(h.dat)

lt_hs = as.numeric(educ < 12)
Focusing first on women only, run a regression of (1) Earnings on Height and (2) Earnings on Height, including LT_HS, HS, and Some_Col as control variables.
Compare the estimated coefficient on Height in regressions (1) and (2). Is there a large change in the coefficient? Has it changed in a way consistent with the cognitive ability explanation? Explain.

The regression omits the control variable College. Why?

Test the joint null hypothesis that the coefficients on the education variables are equal to 0.

Discuss the values of the estimated coefficients on LT_HS, HS, and Some_Col. (Each of the estimated coefficients is negative, and the coefficient on LT_HS is more negative than the coefficient on HS, which in turn is more negative than the coefficient on Some_Col. Why? What do the coefficients measure?)

Empirical Exercises 3
Use the data set cps12.xlsx to answer the following questions.

Run a regression of average hourly earnings (AHE) on age(Age). What is the estimated intercept? What is the estimated slope?

Run a regression of AHE on Age, gender (Female), and education (Bachelor). What is the estimated effect of Age on earnings? Construct a 95% confidence interval for the coefficient on Age in the regression.

Are the results from the regression in (b) substantively different from the results in (a) regarding the effects of Age and on &AHE*? Does the regression in (a) seem to suffer from omitted variable bias?

Bob is a 26-year-old male worker with a high school diploma. Predict Bob’s earnings using the estimated regression in (b). Alexis is a 30-year-old female worker with a college degree. Predict Alexis’s earnings using the regression.

Are gender and education determinants of earnings? Test the null hypothesis that females can be deleted from the regression. Test the null hypothesis that Bachelor can be deleted from the regression. Test the null hypothesis that both Female and Bachelor can be deleted from the regression.

Sample Solution

This question has been answered.

Get Answer