Hypothesis Testing in Econometrics

We now turn to more general forms of hypothesis testing that are not usually reported automatically in basic regression output.

Wald Test

Consider:
- $G$: the number of linear restrictions;
- $\boldsymbol{\beta}$: the $(K+1) \times 1$ parameter vector;
- $\boldsymbol{h}$: a $G \times 1$ vector of constants;
- $\boldsymbol{R}$: a $G \times (K+1)$ matrix made up of row vectors $\boldsymbol{r}'_g$ of dimension $1 \times (K+1)$, for $g=1, 2, ..., G$;
- the multivariate model:
$$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_K x_K + u$$
Using these matrices and vectors, we can write hypothesis tests of the form: \begin{align} \text{H}_0: &\underset{G\times (K+1)}{\boldsymbol{R}} \underset{(K+1)\times 1}{\boldsymbol{\beta}} = \underset{G \times 1}{\boldsymbol{h}} \\ \text{H}_0: &\left[ \begin{matrix} \boldsymbol{r}'_1 \\ \boldsymbol{r}'_2 \\ \vdots \\ \boldsymbol{r}'_{G} \end{matrix} \right] \boldsymbol{\beta} = \left[ \begin{matrix} h_1 \\ h_2 \\ \vdots \\ h_G \end{matrix} \right] \\ \text{H}_0: &\left\{ \begin{matrix} \boldsymbol{r}'_1 \boldsymbol{\beta} = h_1 \\ \boldsymbol{r}'_2 \boldsymbol{\beta} = h_2 \\ \vdots \\ \boldsymbol{r}'_G \boldsymbol{\beta} = h_G \end{matrix} \right. \end{align}

A Single Linear Restriction

Consider the model: $$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + u$$
There are $K=2$ explanatory variables, hence 3 parameters.
One linear restriction implies $G=1$.
In this specific case, $$\boldsymbol{R} = \boldsymbol{r}'_1\ \implies\ \text{H}_0:\ \boldsymbol{r}'_1 \boldsymbol{\beta} = h_1 $$

Example 1: H$_0: \ \beta_1 = 4$

Here $h_1 = 4$
The vector $r'_1$ can be written as

$$ r'_1 = \left[ \begin{matrix} 0 & 1 & 0 \end{matrix} \right] $$

So the null hypothesis is $$\text{H}_0:\ \left[ \begin{matrix} 0 & 1 & 0 \end{matrix} \right] \left[ \begin{matrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{matrix} \right] = 4\ \iff\ \beta_1 = 4 $$

Example 2: H$_0: \ \beta_1 + \beta_2 = 2$

Here $h_1 = 2$
The vector $r'_1$ can be written as

$$ r'_1 = \left[ \begin{matrix} 0 & 1 & 1 \end{matrix} \right] $$

So the null hypothesis is $$\text{H}_0:\ \left[ \begin{matrix} 0 & 1 & 1 \end{matrix} \right] \left[ \begin{matrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{matrix} \right] = 2\ \iff\ \beta_1 + \beta_2 = 2 $$

Example 3: H$_0: \ \beta_1 = \beta_2$

Note that $$\beta_1 = \beta_2 \iff \beta_1 - \beta_2 = 0 $$
Therefore $h_1 = 0$
The vector $r'_1$ can be written as

$$ r'_1 = \left[ \begin{matrix} 0 & 1 & -1 \end{matrix} \right] $$

So the null hypothesis is $$\text{H}_0:\ \left[ \begin{matrix} 0 & 1 & -1 \end{matrix} \right] \left[ \begin{matrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{matrix} \right] = 0\ \iff\ \beta_1 - \beta_2 = 0 $$

Implementing It in R

Evaluating the Null with a Single Restriction

In the case of a single restriction, we assume that $$ \boldsymbol{r}'_1 \hat{\boldsymbol{\beta}} \sim N(\boldsymbol{r}'_1 \hat{\boldsymbol{\beta}};\ \boldsymbol{r}'_1 \boldsymbol{V_{\beta(x)} r_1})$$
The corresponding t statistic is $$ t = \frac{\boldsymbol{r}'_1 \hat{\boldsymbol{\beta}} - h_1}{\sqrt{\boldsymbol{r}'_1 S^2 (\boldsymbol{X}'\boldsymbol{X})^{-1} \boldsymbol{r}_1}} = \frac{\boldsymbol{r}'_1 \hat{\boldsymbol{\beta}} - h_1}{\sqrt{\boldsymbol{r}'_1 \boldsymbol{V_{\beta(x)}} \boldsymbol{r}_1}} $$
In small samples, we also need to assume that $ u|x \sim N(0; \sigma^2) $.
Choose a significance level $\alpha$ and reject the null if the t statistic falls outside the acceptance region.

(Continued) Example 7.5: The Log Wage Equation (Wooldridge, 2006)

Earlier, we estimated the following model:

\begin{align} \log(\text{wage}) = &\beta_0 + \beta_1 \text{female} + \beta_2 \text{married} + \delta_2 \text{female*married} + \beta_3 \text{educ} +\\ &\beta_4 \text{exper} + \beta_5 \text{exper}^2 + \beta_6 \text{tenure} + \beta_7 \text{tenure}^2 + u \end{align} where:

wage: average hourly wage
female: dummy equal to 1 for women and 0 for men
married: dummy equal to 1 for married individuals and 0 for unmarried individuals
female*married: interaction between the female and married dummies
educ: years of education
exper: years of experience (expersq = years squared)
tenure: years with the current employer (tenursq = years squared)

# Load the required dataset
data(wage1, package="wooldridge")

# Estimate the model
res_7.14 = lm(lwage ~ female*married + educ + exper + expersq + tenure + tenursq, data=wage1)
round( summary(res_7.14)$coef, 4 )

##                Estimate Std. Error t value Pr(>|t|)
## (Intercept)      0.3214     0.1000  3.2135   0.0014
## female          -0.1104     0.0557 -1.9797   0.0483
## married          0.2127     0.0554  3.8419   0.0001
## educ             0.0789     0.0067 11.7873   0.0000
## exper            0.0268     0.0052  5.1118   0.0000
## expersq         -0.0005     0.0001 -4.8471   0.0000
## tenure           0.0291     0.0068  4.3016   0.0000
## tenursq         -0.0005     0.0002 -2.3056   0.0215
## female:married  -0.3006     0.0718 -4.1885   0.0000

We already know that the effect of marriage differs between women and men because the coefficient on female:married ( $\delta_2$) is significant.
However, to assess whether the effect of marriage on women’s wages is itself significant, we need to test whether H$_0 :\ \beta_2 + \delta_2 = 0$.
Since there is only one restriction, the hypothesis can be evaluated with a t test:

# Extract regression objects
bhat = matrix(coef(res_7.14), ncol=1) # coefficients as a column vector
Vbhat = vcov(res_7.14) # variance-covariance matrix of the estimator
N = nrow(wage1) # number of observations
K = length(bhat) - 1 # number of covariates
uhat = residuals(res_7.14) # regression residuals

# Create the row vector that defines the restriction
r1prime = matrix(c(0, 0, 1, 0, 0, 0, 0, 0, 1), nrow=1) # restriction vector
h1 = 0 # constant under H0
G = 1 # number of restrictions

# Compute the t test
t = (r1prime %*% bhat - h1) / sqrt(r1prime %*% Vbhat %*% t(r1prime))
abs(t)

##          [,1]
## [1,] 1.679475

# Compute the 5% two-sided critical value
c = qt(1 - 0.05/2, df=N-K-1)
c

## [1] 1.964563

# Compute the p-value
p = pt(-abs(t), N-K-1) * 2
p

##            [,1]
## [1,] 0.09366368

Since $|t| < 2$ (an approximate 5% critical value), we do not reject the null hypothesis and conclude that the effect of marriage on women’s wages ( $\beta_2 + \delta_2$) is not statistically significant.
We can also evaluate the same restriction using the Wald test and the $\chi^2$ distribution with 1 degree of freedom, since there is only one restriction ( $G=1$).
Remember that the chi-squared test is right-tailed.

# Compute the Wald statistic
aux = r1prime %*% bhat - h1 # R beta - h
w = t(aux) %*% solve( r1prime %*% Vbhat %*% t(r1prime)) %*% aux
w

##          [,1]
## [1,] 2.820636

# Compute the 5% chi-squared critical value
c = qchisq(1-0.05, df=G)
c

## [1] 3.841459

# Compute the p-value of w
p = 1 - pchisq(w, df=G)
p

##            [,1]
## [1,] 0.09305951

Multiple Linear Restrictions

Example 4: H$_0: \ \beta_1 = 0\ \text{ e }\ \beta_1 + \beta_2 = 2$

Here $h_1 = 0 \text{ and } h_2 = 2$
The vectors $r'_1 \text{ and } r'_2$ can be written as

$$ r'_1 = \left[ \begin{matrix} 0 & 1 & 0 \end{matrix} \right] \quad \text{e} \quad r'_2 = \left[ \begin{matrix} 0 & 1 & 1 \end{matrix} \right] $$

Therefore, $\boldsymbol{R}$ is $$ \boldsymbol{R} = \left[ \begin{matrix} \boldsymbol{r}'_1 \\ \boldsymbol{r}'_2 \end{matrix} \right] = \left[ \begin{matrix} 0 & 1 & 0 \\ 0 & 1 & 1 \end{matrix} \right] $$
So the null hypothesis is $$\text{H}_0:\ \boldsymbol{R} \boldsymbol{\beta} = \left[ \begin{matrix} 0 & 1 & 0 \\ 0 & 1 & 1 \end{matrix} \right] \left[ \begin{matrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{matrix} \right] = \left[ \begin{matrix} h_1 \\ h_2 \end{matrix} \right]\ \iff\ \text{H}_0:\ \left\{ \begin{matrix} \beta_1 &= 0 \\ \beta_1 + \beta_2 &= 2 \end{matrix} \right. $$

Evaluating the Null with Multiple Restrictions

In the case of G restrictions, we assume that $$ \boldsymbol{R} \hat{\boldsymbol{\beta}} \sim N(\boldsymbol{R} \hat{\boldsymbol{\beta}};\ \sigma^2 \boldsymbol{R} \boldsymbol{V_{\beta(x)} R'})$$
The Wald statistic is $$ w(\hat{\boldsymbol{\beta}}) = \left[ \boldsymbol{R}\hat{\boldsymbol{\beta}} - \boldsymbol{h} \right]' \left[ \boldsymbol{R V_{\hat{\beta}} R}' \right]^{-1} \left[ \boldsymbol{R}\hat{\boldsymbol{\beta}} - \boldsymbol{h} \right]\ \sim\ \chi^2_{(G)} $$
Choose a significance level $\alpha$ and reject the null if the statistic $ w(\hat{\boldsymbol{\beta}})$ exceeds the critical value.

Implementing It in R

As an example, we use the mlb1 dataset with statistics for Major League Baseball players (Wooldridge, 2006, Section 4.5).
We want to estimate the model: \begin{align} \log(\text{salary}) = &\beta_0 + \beta_1. \text{years} + \beta_2. \text{gameyr} + \beta_3. \text{bavg} + \\ &\beta_4 .\text{hrunsyr} + \beta_5. \text{rbisyr} + u \end{align}

where:

log(salary): log of 1993 salary
years: years playing in Major League Baseball
gamesyr: average number of games per year
bavg: career batting average
hrunsyr: average home runs per year
rbisyr: average runs batted in per year

data(mlb1, package="wooldridge")

# Estimate the full (unrestricted) model
resMLB = lm(log(salary) ~ years + gamesyr + bavg + hrunsyr + rbisyr, data=mlb1)
round(summary(resMLB)$coef, 5) # estimated coefficients

##             Estimate Std. Error  t value Pr(>|t|)
## (Intercept) 11.19242    0.28882 38.75184  0.00000
## years        0.06886    0.01211  5.68430  0.00000
## gamesyr      0.01255    0.00265  4.74244  0.00000
## bavg         0.00098    0.00110  0.88681  0.37579
## hrunsyr      0.01443    0.01606  0.89864  0.36947
## rbisyr       0.01077    0.00717  1.50046  0.13440

Notice that bavg, hrunsyr, and rbisyr are individually statistically insignificant.
We want to evaluate whether they are jointly significant, that is, $$ \text{H}_0:\ \left\{ \begin{matrix} \beta_3 = 0 \\ \beta_4 = 0 \\ \beta_5 = 0\end{matrix} \right. $$
Therefore, $$ \boldsymbol{R} = \left[ \begin{matrix} \boldsymbol{r}'_1 \\ \boldsymbol{r}'_2 \\ \boldsymbol{r}'_3 \end{matrix} \right] = \left[ \begin{matrix} 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \end{matrix} \right] $$

Using the `Wald.test()` Function

# Extract the variance-covariance matrix of the estimator
Vbhat = vcov(resMLB)
round(Vbhat, 5)

##             (Intercept)    years  gamesyr     bavg  hrunsyr   rbisyr
## (Intercept)     0.08342  0.00001 -0.00027 -0.00029 -0.00148  0.00082
## years           0.00001  0.00015 -0.00001  0.00000 -0.00002  0.00001
## gamesyr        -0.00027 -0.00001  0.00001  0.00000  0.00002 -0.00002
## bavg           -0.00029  0.00000  0.00000  0.00000  0.00000  0.00000
## hrunsyr        -0.00148 -0.00002  0.00002  0.00000  0.00026 -0.00010
## rbisyr          0.00082  0.00001 -0.00002  0.00000 -0.00010  0.00005

# Compute the Wald statistic
# install.packages("aod") # install the required package
aod::wald.test(Sigma = Vbhat, # variance-covariance matrix
               b = coef(resMLB), # estimates
               Terms = 4:6, # positions of the parameters being tested
               H0 = c(0, 0, 0) # null hypothesis (all equal to zero)
               )

## Wald test:
## ----------
## 
## Chi-squared test:
## X2 = 28.7, df = 3, P(> X2) = 2.7e-06

# Wald test for the effect of root
# aod::wald.test(b = coef(resMLB), Sigma = vcov(resMLB), L=R, H0=h)

We reject the null hypothesis and conclude that the parameters $\beta_3, \beta_4 \text{ and } \beta_5$ are jointly significant.

Computing It “By Hand”

Estimate the model

# Create the variable log_salary
mlb1$log_salary = log(mlb1$salary)
name_y = "log_salary"
names_X = c("years", "gamesyr", "bavg", "hrunsyr", "rbisyr")

# Create vector y
y = as.matrix(mlb1[,name_y]) # convert data-frame column into a matrix

# Create the covariate matrix X with a leading column of 1s
X = as.matrix( cbind( const=1, mlb1[,names_X] ) ) # bind 1s to the covariates

# Retrieve N and K
N = nrow(mlb1)
K = ncol(X) - 1

# Estimate the model
bhat = solve( t(X) %*% X ) %*% t(X) %*% y
round(bhat, 5)

##             [,1]
## const   11.19242
## years    0.06886
## gamesyr  0.01255
## bavg     0.00098
## hrunsyr  0.01443
## rbisyr   0.01077

# Compute residuals
uhat = y - X %*% bhat

# Error-term variance
S2 = as.numeric( t(uhat) %*% uhat / (N-K-1) )

# Variance-covariance matrix of the estimator
Vbhat = S2 * solve( t(X) %*% X )
round(Vbhat, 5)

##            const    years  gamesyr     bavg  hrunsyr   rbisyr
## const    0.08342  0.00001 -0.00027 -0.00029 -0.00148  0.00082
## years    0.00001  0.00015 -0.00001  0.00000 -0.00002  0.00001
## gamesyr -0.00027 -0.00001  0.00001  0.00000  0.00002 -0.00002
## bavg    -0.00029  0.00000  0.00000  0.00000  0.00000  0.00000
## hrunsyr -0.00148 -0.00002  0.00002  0.00000  0.00026 -0.00010
## rbisyr   0.00082  0.00001 -0.00002  0.00000 -0.00010  0.00005

Now create the restriction matrix:

# Number of restrictions
G = 3

# Restriction matrix
R = matrix(c(0, 0, 0, 1, 0, 0,
             0, 0, 0, 0, 1, 0,
             0, 0, 0, 0, 0, 1),
           nrow=G, byrow=TRUE)
R

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    0    0    0    1    0    0
## [2,]    0    0    0    0    1    0
## [3,]    0    0    0    0    0    1

# Vector of constants h
h = matrix(c(0, 0, 0),
           nrow=3, ncol=1)
h

##      [,1]
## [1,]    0
## [2,]    0
## [3,]    0

Remember that matrix() fills by column by default.
Here it is more intuitive to fill the restrictions by row, since each row represents one restriction. That is why we used byrow=TRUE.
The Wald statistic is then $$ w(\hat{\boldsymbol{\beta}}) = \left[ \boldsymbol{R}\hat{\boldsymbol{\beta}} - \boldsymbol{h} \right]' \left[ \boldsymbol{R V_{\hat{\beta}} R}' \right]^{-1} \left[ \boldsymbol{R}\hat{\boldsymbol{\beta}} - \boldsymbol{h} \right]\ \sim\ \chi^2_{(G)} $$

# Wald statistic
w = t( R %*% bhat - h ) %*% solve( R %*% Vbhat %*% t(R) ) %*% (R %*% bhat - h)
w

##          [,1]
## [1,] 28.65076

# Find the 5% chi-squared critical value
alpha = 0.05
c = qchisq(1-alpha, df=G)
c

## [1] 7.814728

# Compare the Wald statistic with the critical value
w > c

##      [,1]
## [1,] TRUE

Since the Wald statistic (= 28.65) is greater than the critical value (= 7.81), we reject the joint null that all tested parameters are equal to zero.
We can also evaluate the p-value from the Wald statistic:

1 - pchisq(w, df=G)

##              [,1]
## [1,] 2.651604e-06

Because it is below 5%, we reject the null hypothesis.

F Test

Section 4.3 of Heiss (2020)
Another way to evaluate multiple restrictions is with the F test.
Here we estimate two models:
- unrestricted: includes all explanatory variables of interest;
- restricted: excludes some variables.
The F test compares the residual sum of squares (RSS) or the R$^2$ of the two models.
The intuition is straightforward: if the excluded variables are jointly significant, the unrestricted model should fit the data better.

The F statistic can be computed as

$$ F = \frac{\text{SQR}_{r} - \text{SQR}_{ur}}{\text{SQR}_{ur}}.\frac{N-K-1}{G} = \frac{R^2_{ur} - R^2_{r}}{1 - R^2_{ur}}.\frac{N-K-1}{G} \tag{4.10} $$

where ur denotes the unrestricted model and r denotes the restricted model.

We then evaluate the statistic using a right-tailed F test:

Implementing It in R

We continue using the mlb1 dataset from Section 4.5 of Wooldridge (2006).
The unrestricted model, with all explanatory variables, is \begin{align} \log(\text{salary}) = &\beta_0 + \beta_1. \text{years} + \beta_2. \text{gameyr} + \beta_3. \text{bavg} + \\ &\beta_4 .\text{hrunsyr} + \beta_5. \text{rbisyr} + u \end{align}
The restricted model, which excludes the tested variables, is \begin{align} \log(\text{salary}) = &\beta_0 + \beta_1. \text{years} + \beta_2. \text{gameyr} + u \end{align}

Using `linearHypothesis()`

We can run the F test with the linearHypothesis() function from the car package.
In addition to the estimated model object, we provide a character vector listing the restrictions:

# Estimate the unrestricted model
res.ur = lm(log(salary) ~ years + gamesyr + bavg + hrunsyr + rbisyr, data=mlb1)

# Create a vector with the restrictions
myH0 = c("bavg = 0", "hrunsyr = 0", "rbisyr = 0")

# Apply the F test
# install.packages("car") # install the required package
car::linearHypothesis(res.ur, myH0)

## Linear hypothesis test
## 
## Hypothesis:
## bavg = 0
## hrunsyr = 0
## rbisyr = 0
## 
## Model 1: restricted model
## Model 2: log(salary) ~ years + gamesyr + bavg + hrunsyr + rbisyr
## 
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    350 198.31                                  
## 2    347 183.19  3    15.125 9.5503 4.474e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Notice that in the second row, which corresponds to the unrestricted model, the residual sum of squares (RSS) is smaller than in the restricted model. Hence the larger set of covariates provides more explanatory power, as expected.
To evaluate the null hypothesis ( $\beta_3 = \beta_4 = \beta_5 = 0$), we can either compare the F statistic with a critical value or compare the p-value with the chosen significance level.
From the p-value criterion above, we reject the null hypothesis.
The 5% critical value can be obtained with:

qf(1-0.05, G, N-K-1)

## [1] 2.630641

Since 9.55 > 2.63, we reject the null hypothesis.

Computing It “By Hand”

Here we use the unrestricted and restricted models estimated with lm() so that we do not need to redo the full estimation procedure twice.

# Estimate the unrestricted model
res.ur = lm(log(salary) ~ years + gamesyr + bavg + hrunsyr + rbisyr, data=mlb1)

# Estimate the restricted model
res.r = lm(log(salary) ~ years + gamesyr, data=mlb1)

# Extract the R2 from each fitted model
r2.ur = summary(res.ur)$r.squared
r2.ur

## [1] 0.6278028

r2.r = summary(res.r)$r.squared
r2.r

## [1] 0.5970716

# Compute the F statistic
F = ( r2.ur - r2.r ) / (1 - r2.ur) * (N-K-1) /  G
F

## [1] 9.550254

# p-value of the F test
1 - pf(F, G, N-K-1)

## [1] 4.473708e-06

Proceed to Numerical Optimization

Hypothesis Testing in Econometrics

Wald Test

A Single Linear Restriction

Example 1: H$_0: \ \beta_1 = 4$

Example 2: H$_0: \ \beta_1 + \beta_2 = 2$

Example 3: H$_0: \ \beta_1 = \beta_2$

Implementing It in R

Evaluating the Null with a Single Restriction

(Continued) Example 7.5: The Log Wage Equation (Wooldridge, 2006)

Multiple Linear Restrictions

Example 4: H$_0: \ \beta_1 = 0\ \text{ e }\ \beta_1 + \beta_2 = 2$

Evaluating the Null with Multiple Restrictions

Implementing It in R

Using the Wald.test() Function

Computing It “By Hand”

F Test

Implementing It in R

Using linearHypothesis()

Computing It “By Hand”

Using the `Wald.test()` Function

Using `linearHypothesis()`