# Variance Estimation Testing Using A Simple Linear Specification

#### Mark Fredrickson, Ben Hansen

#### 2023-03-30

Source:`vignettes/LinearModelVarianceEstimation.Rmd`

`LinearModelVarianceEstimation.Rmd`

The goal of this document is to consider a simple model that can be arranged as two separate regressions and relate the variance of the second stage coefficients to those that are fit in the entire model. With this information, we should be able to create tests of our variance estimation routines.

## Combined model

Consider two sets of background variables and treatment assignments, $x$ and $z$ with dimension $p_x$ and $p_z$, respectively, to be treated as nonrandom or fixed by conditioning. While we allow correlation between the variables, we will assume that the matrix $\begin{pmatrix} X & Z \end{pmatrix}' \begin{pmatrix} X & Z \end{pmatrix}$ has an inverse (I think most of these results would work with generalized inverses, but if does not exist it makes the dimesionality of the coefficients a little tricky).

$Y = \alpha'x + \beta'z + \epsilon, \quad E(\epsilon) = 0, \text{Var}(\epsilon) = \sigma^2$

Standard results give us

$\begin{pmatrix} \hat \alpha_1 \\ \hat \beta_1 \end{pmatrix} = \begin{pmatrix} X'X & X'Z \\ Z'X & Z'Z \end{pmatrix}^{-1} \begin{pmatrix} X'y \\ Z'y \end{pmatrix}$ and that the variance of the estimators is $\sigma^2 \begin{pmatrix} X'X & X'Z \\ Z'X & Z'Z \end{pmatrix}^{-1}$ Results for blocked matrices (e.g., The Matrix Cookbook) give the variance for just $\hat \beta$ as $\text{Var}(\hat \beta_1) = \sigma^2 \left[Z'Z - Z' X (X'X)^{-1} X'Z\right]^{-1}$ Write $H = I - X (X'X)^{-1} X'$, matrix that creates the residuals of the regression on $x$ alone. Then, $\text{Var}(\hat \beta_1) = \sigma^2 \left[Z'H Z\right]^{-1}$

The same $\hat \beta_1$ arises as the coefficient of a regression of $HY$ on $HZ$ (by the so-called Frisch-Waugh-Lovell Theorem). In consequence, $\hat \beta_1 = (Z'HZ)^{-1}(Z'Hy) .$

## Two regressions

Let $\hat \alpha_2$ and $\hat \beta_2$ be the estimators from first regression on $Y$ on $x$ alone and then regressing $Y - \hat \alpha_2'x$, that is $YH$ with $H$ as defined above, on $Z$. Standard results give

$\hat \alpha_2 = (X'X)^{-1} X' y$ and $\hat \beta_2 = (Z'Z)^{-1} Z' (y - X \hat \alpha_2) = (Z'Z)^{-1} Z'(y - X (X'X)^{-1} X' y) = (Z'Z)^{-1} Z' H y$

As both $Z$ and $X$ are taken to be nonrandom, we may pass between $\hat \beta_1$ and $\hat \beta_2$ via nonrandom linear transformations, as follows: $\hat \beta_1 = (Z'HZ)^{-1}(Z'Z) \hat \beta_2;\quad \hat \beta_2 = (Z'Z)^{-1}(Z'HZ) \hat \beta_1 .$ Accordingly $\operatorname{Cov}(\hat \beta_1) = (Z'HZ)^{-1}(Z'Z) \operatorname{Cov}(\hat \beta_{2})(Z'Z) (Z'HZ)^{-1}$ which may serve as a basis for tests.