The motivation we develop Multiple Linear Regression is that Simple Linear Regression only contains one variable that could explaining .

That is, it will face a serious Omitted Variable Bias (OVB) problem.

OLS under OVB

The true model:

However, we incorrectly assume:

Where


From Simple Linear Regression in Matrix Notation, we could also use Matrix to denote the model.

The initial model is:

We have unknowns, and equations. Only when equations are more than unknowns, we could get the answers.

However, once we change the unknowns to:

We could denote the model to a much more simpler form:

recall that from Simple Linear Regression in Matrix Notation, the FOC of is exactly the same, thanks to the matrix:

The difference is that the is now a matrix, not

We now could derive :

We called “残差生成矩阵”

Thus,

Thus we prove that the residual vector is orthogonal to the matrix of explanatory variables , i.e. .

OLS part

Ordinary Least Squares

We could then write :

We denote :

Try to figure out that ( and are orthogonal).

Hint:

  1. identity matrix
  2. orthogonal projection matrix

are both Idempotency.

The Least Squares is unbiased

See Unbiasedness.

To see why the OLS estimator is most efficient, see attached: Proof of Gauss-Markov Theorem.

Estimating the disturbance Variance

That is to check .

Recall that we have $$ \mathbf{e = (I_{k \times k} - X (X’X)^{-1}X’ )y}

We denote $M = I_{k \times k} - X (X'X)^{-1}X'$ so $e = My$. we already know that $MX = 0$ so $$ e = My = M(X\beta + \epsilon) = \underbrace{ MX\beta }_{ =0 } +M\epsilon = M\epsilon

Thus

Under the current assumptions, is a fixed matrix, and .

Hint: is idempotent. See Idempotency.

So we know that:

A simple proof: to see why the expectation of a matrix is equal to its Trace, see here

This shows that so that:

is unbiased estimator of . is called standard error of the regression.

is also the degree of freedom.

Omitting Relevant Variables

The source of omitted relevant variables: if we ignore some of the dependent variables that have significant relation with the explanatory variables, we could cause bias in the regressors.

Eg: we are focusing on the relation between wage and education, but we forget to include Experience as a explanatory variable.

Discussion

Suppose the true model is:

Suppose we omit , the model now becomes:

Now the restricted estimator is:

It shows that will be biased in a predictable direction: the bias is equal to the regression effect of predicted by a regression on .

We use to represent the corresponding restricted residuals.

Now we have a clear thinking that there must be a difference between and .

Comparison between residuals

Now we compare and :

First, we try to get :

如何形象理解 的意义? 其实简单来说, 是一个投影矩阵,它的作用就是将一个向量投影到 的列空间的正交补空间上。简单来说,会移除任何在 列空间内的成分。

After calculating the results, we could figure out:

Only when , the equal sign is satisfied.