The motivation we develop Multiple Linear Regression is that Simple Linear Regression only contains one variable that could explaining .
That is, it will face a serious Omitted Variable Bias (OVB) problem.
OLS under OVB
The true model:
However, we incorrectly assume:
Where
From Simple Linear Regression in Matrix Notation, we could also use Matrix to denote the model.
The initial model is:
We have unknowns, and equations. Only when equations are more than unknowns, we could get the answers.
However, once we change the unknowns to:
We could denote the model to a much more simpler form:
recall that from Simple Linear Regression in Matrix Notation, the FOC of is exactly the same, thanks to the matrix:
The difference is that the is now a matrix, not
We now could derive :
We called “残差生成矩阵”
Thus,
Thus we prove that the residual vector is orthogonal to the matrix of explanatory variables , i.e. .
OLS part
We could then write :
We denote :
Try to figure out that ( and are orthogonal).
Hint:
- identity matrix
- orthogonal projection matrix
are both Idempotency.
The Least Squares is unbiased
See Unbiasedness.
To see why the OLS estimator is most efficient, see attached: Proof of Gauss-Markov Theorem.
Estimating the disturbance Variance
That is to check .
Recall that we have $$ \mathbf{e = (I_{k \times k} - X (X’X)^{-1}X’ )y}
We denote $M = I_{k \times k} - X (X'X)^{-1}X'$ so $e = My$. we already know that $MX = 0$ so $$ e = My = M(X\beta + \epsilon) = \underbrace{ MX\beta }_{ =0 } +M\epsilon = M\epsilonThus
Under the current assumptions, is a fixed matrix, and .
Hint: is idempotent. See Idempotency.
So we know that:
A simple proof: to see why the expectation of a matrix is equal to its Trace, see here
This shows that so that:
is unbiased estimator of . is called standard error of the regression.
is also the degree of freedom.
Omitting Relevant Variables
The source of omitted relevant variables: if we ignore some of the dependent variables that have significant relation with the explanatory variables, we could cause bias in the regressors.
Eg: we are focusing on the relation between wage and education, but we forget to include
Experienceas a explanatory variable.
Discussion
Suppose the true model is:
Suppose we omit , the model now becomes:
Now the restricted estimator is:
It shows that will be biased in a predictable direction: the bias is equal to the regression effect of predicted by a regression on .
We use to represent the corresponding restricted residuals.
Now we have a clear thinking that there must be a difference between and .
Comparison between residuals
Now we compare and :
First, we try to get :
如何形象理解 的意义? 其实简单来说, 是一个投影矩阵,它的作用就是将一个向量投影到 的列空间的正交补空间上。简单来说,会移除任何在 列空间内的成分。
After calculating the results, we could figure out:
Only when , the equal sign is satisfied.