Econometrics I
Professor William Greene
Stern School of Business
Department of Economics
5-1/34
Part 5: Regression Algebra and Fit
Gauss-Markov Theorem
A theorem of Gauss and Markov: Least Squares is
the minimum variance linear unbiased
estimator (MVLUE)
n
=    i1 v ii
1. Linear estimator
2. Unbiased: E[b|X] = β
Theorem: Var[b*|X] – Var[b|X] is nonnegative
definite for any other linear and unbiased
estimator b* that is not equal to b.
Definition: b is efficient in this class of estimators.
5-2/34
Part 5: Regression Algebra and Fit
Implications of Gauss-Markov
Theorem: Var[b*|X] – Var[b|X] is nonnegative
definite for any other linear and unbiased
estimator b* that is not equal to b. Implies:
 bk = the kth particular element of b.
Var[bk|X] = the kth diagonal element of Var[b|X]
Var[bk|X] < Var[bk*|X] for each coefficient.
 cb = any linear combination of the elements of
b. Var[cb|X] < Var[cb*|X] for any nonzero c
and b* that is not equal to b.

5-3/34
Part 5: Regression Algebra and Fit
Summary: Finite Sample Properties of b
Unbiased: E[b]=
 Variance: Var[b|X] = 2(XX)-1
 Efficiency: Gauss-Markov Theorem with all
implications
 Distribution: Under normality,
b|X ~ N[, 2(XX)-1
(Without normality, the distribution is generally
unknown.)

5-4/34
Part 5: Regression Algebra and Fit
Comparação de modelos
Podemos comparar modelos sem usar testes
estatsticos, mas sim medidas conhecidas como
criterios de informação que traduzem a
qualidade de ajustamento de um modelo.
 Para estes indicadores a variável principal
acaba por ser uma medida do valor absoluto
dos erros.

5-5/34
Part 5: Regression Algebra and Fit
Medida de Ajuste
R2 = bXM0Xb/yM0y
e'e
Regression Variation
 1 N

2
Total Variation
(y

y)
 i1 i
(Very Important Result.) R2 is bounded by zero and one only if:
(a) There is a constant term in X and
(b) The line is computed by linear least squares.
5-6/34
Part 5: Regression Algebra and Fit
Comparing fits of regressions
Make sure the denominator in R2 is the same - i.e.,
same left hand side variable. Example, linear
vs. loglinear. Loglinear will almost always
appear to fit better because taking logs reduces
variation.
5-7/34
Part 5: Regression Algebra and Fit
5-8/34
Part 5: Regression Algebra and Fit
Adjusted R Squared

Adjusted R2 (for degrees of freedom)
2
R = 1 - [(n-1)/(n-K)](1 - R2)

includes a penalty for variables that don’t add much fit.
Can fall when a variable is added to the equation.
R
5-9/34
2
Part 5: Regression Algebra and Fit
Adjusted R2
What is being adjusted?
The penalty for using up degrees of freedom.
R 2 = 1 - [ee/(n – K)]/[yM0y/(n-1)] uses the ratio of two
‘unbiased’ estimators. Is the ratio unbiased?
R 2 = 1 – [(n-1)/(n-K)(1 – R2)]
Will R 2 rise when a variable is added to the regression?
R 2 is higher with z than without z if and only if the t ratio
on z is in the regression when it is added is larger than
one in absolute value.
5-10/34
Part 5: Regression Algebra and Fit
Full Regression (Without PD)
---------------------------------------------------------------------Ordinary
least squares regression ............
LHS=G
Mean
=
226.09444
Standard deviation
=
50.59182
Number of observs.
=
36
Model size
Parameters
=
9
Degrees of freedom
=
27
Residuals
Sum of squares
=
596.68995
Standard error of e =
4.70102
Fit
R-squared
=
.99334 <**********
Adjusted R-squared
=
.99137 <**********
Info criter. LogAmemiya Prd. Crt. =
3.31870 <**********
Akaike Info. Criter. =
3.30788 <**********
Model test
F[ 8,
27] (prob) =
503.3(.0000)
--------+------------------------------------------------------------Variable| Coefficient
Standard Error t-ratio P[|T|>t]
Mean of X
--------+------------------------------------------------------------Constant|
-8220.38**
3629.309
-2.265
.0317
PG|
-26.8313***
5.76403
-4.655
.0001
2.31661
Y|
.02214***
.00711
3.116
.0043
9232.86
PNC|
36.2027
21.54563
1.680
.1044
1.67078
PUC|
-6.23235
5.01098
-1.244
.2243
2.34364
PPT|
9.35681
8.94549
1.046
.3048
2.74486
PN|
53.5879*
30.61384
1.750
.0914
2.08511
PS|
-65.4897***
23.58819
-2.776
.0099
2.36898
YEAR|
4.18510**
1.87283
2.235
.0339
1977.50
--------+-------------------------------------------------------------
5-11/34
Part 5: Regression Algebra and Fit
PD added to the model. R2 rises, Adj. R2 falls
---------------------------------------------------------------------Ordinary
least squares regression ............
LHS=G
Mean
=
226.09444
Standard deviation
=
50.59182
Number of observs.
=
36
Model size
Parameters
=
10
Degrees of freedom
=
26
Residuals
Sum of squares
=
594.54206
Standard error of e =
4.78195
Fit
R-squared
=
.99336 Was 0.99334
Adjusted R-squared
=
.99107 Was 0.99137
--------+------------------------------------------------------------Variable| Coefficient
Standard Error t-ratio P[|T|>t]
Mean of X
--------+------------------------------------------------------------Constant|
-7916.51**
3822.602
-2.071
.0484
PG|
-26.8077***
5.86376
-4.572
.0001
2.31661
Y|
.02231***
.00725
3.077
.0049
9232.86
PNC|
30.0618
29.69543
1.012
.3207
1.67078
PUC|
-7.44699
6.45668
-1.153
.2592
2.34364
PPT|
9.05542
9.15246
.989
.3316
2.74486
PD|
11.8023
38.50913
.306
.7617
1.65056 (NOTE LOW t ratio)
PN|
47.3306
37.23680
1.271
.2150
2.08511
PS|
-60.6202**
28.77798
-2.106
.0450
2.36898
YEAR|
4.02861*
1.97231
2.043
.0514
1977.50
--------+-------------------------------------------------------------
5-12/34
Part 5: Regression Algebra and Fit
Outras medidas de Ajuste
Para alternativas não aninhadas
Inclui penalidade de grau de liberdade.
Critério de Informação
 Schwarz (BIC): n log(ee) + k(log(n))
Akaike (AIC): n log(ee) + 2k
Quando se quer decidir entre dois modelos não
aninhados, o melhor é o que produz o menor
valor do critério. A penalidade no BIC de incluir
algo não relevante é maior que no AIC.

5-13/34
Part 5: Regression Algebra and Fit
Outras medidas de Ajuste




Tanto o AIC quanto o BIC aumentam conforme SQR
aumenta.
Além disso, ambos critérios penalizam modelos com
muitas variáveis sendo que valores menores de AIC
e BIC são preferíveis.
Como modelos com mais variáveis tendem a
produzir menor SQR mas usam mais parâmetros, a
melhor escolha é balancear o ajuste com a
quantidade de variáveis.
A penalidade no BIC de incluir algo não relevante é
maior que no AIC.
5-14/34
Part 5: Regression Algebra and Fit
Multicolinearidade
5-15/34
Part 5: Regression Algebra and Fit
Formas funcionais
5-16/34
Part 5: Regression Algebra and Fit
Specification and Functional Form:
Nonlinearity
Population
Estimators
y  1  2 x  3 x 2  4 z  
yˆ  b1  b2 x  b3 x 2  b4 z
E[ y | x, z ]
x 
 2  23 x
x
ˆ x  b2  2b3 x
Estimator of the variance of ˆ x
Est.Var[ˆ x ]  Var[b2 ]  4 x 2Var[b3 ]  4 xCov[b2 , b3 ]
5-17/34
Part 5: Regression Algebra and Fit
Log Income Equation
---------------------------------------------------------------------Ordinary
least squares regression ............
LHS=LOGY
Mean
=
-1.15746
Estimated Cov[b1,b2]
Standard deviation
=
.49149
Number of observs.
=
27322
Model size
Parameters
=
7
Degrees of freedom
=
27315
Residuals
Sum of squares
=
5462.03686
Standard error of e =
.44717
Fit
R-squared
=
.17237
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------AGE|
.06225***
.00213
29.189
.0000
43.5272
AGESQ|
-.00074***
.242482D-04
-30.576
.0000
2022.99
Constant|
-3.19130***
.04567
-69.884
.0000
MARRIED|
.32153***
.00703
45.767
.0000
.75869
HHKIDS|
-.11134***
.00655
-17.002
.0000
.40272
FEMALE|
-.00491
.00552
-.889
.3739
.47881
EDUC|
.05542***
.00120
46.050
.0000
11.3202
--------+------------------------------------------------------------Average Age = 43.5272. Estimated Partial effect = .066225 – 2(.00074)43.5272 = .00018.
Estimated Variance 4.54799e-6 + 4(43.5272)2(5.87973e-10) + 4(43.5272)(-5.1285e-8)
= 7.4755086e-08.
Estimated standard error = .00027341.
5-18/34
Part 5: Regression Algebra and Fit
Specification and Functional Form:
Interaction Effect
Population
y  1  2 x  3 z  4 xz  
Estimators
yˆ  b1  b2 x  b3 z  b4 xz
E[ y | x, z ]
x 
 2  4 z
ˆ x  b2  b4 z
x
Estimator of the variance of ˆ x
Est.Var[ˆ x ]  Var[b2 ]  z 2Var[b4 ]  2 zCov[b2 , b4 ]
5-19/34
Part 5: Regression Algebra and Fit
Interaction Effect
---------------------------------------------------------------------Ordinary
least squares regression ............
LHS=LOGY
Mean
=
-1.15746
Standard deviation
=
.49149
Number of observs.
=
27322
Model size
Parameters
=
4
Degrees of freedom
=
27318
Residuals
Sum of squares
=
6540.45988
Standard error of e =
.48931
Fit
R-squared
=
.00896
Adjusted R-squared
=
.00885
Model test
F[ 3, 27318] (prob) =
82.4(.0000)
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------Constant|
-1.22592***
.01605
-76.376
.0000
AGE|
.00227***
.00036
6.240
.0000
43.5272
FEMALE|
.21239***
.02363
8.987
.0000
.47881
AGE_FEM|
-.00620***
.00052
-11.819
.0000
21.2960
--------+------------------------------------------------------------Do women earn more than men (in this sample?) The +.21239 coefficient on FEMALE would
suggest so. But, the female “difference” is +.21239 - .00620*Age. At average Age, the
effect is .21239 - .00620(43.5272) = -.05748.
5-20/34
Part 5: Regression Algebra and Fit
5-21/34
Part 5: Regression Algebra and Fit
5-22/34
Part 5: Regression Algebra and Fit
Quebra estrutural
5-23/34
Part 5: Regression Algebra and Fit
Linear Restrictions
Context: How do linear restrictions affect the properties of
the least squares estimator?
Model:
y = X + 
Theory (information) R - q = 0
Restricted least squares estimator:
b* = b - (XX)-1R[R(XX)-1R]-1(Rb - q)
Expected value: E[b*] =  - (XX)-1R[R(XX)-1R]-1(Rb - q)
Variance:
2(XX)-1 - 2 (XX)-1R[R(XX)-1R]-1 R(XX)-1
= Var[b] – a nonnegative definite matrix < Var[b]
Implication: (As before) nonsample information reduces the
variance of the estimator.
5-24/34
Part 5: Regression Algebra and Fit
Interpretation
Case 1: Theory is correct: R - q = 0 (the
restrictions do hold).
b* is unbiased
Var[b*] is smaller than Var[b]
How do we know this?
Case 2: Theory is incorrect: R - q  0 (the
restrictions do not hold).
b* is biased – what does this mean?
Var[b*] is still smaller than Var[b]
5-25/34
Part 5: Regression Algebra and Fit
Linear Least Squares Subject to Restrictions
Restrictions: Theory imposes certain restrictions on parameters.
Some common applications
 Dropping variables from the equation = certain coefficients in b
forced to equal 0. (Probably the most common testing situation. “Is
a certain variable significant?”)
 Adding up conditions: Sums of certain coefficients must equal fixed
values. Adding up conditions in demand systems. Constant returns
to scale in production functions.
 Equality restrictions: Certain coefficients must equal other
coefficients. Using real vs. nominal variables in equations.
General formulation for linear restrictions:
Minimize the sum of squares, ee, subject to the linear constraint
Rb = q.
5-26/34
Part 5: Regression Algebra and Fit
Restricted Least Squares
In practice, restrictions can usually be imposed by solving them out.
1. Force a coefficient to equal zero. Drop the variable from the equation

Problem: Minimize for 1 , 2 , 3
n
2
(y


x


x


x
)
subject to 3  0
i
1
i1
2
i2
3
i
3
i 1
Solution: Minimize for 1 , 2  i 1 (yi  1x i1  2 x i2 ) 2
n
2. Adding up restriction. Impose 1 + 2 + 3 = 1. Strategy: 3 =1  1  2 .

=
Solution: Minimize for 1 , 2
n
2
(
y


x


x

(1




)x
)
i
1
i1
2
i2
1
2
i3
i 1
n
i 1
[(yi  x i3 )  1 (x i1  x i3 )  2 (x i2  x i3 )]2
3. Equality restriction. Impose 3  2
Minimize for 1 , 2 , 3

n
i 1
(yi  1x i1  2 x i2  3 x i3 ) 2 subject to 3  2
Solution: Minimize for 1 , 2  i 1[yi  1x i1  2 (x i2  x i3 )]2
n
In each case, least squares using transformations of the data.
5-27/34
Part 5: Regression Algebra and Fit
Restricted Least Squares Solution
General Approach: Programming Problem
Minimize for  L = (y - X)(y - X)
subject to R = q
Each row of R is the K coefficients in a restriction.
There are J restrictions: J rows
 3 = 0: R = [0,0,1,0,…]
q = (0).
 2 = 3: R = [0,1,-1,0,…] q = (0)
 2 = 0, 3 = 0: R = 0,1,0,0,…
q= 0
0,0,1,0,…
0

5-28/34
Part 5: Regression Algebra and Fit
Solution Strategy
Quadratic program: Minimize quadratic criterion
subject to linear restrictions
 All restrictions are binding
 Solve using Lagrangean formulation
 Minimize over (,)
L* = (y - X)(y - X) + 2(R-q)
(The 2 is for convenience – see below.)

5-29/34
Part 5: Regression Algebra and Fit
Restricted LS Solution
Necessary Conditions
L *
 2X(y  X)  2R  0

L *
 2(R  q)
0

Divide everything by 2. Collect in a matrix form
 XX R     Xy 
ˆ = A 1w

or
A

=
w.
Solution

  

 R
0      q 

Does not rely on full rank of X.
Relies on column rank of A = K  J.
5-30/34
Part 5: Regression Algebra and Fit
Restricted Least Squares
If X has full rank, there is a partitioned solution for * and *
β * = b - (XX) 1R [R ( XX) 1 R ](Rb  q)
 *  [ R ( XX) 1 R ]( Rb  q)
where b  the simple least squares coefficients, b = (XX) 1Xy.
There are cases in which X does not have full rank. E.g.,
X = [1,x1 ,x 2 ,d1 ,d 2 ,d3 ,d 4 ] where d1 ,d 2 ,d3 ,d 4 are a complete set
of dummy variables with coefficients a1 ,a 2 ,a 3 ,a 4 . Unrestricted b
cannot be computed. Restricted LS with a1 +a 2 +a 3 +a 4 = 0 can
be computed.
5-31/34
Part 5: Regression Algebra and Fit
Aspects of Restricted LS
1. b* = b - Cm where
m = the “discrepancy vector” Rb - q.
Note what happens if m = 0.
What does m = 0 mean?
2. =[R(XX)-1R]-1(Rb - q) = [R(XX)-1R]-1m.
When does  = 0. What does this mean?
3. Combining results: b* = b - (XX)-1R.
How could b* = b?
5-32/34
Part 5: Regression Algebra and Fit
Restrictions and the Criterion Function
Assume full rank X case. (The usual case.)
b = (XX) 1Xy uniquely minimizes (y -X)(y-X) = .
(y -Xb)(y -Xb) < (y -Xb*)(y -Xb*) for any b*  b.
Imposing restrictions cannot improve the criterion value.
It follows that R 2 * < R 2 . Restrictions must degrade the fit.
5-33/34
Part 5: Regression Algebra and Fit