/
Linear Statistical Models Linear Statistical Models

Linear Statistical Models - PowerPoint Presentation

LifeOfTheParty
LifeOfTheParty . @LifeOfTheParty
Follow
342 views
Uploaded On 2022-07-28

Linear Statistical Models - PPT Presentation

as a First Statistics Course for Math Majors George W Cobb Mount Holyoke College GCobbMtHolyokeedu CAUSE Webinar October 12 2010 Overview A Goals for a first stat course for math majors ID: 930302

gauss theorem markov modeling theorem gauss modeling markov challenge ols space variable estimator math assumptions distribution linear data lurking

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Linear Statistical Models" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Linear Statistical Modelsas a First Statistics Coursefor Math Majors

George W. Cobb

Mount Holyoke College

GCobb@MtHolyoke.edu

CAUSE Webinar

October 12, 2010

Slide2

OverviewA. Goals for a first stat course for math majors

B. Example of a modeling challenge

C. Examples of methodological challenges

D. Some important tensions

E. Two geometries

F. Gauss-Markov Theorem

G. Conclusion

Slide3

A. Goals for a First Statistics Course for Math Majors :

1. Minimize prerequisites

2. Teach what we want students to learn

- Data analysis and modeling

- Methodological challenges

- Current practice

3. Appeal to the mathematical mind

- Mathematical substance

- Abstraction as process

- Math for its own sake

Slide4

B. A Modeling Challenge:Pattern only – How many groups?

Slide5

B. A Modeling Challenge:Pattern plus context: two lines?

Slide6

B. A Modeling Challenge:Lurking variable 1: The five solid dots

Slide7

B. A Modeling Challenge:Lurking variable 1: The five solid dots

Slide8

B. A Modeling Challenge:Lurking variable 2 – confounding

Slide9

B. A Modeling Challenge:Lurking variable 2 – confounding

Slide10

B. A Modeling Challenge:Lurking variable 2 -- confounding

Slide11

B. The Modeling Challenge: summary

Do the data provide evidence of discrimination?

Alternative explanations

based on classical economics

Additional variables:

percent unemployed in the subject

percent non-academic jobs in the subject

median non-academic salary in the subject

Which model(s) are most useful ?

Slide12

C. Methodological ChallengesHow to “solve” an inconsistent linear system?

Stigler, 1990:

The History of Statistics

How to measure goodness of fit?

(Invariance issues)

How to identify influential points?

Exploratory plots

How

to measure

multicolinearity

?

Note that none of these require any assumptions about probability distributions

Slide13

D. Some Important Tensions

Data analysis v. methodological challenges

Abstraction: Top down v. bottom up

Math as tool v. math as aesthetic object

Structure by dimension v. structure by assumptions

Distribution Number of covariates

assumptions One Two Many

A. None

B. Moments

C. Normality

5. Two geometries

Slide14

D4. Structure by assumptions

No distribution assumptions about errors

1. Inconsistent linear systems; OLS Theorem

2. Measuring fit and correlation

3. Measuring influence and the Hat matrix

4. Measuring

multicolinearity

B. Moment assumptions: E{

e

}=

0

,

Var

{

e

}=

s

2

I

1. Moment Theorem: EV and

Var

for OLS estimators

2. Variance Estimation Theorem: E{MSE} =

s

2

3. Gauss-Markov

Theorm

: OLS = BLUE

C. Normality assumption: e ~

N

Slide15

D4. Structure by assumptions

Normality assumption: e ~

N

1. Herschel-Maxwell Theorem

2. Distribution of OLS estimators

3. t-distribution and confidence intervals for

b

j

4. Chi-square distribution and confidence interval for

s

2

5. F-distribution and nested F-test

Slide16

E. Two Geometries:the Crystal Problem (Tom Moore, Primus, 1992)

y

1

=

b

+

e

1

y

2

= 2

b

+

e

2

Slide17

E. Two Geometries:

Crystal Problem

Individual Space Variable Space

Point = Case Vector = Variable

Axis = Variable Axis = Case

Slide18

F

.

Gauss-Markov Theorem:

Crystal Example:

y

1

=

b

+

e

1

,

y

2

= 2

b

+

e

2

LINEAR:

estimator = a

1

y

1

+ a

2

y

2

UNBIASED:

a

1

b

+

a22b = b, i.e. a1 + 2a2 = 1.Ex: y

1

= 1

y

1

+ 0y2 a = (1,

0)TEx: (1/2)y

2

= 1

y

1

+

0

y

2

a

=

(0, 1/2)

T

Ex: (

1/5)

y

1

+

(2/5)

y

2

a

=

(1/5, 2/5)

T

BEST:

SD

2

(

a

T

y

) =

s

2

|

a

|

2

best = shortest

a

THEOREM: OLS = BLUE

Slide19

F. Gauss-Markov Theorem:Coefficient Space for the Crystal Problem

Slide20

F. Gauss-Markov Theorem:Estimator y1

= (1,0)(

y

1

,

y

2

)

T

B.

Slide21

F. Gauss-Markov Theorem: Estimator y2

/2 = (0,1/2)(

y

1

,

y

2

)

T

B.

Slide22

F. Gauss-Markov Theorem: OLS estimator = (1/5,2/5)(y1

,

y

2

)

T

B.

Slide23

F. Gauss-Markov Theorem:The Set of Linear Unbiased Estimators

B.

Slide24

F. Gauss-Markov Theorem:LUEs form a translate of error space

B.

Slide25

F. Gauss-Markov Theorem:OLS estimator lies in model space

B.

Slide26

F. Gauss-Markov Theorem:Four Steps plus Pythagoras

1. OLS estimator is an LUE

2. LUEs of

β

j are a flat set in

n

-space

3. LUEs of 0 = error space

4. OLS estimator lies in model space

Slide27

G. Conclusion:A Least Squares Course can be

1. Accessible

- Requires only Calc. I and matrix algebra

2. A good vehicle for teaching data modeling

3. A sequence of methodological challenges

4. Mathematically attractive

- Mathematical substance

- Abstraction as process

- Math as tool

and

for its own sake

5. A direct route to current practice

- Generalized linear models

- Correlated data, time-to-event data

- Hierarchical

Bayes