/
AMS 572 Group #2 Multiple Linear Regression AMS 572 Group #2 Multiple Linear Regression

AMS 572 Group #2 Multiple Linear Regression - PowerPoint Presentation

alida-meadow
alida-meadow . @alida-meadow
Follow
382 views
Uploaded On 2018-02-23

AMS 572 Group #2 Multiple Linear Regression - PPT Presentation

1 2 3 Outline Jinmiao FuIntroduction and History Ning MaEstablish and Fitting of the model Ruoyu ZhouMultiple Regression Model in Matrix Notation Dawei Xu and Yuan ShangStatistical Inference for Multiple Regression ID: 634423

regression model variables predictor model regression predictor variables variable test statistical multiple predictors method inference muticollinearity sas squares stepwise

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "AMS 572 Group #2 Multiple Linear Regress..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

AMS 572 Group #2

Multiple Linear Regression

1Slide2

2Slide3

3Slide4

Outline

Jinmiao Fu—Introduction and History Ning Ma—Establish and Fitting of the modelRuoyu Zhou—Multiple Regression Model in Matrix NotationDawei Xu

and Yuan Shang—Statistical Inference for Multiple Regression

Yu Mu—Regression Diagnostics

Chen Wang and

Tianyu

Lu—Topics in Regression Modeling

Tian

Feng—Variable Selection MethodsHua Mo—Chapter Summary and modern application

4Slide5

Introduction

Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. Every value of the independent variable x is associated with a value of the dependent variable 5Slide6

Example: The relationship between an adult’s health and his/her daily eating amount of wheat, vegetable and meat.

6Slide7

History

7Slide8

Correlation coefficient

Method of momentsPearson's system of continuous curves.Karl Pearson (1857–1936)

Lawyer,

Germanist

, eugenicist, mathematician and statistician

Chi distance,

P-value

Statistical hypothesis testing theory

, statistical decision theory.

Pearson's chi-square test

, Principal component analysis.

8Slide9

Sir

Francis Galton FRS (16 February 1822 – 17 January 1911)Anthropology and polymathyDoctoral students Karl Pearson

In the late 1860s, Galton conceived the standard deviation.

He created the statistical concept of

correlation

and

also discovered the properties of the

bivariate

normal distribution and its relationship to regression analysis

9Slide10

Galton invented the use of

the regression line (Bulmer 2003, p. 184), and was the first to describe and explain the common phenomenon of regression toward the mean, which he first observed in his experiments on the size of the seeds of successive generations of sweet peas.10Slide11

The publication by

his cousin Charles Darwin of The Origin of Species in 1859 was an event that changed Galton's life. He came to be gripped by the work, especially the first chapter on "Variation under Domestication" concerning the breeding of domestic animals.

11Slide12

Adrien

-Marie Legendre (18 September 1752 – 10 January 1833) was a French mathematician. He made important contributions to statistics, number theory, abstract algebra and mathematical analysis.He developed the

least squares method

, which has broad application in linear regression, signal processing, statistics, and curve fitting.

12Slide13

Johann Carl Friedrich Gauss

(30 April 1777 – 23 February 1855) was a German mathematician and scientist who contributed significantly to many fields, including number theory, statistics, analysis, differential geometry, geodesy, geophysics, electrostatics, astronomy and optics.

13Slide14

Gauss, who was 23 at the time, heard about the problem and tackled it. After three months of intense work, he predicted a position for Ceres in December 1801—just about a year after its first sighting—and this turned out to be accurate within a half-degree. In the process, he so streamlined the cumbersome mathematics of 18th century orbital prediction that his work—published a few years later as

Theory of Celestial Movement—remains a cornerstone of astronomical computation.14Slide15

It

introduced the Gaussian gravitational constant, and contained an influential treatment of the method of least squares, a procedure used in all sciences to this day to minimize the impact of measurement error. Gauss was able to prove the method in 1809 under the assumption of normally distributed errors (see Gauss–Markov theorem; see also Gaussian). The method had been described earlier by Adrien

-Marie Legendre in 1805, but

Gauss claimed that he had been using it since 1795.

15Slide16

Sir

Ronald Aylmer Fisher FRS (17 February 1890 – 29 July 1962) was an English statistician, evolutionary biologist, eugenicist and geneticist. He was described by Anders Hald as "a genius who almost single-handedly created the foundations for modern statistical science," and Richard Dawkins described him as "the greatest of Darwin's successors".

16Slide17

In addition to "analysis of variance", Fisher invented the technique of

maximum likelihood and originated the concepts of sufficiency, ancillarity, Fisher's linear discriminator and Fisher information.

17Slide18

Establish and Fitting

of the Model

18Slide19

Probabilistic Model

: the observed value of the random variable(

r.v

.)

unknown model parameters

depends on fixed predictor values

n

is the number of observations.

~

N (0, )

i.i.d

,

i

=1,2,3

,…,n

19Slide20

Fitting the model

LS provides estimates of the unknown model parameters,

which

minimizes

Q

(j=1,2,

,k)

20Slide21

Tire tread wear vs. mileage (example11.1 in textbook)

Mileage (in 1000 miles)Groove Depth (in mils)

0

394.33

4

329.50

8

291.00

12

255.17

16

229.33

20

204.83

24

179.00

28

163.83

32

150.33

The table gives the measurements on the groove of one tire after every 4000 miles.

Our Goal: to build a model to find the

relation between

the

mileage

and

groove depth

of the tire.

21Slide22

Data example

;Input mile depth @@;Sqmile=mile*mile;Datalines;0 394.33 4 329.5 8 291 12 255.17 16 229.33 20 204.83 24 179 28 163.83 32 150.33;run;

Proc

reg

data=example;

Model Depth= mile

sqmile

;

Run;

SAS code

----fitting the model

22Slide23

Depth=386.26-12.77mile+0.172sqmile

23Slide24

Goodness of Fit of the Model

Residuals

are the fitted values

total

sum of squares (SST):

regression

sum of squares (SSR):

An overall measure of the goodness of fit

Error

sum of squares (SSE):

24Slide25

Multiple Regression Model

In Matrix Notation

25Slide26

1. Transform the Formulas to Matrix Notation

26Slide27

The first column of X

denotes the constant term

(We can treat this as

with

)

 

27Slide28

Finally let

where

the (k+1)

1 vectors of unknown parameters

LS estimates

 

28Slide29

Formula

becomes

Simultaneously, the linear equation

are changed to

Solve this equation respect to

and we get

(if the inverse of the matrix

exists.)

 

-1

29Slide30

2. Example 11.2 (Tire Wear Data: Quadratic Fit Using Hand Calculations)

We will do Example 11.1 again in this part using the matrix approach.For the quadratic model to be fitted

 

30Slide31

According to formula

we need to calculate

first and then invert

it and get

 

-1

31Slide32

Finally, we calculate the vector of LS estimates

 

32Slide33

Therefore, the LS quadratic model is

This model is the same as we obtained

in Example

11.1.

 

33Slide34

Statistical

Inference

for

Multiple

Regression

34Slide35

Statistical Inference for

Multiple RegressionDetermine which predictor variables have statistically significant effects

We test the hypotheses:

If we can’t reject H

0j

, then

x

j

is not a significant predictor of y.

35Slide36

Review statistical inference for

Simple Linear Regression

Statistical Inference on

36Slide37

Statistical Inference on

What about Multiple Regression?The steps are similar

37Slide38

Statistical Inference on

What’s Vjj? Why ?

1. Mean

Recall

from

simple

linear

regression,

the least

squares

estimators for the regression

parameters

and

are unbiased.

Here

,

of least

squares estimators

is also unbiased.

38Slide39

Statistical Inference on

2.VarianceConstant Variance assumption:

39Slide40

Statistical Inference on

Let V

jj

be the j

th

diagonal of the matrix

40Slide41

Statistical Inference on

41Slide42

Statistical Inference on

42Slide43

Statistical Inference on

Therefore,

43Slide44

Statistical Inference on

Derivation of confidence interval of

The 100(1-

α

)% confidence interval for is

44Slide45

Statistical Inference on

Rejects H

0j

if

45Slide46

Prediction of Future Observation

Having fitted a multiple regression model, suppose we wish to predict the future value of Y for a specified vector of predictor variables x*=(x0*,x

1

*,…,

x

k

*)

One way is to estimate E(Y*) by a confidence interval(CI).

46Slide47

Prediction of Future Observation

47Slide48

F-Test for

Consider: Here is the overall null hypothesis, which states that none of the variables are related to . The alternative one shows at least one is related.

48Slide49

How to Build a F-Test……

The test statistic F=MSR/MSE follows F-distribution with k and n-(k+1) d.f. The α -level test rejects if recall that

MSE(error mean square)

with n-(k+1) degrees of freedom.

49Slide50

The relation between F and r

F can be written as a function of r.By using the formula:F can be as:

We see that F is an increasing function of r ² and test the significance of it.

50Slide51

Analysis of Variance (ANOVA)

The relation between SST, SSR and SSE:where they are respectively equals to:The corresponding degrees of freedom(d.f

.) is:

51Slide52

ANOVA Table for Multiple Regression

Source of Variation(source)

Sum of Squares

(SS)

Degrees of Freedom

(

d.f

.)

Mean Square

(MS)

F

Regression

Error

SSR

SSE

k

n-(k+1)

Total

SST

n-1

This table gives us a clear view of analysis of variance of Multiple Regression.

52Slide53

Extra Sum of Squares Method for Testing Subsets of Parameters

Before, we consider the full model with k parameters. Now we consider the partial model: while the rest m coefficients are set to zero. And we could test these m coefficients to check out the significance:

53Slide54

Building F-test by Using Extra Sum of Squares Method

Let and be the regression and errorsums of squares for the partial model. Since SSTIs fixed regardless of the particular model, so:then, we have:

The

α

-level F-test rejects null hypothesis if

54Slide55

Remarks on the F-test

The numerator d.f. is m which is the number of coefficients set to zero. While the denominator d.f. is n-(k+1) which is the error d.f. for the full

model.

The MSE in the denominator is the normalizing

factor, which is an estimate of

σ²

for the full

model. If the ratio is large, we reject .

55Slide56

Links between ANOVA and Extra Sum of Squares Method

Let m=1 and m=k respectively, we have:From above we can derive:Hence, the F-ratio equals:

with k and n-(k+1) d.f.

56Slide57

Regression Diagnostics

57Slide58

5 Regression Diagnostics

5.1 Checking the Model AssumptionsPlots of the residuals against individual predictor variables: check for linearityA plot of the residuals against fitted values: check for constant varianceA normal plot of the residuals: check for normality

58Slide59

A run chart of the residuals: check if the random errors are auto correlated.

Plots of the residuals against any omitted predictor variables: check if any of the omitted predictor variables should be included in the model.59Slide60

Example: Plots of the residuals against individual predictor variables

60Slide61

SAS code

61Slide62

Example: plot of the residuals against fitted values

62Slide63

SAS code

63Slide64

Example: normal plot of the residuals

64Slide65

SAS code

65Slide66

5.2 Checking for Outliers and Influential Observations

Standardized residualsLarge values indicate outlier observation. Hat matrix If the Hat matrix diagonal , thenith observation is influential.

66Slide67

Example: graphical exploration of outliers

67Slide68

Example: leverage plot

68Slide69

5.3 Data transformation

Transformations of the variables(both y and the x’s) are often necessary to satisfy the assumptions of linearity, normality, and constant error variance. Many seemingly nonlinear models can be written in the multiple linear regression model form after making a suitable transformation. For example, after transformation: or

69Slide70

Topics in Regression Modeling

70Slide71

Multicollinearity

Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response.Example of multicollinear predictors are height and weight of a person, years of education and income, and assessed value and square footage of

a home

.

Consequences of high

multicollinearity

:

a. Increased

standard error of estimates of the β ’s b. Often confused and misled

results.

71Slide72

Detecting Multicollinearity

Easy way: compute correlations between all pairs of predictors. If some r are close to 1 or -1, remove one of the two correlated predictors from the model.

Equal to 1

Correlations

X

1

colinear

X

2

independent

X

3

X

2

72Slide73

Detecting Multicollinearity

Another way: calculate the variance inflation factors for each predictor xj: where is the coefficient of determination of the model that includes all predictors except the

jth

predictor.

If VIFj

10,

then there is a problem

of multicollinearity.

73Slide74

Muticollinearity-Example

See Example11.5 on Page 416, Response is the heat of cement on a per gram basis (y) and predictors are tricalcium aluminate(x1), tricalcium silicate(x2), tetracalcium

alumino

ferrite(x3) and

dicalcium

silicate(x4).

74Slide75

Muticollinearity-Example

Estimated parameters in first order model: ˆy =62.4+1.55x1+0.510x2+0.102x3-0.144x

4

.

F = 111.48 with p−value below 0.0001. Individual t−statistics and p−values: 2.08 (0.071), 0.7 (0.501) and 0.14 (0.896), -0.20 (0.844).

Note that sign on β

4

is opposite of what is expected. And very high F would suggest more than just one significant predictor.

75Slide76

Muticollinearity-Example

CorrelationsCorrelations were r13 = -0.824, r24 =-0.973. Also the VIF were all greater than 10. So there is a multicollinearity problem in such model and we need to choose the optimal algorithm to help us select the variables necessary.

76Slide77

Muticollinearity-Subsets Selection

Algorithms for Selecting Subsets All possible subsets Only feasible with small number of potential predictors (maybe 10 or less) Then can use one or more of possible numerical criteria to find overall best Leaps and bounds method Identifies best subsets for each value of p Requires fewer variables than observations Can be quite effective for medium-sized data sets

Advantage to have several slightly different models to compare

77Slide78

Muticollinearity-Subsets

SelectioinForward stepwise regression Start with no predictors First include predictor with highest correlation with response In subsequent steps add predictors with highest partial correlation with response controlling for variables already in equations Stop when numerical criterion signals maximum (minimum) Sometimes eliminate variables when t value gets too small

Only possible method for very large predictor pools

Local optimization at each step, no guarantee of finding overall optimum

Backward elimination

Start with all predictors in equation

Remove predictor with smallest t value

Continue until numerical criterion signals maximum (minimum)

Often produces different final model than forward stepwise method

78Slide79

Muticollinearity-Best Subsets Criteria

Numerical Criteria for Choosing Best Subsets No single generally accepted criterion Should not be followed too mindlessly Most common criteria combine measures of with add penalties for increasing complexity (number of predictors) Coefficient of determination Ordinary multiple R-square

Always increases with increasing number of predictors, so not very good for comparing models with different numbers of predictors

Adjusted R-Square

Will decrease if increase in R-Square with increasing p is small

79Slide80

Muticollinearity-Best Subsets Criteria

Residual mean square (MSEp)Equivalent to adjusted r-square except look for minimum

Minimum occurs when added variable doesn't decrease error sum of squares enough to offset loss of error degree of freedom

Mallows' Cp statistic

Should be about equal to p and look for small values near p

Need to estimate overall error variance

PRESS statistic

The one associated with the minimum value of

PRESSp

is chosen

Intuitively easier to grasp than the Cp-criterion.

80Slide81

Muticollinearity-Forward Stepwise

First include predictor with highest correlation with response

>FIN=4

81Slide82

Muticollinearity-Forward Stepwise

In subsequent steps add predictors with highest partial correlation with response controlling for variables already in equations. (if Fi>FIN=4, enter the Xi and Fi<FOUT=4, remove the Xi)

>FIN=4

82Slide83

Muticollinearity-Forward Stepwise

<FOUT=4

>FIN=4

83Slide84

Muticollinearity-Forward Stepwise

Summarize the stepwise algorithmsTherefore our “Best Model” should only include x1 and x2, which is y=52.5773+1.4683x1+0.6623x

2

84Slide85

Muticollinearity-Forward Stepwise

Check the significance of the model and individual parameter again. We find p value are all small and each VIF is far less than 10.

85Slide86

Muticollinearity-Best Subsets

Also we can stop when numerical criterion signals maximum (minimum) and sometimes eliminate variables when t value gets too small.

86Slide87

Muticollinearity-Best Subsets

The largest R squared value 0.9824 is associated with the full model.The best subset which minimizes the Cp-criterion includes x1,x2The subset which maximizes Adjusted R squared or equivalently minimizes MSEp is x1,x

2

,x

4

. And the Adjusted R squared increases only from 0.9744 to 0.9763 by the addition of x

4

to the model already containing x

1 and x2.Thus the simpler model chosen by the Cp-criterion is preferred, which the fitted model is y=52.5773+1.4683x

1

+0.6623x

2

87Slide88

Polynomial model

Polynomial models are useful in situations where the analyst knows that curvilinear effects are present in the true response function. We can do this with more than one explanatory

variable using

Polynomial regression model:

88Slide89

Multicollinearity-Polynomial Models

Multicollinearity is a problem in polynomial regression (with terms of second and higher order): x and x2 tend to be highly correlated.

A special solution in polynomial models is to use

zi

= xi − ¯xi

instead of

just xi. That is, first subtract each predictor from its mean and

then use

the deviations in the model.

89Slide90

Multicollinearity

– Polynomial modelExample: x = 2, 3, 4, 5, 6 and x2 = 4, 9, 16, 25, 36. As x increases, so does x2. rx,x2 = 0.98

.

=

4

then z = −2,−1, 0, 1, 2 and z

2

= 4, 1, 0, 1, 4. Thus, z and z2 are no longer correlated. rz,z2 = 0.

We can get the estimates of the

β’s

from the estimates of

the γ ’s. Since

90Slide91

Dummy Predictor Variable

The dummy variable is a simple and useful method of introducing into a regression analysis information contained in variables that are not conventionally measured on a numerical scale, e.g., race, gender, region, etc.91Slide92

Dummy Predictor Variable

The categories of an ordinal variable could be assigned suitable numerical scores.A nominal variable with c≥2 categories can be coded using c – 1 indicator variables, X1,…,Xc-1, called dummy variables.Xi=1, for ith category and 0 otherwiseX1=,…,=Xc-1=0, for the cth category

92Slide93

Dummy Predictor Variable

If y is a worker’s salary and Di = 1 if a non-smoker Di = 0 if a smoker We can model this in the following way:

93Slide94

Dummy Predictor Variable

Equally we could have used the dummy variable in a model with other explanatory variables. In addition to the dummy variable we could also add years of experience (x), to give:

For smoker

For non-smoker

94Slide95

Dummy Predictor Variable

Non-smoker

Smoker

α

α+β

y

x

95Slide96

Dummy Predictor Variable

We can also add the interaction to between smoking and experience with respect to their effects on salary.

For non-smoker

For smoker

96Slide97

Dummy Predictor Variable

Non-smoker

Smoker

α

α+β

y

x

97Slide98

Standardized Regression Coefficients

We typically wants to compare predictors in terms of the magnitudes of their effects on response variable.We use standardized regression coefficients to judge the effects of predictors with different units98Slide99

Standardized Regression Coefficients

They are the LS parameter estimates obtained by running a regression on standardized variables, defined as follows:Where and are sample SD’s of and

99Slide100

Standardized Regression Coefficients

Let AndThe magnitudes of can be directly compared to judge the relative effects of on y.

100Slide101

Standardized Regression Coefficients

Since , the constant can be dropped from the model. Let be the vector of the and be the matrix of

101Slide102

Standardized Regression Coefficients

So we can getThis method of computing is numerically more stable than computing directly, because all entries of R and r are between -1 and 1.

102Slide103

Standardized Regression Coefficients

Example (Given in page 424)From the calculation, we can obtain that And sample standard deviations of x1,x2 andare Then we have Note that ,although .Thus x1 has a larger effect than x2 on y.

103Slide104

Standardized Regression Coefficients

We can also use the matrix method to compute standardized regression coefficients.First we compute the correlation matrix between x1 ,x2 and yThen we haveNext calculate

Hence

Which is as same result as before

104Slide105

Variable Selection Methods

105Slide106

How to decide their salaries?

Lionel Messi

10,000,000 EURO/yr

Carles Puyol

5,000,000 EURO/yr

23

32

Attacker

Defender

5

years

11

years

more than 20

goals per year

less than 1

goals per year

106Slide107

How to select variables?

1) Stepwise Regression2)Best Subset Regression107Slide108

Stepwise Regression

Partial F-testPartial Correlation CoefficientsHow to do it by SAS?Drawbacks108Slide109

Partial F-test

(p-1)-Variable Model:

p-Variable Model:

109Slide110

How to do the test?

We reject in favor of

at

level

α

if

110Slide111

Another way to interpret the test:

test statistics:

We reject at level

α

if

111Slide112

Partial Correlation Coeffientients

test statistics:

*Add to the regression equation that includes

only if is large enough.

112Slide113

How to do it

by SAS? (EX9 Continuity of Ex5)

No.

X1

X2

X3

X4

Y

1

7

26

6

60

78.5

2

1

29

15

52

74.3

3

11

56

8

20

104.3

4

11

31

8

47

87.6

5

7

52

6

33

95.9

6

11

55

9

22

109.2

7

3

71

17

6

102.7

8

1

31

22

44

72.5

9

2

54

18

22

93.1

10

21

47

4

26

1159

11

1

40

23

34

83.8

12

11

66

9

12

113.3

13

10

68

8

12

109.4

The table shows data on the heat evolved in calories during the hardening of cement on a per gram basis (y) along with the percentages of four ingredients:

tricalcium

aluminate

(x1),

tricalcium

silicate (x2),

tetracalcium

alumino

ferrite (x3), and

dicalcium

silicate (x4).

113Slide114

SAS Code

data example1;input x1 x2 x3 x4 y;

datalines

;

7

26

6

60

78.5

1

29 15

52

74.3

11

56

8

20 104.3

11

31

8

47

87.6

7

52

6

33

95.9

11

55

9

22 109.2

3

71 17

6 102.7

1

31 22

44

72.5

2

54 18

22

93.1

21

47

4

26 115.9

1

40 23

34

83.8

11

66

9

12 113.3

10

68

8

12 109.4

;

Run;

proc

reg

data=example1;

model y= x1 x2 x3 x4 /selection=

stepwise

;

run;

114Slide115

SAS output

115Slide116

SAS output

116Slide117

Interpretation

At the first step, x4 is chosen into the equation as it has the largest correlation with y among the 4 predictors; At the second step, we choose x1

into the equation for it has the highest partial correlation with y controlling for x4;

At the

third

step, since is greater than

, x2 is chosen into the equation rather than x3.

117Slide118

At the 4th

step, we removed x4 from the model since its partial F-statistics is too small. From Ex11.5, we know that x4 is highly correlated with x2. Note that in Step4, the R-Square is 0.9787, which is slightly higher that 0.9725, the R-Square of Step 2. It indicates that even x4 is the best predictor of y, the pair (x1,x2) is a better predictor than the predictor (x1,x4).

Interpretation

118Slide119

Drawbacks

The final model is not guaranteed to be optimal in any specified case.It yields a single final model while in practice there are often several equally good model.119Slide120

Best Subset Regression

Comparison to Stepwise MethodOptimality CriteriaHow to do it by SAS?120Slide121

In best subsets regression, a subset of variables is chosen from that optimizes a well-defined objective criterion.

The best regression algorithm permits determination of a specified number of best subsets from which the choice of the final model can be made by the investigator.Comparison to Stepwise Regression121Slide122

Optimality Criteria

122Slide123

Optimality Criteria

Standardized mean square error of

prediction

:

involves unknown parameters such as ‘s, so minimize a sample estimate of . Mallows’ :

123Slide124

It

practice, we use the because of its ease of computation and its ability to judge the predictive power of a model.

Optimality Criteria

124Slide125

How to do it by

SAS?(Ex11.9)proc reg data=example1; model y= x1 x2 x3 x4 /selection=adjrsq mse cp; run;125Slide126

SAS output

126Slide127

The best subset which

minimizes the is x1, x2 which is the same model selected using stepwise regression in the former example. The subset which maximizes is x1, x2, x4. However, increases only from 0.9744

to

0.9763

by the addition of x4 to the model which already contains x1 and x2.

Thus, the model chosen by the is preferred.

Interpretation

127Slide128

Chapter Summary

and Modern Application

128Slide129

are unknown parameters

Model (Extension of Simple Regression):

Least squares method

:

Goodness of fit of the model:

129Slide130

Hypotheses:

Test statistic:

At least one

Statistical Inference on

vs.

Hypotheses:

vs.

Test statistic:

Residual Analysis

Data Transformation

130Slide131

Compare

the full model:

the partial model:

Hypotheses:

vs.

Test

statistic:

Reject

H

0

when

The General Hypothesis Test:

Estimating and Predicting Future Observations:

Let

and

Test statistic:

CI for the estimated mean

*

:

PI for the estimated

Y

*

:

131Slide132

partial F-test

partial Correlation Coefficient

132Slide133

Application of the MLR model

Linear regression is widely used in biological, chemistry, finance and social sciences to describe possible relationships between variables. It ranks as one of the most important tools used in these disciplines.133Slide134

Chemistry

heredity

Financial market

biology

Housing price

134Slide135

Broadly speaking, an asset pricing model can be expressed as:

Example

Where , and k denote the expected return on asset

i

, the

kth

risk factor and the number of risk factors, respectively.

denotes the specific return

on asset

i

.

135Slide136

The equation can also be expressed in the matrix notation:

is called the factor loading

136Slide137

What’s the most important factors?

Interest rate

Inflation rate

Employment rate

Rate of return on the market portfolio

Government policies

GDP

137Slide138

Method

Step 1: Find the efficient factors (EM algorithms, maximum likelihood) Step 2: Fit the model and estimate the factor loading (Multiple linear regression)

138Slide139

According to the multiple linear regression and run data on SAS, we can get the factor loading and the coefficient of multiple determination

We can ensure the factors that mostly effect the return in term of SAS output and then build the appropriate multiple factor modelsWe can use the model to predict the future return and make a good choice!

139Slide140

Questions

Thank you

140