1 2 3 Outline Jinmiao FuIntroduction and History Ning MaEstablish and Fitting of the model Ruoyu ZhouMultiple Regression Model in Matrix Notation Dawei Xu and Yuan ShangStatistical Inference for Multiple Regression ID: 634423
Download Presentation The PPT/PDF document "AMS 572 Group #2 Multiple Linear Regress..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
AMS 572 Group #2
Multiple Linear Regression
1Slide2
2Slide3
3Slide4
Outline
Jinmiao Fu—Introduction and History Ning Ma—Establish and Fitting of the modelRuoyu Zhou—Multiple Regression Model in Matrix NotationDawei Xu
and Yuan Shang—Statistical Inference for Multiple Regression
Yu Mu—Regression Diagnostics
Chen Wang and
Tianyu
Lu—Topics in Regression Modeling
Tian
Feng—Variable Selection MethodsHua Mo—Chapter Summary and modern application
4Slide5
Introduction
Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. Every value of the independent variable x is associated with a value of the dependent variable 5Slide6
Example: The relationship between an adult’s health and his/her daily eating amount of wheat, vegetable and meat.
6Slide7
History
7Slide8
Correlation coefficient
Method of momentsPearson's system of continuous curves.Karl Pearson (1857–1936)
Lawyer,
Germanist
, eugenicist, mathematician and statistician
Chi distance,
P-value
Statistical hypothesis testing theory
, statistical decision theory.
Pearson's chi-square test
, Principal component analysis.
8Slide9
Sir
Francis Galton FRS (16 February 1822 – 17 January 1911)Anthropology and polymathyDoctoral students Karl Pearson
In the late 1860s, Galton conceived the standard deviation.
He created the statistical concept of
correlation
and
also discovered the properties of the
bivariate
normal distribution and its relationship to regression analysis
9Slide10
Galton invented the use of
the regression line (Bulmer 2003, p. 184), and was the first to describe and explain the common phenomenon of regression toward the mean, which he first observed in his experiments on the size of the seeds of successive generations of sweet peas.10Slide11
The publication by
his cousin Charles Darwin of The Origin of Species in 1859 was an event that changed Galton's life. He came to be gripped by the work, especially the first chapter on "Variation under Domestication" concerning the breeding of domestic animals.
11Slide12
Adrien
-Marie Legendre (18 September 1752 – 10 January 1833) was a French mathematician. He made important contributions to statistics, number theory, abstract algebra and mathematical analysis.He developed the
least squares method
, which has broad application in linear regression, signal processing, statistics, and curve fitting.
12Slide13
Johann Carl Friedrich Gauss
(30 April 1777 – 23 February 1855) was a German mathematician and scientist who contributed significantly to many fields, including number theory, statistics, analysis, differential geometry, geodesy, geophysics, electrostatics, astronomy and optics.
13Slide14
Gauss, who was 23 at the time, heard about the problem and tackled it. After three months of intense work, he predicted a position for Ceres in December 1801—just about a year after its first sighting—and this turned out to be accurate within a half-degree. In the process, he so streamlined the cumbersome mathematics of 18th century orbital prediction that his work—published a few years later as
Theory of Celestial Movement—remains a cornerstone of astronomical computation.14Slide15
It
introduced the Gaussian gravitational constant, and contained an influential treatment of the method of least squares, a procedure used in all sciences to this day to minimize the impact of measurement error. Gauss was able to prove the method in 1809 under the assumption of normally distributed errors (see Gauss–Markov theorem; see also Gaussian). The method had been described earlier by Adrien
-Marie Legendre in 1805, but
Gauss claimed that he had been using it since 1795.
15Slide16
Sir
Ronald Aylmer Fisher FRS (17 February 1890 – 29 July 1962) was an English statistician, evolutionary biologist, eugenicist and geneticist. He was described by Anders Hald as "a genius who almost single-handedly created the foundations for modern statistical science," and Richard Dawkins described him as "the greatest of Darwin's successors".
16Slide17
In addition to "analysis of variance", Fisher invented the technique of
maximum likelihood and originated the concepts of sufficiency, ancillarity, Fisher's linear discriminator and Fisher information.
17Slide18
Establish and Fitting
of the Model
18Slide19
Probabilistic Model
: the observed value of the random variable(
r.v
.)
unknown model parameters
depends on fixed predictor values
n
is the number of observations.
~
N (0, )
i.i.d
,
i
=1,2,3
,…,n
19Slide20
Fitting the model
LS provides estimates of the unknown model parameters,
which
minimizes
Q
(j=1,2,
…
,k)
20Slide21
Tire tread wear vs. mileage (example11.1 in textbook)
Mileage (in 1000 miles)Groove Depth (in mils)
0
394.33
4
329.50
8
291.00
12
255.17
16
229.33
20
204.83
24
179.00
28
163.83
32
150.33
The table gives the measurements on the groove of one tire after every 4000 miles.
Our Goal: to build a model to find the
relation between
the
mileage
and
groove depth
of the tire.
21Slide22
Data example
;Input mile depth @@;Sqmile=mile*mile;Datalines;0 394.33 4 329.5 8 291 12 255.17 16 229.33 20 204.83 24 179 28 163.83 32 150.33;run;
Proc
reg
data=example;
Model Depth= mile
sqmile
;
Run;
SAS code
----fitting the model
22Slide23
Depth=386.26-12.77mile+0.172sqmile
23Slide24
Goodness of Fit of the Model
Residuals
are the fitted values
total
sum of squares (SST):
regression
sum of squares (SSR):
An overall measure of the goodness of fit
Error
sum of squares (SSE):
24Slide25
Multiple Regression Model
In Matrix Notation
25Slide26
1. Transform the Formulas to Matrix Notation
26Slide27
The first column of X
denotes the constant term
(We can treat this as
with
)
27Slide28
Finally let
where
the (k+1)
1 vectors of unknown parameters
LS estimates
28Slide29
Formula
becomes
Simultaneously, the linear equation
are changed to
Solve this equation respect to
and we get
(if the inverse of the matrix
exists.)
-1
29Slide30
2. Example 11.2 (Tire Wear Data: Quadratic Fit Using Hand Calculations)
We will do Example 11.1 again in this part using the matrix approach.For the quadratic model to be fitted
30Slide31
According to formula
we need to calculate
first and then invert
it and get
-1
31Slide32
Finally, we calculate the vector of LS estimates
32Slide33
Therefore, the LS quadratic model is
This model is the same as we obtained
in Example
11.1.
33Slide34
Statistical
Inference
for
Multiple
Regression
34Slide35
Statistical Inference for
Multiple RegressionDetermine which predictor variables have statistically significant effects
We test the hypotheses:
If we can’t reject H
0j
, then
x
j
is not a significant predictor of y.
35Slide36
Review statistical inference for
Simple Linear Regression
Statistical Inference on
36Slide37
Statistical Inference on
What about Multiple Regression?The steps are similar
37Slide38
Statistical Inference on
What’s Vjj? Why ?
1. Mean
Recall
from
simple
linear
regression,
the least
squares
estimators for the regression
parameters
and
are unbiased.
Here
,
of least
squares estimators
is also unbiased.
38Slide39
Statistical Inference on
2.VarianceConstant Variance assumption:
39Slide40
Statistical Inference on
Let V
jj
be the j
th
diagonal of the matrix
40Slide41
Statistical Inference on
41Slide42
Statistical Inference on
42Slide43
Statistical Inference on
Therefore,
43Slide44
Statistical Inference on
Derivation of confidence interval of
The 100(1-
α
)% confidence interval for is
44Slide45
Statistical Inference on
Rejects H
0j
if
45Slide46
Prediction of Future Observation
Having fitted a multiple regression model, suppose we wish to predict the future value of Y for a specified vector of predictor variables x*=(x0*,x
1
*,…,
x
k
*)
One way is to estimate E(Y*) by a confidence interval(CI).
46Slide47
Prediction of Future Observation
47Slide48
F-Test for
Consider: Here is the overall null hypothesis, which states that none of the variables are related to . The alternative one shows at least one is related.
48Slide49
How to Build a F-Test……
The test statistic F=MSR/MSE follows F-distribution with k and n-(k+1) d.f. The α -level test rejects if recall that
MSE(error mean square)
with n-(k+1) degrees of freedom.
49Slide50
The relation between F and r
F can be written as a function of r.By using the formula:F can be as:
We see that F is an increasing function of r ² and test the significance of it.
50Slide51
Analysis of Variance (ANOVA)
The relation between SST, SSR and SSE:where they are respectively equals to:The corresponding degrees of freedom(d.f
.) is:
51Slide52
ANOVA Table for Multiple Regression
Source of Variation(source)
Sum of Squares
(SS)
Degrees of Freedom
(
d.f
.)
Mean Square
(MS)
F
Regression
Error
SSR
SSE
k
n-(k+1)
Total
SST
n-1
This table gives us a clear view of analysis of variance of Multiple Regression.
52Slide53
Extra Sum of Squares Method for Testing Subsets of Parameters
Before, we consider the full model with k parameters. Now we consider the partial model: while the rest m coefficients are set to zero. And we could test these m coefficients to check out the significance:
53Slide54
Building F-test by Using Extra Sum of Squares Method
Let and be the regression and errorsums of squares for the partial model. Since SSTIs fixed regardless of the particular model, so:then, we have:
The
α
-level F-test rejects null hypothesis if
54Slide55
Remarks on the F-test
The numerator d.f. is m which is the number of coefficients set to zero. While the denominator d.f. is n-(k+1) which is the error d.f. for the full
model.
The MSE in the denominator is the normalizing
factor, which is an estimate of
σ²
for the full
model. If the ratio is large, we reject .
55Slide56
Links between ANOVA and Extra Sum of Squares Method
Let m=1 and m=k respectively, we have:From above we can derive:Hence, the F-ratio equals:
with k and n-(k+1) d.f.
56Slide57
Regression Diagnostics
57Slide58
5 Regression Diagnostics
5.1 Checking the Model AssumptionsPlots of the residuals against individual predictor variables: check for linearityA plot of the residuals against fitted values: check for constant varianceA normal plot of the residuals: check for normality
58Slide59
A run chart of the residuals: check if the random errors are auto correlated.
Plots of the residuals against any omitted predictor variables: check if any of the omitted predictor variables should be included in the model.59Slide60
Example: Plots of the residuals against individual predictor variables
60Slide61
SAS code
61Slide62
Example: plot of the residuals against fitted values
62Slide63
SAS code
63Slide64
Example: normal plot of the residuals
64Slide65
SAS code
65Slide66
5.2 Checking for Outliers and Influential Observations
Standardized residualsLarge values indicate outlier observation. Hat matrix If the Hat matrix diagonal , thenith observation is influential.
66Slide67
Example: graphical exploration of outliers
67Slide68
Example: leverage plot
68Slide69
5.3 Data transformation
Transformations of the variables(both y and the x’s) are often necessary to satisfy the assumptions of linearity, normality, and constant error variance. Many seemingly nonlinear models can be written in the multiple linear regression model form after making a suitable transformation. For example, after transformation: or
69Slide70
Topics in Regression Modeling
70Slide71
Multicollinearity
Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response.Example of multicollinear predictors are height and weight of a person, years of education and income, and assessed value and square footage of
a home
.
Consequences of high
multicollinearity
:
a. Increased
standard error of estimates of the β ’s b. Often confused and misled
results.
71Slide72
Detecting Multicollinearity
Easy way: compute correlations between all pairs of predictors. If some r are close to 1 or -1, remove one of the two correlated predictors from the model.
Equal to 1
Correlations
X
1
colinear
X
2
independent
X
3
X
2
72Slide73
Detecting Multicollinearity
Another way: calculate the variance inflation factors for each predictor xj: where is the coefficient of determination of the model that includes all predictors except the
jth
predictor.
If VIFj
≥
10,
then there is a problem
of multicollinearity.
73Slide74
Muticollinearity-Example
See Example11.5 on Page 416, Response is the heat of cement on a per gram basis (y) and predictors are tricalcium aluminate(x1), tricalcium silicate(x2), tetracalcium
alumino
ferrite(x3) and
dicalcium
silicate(x4).
74Slide75
Muticollinearity-Example
Estimated parameters in first order model: ˆy =62.4+1.55x1+0.510x2+0.102x3-0.144x
4
.
F = 111.48 with p−value below 0.0001. Individual t−statistics and p−values: 2.08 (0.071), 0.7 (0.501) and 0.14 (0.896), -0.20 (0.844).
Note that sign on β
4
is opposite of what is expected. And very high F would suggest more than just one significant predictor.
75Slide76
Muticollinearity-Example
CorrelationsCorrelations were r13 = -0.824, r24 =-0.973. Also the VIF were all greater than 10. So there is a multicollinearity problem in such model and we need to choose the optimal algorithm to help us select the variables necessary.
76Slide77
Muticollinearity-Subsets Selection
Algorithms for Selecting Subsets All possible subsets Only feasible with small number of potential predictors (maybe 10 or less) Then can use one or more of possible numerical criteria to find overall best Leaps and bounds method Identifies best subsets for each value of p Requires fewer variables than observations Can be quite effective for medium-sized data sets
Advantage to have several slightly different models to compare
77Slide78
Muticollinearity-Subsets
SelectioinForward stepwise regression Start with no predictors First include predictor with highest correlation with response In subsequent steps add predictors with highest partial correlation with response controlling for variables already in equations Stop when numerical criterion signals maximum (minimum) Sometimes eliminate variables when t value gets too small
Only possible method for very large predictor pools
Local optimization at each step, no guarantee of finding overall optimum
Backward elimination
Start with all predictors in equation
Remove predictor with smallest t value
Continue until numerical criterion signals maximum (minimum)
Often produces different final model than forward stepwise method
78Slide79
Muticollinearity-Best Subsets Criteria
Numerical Criteria for Choosing Best Subsets No single generally accepted criterion Should not be followed too mindlessly Most common criteria combine measures of with add penalties for increasing complexity (number of predictors) Coefficient of determination Ordinary multiple R-square
Always increases with increasing number of predictors, so not very good for comparing models with different numbers of predictors
Adjusted R-Square
Will decrease if increase in R-Square with increasing p is small
79Slide80
Muticollinearity-Best Subsets Criteria
Residual mean square (MSEp)Equivalent to adjusted r-square except look for minimum
Minimum occurs when added variable doesn't decrease error sum of squares enough to offset loss of error degree of freedom
Mallows' Cp statistic
Should be about equal to p and look for small values near p
Need to estimate overall error variance
PRESS statistic
The one associated with the minimum value of
PRESSp
is chosen
Intuitively easier to grasp than the Cp-criterion.
80Slide81
Muticollinearity-Forward Stepwise
First include predictor with highest correlation with response
>FIN=4
81Slide82
Muticollinearity-Forward Stepwise
In subsequent steps add predictors with highest partial correlation with response controlling for variables already in equations. (if Fi>FIN=4, enter the Xi and Fi<FOUT=4, remove the Xi)
>FIN=4
82Slide83
Muticollinearity-Forward Stepwise
<FOUT=4
>FIN=4
83Slide84
Muticollinearity-Forward Stepwise
Summarize the stepwise algorithmsTherefore our “Best Model” should only include x1 and x2, which is y=52.5773+1.4683x1+0.6623x
2
84Slide85
Muticollinearity-Forward Stepwise
Check the significance of the model and individual parameter again. We find p value are all small and each VIF is far less than 10.
85Slide86
Muticollinearity-Best Subsets
Also we can stop when numerical criterion signals maximum (minimum) and sometimes eliminate variables when t value gets too small.
86Slide87
Muticollinearity-Best Subsets
The largest R squared value 0.9824 is associated with the full model.The best subset which minimizes the Cp-criterion includes x1,x2The subset which maximizes Adjusted R squared or equivalently minimizes MSEp is x1,x
2
,x
4
. And the Adjusted R squared increases only from 0.9744 to 0.9763 by the addition of x
4
to the model already containing x
1 and x2.Thus the simpler model chosen by the Cp-criterion is preferred, which the fitted model is y=52.5773+1.4683x
1
+0.6623x
2
87Slide88
Polynomial model
Polynomial models are useful in situations where the analyst knows that curvilinear effects are present in the true response function. We can do this with more than one explanatory
variable using
Polynomial regression model:
88Slide89
Multicollinearity-Polynomial Models
Multicollinearity is a problem in polynomial regression (with terms of second and higher order): x and x2 tend to be highly correlated.
A special solution in polynomial models is to use
zi
= xi − ¯xi
instead of
just xi. That is, first subtract each predictor from its mean and
then use
the deviations in the model.
89Slide90
Multicollinearity
– Polynomial modelExample: x = 2, 3, 4, 5, 6 and x2 = 4, 9, 16, 25, 36. As x increases, so does x2. rx,x2 = 0.98
.
=
4
then z = −2,−1, 0, 1, 2 and z
2
= 4, 1, 0, 1, 4. Thus, z and z2 are no longer correlated. rz,z2 = 0.
We can get the estimates of the
β’s
from the estimates of
the γ ’s. Since
90Slide91
Dummy Predictor Variable
The dummy variable is a simple and useful method of introducing into a regression analysis information contained in variables that are not conventionally measured on a numerical scale, e.g., race, gender, region, etc.91Slide92
Dummy Predictor Variable
The categories of an ordinal variable could be assigned suitable numerical scores.A nominal variable with c≥2 categories can be coded using c – 1 indicator variables, X1,…,Xc-1, called dummy variables.Xi=1, for ith category and 0 otherwiseX1=,…,=Xc-1=0, for the cth category
92Slide93
Dummy Predictor Variable
If y is a worker’s salary and Di = 1 if a non-smoker Di = 0 if a smoker We can model this in the following way:
93Slide94
Dummy Predictor Variable
Equally we could have used the dummy variable in a model with other explanatory variables. In addition to the dummy variable we could also add years of experience (x), to give:
For smoker
For non-smoker
94Slide95
Dummy Predictor Variable
Non-smoker
Smoker
α
α+β
y
x
95Slide96
Dummy Predictor Variable
We can also add the interaction to between smoking and experience with respect to their effects on salary.
For non-smoker
For smoker
96Slide97
Dummy Predictor Variable
Non-smoker
Smoker
α
α+β
y
x
97Slide98
Standardized Regression Coefficients
We typically wants to compare predictors in terms of the magnitudes of their effects on response variable.We use standardized regression coefficients to judge the effects of predictors with different units98Slide99
Standardized Regression Coefficients
They are the LS parameter estimates obtained by running a regression on standardized variables, defined as follows:Where and are sample SD’s of and
99Slide100
Standardized Regression Coefficients
Let AndThe magnitudes of can be directly compared to judge the relative effects of on y.
100Slide101
Standardized Regression Coefficients
Since , the constant can be dropped from the model. Let be the vector of the and be the matrix of
101Slide102
Standardized Regression Coefficients
So we can getThis method of computing is numerically more stable than computing directly, because all entries of R and r are between -1 and 1.
102Slide103
Standardized Regression Coefficients
Example (Given in page 424)From the calculation, we can obtain that And sample standard deviations of x1,x2 andare Then we have Note that ,although .Thus x1 has a larger effect than x2 on y.
103Slide104
Standardized Regression Coefficients
We can also use the matrix method to compute standardized regression coefficients.First we compute the correlation matrix between x1 ,x2 and yThen we haveNext calculate
Hence
Which is as same result as before
104Slide105
Variable Selection Methods
105Slide106
How to decide their salaries?
Lionel Messi
10,000,000 EURO/yr
Carles Puyol
5,000,000 EURO/yr
23
32
Attacker
Defender
5
years
11
years
more than 20
goals per year
less than 1
goals per year
106Slide107
How to select variables?
1) Stepwise Regression2)Best Subset Regression107Slide108
Stepwise Regression
Partial F-testPartial Correlation CoefficientsHow to do it by SAS?Drawbacks108Slide109
Partial F-test
(p-1)-Variable Model:
p-Variable Model:
109Slide110
How to do the test?
We reject in favor of
at
level
α
if
110Slide111
Another way to interpret the test:
test statistics:
We reject at level
α
if
111Slide112
Partial Correlation Coeffientients
test statistics:
*Add to the regression equation that includes
only if is large enough.
112Slide113
How to do it
by SAS? (EX9 Continuity of Ex5)
No.
X1
X2
X3
X4
Y
1
7
26
6
60
78.5
2
1
29
15
52
74.3
3
11
56
8
20
104.3
4
11
31
8
47
87.6
5
7
52
6
33
95.9
6
11
55
9
22
109.2
7
3
71
17
6
102.7
8
1
31
22
44
72.5
9
2
54
18
22
93.1
10
21
47
4
26
1159
11
1
40
23
34
83.8
12
11
66
9
12
113.3
13
10
68
8
12
109.4
The table shows data on the heat evolved in calories during the hardening of cement on a per gram basis (y) along with the percentages of four ingredients:
tricalcium
aluminate
(x1),
tricalcium
silicate (x2),
tetracalcium
alumino
ferrite (x3), and
dicalcium
silicate (x4).
113Slide114
SAS Code
data example1;input x1 x2 x3 x4 y;
datalines
;
7
26
6
60
78.5
1
29 15
52
74.3
11
56
8
20 104.3
11
31
8
47
87.6
7
52
6
33
95.9
11
55
9
22 109.2
3
71 17
6 102.7
1
31 22
44
72.5
2
54 18
22
93.1
21
47
4
26 115.9
1
40 23
34
83.8
11
66
9
12 113.3
10
68
8
12 109.4
;
Run;
proc
reg
data=example1;
model y= x1 x2 x3 x4 /selection=
stepwise
;
run;
114Slide115
SAS output
115Slide116
SAS output
116Slide117
Interpretation
At the first step, x4 is chosen into the equation as it has the largest correlation with y among the 4 predictors; At the second step, we choose x1
into the equation for it has the highest partial correlation with y controlling for x4;
At the
third
step, since is greater than
, x2 is chosen into the equation rather than x3.
117Slide118
At the 4th
step, we removed x4 from the model since its partial F-statistics is too small. From Ex11.5, we know that x4 is highly correlated with x2. Note that in Step4, the R-Square is 0.9787, which is slightly higher that 0.9725, the R-Square of Step 2. It indicates that even x4 is the best predictor of y, the pair (x1,x2) is a better predictor than the predictor (x1,x4).
Interpretation
118Slide119
Drawbacks
The final model is not guaranteed to be optimal in any specified case.It yields a single final model while in practice there are often several equally good model.119Slide120
Best Subset Regression
Comparison to Stepwise MethodOptimality CriteriaHow to do it by SAS?120Slide121
In best subsets regression, a subset of variables is chosen from that optimizes a well-defined objective criterion.
The best regression algorithm permits determination of a specified number of best subsets from which the choice of the final model can be made by the investigator.Comparison to Stepwise Regression121Slide122
Optimality Criteria
122Slide123
Optimality Criteria
Standardized mean square error of
prediction
:
involves unknown parameters such as ‘s, so minimize a sample estimate of . Mallows’ :
123Slide124
It
practice, we use the because of its ease of computation and its ability to judge the predictive power of a model.
Optimality Criteria
124Slide125
How to do it by
SAS?(Ex11.9)proc reg data=example1; model y= x1 x2 x3 x4 /selection=adjrsq mse cp; run;125Slide126
SAS output
126Slide127
The best subset which
minimizes the is x1, x2 which is the same model selected using stepwise regression in the former example. The subset which maximizes is x1, x2, x4. However, increases only from 0.9744
to
0.9763
by the addition of x4 to the model which already contains x1 and x2.
Thus, the model chosen by the is preferred.
Interpretation
127Slide128
Chapter Summary
and Modern Application
128Slide129
are unknown parameters
Model (Extension of Simple Regression):
Least squares method
:
Goodness of fit of the model:
129Slide130
Hypotheses:
Test statistic:
At least one
Statistical Inference on
vs.
Hypotheses:
vs.
Test statistic:
Residual Analysis
Data Transformation
130Slide131
Compare
the full model:
the partial model:
Hypotheses:
vs.
Test
statistic:
Reject
H
0
when
The General Hypothesis Test:
Estimating and Predicting Future Observations:
Let
and
Test statistic:
CI for the estimated mean
*
:
PI for the estimated
Y
*
:
131Slide132
partial F-test
partial Correlation Coefficient
132Slide133
Application of the MLR model
Linear regression is widely used in biological, chemistry, finance and social sciences to describe possible relationships between variables. It ranks as one of the most important tools used in these disciplines.133Slide134
Chemistry
heredity
Financial market
biology
Housing price
134Slide135
Broadly speaking, an asset pricing model can be expressed as:
Example
Where , and k denote the expected return on asset
i
, the
kth
risk factor and the number of risk factors, respectively.
denotes the specific return
on asset
i
.
135Slide136
The equation can also be expressed in the matrix notation:
is called the factor loading
136Slide137
What’s the most important factors?
Interest rate
Inflation rate
Employment rate
Rate of return on the market portfolio
Government policies
GDP
137Slide138
Method
Step 1: Find the efficient factors (EM algorithms, maximum likelihood) Step 2: Fit the model and estimate the factor loading (Multiple linear regression)
138Slide139
According to the multiple linear regression and run data on SAS, we can get the factor loading and the coefficient of multiple determination
We can ensure the factors that mostly effect the return in term of SAS output and then build the appropriate multiple factor modelsWe can use the model to predict the future return and make a good choice!
139Slide140
Questions
Thank you
140