/
Multipe  and non-linear regression Multipe  and non-linear regression

Multipe and non-linear regression - PowerPoint Presentation

inventco
inventco . @inventco
Follow
354 views
Uploaded On 2020-06-15

Multipe and non-linear regression - PPT Presentation

What is what Regression One variable is considered dependent on the others Correlation No variables are considered dependent on the others Multiple regression More than one independent variable ID: 777287

linear regression variable multiple regression linear multiple variable variables ocp independent step 000 exp dependent age factor coefficient logistic

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Multipe and non-linear regression" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Multipe and non-linear regression

Slide2

What is what

?

Regression: One variable is considered dependent on the other(s)

Correlation: No variables are considered dependent on the other(s)Multiple regression: More than one independent variableLinear regression: The independent factor is scalar and linearly dependent on the independent factor(s)Logistic regression: The independent factor is categorical (hopefully only two levels) and follows a s-shaped relation.

2

Slide3

Remember the simple linear regression?

If Y is

linaery

dependent on X, simple linear regression is used: is the intercept, the value of Y when X = 0 is the slope, the rate in which Y increases when X increases3

Slide4

I the relation linaer?

4

Slide5

Multiple linear regression

If Y is

linaery

dependent on more than one independent variable: is the intercept, the value of Y when X1 and X2 = 0

1

and

2

are termed partial regression coefficients

1

expresses the change of Y for one unit of X when

2 is kept constant

5

Slide6

Multiple linear regression – residual error and estimations

As the collected data is not expected to fall in a plane an error term must be added

The error term

summes up to be zero.Estimating the dependent factor and the population parameters:6

Slide7

Multiple linear regression – general equations

In general an

finite

number (m) of independent variables may be used to estimate the hyperplaneThe number

of sample points must

be

two

more

than

the

number

of variables

7

Slide8

Multiple linear regression – least sum of squares

The

principle

of the least sum of squares are usually used to perform the fit:

8

Slide9

Multiple linear regression – An example

9

Slide10

Multiple linear regression – The fitted

equation

10

Slide11

Multiple linear regression – Are

any

of the

coefficients significant?F = regression MS / residual MS11

Slide12

Multiple linear regression – Is it a good fit?

R

2

= 1-regression SS / total SSIs an expression of how much of the variation can be described by the model When comparing models with different numbers of variables the ajusted R-square should be used:Ra2 = 1 – regression MS / total MSThe multiple regression coefficient:

R =

sqrt

(R

2

)

The standard error of the estimate =

sqrt

(residual MS)

12

Slide13

Multiple linear regression – Which of the coefficient are significant?

s

bi

is the standard error of the regresion parameter bit-test tests if bi is different from 0t = bi /

s

bi

 is the residual DF

p values can be found in a table

13

Slide14

Multiple linear regression – Which of the are most important?

The standardized regression coefficient , b’ is a normalized version of b

14

Slide15

Multiple linear regression - multicollinearity

If two factors are well correlated the estimated

b’s

becomes inaccurate. Collinearity, intercorrelation, nonorthogonality, illconditioningTolerance or variance inflation factors can be computed

Extreme correlation is called singularity and on of the correlated variables must be removed.

15

Slide16

Multiple linear regression –

Pairvise

correlation coefficients16

Slide17

Multiple linear regression – Assumptions

The same as for simple linear regression:

Y’s are randomly sampled

The reciduals are normal distributed The reciduals hav equal varianceThe X’s are fixed factors (their error are small). The X’s are not perfectly correlated

17

Slide18

Logistic regression

18

Slide19

Logistic Regression

If the dependent variable is categorical and especially binary?

Use some interpolation method

Linear regression cannot help us. 19

Slide20

20

The sigmodal curve

Slide21

21

The sigmodal curve

The intercept basically just ‘scale’ the input variable

Slide22

22

The sigmodal curve

The intercept basically just ‘scale’ the input variable

Large regression coefficient

risk factor strongly influences the probability

Slide23

23

The sigmodal curve

The intercept basically just ‘scale’ the input variable

Large regression coefficient

risk factor strongly influences the probability

Positive regression

coefficient

risk

factor

increases

the

probability

Logistic

regession

uses

maximum

likelihood

estimation

, not

least

square

estimation

Slide24

Does age influence the diagnosis? Continuous independent variable

24

Variables in the Equation

B

S.E.

Wald

df

Sig.

Exp(B)

95% C.I.for EXP(B)

Lower

Upper

Step 1

a

Age

,109

,010

108,745

1

,000

1,115

1,092

1,138

Constant

-4,213

,423

99,097

1

,000

,015

a. Variable(s) entered on step 1: Age.

Slide25

Does previous intake of OCP influence the diagnosis? Categorical independent variable

Variables in the Equation

B

S.E.

Wald

df

Sig.

Exp(B)

95% C.I.for EXP(B)

Lower

Upper

Step 1

a

OCP(1)

-,311

,180

2,979

1

,084

,733

,515

1,043

Constant

,233

,123

3,583

1

,058

1,263

a. Variable(s) entered on step 1: OCP.

25

Slide26

Odds ratio

26

Slide27

Multiple logistic regression

Variables in the Equation

B

S.E.

Wald

df

Sig.

Exp(B)

95% C.I.for EXP(B)

Lower

Upper

Step 1

a

Age

,123

,011

115,343

1

,000

1,131

1,106

1,157

BMI

,083

,019

18,732

1

,000

1,087

1,046

1,128

OCP

,528

,219

5,808

1

,016

1,695

1,104

2,603

Constant

-6,974

,762

83,777

1

,000

,001

a. Variable(s) entered on step 1: Age, BMI, OCP.

27

Slide28

Predicting the diagnosis by logistic regression

What

is the probability that the tumor of a 50 year old woman who has been using

OCP and has a BMI of 26 is

malignant

?

z

= -6.974 + 0.123*50 + 0.083*26 + 0.28*1 = 1.6140

p = 1/(1+e

-1.6140

) = 0.8340

28

Variables in the Equation

B

S.E.

Wald

df

Sig.

Exp(B)

95% C.I.for EXP(B)

Lower

Upper

Step 1

a

Age

,123

,011

115,343

1

,000

1,131

1,106

1,157

BMI

,083

,019

18,732

1

,000

1,087

1,046

1,128

OCP

,528

,219

5,808

1

,016

1,695

1,104

2,603

Constant

-6,974

,762

83,777

1

,000

,001

a. Variable(s) entered on step 1: Age, BMI, OCP.

Slide29

Exercises

20.1, 20.2

29