/
Multiple Regression Multiple Regression

Multiple Regression - PowerPoint Presentation

liane-varnes
liane-varnes . @liane-varnes
Follow
603 views
Uploaded On 2016-05-04

Multiple Regression - PPT Presentation

Analysis of Biological Data Ryan McEwan and Julia Chapman Department of Biology University of Dayton ryanmcewanudaytonedu Simple linear regression is a way of understanding the relationship between two variables ID: 305741

variables cover model herbaceous cover variables herbaceous model models variable relationship evaluate correlated categorical regression information response aic data

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Multiple Regression" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Multiple Regression

Analysis of Biological Data

Ryan McEwan and Julia Chapman

Department of Biology

University of Dayton

ryan.mcewan@udayton.eduSlide2

Simple linear regression

is a way of understanding the relationship between two variables

where the data analyst assumes that one variable (predictor; independent variable) drives a second variable (response; dependent).

Extremely useful this is, and yet in most biological situations any given response variable is likely to be determined by more than just a single predictor.

In this case, wing length is related to age, but you can imagine that nutritional status or gender could be important as well.Slide3

Here is aboveground biomass (Y axis) in a forest

and stem density in that forest.You see a relationship, but a messy one.Maybe adding other variables would help Explain AGB.

How about soil nitrogen? How about species diversity?How about mean temperature at each point?Etc.Slide4
Slide5

In biology you may be collecting a slew of values that might serve as predictors for a potential response.Slide6

Consider a correlation matrix!!Slide7
Slide8

Herbaceous cover = Slide9

Herbaceous cover =

+

You are building a model!!Slide10

Herbaceous cover =

+

+

+Slide11

Herbaceous cover =

+

+

+

Multiple regression is a process of figuring out statistically what suite variables best predict a particular response…

…okay how do you proceed?Slide12

Herbaceous cover =

+

Forward selection:

(1) select the variable that forms the best regression relationship with the response variable.

(2) Add all of the variables in the pool, in a stepwise fashion, to find the best relationship, throwing back in weaker ones.

(3) Repeat step 2 until adding in variables no longer makes a stronger relationship.

+Slide13

Herbaceous cover =

+

Backward selection:

Start with all variables in the

model

(2) Eject each one and test the relationship

(3) Throw back into the pool the variable(s) that weaken, or fail to strengthen the relationship.

+Slide14

Herbaceous cover =

+

Backward selection:

Start with all variables in the

model

(2) Eject each one and test the relationship

(3) Throw back into the pool the variable(s) that weaken, or fail to strengthen the relationship.

+

+

+Slide15

Herbaceous cover =

+

+

+

+

A few more things to cover:

How to evaluate models?

What about correlated variables

What about categorical variables?Slide16

Herbaceous cover =

+

+

+

+

A few more things to cover:

How to evaluate models?

Herbaceous cover =

+

+Slide17

Herbaceous cover =

+

+

+

+

A few more things to cover:

How to evaluate models?

Herbaceous cover =

+

+

P-value

R

2

Akaike

Information Criterion (AIC)Slide18

A few more things to cover:

How to evaluate models?

P-value

R

2

Akaike

Information Criterion (AIC)

AIC is a way of comparing the information content of different models. It does not provide a statistical test,

per se

, but rather provides a quantitative way to assess model fit

vs

. model complexity. The best model is the one with the lowest AICSlide19

Herbaceous cover =

+

+

+

+

A few more things to cover:

How to evaluate models?

What about correlated variables

What about categorical variables?Slide20

Herbaceous cover =

+

+

+

+

A few more things to cover:

How to evaluate models?

What about correlated variables

What about categorical variables?Slide21

Herbaceous cover =

+

+

+

+

A few more things to cover:

How to evaluate models?

What about correlated variables

What about categorical variables?

Strongly correlated variables effectively contain the same information, thus should not be inserted into the same model.

The data analyst needs to assess “

muliticollinearity

” among the variables in the model. One simple way to think about it = correlation matrix. Formally, a model building procedure generally includes calculation of “Variable Inflation Factors” and ejecting from the model one of two variables that are highly correlated. Slide22

Herbaceous cover =

+

+

+

+

A few more things to cover:

How to evaluate models?

What about correlated variables?

What about categorical variables

?

Multiple regression models CAN incorporate yes/no variables (logistic) or even categorical variables.

Burned vs.

UnBurned

H

vs.

M

vs.

N

invadedSlide23

Burned vs.

UnBurned

H

vs.

M

vs.

N

invaded