/
Generalized Linear Models II Generalized Linear Models II

Generalized Linear Models II - PowerPoint Presentation

yoshiko-marsland
yoshiko-marsland . @yoshiko-marsland
Follow
471 views
Uploaded On 2016-06-17

Generalized Linear Models II - PPT Presentation

Distributions link functions diagnostics linearity homoscedasticity leverage Dichotomous key picking a distribution for your data Discrete or continuous Possible values 01 or 012 etc ID: 365773

binomial data gamma lambda data binomial lambda gamma distribution resid poisson glm plot check trials lpop http link family

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Generalized Linear Models II" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Generalized Linear Models II

Distributions, link functions, diagnostics (linearity, homoscedasticity, leverage)Slide2

Dichotomous key: picking a distribution for your dataSlide3

Discrete or continuous?

Possible values:

0/1 or 0,1,2,… etc.

Binomial

(logistic

regression)

0/1

Range of data

-

 to + 

0,1,2,…

Discrete

Continuous

Poisson or Binomial

Normal

Gamma or

Inverse-Gaussian

>0

to + 

Check for

overdispersion

Poisson ok

Resid

. deviance ~=

Resid

.

df

(~n-p)

Compare fit w/ quasi-

poisson orQuasi-binomial or negative binomial

Resid. deviance >>Resid. df (~n-p)

Check residuals for

normality

Check

s.dev. residuals for normality

If distributional checks failexamine the data/residuals and try to determine source of deviance! Bimodality? Linearity? Fat tails? Excess zeros?

Check

Resid. deviance =Resid. df (~n-p) again andcompare s.dev. resids tonormality

Common distributions

(But see next slide for others

And additional details)Slide4

Possible values:

0/1

Bernoulli (

successs

/failure, logistic

regresion

?)

- to + 

Discrete

Continuous

Geometric (# trials to 1st success)Poisson (#successes in large # trials)Negative Binomial (#trials to n

th success or over-dispersed Poisson)

Exponential (time to 1st success)Gamma (time to nth

success)Inverse-Gaussian ( 1/x is normal)

>0 to + 

0,1,2,… infinity

Normal

Binomial (# successes in fixed # trials)Multinomial (more than 2 categories, fixed # trials)

0,1,2,… N (known)

0 to 1

Beta (fraction of total, proportions)

Check out Wikipedia pages for each distribution for more info!Slide5

As sample sizes get large, many distributions converge on the normal distribution

See, e.g.

http

://

en.wikipedia.org/wiki/Negative_binomial_distribution

http://

en.wikipedia.org/wiki/Gamma_distributionSlide6

Group exercise

Get a partner

Describe a real dataset to your partner

Partner

picks

a potentially appropriate distribution

Switch rolesRepeat!Slide7

Link Functions

Enforce appropriate range for expected response

(e.g. 0,1 for ‘probability of success’, >0 for counts, etc)

Linearize

relationship between expected response and predictors

G(E(y)) = b

0 + b

1x1+ b2x2+ etcBe careful to interpret coefficients properly given a link function!

E(y) =G-1( b0 + b

1x1+ b2x

2+ etc)E.g.

Link Constraint Inverse

Log E(y)>0

Logit

E(y) in (0,1)

See Table 15.1 in GLM chapter for lots more!Slide8

Canonical link functionsSlide9

Sample problems for count data

Binomial vs.

poisson

http://personal.maths.surrey.ac.uk/st/J.Deane/Teach/se202/poiss_bin.htmlSlide10

Leverage (see diagnostic plots & websites on next slide)

Xxx et al 2006

PLoS

BiologySlide11

R: example GLM with data

#

read in data

bd

=read.csv("c:/marm/teaching/293qe/bat_lambda.csv

")

str(bd);head(bd

)#What not to do- run models blindly!b1=glm(Lambda~PreWNS_Pop,family=

Gamma,data=bd);summary(b1)#What to do - plot data

plot(Lambda~PreWNS_Pop,data=bd)#What does it suggest would be a good idea?

bd$Lpop=log(bd$PreWNS_Pop)plot(Lambda~Lpop,data

=bd)b1=glm(Lambda~Lpop,family=

Gamma,data=bd);summary(b1)b2=glm

(Lambda~Lpop+Species,family=Gamma,data=bd

);summary(b2)b3=glm(Lambda~Lpop*

Species,family=Gamma,data=bd);summary(b3)anova

(b1,b2,b3,test="Chisq")AIC(b1,b2,b3)plot(b3)

http://stats.stackexchange.com/questions/52089/what-does-having-constant-variance-in-a-linear-regression-model-meanhttp://stats.stackexchange.com/questions/58141/interpreting-plot-lm