Distributions link functions diagnostics linearity homoscedasticity leverage Dichotomous key picking a distribution for your data Discrete or continuous Possible values 01 or 012 etc ID: 365773
Download Presentation The PPT/PDF document "Generalized Linear Models II" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Generalized Linear Models II
Distributions, link functions, diagnostics (linearity, homoscedasticity, leverage)Slide2
Dichotomous key: picking a distribution for your dataSlide3
Discrete or continuous?
Possible values:
0/1 or 0,1,2,… etc.
Binomial
(logistic
regression)
0/1
Range of data
-
to +
0,1,2,…
Discrete
Continuous
Poisson or Binomial
Normal
Gamma or
Inverse-Gaussian
>0
to +
Check for
overdispersion
Poisson ok
Resid
. deviance ~=
Resid
.
df
(~n-p)
Compare fit w/ quasi-
poisson orQuasi-binomial or negative binomial
Resid. deviance >>Resid. df (~n-p)
Check residuals for
normality
Check
s.dev. residuals for normality
If distributional checks failexamine the data/residuals and try to determine source of deviance! Bimodality? Linearity? Fat tails? Excess zeros?
Check
Resid. deviance =Resid. df (~n-p) again andcompare s.dev. resids tonormality
Common distributions
(But see next slide for others
And additional details)Slide4
Possible values:
0/1
Bernoulli (
successs
/failure, logistic
regresion
?)
- to +
Discrete
Continuous
Geometric (# trials to 1st success)Poisson (#successes in large # trials)Negative Binomial (#trials to n
th success or over-dispersed Poisson)
Exponential (time to 1st success)Gamma (time to nth
success)Inverse-Gaussian ( 1/x is normal)
>0 to +
0,1,2,… infinity
Normal
Binomial (# successes in fixed # trials)Multinomial (more than 2 categories, fixed # trials)
0,1,2,… N (known)
0 to 1
Beta (fraction of total, proportions)
Check out Wikipedia pages for each distribution for more info!Slide5
As sample sizes get large, many distributions converge on the normal distribution
See, e.g.
http
://
en.wikipedia.org/wiki/Negative_binomial_distribution
http://
en.wikipedia.org/wiki/Gamma_distributionSlide6
Group exercise
Get a partner
Describe a real dataset to your partner
Partner
picks
a potentially appropriate distribution
Switch rolesRepeat!Slide7
Link Functions
Enforce appropriate range for expected response
(e.g. 0,1 for ‘probability of success’, >0 for counts, etc)
Linearize
relationship between expected response and predictors
G(E(y)) = b
0 + b
1x1+ b2x2+ etcBe careful to interpret coefficients properly given a link function!
E(y) =G-1( b0 + b
1x1+ b2x
2+ etc)E.g.
Link Constraint Inverse
Log E(y)>0
Logit
E(y) in (0,1)
See Table 15.1 in GLM chapter for lots more!Slide8
Canonical link functionsSlide9
Sample problems for count data
Binomial vs.
poisson
http://personal.maths.surrey.ac.uk/st/J.Deane/Teach/se202/poiss_bin.htmlSlide10
Leverage (see diagnostic plots & websites on next slide)
Xxx et al 2006
PLoS
BiologySlide11
R: example GLM with data
#
read in data
bd
=read.csv("c:/marm/teaching/293qe/bat_lambda.csv
")
str(bd);head(bd
)#What not to do- run models blindly!b1=glm(Lambda~PreWNS_Pop,family=
Gamma,data=bd);summary(b1)#What to do - plot data
plot(Lambda~PreWNS_Pop,data=bd)#What does it suggest would be a good idea?
bd$Lpop=log(bd$PreWNS_Pop)plot(Lambda~Lpop,data
=bd)b1=glm(Lambda~Lpop,family=
Gamma,data=bd);summary(b1)b2=glm
(Lambda~Lpop+Species,family=Gamma,data=bd
);summary(b2)b3=glm(Lambda~Lpop*
Species,family=Gamma,data=bd);summary(b3)anova
(b1,b2,b3,test="Chisq")AIC(b1,b2,b3)plot(b3)
http://stats.stackexchange.com/questions/52089/what-does-having-constant-variance-in-a-linear-regression-model-meanhttp://stats.stackexchange.com/questions/58141/interpreting-plot-lm