for Modelling Over and Underdispersed Binomial Frequencies Feirer V Hirn U Friedl H Bauer W Institute for Paper Pulp and Fiber Technology amp Institute for Statistics Graz University of Technology ID: 433789
Download Presentation The PPT/PDF document "Two Distribution Families" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Two Distribution Families
for Modelling Over- and Underdispersed
Binomial Frequencies
Feirer V.
, Hirn U., Friedl H., Bauer W.
Institute for Paper, Pulp and Fiber Technology
& Institute for Statistics
Graz University of TechnologySlide2
Agenda
Motivation
Generalized Linear Models
Multiplicative Binomial Distribution
Double Binomial Distribution
Application of the Two Distributions
SummarySlide3
Motivation
consider the problem of successful ink transfer on paper
explain occurrence of unprinted
regions
…part of a larger, industry-funded project at the IPZ.
(No. of datapoints
in sample:
roughly
9
10
6
sample size:
3
6 mm²
)Slide4
Predictor Variables
Topography
Formation
…the way fibres are arrangedSlide5
Response
true colour imageSlide6
generalized linear models
BasicsSlide7
Distribution of the Response
response
model for
here
…part of the Exponential Family
with
the probability for successful ink transmissionSlide8
the Generalized Linear Model*
model for
is linked to the mean by
*
Nelder & Wedderburn
(1972). Generalized Linear Models.
Journal of the Royal Statistical Society,
135, 370-384
linear predictor
advances over a linear model:
distribution of the relative frequencies
… member of the Exponential Family
mean lies between 0 and 1Slide9
Model Deviance
Deviance = -2 ×
(
maximized log-likelihood of considered model –
maximized log-likelihood of saturated model
)
under certain regularity conditions,
…a test for goodness-of-fit
if
Underdispersion
Variance of data smaller than assumed by the model
if Overdispersion
Variance of data larger than assumed by the modelSlide10
Deviances of the Printability Datasets
distinct deviations from a binomial variance!
few
many
unprinted areas
…values from 11 different data setsSlide11
Multiplicative binomial distribution
A Generalization of the Binomial DistributionSlide12
Definition
*Altham
(1978). Two Generalizations of the Binomial Distribution.
Journal of the Royal Statistical Society,
27, 162-197
considers litters of rabbits
animals within one litter are treated with the same dosis of a certain drug
n… litter sizey… number of surviving animals
outcomes from animals from within one litter are
not mutually independent
Altham introduces an interaction parameter ω
introduced by Altham* as „multiplicative generalization of the binomial distribution“Slide13
Properties
Member of the 2-parameter Exponential Family
For
ω
=1, it corresponds to the Binomial Distribution
For n=1, it reduces to the Bernoulli distributionSlide14
Comparison With Classic Binomial pdf
n = 36
= 0.8
ω
=1
gives the classic binomial distributionSlide15
Comparison of the Variances
n = 36
ω
=1
gives the classic binomial distributionSlide16
Integration into GLM Context
log-likelihood function of distribution
logit-link
0 < < 1
ω
> 0
log-linear linkSlide17
Double binomial Distribution
A Second Generalization of the Binomial DistributionSlide18
Definition
*Efron
(1986). Double Exponential Families and their Use in Generalized Linear Regression.
Journal of the American Statistical Association,
81, 709-721
introduced by Efron* as part of the
Double Exponential Family
second parameter
allows variation of variance: variance is smaller than binomial if
0<<1 and larger than binomial if
>1
=1 gives the classic binomial distributionSlide19
Comparison With Classic Binomial pdf
n = 36
= 0.8
=1
gives the classic binomial distributionSlide20
Comparison of the Variances
n = 36
=1
gives the classic binomial distributionSlide21
Integration into GLM Context
member of the 2-parameter exponential family
log-likelihood function of distribution
0 < < 1
> 0
logit-link
log-linear linkSlide22
An application
The Printability DatasetSlide23
Response and Explanatory Variables
occurrrence of unprinted areas…
~
explained by…
topography
+
formationSlide24
Comparison of Three Models
Distribution
classic binomial
multiplicative binomial
double binomial
17071
8452
11632DoF
2483
2482
2482
6614
5836
4117
DoF2481
24802480
AIC
66205845
4125Slide25
Comparison of the MeansSlide26
Comparison of the MeansSlide27
Comparison of the Means
The second parameter
influences the mean, too.Slide28
Comparison of the Standard DeviationsSlide29
Comparison of the Standard DeviationsSlide30
Comparison of the Variances
binomial Std. Dev. at n=36:
cannot be larger than 3
empirical Std. Deviations:
up to 11
Multiplicative and Double Binomial Standard Deviations fit much
better to empirical resultsSlide31
Summary
Two generalizations of the binomial distribution
might compensate over- or underdispersion
in the case of classic binomial distribution.
Multiplicative Binomial Distribution (Altham, 1978)
second parameter
ω in GLM context: model with the logistic link and
ω with the log-linear link functionSlide32
Summary 2
Double Binomial Distribution (Efron, 1986)
second parameter
in GLM context: model
with the logistic link
and with the log-linear link functionSlide33
Thank You for Your Attention