TNU Zurich Switzerland An introduction to Bayesian inference and model comparison Overview of the talk An introduction to probabilistic modelling Bayesian model comparison SPM applications ID: 627540
Download Presentation The PPT/PDF document "J. Daunizeau ICM, Paris, France" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
J. DaunizeauICM, Paris, FranceTNU, Zurich, Switzerland
An introduction to
Bayesian
inference
and model
comparisonSlide2
Overview of the talk
An introduction to probabilistic modelling
Bayesian model comparison
SPM applicationsSlide3
Overview of the talk
An introduction to probabilistic modelling
Bayesian model comparison
SPM applicationsSlide4
Degree of plausibility desiderata:- should be represented using real numbers (D1)- should conform with intuition (D2)- should be consistent (D3)
a=2
b=5
a=2
•
normalization:
•
marginalization:
•
conditioning :
(
Bayes rule
)
Probability theory: basicsSlide5
Deriving the likelihood function
- Model of data with unknown parameters:
e.g., GLM:
- But data is noisy:
- Assume noise/residuals is ‘small’:
→ Distribution of data,
given fixed parameters
:
fSlide6
Forward and inverse problems
forward problem
likelihood
inverse problem
posterior distribution
model
dataSlide7
Likelihood
:
Prior
:
Bayes rule:
Likelihood, priors and the model evidence
generative model
mSlide8
Principle of parsimony :« plurality should not be assumed without necessity »
y=f(x)
y = f(x)
x
“Occam’s razor”
:
model evidence
p(y|m)
space of all data sets
Model evidence
:
Bayesian model comparisonSlide9
•••
inference
causality
Hierarchical
modelsSlide10
Directed acyclic graphs (DAGs)Slide11
Variational approximations (VB, EM, ReML)
→
VB
: maximize the
free energy
F(q)
w.r.t
.
the
approximate
posterior
q(θ)
under some (e.g., mean field, Laplace) simplifying constraintSlide12
Type, role and impact of priorsTypes of priors:
Explicit priors on
model
parameters
(e.g., Gaussian)
Implicit priors on
model
functional form (e.g., evolution & observation functions)
Choice of “interesting” data features (e.g., response magnitude vs response profile)
Impact of priors:
On parameter posterior distributions (cf. “shrinkage to the mean” effect)
On model evidence (cf. “Occam’s razor”)
Role of explicit priors (on model parameters):Resolving the
ill-posedness of the inverse problem
Avoiding overfitting (cf. generalization error)Slide13
Overview of the talk
An introduction to probabilistic modelling
Bayesian model comparison
SPM applicationsSlide14
if
then reject H0
•
estimate parameters (obtain test stat.)
•
define the null, e.g.:
•
apply decision rule, i.e.:
classical
(null) hypothesis testing
•
define two alternative models, e.g.:
•
apply decision rule, e.g.:
Bayesian Model Comparison
Frequentist versus Bayesian
inference
space of all datasets
if
then accept
m
0Slide15
Family-level inference
A
B
A
B
A
B
u
A
B
u
P(m
1
|y) = 0.04
P(m
2
|y) = 0.25
P(m
2
|y) = 0.7
P(m
2
|y) = 0.01
model
selection
error
risk
:Slide16
Family-level inference
A
B
A
B
A
B
u
A
B
u
P(m
1
|y) = 0.04
P(m
2
|y) = 0.25
P(m
2
|y) = 0.7
P(m
2
|y) = 0.01
model
selection
error
risk
:
P(f
2
|y) = 0.95
P(f
1
|y) = 0.05
family
inference
(pool
statistical
evidence
)Slide17
Sampling subjects as marbles in an urn
→
i
th
marble is blue
→
i
th
marble is purple
→ (binomial) probability of drawing a set of
n
marbles:
Thus, our belief about the frequency of blue marbles is:
= frequency of blue marbles in the urn
…Slide18
RFX group-level model comparison
At least, we can measure how likely is the
i
th
subject’s data under each model!
…
…
…
…
Our belief about the frequency of models is:
Exceedance
probability:Slide19
SPM: frequentist vs Bayesian RFX analysis
subjects
parameter
estimatesSlide20
Overview of the talk
An introduction to probabilistic modelling
Bayesian model comparison
SPM applicationsSlide21
realignment
smoothing
normalisation
general linear model
template
Gaussian
field theory
p <0.05
statistical
inference
segmentation
and normalisation
dynamic causal
modelling
posterior probability
maps (PPMs)
multivariate
decodingSlide22
grey matter
CSF
white matter
…
…
class variances
class
means
i
th
voxel
value
i
th
voxel
label
class
frequencies
aMRI segmentation
mixture of Gaussians (MoG) modelSlide23
Decoding of brain images
recognizing brain states from fMRI
+
fixation cross
>>
pace
response
log-evidence of X-Y sparse mappings:
effect of lateralization
log-evidence of X-Y bilateral mappings:
effect of spatial deployment Slide24
fMRI time series analysisspatial priors and model comparison
PPM: regions best explained
by short-term memory model
PPM: regions best explained by long-term memory model
fMRI time series
GLM coeff
prior variance
of GLM coeff
prior variance
of data noise
AR coeff
(correlated noise)
short-term memory
design matrix (X)
long-term memory
design matrix (X)Slide25
m2
m
1
m
3
m
4
V1
V5
stim
PPC
attention
V1
V5
stim
PPC
attention
V1
V5
stim
PPC
attention
V1
V5
stim
PPC
attention
m
1
m
2
m
3
m
4
15
10
5
0
V1
V5
stim
PPC
attention
1.25
0.13
0.46
0.39
0.26
0.26
0.10
estimated
effective synaptic strengths
for best model (m
4
)
models marginal likelihood
Dynamic Causal Modelling
network structure identificationSlide26
I thank you for your attention.Slide27
A note on statistical significancelessons from the Neyman-Pearson lemma
Neyman-Pearson lemma
: the likelihood ratio (or Bayes factor) test
is the most powerful test of size to test the null.
MVB
(Bayes factor)
u
=1.09, power=56%
CCA
(F-statistics)
F
=2.20, power=20%
error I rate
1 - error II rate
ROC analysis
what is the threshold
u
, above which the Bayes factor test
yields a error I rate of 5%?