MMBIOS April 2015 Gregory R Johnson Nuclear shape Cell shape Object pos probability Object number Object appearance Microtubule distribution Object positions Object distribution CellOrganizer ID: 784287
Download The PPT/PDF document "Model Selection in Parameterizing Cell ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Model Selection in Parameterizing Cell Images and Populations
MMBIOS, April 2015
Gregory R. Johnson
Slide2Nuclear shape
Cell shape
Object
pos. probability
Object number
Object appearance
Microtubule distribution
Object
positions
Object
distribution
CellOrganizer
Training
Synthesis
Cell
Images
Synthetic
Images
Model Parameters
Slide3CellOrganizer
Models Cell Populations
Learn how spatial relationships of cell compartments vary across cell populations
Generate high-quality
in
silico
representations (i.e. images) cell shape and the relationships of compartments within them
X
1
X
2
X
3
X
4
…
Xn
…p1*
p
2*…
p
m*x1*
x2*
…
x
m
*
P(
pi|Ɵ)
Images
Parameterizations
Cell Morphology
Distribution
Sampled
Parameterizations
Synthesized
Images
f
(x) = p
d({p
1
,…,
p
n
}) =
Ɵ
b(
Ɵ
) = p*
g(p) = x
p
1
p
2
p
3
p
4
p
n
Slide4CellOrganizer Models Cell Populations
Represent cell morphology and organization of components in an
invertable
, compact manner
Learn a distribution over these compact parameterizations
X
1
X
2
X
3
X
4
…
X
n
…
p1*
p2*
…
pm*
x1*
x2*
…
x
m*
P(
p
i|Ɵ)
Cell Morphology
Distribution
Sampled
Parameterizations
Synthesized
Images
f
(x) = p
d({p
1
,…,
p
n
}) =
Ɵ
b(
Ɵ
) = p*
g(p) = x
p
1
p
2
p
3
p
4
p
n
Slide5Image To Parameterization
Represent cell morphology in a compact set of parameters
We also desire an invertible function such that we can recover the original image
X
1
p
1
Images
Parameterizations[ , , ]
cell
nucleus
protein pattern
f
(x
i
) = pi ⟺ g(pi) = xi, i
.,e. xi
pi,1pi,2
pi,3
p1
x1
Slide6Image parameterization is
lossy
Image parameterizations vs number of parameters
Becomes Likelihood Maximization problem if K is known
GMM parameters
Represent the mixture from parameters
LAMP2 Protein Pattern
Full covariance matrix
Gaussian fit
Spherical covariance matrix
Gaussian fit
Slide7MDS
0.85
0.63
0.74
0.90
a.
b.
c.
d.
Shape Space Modeling Pipeline
Slide8Image parameterization is
lossy
(contd.)
Fig 2 from
T
.
Peng et al, “Instance-based generative biological shape modeling” 2009.
x
1
x2 x
3 x4
g(p1)
g(p2)g(p3)
g(p4)
Where
Slide9Multidimensional Scaling
= measured distance between shapes
i
, j
= Euclidian
embeddings
for all shapes
= Euclidean distance between embedding coordinates for shapes
i
, j
= Indicator for if
D
i,j
is observed
Slide10Shape space dimensionality vs Reconstruction
Reconstruction is dependent on the number of observed distances and the dimensionality of the embedding
blue = 1 dimensional embedding
red = “complete” embedding
Slide11Prediction of cell and nuclear dependency
Slide12The “goodness” of a cell parameterization
Many ways to do this
Pixel-pixel Mean Squared
Sørensen
-Dice Coefficient for binary images and shapesLikelihood function…
Slide13Parameters to distribution
p
1
…
p
n
p
*
P(pi|Ɵ)
d({p
1
,…,
pn}) = Ɵb(p|Ɵ) = p*
Slide14Parameters to distribution
“Straight forward” distribution learning and model selection
Some parameterization may
overfit
(i.e. point-mass)Many models can not be learned via closed-form solutionsPredictive Maximum Likelihood i.e.
p1
…p
n
p*
P(pi|Ɵ)
d({p
1
,…,pn
}) = Ɵb(p|
Ɵ) = p*where n is the number of hold outs x
n is some hold-out subset and Ɵn is corresponding trained model
Slide15Distributions of object position
SEC23B
ACBD5
HIP1
Slide16Possible Models
Puncta
are dependent on organelles, but independent of each other
Poisson process
Puncta are dependent on organelles and each otherFiskel point process
Slide17Model with no puncta-puncta spatial interaction indicates greater likelihood!
Five-fold cross validation to choose the best model
Slide18Toward Spatial Network Models
C
olocalization
is a complex network with interdependencies
Simplify it by use one-direction dependencies (network -> DAG)
A diagram of a simplified spatial interaction network
A spatial network exhibiting negative
colocalization
a)
b)
c)
d
prot
dcell
dnuc
p
prot
nprot
sprot
iprot
dprot
Protein N
Slide19Pattern Modeling contd.
Generative Models
Add parameters to account for spatial dependency of arbitrary numbers of protein patterns
19
P(Chloroplast | Cell)
P(ER | Cell, Chloroplast)
3D rendering of a protoplast
P(Chloroplast | Cell)
P( ER | Cell)
Slide20Big Picture…
Want most precise cell parameterization
f(x) = p, g(p) = x
Best-generalizing distribution
d({p1,…,p
n}) = Ɵ
X1X2
X3
X4
…
Xn
p1
p2
p3
p4
…pn
p
1
*
p
2
*
…
p
m*
x
1*
x2*
…
xm*
P(
p
i
|Ɵ)
Images
Parameterizations
Cell Morphology
Distribution
Sampled
Parameterizations
Synthesized
Images
f
(x) = p
d({p
1
,…,
p
n
}) =
Ɵ
b(
Ɵ
) = p*
g(p) = x
Slide21Master Modeling function
How to build a master model-selection model
g(p
i
) with least error between x
i
and g(pi)d({p1,…,pn}) = Ɵ
with greatest likelihoodEven if errtot is some sort of proabilistic
model, it is not clear how to balance errtot and likelihood of the modelESPECIALLY BECAUSE G(X) DRASTICTLY CHANGES VALUES OF
Ɵ