Part II Definition and Properties Nevin L Zhang Dept of Computer Science amp Engineering The Hong Kong Univ of Sci amp Tech httpwwwcseusthklzhang AAAI 2014 Tutorial Part II Concept ID: 495860
Download Presentation The PPT/PDF document "Latent Tree Models" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Latent Tree ModelsPart II: Definition and Properties
Nevin L. ZhangDept. of Computer Science & EngineeringThe Hong Kong Univ. of Sci. & Tech.http://www.cse.ust.hk/~lzhang
AAAI 2014 TutorialSlide2
Part II: Concept and Properties
Latent Tree ModelsDefinitionRelationship with finite mixture modelsRelationship with phylogenetic treesBasic PropertiesSlide3
Basic Latent Tree Models (LTM)
Bayesian networkAll variables are discrete Structure is a rooted treeLeaf nodes are observed (manifest variables)Internal nodes are not observed (latent variables)Parameters:P(Y1
),
P(Y2|Y1),P(X1|Y2
), P(X2|Y2), …
Semantics:
Also known as Hierarchical latent class (HLC) models, HLC models (Zhang. JMLR 2004)Slide4
Marginalizing out the latent variables in , we get a joint distribution over the observed
variables .In comparison with Bayesian network without latent variables, LTM: Is computationally very simple to work with.
R
epresent
complex relationships among manifest variables
.
What does the structure look like without the latent variables?
Joint Distribution over Observed VariablesSlide5
Pouch Latent Tree Models (PLTM)An extension of basic LTM
Rooted treeInternal nodes represent discrete latent variablesEach leaf node consists of one or more continuous observed variable, called a pouch.
(Poon et al. ICML 2010)Slide6
More General Latent Variable Tree Models
Some internal nodes can be observedInternal nodes can be continuousForestPrimary focus of this tutorial: the basic LTM
(Choi et al. JMLR 2011)Slide7
Part II: Concept and Properties
Latent Tree ModelsDefinitionRelationship with finite mixture modelsRelationship with phylogenetic treesBasic PropertiesSlide8
Finite Mixture Models (FMM)Gaussian Mixture Models (GMM): Continuous attributes
Graphical modelSlide9
Finite Mixture Models (FMM)
GMM with independence assumptionBlock diagonal co-variable matrixGraphical ModelSlide10
Finite Mixture Models
Latent class models (LCM): Discrete attributes Distribution for cluster k: Product multinomial distribution: All FMMsOne latent variableYielding one partition of data
Graphical ModelSlide11
From FMMs to LTMs
Start with several GMMs, Each based on a distinct subset of attributesEach partitions data from a certain perspective.Different partitions are independent of each otherLink them up to form a tree modelGet Pouch LTMConsider different perspectives in a single modelMultiple partitions of data that are correlated
.Slide12
From FMMs to LTMs
Start with several LCMs, Each based on a distinct subset of attributesEach partitions data from a certain perspective.Different partitions are independent of each otherLink them up to form a tree modelGet LTMConsider different perspectives in a single modelMultiple partitions of data that are correlated.
Summary: An LTM can be viewed as a collections of FMMs, with their latent variables linked up to form a tree structure.Slide13
Part II: Concept and Properties
Latent Tree ModelsDefinitionRelationship with finite mixture modelsRelationship with phylogenetic treesBasic PropertiesSlide14
Phylogenetic trees
TAXA (sequences) identify speciesEdge lengths represent evolution timeUsually, bifurcating tree topologyDurbin, et al. (1998). Biological Sequence Analysis:
Probabilistic Models
of
Proteins
and
Nucleic Acids. Cambridge University Press.Slide15
Probabilistic Models of Evolution
Two assumptions There are only substitutions, no insertions/deletions (aligned)One-to-one correspondence between sites in different sequencesEach site evolves independently and identicallyP(x|y, t) =
P
i=1 to m
P(x(
i
) | y(i), t)
m is sequence lengthP(x(i)|y(i), t) Jukes-Cantor (Character Evolution) Model [1969]Rate of substitution a Slide16
Phylogenetic Trees are Special LTMsWhen focus on one site, phylogenetic trees are special latent tree models
The structure is a binary tree The variables share the same state space.Each conditional distribution is characterized by only one parameters, i.e., the length of the corresponding edge Slide17
Hidden Markov models are also special latent tree models
All latent variables share the same state space.All observed variables share the same state space.P(yt |st ) and P(st+1 |
s
t
)
are the same for different t ’s.Hidden Markov ModelsSlide18
Part II: Concept and Basic Properties
Latent Tree ModelsDefinitionRelationship with finite mixture modelsRelationship with phylogenetic treesBasic PropertiesSlide19
So far, a model consists ofObserved and latent variablesConnections among the variables
Probability valuesFor the rest of Part II, a model consists ofObserved and latent variablesConnections among the variablesProbability parametersTwo Concepts of ModelsSlide20
Model InclusionSlide21
If m includes m’
and vice versa, then they are marginally equivalent. If they also have the same number of free parameters, then they are equivalent.It is not possible to distinguish between equivalent models based on data.Model EquivalenceSlide22
Root WalkingSlide23
Root Walking Example
Root
walks to X2;
R
oot
walks to X3Slide24
Theorem: Root walking leads to equivalent latent tree models.
Root Walking
(Zhang, JMLR 2004)
Special case of
covered arc reversal
in general Bayesian network,
Chickering
, D. M. (1995). A transformational characterization of equivalentBayesian network structures. UAI.Slide25
Edge orientations in latent tree models are not identifiable.
Technically, better to start with alternative definition of LTM:A latent tree model (LTM) is a Markov random field over an undirected tree, or tree-structured Markov network where variables at leaf nodes are observed and variables at internal nodes are hidden.
ImplicationSlide26
For technical convenience, we often root an LTM at one of its latent nodes and regard it as a directed graphical
model.Rooting the model at different latent nodes lead to equivalent directed models.This is why we introduced LTM as directed models.ImplicationSlide27
Regularity
|X|: Cardinality of variable X, i.e., the number of states.Slide28
Can focus on regular models onlyIrregular models can be made regular
Regularized models better than irregular modelsTheorem: The set of all regular models for a given set of observed variables is finite.Regularity
(Zhang, JMLR 2004)