/
NEURAL VARIATIONAL IDENTIFICATION AND FILTERING NEURAL VARIATIONAL IDENTIFICATION AND FILTERING

NEURAL VARIATIONAL IDENTIFICATION AND FILTERING - PowerPoint Presentation

FluffyFace
FluffyFace . @FluffyFace
Follow
342 views
Uploaded On 2022-07-28

NEURAL VARIATIONAL IDENTIFICATION AND FILTERING - PPT Presentation

Henning Lange Mario Bergés Zico Kolter Variational Filtering Statistical Inference Expectation Maximization Variational Inference Deep Learning Dynamical Systems Variational Filtering ID: 930062

filtering variational intractable inference variational filtering inference intractable variance joint expectation learning maximization algorithmically dynamical deep temporal modeling importance

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "NEURAL VARIATIONAL IDENTIFICATION AND FI..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

NEURAL VARIATIONAL IDENTIFICATION AND FILTERING

Henning Lange, Mario

Bergés

, Zico Kolter

Slide2

Variational Filtering

Statistical Inference

(Expectation Maximization, Variational Inference)

Deep Learning

Dynamical Systems

Variational Filtering

Slide3

Variational Filtering

Statistical Inference

(Expectation Maximization, Variational Inference)

Deep Learning

Dynamical Systems

This makes it unsupervised

Slide4

Variational Filtering

Statistical Inference

(Expectation Maximization, Variational Inference)

Deep Learning

Dynamical Systems

This provides the structure

Slide5

Variational Filtering

Statistical Inference

(Expectation Maximization, Variational Inference)

Deep Learning

Dynamical Systems

This is the optimization engine

Slide6

Variational Filtering

Expectation Maximization…

… but with a Neural Network that tells us where to look.

For statistician:

Slide7

Variational Filtering

Deep Neural Network…

… that learns to perform posterior inference.

For ML researcher:

Slide8

Variational Filtering

Non-linear Kalman filter…

… that is unbiased* and quite fast to evaluate.

For dynamical system guy:

Slide9

Recap

Monte Carlo Integration

Importance sampling

 

 

with

 

Slide10

Outline

1. Statistics

Expectation Maximization

Variational Inference2. Deep LearningDistributions parameterized by Neural Nets3. Dynamical SystemsAdditional challenges from intractable joint distributions

4. Variance Reduction

Slide11

Expectation Maximization in one slide

EM is a technique to perform ML inference of parameters

in a latent variable model (unsupervised learning)

Latent variable

: state of appliances on/offCoordinate Ascent on:

E-Step:

M-Step: Increase

 

Neal, Radford M., and Geoffrey E. Hinton. "A view of the EM algorithm that justifies incremental, sparse, and other variants." 

Learning in graphical models

Slide12

Example: Non-Intrusive Load Monitoring

= some prior, e.g. sparsity

Expectation Maximization allows for learning

could constitute reactive/active power of appliances or waveforms

 

 

Slide13

Intractable posterior distributions

EM requires computation of

For many interesting latent variable models, computing

is intractable

 

 

Slide14

Intractable posterior distributions

For many interesting latent variable models, computing

is intractable

NILM is one of them :

the latent domain grows exponentially with number of appliances

 

 

Slide15

Variational Inference in two slides

Expectation Maximization:

 

Variational Inference:

 

Jordan, M. I.,

Ghahramani

, Z.,

Jaakkola

, T. S., & Saul, L. K. (1999). An introduction to variational methods for graphical models. 

Slide16

Variational Inference in two slides

Variational Inference:

 

 

Evidence Lower

BOund

(ELBO)

Slide17

Variational Inference in two slides

Variational Inference:

 

 

Extract waveforms that best explain data!

Slide18

Variational Inference in two slides

Variational Inference:

 

 

Posterior Inference!

Slide19

Connection Deep Learning

We choose

to be parameterized by a Neural Networks

More detail:

 

Slide20

Connection: Dynamical Systems

Appliances evolve over time

The temporal dynamics are important (overfitting)

 

 

 

 

 

 

 

 

Slide21

Variational Filtering

Variational Filtering:

 

 

Slide22

Variational Filtering

Variational Filtering:

 

 

 

Slide23

Intractable Joint distribution

When modeling temporal dependencies, even the joint becomes intractable

 

Slide24

Intractable Joint distribution

When modeling temporal dependencies, even the joint becomes intractable

 

Intractable for two reasons!

Slide25

Reason 1: Intractable Joint distribution

When modeling temporal dependencies, even the joint becomes intractable

 

Importance sampling and MC integration!

Slide26

Reason 1: Intractable Joint distribution

When modeling temporal dependencies, even the joint becomes intractable

 

Importance sampling and MC integration!

Slide27

Reason 2: Intractable Joint distribution

When modeling temporal dependencies, even the joint becomes intractable

 

Importance sampling and MC integration!

Slide28

Reason 2: Approximating the data likelihood

=

 

 

Importance sampling and MC integration!

Slide29

Reason 2: Approximating the data likelihood

=

 

 

 

Slide30

Putting the pieces together

 

 

Slide31

Putting the pieces together

 

 

 

This is tractable!

Slide32

Are we done?

Sadly no, the gradient estimator

w.r.t

.

has high variance.However, there is remedy.

 

Slide33

VI: Variance

 

Slide34

VI: Variance

 

Unbiased but high variance!

Slide35

VI: Variance

More general if

is independent of

:

=

 

Slide36

VI: Variance

More general if

is independent of

:

=

 

What’s an appropriate

?

 

Slide37

VI: Variance Reduction

The inability to compute

causes high variance

Why don’t we just use an approximation of

as a control variate?

 

Slide38

Variance reduction

Samples are drawn without replacement from Q

This is not a trivial problem!

Slide39

Variance reduction

Samples are drawn without replacement from Q

In order to reduce the variance of the estimator, we subtract

(control variate)

 

 

Slide40

Variance reduction

Samples are drawn without replacement from Q

In order to reduce the variance of the estimator, we subtract

(control variate)

 

 

 

Slide41

Variational Filtering: algorithmically

t-1

t

t-1

t

Slide42

Variational Filtering: algorithmically

t-1

t

t-1

t

Slide43

Variational Filtering: algorithmically

t-1

t

t-1

t

Slide44

Variational Filtering: algorithmically

t-1

t

t-1

t

 

Slide45

Variational Filtering: algorithmically

t-1

t

t-1

t

 

Slide46

Variational Filtering: algorithmically

t-1

t

t-1

t

Slide47

Variational Filtering: algorithmically

t-1

t

t-1

t

 

compute

Slide48

Variational Filtering: algorithmically

t-1

t

t-1

t

 

Slide49

NVIF: Results

Slide50

Performance NVIF