/
Visual Dynamics: Probabilistic Future Visual Dynamics: Probabilistic Future

Visual Dynamics: Probabilistic Future - PowerPoint Presentation

calandra-battersby
calandra-battersby . @calandra-battersby
Follow
419 views
Uploaded On 2017-06-11

Visual Dynamics: Probabilistic Future - PPT Presentation

Frame Synthesis via Cross Convolutional Networks Tianfan Xue Jiajun Wu Katie Bouman Bill Freeman Indicates equal contribution Frame 1 Frame 2 Task future frame prediction Frame 1 ID: 558522

frame network input motion network frame motion input future learns result structure idea main outline segments vector synthesis random

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Visual Dynamics: Probabilistic Future" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Visual Dynamics: Probabilistic Future Frame Synthesisvia Cross Convolutional Networks

Tianfan Xue*

Jiajun Wu*

Katie Bouman

Bill Freeman

* Indicates equal contributionSlide2

Frame 1

Frame 2

?

Task: future frame

predictionSlide3

Frame 1

Frame 2

Deterministic

neural network

Deterministic predictions fail to model uncertaintySlide4

Frame 1

Deterministic

neural network

Deterministic

predictions fail

to model uncertainty

Prediction

RealitySlide5

 

Warp

Unrealistic Motion

Realistic Motion

Sampling a motion field from a prior distribution

Warp

Warp

Only a few motion fields are consistent with the input imageSlide6

Related work

Deterministic prediction

?

Sample from prior distribution

Motion prediction

:

[

Pintea

et al., 2014

], [

Walker et al. 2015

]

Visual feature prediction

:

[

Vondrick

et al., 2014

]

Future frame synthesis

: [

Mathieu

et al., 2014

]

Image prior

: [

Simoncelli

2001

], [Zoran

2012]

Motion prior

: [Weiss & Adelson, 1998], [Fleet 2000]

Image synthesis

: [

Portilla

and

Simoncelli

, 2000], [

Kingma

and Welling, 2014],

[

Radford

2015], [Oord

2016

]

P

robabilistic prediction

:

[

Walker

et al., 2016]Slide7

Related work

Deterministic prediction

?

Sample from prior distribution

Sampled future frames

Input frame

Our approachSlide8

Sampled future frames

Input frame

Task:

sample future

frames consistent with the input

Main idea

Network structureOutlineWhat the network learnsResultSlide9

?

Input frame

Sampled

future frame

Segment-based synthesis

Main idea

Network structure

OutlineWhat the network learnsResultSlide10

Input frame

Sampled

future frame

Segments

Transformed segments

Segment-based synthesis

Main idea

Network structure

OutlineWhat the network learnsResultSlide11

Input frame

Another sampled

future frame

Segments

Transformed segments

Input random

m

otion vector

 

Synthesize using

different transformations

Main idea

Network structure

Outline

What the network learns

ResultSlide12

Input random motion vector

 

Synthesis network

Input frame

Sampled future frame

Synthesis network

Main idea

Network structure

Outline

What the network learns

ResultSlide13

Synthesis network

Input frame

Sampled future frame

Sample different future

frames

Main idea

Network structure

Outline

What the network learns

Result

Input random motion vector

 Slide14

Synthesis network

Input frame

Sample different future

frames

Main idea

Network structure

Outline

What the network learns

Result

Input random motion vector

 

Sampled future frameSlide15

Synthesis network

Input frame

Sampled future frame

Sample different future

frames

Main idea

Network structure

Outline

What the network learns

Result

Input random motion vector

 Slide16

Sampled future frame

Motion vector

 

Synthesis network

Input frame

Encoding network

Future frame (ground truth)

Training

Main idea

Network structure

Outline

What the network learns

ResultSlide17

Motion vector

 

Encoding network

Synthesis network

Future frame

(prediction)

Training samples

(Label-free)

Training

Input frame

Future frame

(ground truth)

Main idea

Network structure

Outline

What the network learns

ResultSlide18

Future frame

(prediction)

 

Motion vector

 

Encoding network

Synthesis network

Training

Future frame

(ground truth)

 

Input frame

Objective function:

 

Reconstruction loss

Main idea

Network structure

Outline

What the network learns

ResultSlide19

Future frame

(prediction)

 

Future frame

(ground truth)

 

Input frame

Encoding network

Synthesis network

Training

Objective function:

 

KL-divergence loss

M

otion vector

 

Main idea

Network structure

Outline

What the network learns

Result

Variational

Autoencoder

[

Kingma

and Welling, 2014]Slide20

Future frame

(prediction)

 

Synthesis network

Testing

Future frame

(ground truth)

 

Encoding network

Input frame

Input frame

Main idea

Network structure

Outline

What the network learns

Result

u

Input random motion vector

 

Real output from our networkSlide21

Input random motion vector

 

Synthesis network

Input random

motion vector

 

Future frame

Synthesis network

How do we design the synthesis network?

Main idea

Network structure

Outline

What the network learns

Result

Input frameSlide22

Input random

motion vector

 

Input frame

Future frame

Synthesis network

How do we design the synthesis network?

Main idea

Network structure

Outline

What the network learns

ResultSlide23

Input random

motion vector

 

Input frame

Future frame

Synthesis network

Find segments

Transform segments

Synthesize by transforming segments

Main idea

Network structure

Outline

What the network learns

ResultSlide24

Input random

motion vector

 

Input frame

Future frame

Find segments

Transform segments

Synthesize by transforming segments

Main idea

Network structure

Outline

What the network learns

Result

Image segmentsSlide25

Input frame

Future frame

Transform segments

Find segments

Input random

motion vector

 

Synthesize by transforming segments

Main idea

Network structure

Outline

What the network learns

Result

Image segments

ConvolutionSlide26

Movement can be synthesized through convolution

Main idea

Network structure

Outline

What the network learns

ResultSlide27

0

0

0

0

1

0

000

001000000

Movement can be synthesized through convolutionMain idea

Network structure

Outline

What the network learns

ResultSlide28

Input random

motion vector

 

Input frame

Future frame

Convolution

Transform segments

Find segments

Transforming segments vis Cross-convolution

Main idea

Network structure

Outline

What the network learns

Result

Motion kernels

Image segmentsSlide29

Input random

motion vector

 

Input frame

Future frame

Convolution

Transform segments

Find segments

Applying motion to each segment

Main idea

Network structure

Outline

What the network learns

Result

Motion kernels

for segment 1

Segment 1Slide30

Input random

motion vector

 

Input frame

Future frame

Convolution

Transform segments

Find segments

Applying motion to each segment

Main idea

Network structure

Outline

What the network learns

Result

Motion kernels

for segment 2

Segment 2Slide31

Input random

motion vector

 

Input frame

Future frame

Convolution

Transform segments

Find segments

Applying motion to each segment

Main idea

Network structure

Outline

What the network learns

Result

Motion kernels

for segment 3

Segment 3

Decoding netSlide32

Image segments

Applying motion to each segment

Main idea

Network structure

Outline

What the network learns

Result

Motion kernels

The decoding network generates a motion kernel for each corresponding segment

Decoding net

Motion

vector

 

[

Brabandere

et al.

2016]

[Finn et al. 2016]Slide33

Synthesis

network

Motion

vector

 

Input frame

Future frame

Image segments

Motion kernels

Convolution

Future frame

Transform segments

Find segments

Encoding network

Future frame

Synthesize by transforming segments

Main idea

Network structure

Outline

What the network learns

Result

Decoding netSlide34

Motion

vector

 

Input frame

Future frame

Synthesis

network

Future frame

Main idea

Network structure

Outline

What the network learns

Result

What is encoded in the motion vector?

Encoding networkSlide35

Motion

vector

 

Input frame

Future frame

Synthesis

network

Future frame

Main idea

Network structure

Outline

What the network learns

Result

What is encoded in the motion vector?

Encoding networkSlide36

Motion vector

 

Upward motion when changing this dimension

Main idea

Network structure

Outline

What the network learns

Result

Each dimension encodes a type of motionSlide37

Motion vector

 

Leg

motion when changing this dimension

Each dimension encodes a type of motion

Main idea

Network structure

Outline

What the network learnsResultSlide38

Simulated shapes

Training samples

Results: toy example

Main idea

Network structure

Outline

What the network learns

ResultSlide39

Input

Learnedsegments

Network automatically detects segments

Triangles

Circles

Main idea

Network structure

OutlineWhat the network learns

ResultSlide40

Input

Sampled next frame

Ground truth

distribution

Sample

distribution

Network learns the correlation between appearance and motion

Main idea

Network structureOutlineWhat the network learnsResultSlide41

Input

Sampled future frames

Results: real-world images

Main idea

Network structure

Outline

What network learns

ResultSlide42

Challenge: large motion

Main idea

Network structure

Outline

What the network learns

Result

Input

Two sampled future framesArtifacts appear when motion is largeSlide43

Baseline: Transfer flow

25.5 %

Our method

31.3 %

Labeled

as

realMechanical Turk study to assess synthesis qualityIdeal synthesis algorithm achieves 50%

Main ideaNetwork structureOutlineWhat the network learnsResultSlide44

Sample multiple future frames that are consistent with the input

Synthesize frames by transforming segments

L

earn a motion representation

without supervision

ContributionsSlide45

http://visualdynamics.csail.mit.edu

Tianfan Xue*

Jiajun Wu*

Katie Bouman

Bill Freeman

Visual Dynamics: Probabilistic Future Frame Synthesis

via Cross Convolutional Networks