/
Neural Adaptive Video Streaming with Neural Adaptive Video Streaming with

Neural Adaptive Video Streaming with - PowerPoint Presentation

tatiana-dople
tatiana-dople . @tatiana-dople
Follow
413 views
Uploaded On 2018-01-19

Neural Adaptive Video Streaming with - PPT Presentation

Pensieve Hongzi Mao Ravi Netravali Mohammad Alizadeh https gigaomcom 20121109onlineviewersstartleavingifvideodoesntplayin2secondssaysstudy Video La Luna Pixar 2011 ID: 625237

video pensieve network bitrate pensieve video bitrate network throughput abr state buffer reward sec trace qoe agent rebuffering action

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Neural Adaptive Video Streaming with" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Neural Adaptive Video Streaming with Pensieve

Hongzi Mao Ravi Netravali Mohammad AlizadehSlide2

https://

gigaom.com/2012/11/09/online-viewers-start-leaving-if-video-doesnt-play-in-2-seconds-says-study/

Video: La Luna (Pixar 2011)

Users

start leaving if video doesn’t play in

2

seconds

1Slide3

Video Client

Video

Server

Request

:

next video chunk at bitrate

r

Response

:

video content

Input

Output

1 sec/sec

Animation borrowed from

Te

-Yuan Huang (SIGCOMM ‘

14)

http://

conferences.sigcomm.org

/

sigcomm

/2014/doc/slides/38.pdf

2

bitrate

Adaptive Bitrate (ABR)

Algorithms

Dynamic Streaming over HTTP (DASH)

1 sec

video

content

bitrateSlide4

Why is ABR Challenging?

ThroughputVideo bitrate

Network throughput

is

variable & uncertain

Conflicting

QoE

goals

Bitrate

Rebuffering

time

Smoothness

Cascading effects of decisions

Throughput

Bitrate

(Mbps)

3

Buffer size

(sec)Slide5

4

buffer

ABR agent

bitrates

240P

480P

720P

1080P

network and

video measurements

bandwidth

b

it rate

720P

First

network

control system

using

modern

“deep” reinforcement learning

Delivers

12-25% better

QoE

, with 10-30% less

rebuffering

than previous

ABR algorithms

Tailors ABR decisions

for different network

conditions in a data-driven way

Our Contribution:

Pensieve

Pensieve

learns

ABR algorithm

automatically

through experienceSlide6

Rate-based: pick bitrate based on predicted throughputFESTIVE [CoNEXT’12], PANDA [JSAC’14], CS2P

[SIGCOMM’16]Buffer-based: pick bitrate based on buffer occupancy BBA [SIGCOMM’14], BOLA [INFOCOM’16]

Hybrid: use both throughput prediction & buffer occupancyPBA [HotMobile’15], MPC

[SIGCOMM’15]

S

implified inaccurate model leads to suboptimal performance

5

Previous Fixed ABR AlgorithmsSlide7

Example: Model Predictive Control

Throughput

Video bitrate

t + T

maximize

QoE

(t, t + T)

subject to

system dynamics

t

Problem:

Needs accurate throughput model

Conservative Throughput

Prediction

6

Throughput

Bitrate

(Mbps)

Buffer size

(sec)

Solution: learn from video streaming sessions

in actual network conditionsSlide8

Reinforcement LearningGoal: maximize the cumulative reward

Agent

Environment

Observe state

Take action

Reward

7Slide9

Action

Pensieve

D

esign

State

Environment

 

Reward

r

t

+ (bitrate) - (

rebuffering

) - (smoothness)

720P

240P

360P

720P

1080P

Action

a

t

Reward

AgentSlide10

9

How to Train the ABR Agent

ABR agent

state

Neural Network

240P

480P

720P

1080P

policy

π

θ

(

s, a

)

Take action

a

n

ext bitrate

Observe state

s

parameter

θ

e

stimate

from

empirical data

Training

:

Collect experience data

: trajectory of [state, action, reward]Slide11

10What Pensieve is good at

Learn the dynamics directly from experienceOptimize the high level QoE objective end-to-end

Extract control rules from raw high-dimensional signalsSlide12

Pensieve Training System

{state, action, reward}experiencesupdated neural network parameters

11

Video playback

Fast chunk-level simulator

Pensieve

worker

Pensieve

worker

Pensieve

worker

Pensieve

master

Model update

TensorFlow

Large corpus of

network

traces

cellular, broadband, syntheticSlide13

12

PensieveMPC

Demo

Rebuffering

c

hances of outage

Pensieve

buffer (sec)

MPC

buffer (sec)

Throughput (mbps)Slide14

Trace-driven Evaluation

Dataset:

Two datasets, each dataset consists of 1000

traces, each

trace 320 seconds.

Video:

193 seconds. encoded at bitrates:

{300, 750, 1200, 1850, 2850, 4300}

kbps.

V

ideo player:

Google

Chrome

browser

Video server:

Apache server

Norway 3G

c

ellular dataset

FCC broadband dataset

better

better

Pensieve

improves

the best previous scheme by

12-25%and is within 9-14% of the offline optimal

13Slide15

QoE Breakdown

Reward/QoE

+ Bitrate utility

rebuffering

penalty – smooth penalty

better

b

etter

b

etter

Pensieve

reduces

rebuffering

by

10-32% over second best algorithm

14Slide16

15

Does Pensieve Generalize?

3G network trace

Trace generated from a Hidden Markov model

Covers a wide range of average throughput and network variation

Synthetic trace Slide17

Does Pensieve Generalize?

16

Train on

synthetic traces

then test on

real 3G network trace

Only 5% degradation compared with

Pensieve trained on real network trace

betterSlide18

17

Other Evaluations Experiments in the wild (LTE, public WiFi, international link)Controlled experiment for testing optimalityMulti-video extensionSensitivity analysisSlide19

1. Build a fast experimentation/simulation platform2. Data diversity is more important than “accuracy”3. Think carefully about controller state space (observation signals)

Too large a state space ⟶ slow & difficult learningToo small a state space ⟶ loss of information⟶ When in doubt, include rather than cut the signal

18

Lessons We

L

earned

Pensieve

agent

Coarse-grain chunk simulatorSlide20

Pensieve uses Reinforcement Learning to generate ABR algorithmsPensieve optimizes different network conditions through experience

Pensieve outperforms existing approaches across a wide range of network environments and QoE preferencesPolicies generated by Pensieve have strong ability to generalize19

Summary

http://web.mit.edu/pensieve

/