/
Blazes: coordination analysis for distributed program Blazes: coordination analysis for distributed program

Blazes: coordination analysis for distributed program - PowerPoint Presentation

pasty-toler
pasty-toler . @pasty-toler
Follow
429 views
Uploaded On 2016-09-12

Blazes: coordination analysis for distributed program - PPT Presentation

Peter Alvaro Neil Conway Joseph M Hellerstein David Maier UC Berkeley Portland State D istributed systems are hard Asynchrony Partial Failure A synchrony isnt that hard ID: 464671

data key client source key data source client response log input dataflow service component coordination deterministic val filter interface

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Blazes: coordination analysis for distri..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Blazes: coordination analysis for distributed program

Peter Alvaro

,

Neil Conway, Joseph M.

Hellerstein

David Maier

UC Berkeley Portland StateSlide2

Distributed systems are hard

Asynchrony

Partial FailureSlide3

Asynchrony isn’t that hard

Logical timestamps

Deterministic interleaving

Ameloriation

:Slide4

Partial failure isn’t that hard

Replication

Replay

Ameloriation

:Slide5

Asynchrony *

partial failure

is hard

2

Logical timestampsDeterministic interleaving

Replication

ReplaySlide6

asynchrony *

partial failure

is hard

2

Replication

Replay

Today:

Consistency criteria for fault-tolerant distributed systems

Blazes

: analysis and enforcementSlide7

This talk is all setup

Frame of mind:

Dataflow: a model of distributed computation

Anomalies: what can go wrong?

Remediation strategies

Component propertiesDelivery mechanisms

Framework:

Blazes –

coordination analysis and synthesisSlide8

Little boxes: the dataflow model

Generalization of distributed

services

Components

interact via asynchronous calls (streams

)Slide9

Components

Input interfaces

Output interfaceSlide10

Streams

Nondeterministic orderSlide11

Example: a join operator

R

S

TSlide12

Example: a key/value store

put

get

responseSlide13

Example: a pub/sub service

publish

subscribe

deliverSlide14

Logical dataflow

“Software architecture”

Data source

client

Service X

filter

cache

c

a

bSlide15

Dataflow is compositional

Components are recursively defined

Data source

client

Service X

filter

aggregatorSlide16

Dataflow exhibits self-similaritySlide17

Dataflow exhibits self-similarity

DB

HDFS

Hadoop

Index

Combine

Static

HTTP

App1

App2

Buy

Content

User

requests

App1

answers

App2

answersSlide18

Physical dataflowSlide19

Physical dataflow

Data source

client

Service X

filter

aggregator

c

a

bSlide20

Physical dataflow

Data source

Service X

filter

aggregator

client

“System architecture”Slide21

What could go wrong?Slide22

Cross-run nondeterminism

Data source

client

Service X

filter

aggregator

c

a

b

Run 1

Nondeterministic replaysSlide23

Cross-run nondeterminism

Data source

client

Service X

filter

aggregator

c

a

b

Nondeterministic replays

Run 2Slide24

Cross-instance nondeterminism

Data source

Service X

client

Transient replica disagreementSlide25

Divergence

Data source

Service X

client

Permanent replica disagreementSlide26

Hazards

Data source

client

Service X

filter

aggregator

c

a

b

Order

Contents?Slide27

Preventing the anomalies

Understand component semantics

(And disallow certain compositions)Slide28

Component properties

Convergence

Component replicas receiving the same messages reach the same state

Rules out

divergenceSlide29

Insert

Read

Convergent

d

ata structure

(e.g., Set CRDT)

Convergence

Insert

Read

Commutativity

Associativity

Idempotence

Reordering

Batching

Retry/duplication

Tolerant toSlide30

Convergence isn’t compositional

Data source

client

Convergent

(identical input contents

 identical state)Slide31

Component properties

Convergence

Component replicas receiving the same messages reach the same state

Rules out

divergenceConfluenceOutput streams have deterministic contents

Rules out all stream anomaliesConfluent 

convergentSlide32

Confluence

o

utput set = f(input set)

{ }

{ }

=Slide33

Confluence is compositional

o

utput set = f

g(input set) Slide34

Preventing the anomalies

Understand component semantics

(And disallow certain compositions)

Constrain message delivery orders

OrderingSlide35

Ordering – global coordination

Deterministic

outputs

Order-sensitiveSlide36

Ordering – global coordination

Data source

client

The first principle of successful scalability

is to

batter the consistency mechanisms

down

to a minimum.

– James Hamilton Slide37

Preventing the anomalies

Understand component semantics

(And disallow certain compositions)

Constrain message delivery orders

OrderingBarriers and sealingSlide38

Barriers – local coordination

Deterministic

outputs

Data source

client

Order-sensitiveSlide39

Barriers – local coordination

Data source

clientSlide40

Sealing – continuous barriers

Do partitions of (infinite) input streams “end”?

Can components produce deterministic results given “complete” input partitions?

Sealing:

partition barriers for infinite streamsSlide41

Sealing – continuous barriers

Finite partitions of infinite inputs are common

…in distributed systems

SessionsTransactionsEpochs / views

…and applicationsAuctionsChatsShopping cartsSlide42

Blazes:

c

onsistency analysis

+

coordination selectionSlide43

Blazes:

Mode 1: Grey boxesSlide44

Grey boxes

Example:

pub/sub

x = publishy = subscribez = deliver

x

y

z

Deterministic

b

ut unordered

Severity

Label

Confluent

Stateless

1

CR

X

X

2

CW

X

3

OR

gate

X

4

OW

gate

x->z

:

CW

y->z

:

CW

TSlide45

Grey boxes

Example:

key/value store

x = put; y = get; z = response

x

y

z

Deterministic

b

ut unordered

Severity

Label

Confluent

Stateless

1

CR

X

X

2

CW

X

3

OR

gate

X

4

OW

gate

x->z

:

OW

key

y->z

: OR

TSlide46

Label propagation – confluent composition

CW

CR

CR

CR

CR

Deterministic

outputs

CWSlide47

Label propagation – unsafe composition

OW

CR

CR

CR

CR

Tainted

outputs

Interposition

pointSlide48

Label propagation – sealing

OW

key

CR

CR

CR

CR

Deterministic

outputs

OW

key

Seal(key=x)

Seal(key=x)Slide49

Blazes:

Mode 1: White boxesSlide50

white boxes

module

KVS

state

do interface input,

:put

,

[

:key

,

:

val

]

interface input,

:get

,

[:ident, :key] interface output, :response, [:response_id

, :key,

:

val

]

table

:log

,

[

:key

,

:

val

]

end

bloom

do

log

<+

put

log

<-

(put

*

log)

.

rights(

:key

=>

:key

)

response

<=

(log

*

get)

.

pairs(

:key

=>

:key

)

do

|

s,l

|

[

l

.

ident

,

s

.

key

,

s

.

val

]

end

end

end

p

ut

response:

OW

key

g

et  response:

OR

key

Negation (

 order sensitive)

Partitioned by

:keySlide51

white boxes

module

PubSub

state

do interface input,

:

publish

,

[

:key

,

:

val

]

interface input,

:subscribe

, [:ident, :key] interface output, :response,

[:response_id

,

:key

,

:

val

]

table

:log

,

[

:key

,

:

val

]

table :

sub_log

, [:

ident

, :key]

end

bloom

do

log

<=

publish

sub_log <= subscribe

response

<=

(log

*

sub_log

)

.

pairs(

:key

=>

:key

)

do

|

s,l

|

[

l

.

ident

,

s

.

key

,

s

.

val

]

end

end

end

publish

 response:

CW

subscribe  response:

CRSlide52

The Blazes frame of mind:

Asynchronous dataflow model

Focus on consistency of

data in motion

Component semanticsDelivery mechanisms and costsAutomatic, minimal coordinationSlide53

Queries?Slide54