/
Clustering in OpenDaylight Clustering in OpenDaylight

Clustering in OpenDaylight - PowerPoint Presentation

myesha-ticknor
myesha-ticknor . @myesha-ticknor
Follow
394 views
Uploaded On 2016-12-18

Clustering in OpenDaylight - PPT Presentation

Colin Dixon Technical Steering Committee Chair OpenDaylight Distinguished Engineer Brocade Borrowed ideas and content from Jan Medved Moiz Raja and Tom Pantelis MultiProtocol SDN Controller Architecture ID: 503205

shard data service store data shard store service node api application port replica service1 strong network service3 devices leader entity service2 applications

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Clustering in OpenDaylight" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Clustering in OpenDaylight

Colin Dixon

Technical Steering Committee Chair, OpenDaylight

Distinguished Engineer, Brocade

Borrowed ideas

and content from

Jan

Medved

,

Moiz

Raja, and Tom

PantelisSlide2

Multi-Protocol SDN Controller Architecture

Service Adaptation Layer (SAL) / Core

OpenFlow

Netconf

Client

Network Devices

Applications

Network Devices

Network Devices

Applications

OSS/BSS, External Apps

OVSDB

Protocol Plugin

...

Application

Netconf

Server

RESTCONF

...

Application

Protocol Plugins/Adapters

Controller Core

Applications

RESTSlide3

Model

-Driven

SAL

(

MD-SAL)

Netconf

Client

Network Devices

Network Devices

Network Devices

Protocol Plugin

...

Netconf

Server

RESTCONF

Application

Application

REST

Applications

Applications

OSS/BSS, External Apps

Data

Store

RPCs

Notifications

Namespace

Data Change Notifications

Software Architecture

“Kernel”

Apps/ServicesSlide4

Data Store ShardingSelect data subtreesCurrently, can only pick a subtree

directly under the rootWorking on subtrees at arbitrary levelsMap subtrees onto shards

Map shards onto nodes

Data tree root

Shard1.1

Shard1.2

Shard1.3

Shard2.1

ShardN.1

...

...

Node1

Node2

NodeM

Shard Layout Algorithm:

Place shards on M nodes

...

ShardX.Y

:

X: Service X

Y: Shard Y within Service XSlide5

Shard ReplicationReplication using RAFT [1]Provides strong consistencyTransactional data access

snapshot readssnapshot writesread/write transactionsTolerates f failures with 2f+1 nodes

3 nodes => 1 failure, 5 => 2, etc.

Leaders handle all writes

Send to followers before committing

Leaders distributed to spread load

Node

NodeNode

L

L

L

L

L

F

F

F

F

F

F

F

F

F

F

[1]

https

://raftconsensus.github.io/Slide6

Strong ConsistencySerializabilityEveryone always reads the most recent write. Loosely “everyone is at the same point on the same timeline.”Causal ConsistencyLoosely “you won’t see the result of any event without seeing everything that could have caused that event.” [Typically in the confines of reads/writes to a single

datastore.]Eventual ConsistencyLoosely “everyone will eventually see the same events in some order, but not the necessarily the same order.” [Eventual in the same way that your kid will “eventually” clean their room.]Slide7

Why strong consistency mattersA flapping port generates events:port up, port down, port up, port down, port up, port down, …Is the port up or down?If they’re not ordered, you have no idea.

…sure, but these are events from a single device, so you can keep them ordered easilyMore complex examplesswitches (or ports on those switches) attached to different

nodes go up and down

If you don’t have ordering, different nodes will now come to different conclusions about reachability, paths, etc.Slide8

Why everyone doesn’t use itStrong consistency can’t work in the face of partitionsIf you’re network splits in two, either:one side stopsyou lose strong consistencyStrong consistency requires coordination

Effectively, you need some single entity to order everythingThis has performance costs, i.e., a single strongly consistent data store is limited by the performance of a single nodeQuestion is: do we start with strong and relax it or start weak and strengthen it?

OpenDaylight has started strongSlide9

Service/Application ModelLogically, each service or application (code) has a primary subtree (YANG model

) and shard it is associated with (data)One instance of the code is co-located with each replica of the data

All instances are stateless, all state is stored in the data store

The instance that is co-located with the shard leader

handles writes

Node1

S1.1

S3.1

S1.2

S2.9

S3.7

Service1

Node2

S1.1

S3.1

S1.2

S2.9

S3.7

Node3

S1.1

S3.1

S1.2S2.9

S3.7

Data Store API

Data Store API

Data Store API

Service2

Service3

SX.Y

Shard Leader replica

SX.Y

Shard Follower replica

Service1

Service3

Service2

Service3

Service2

Service1Slide10

Service/Application Model (cont’d)Entity Ownership Service allows for related tasks to be co-locatede.g., the tasks related to a given OpenFlow switch should happen where it’s connected

also handles HA/failover, automatic election of a new entity ownerRPCs and Notifications are directed to the entity ownerNew cluster-aware data change listeners provide integration into the data store

Node1

S1.1

S3.1

S1.2

S2.9

S3.7

Service1

Node2

S1.1

S3.1

S1.2

S2.9

S3.7

Node3

S1.1

S3.1

S1.2

S2.9

S3.7

Data Store API

Data Store API

Data Store API

Service2

Service3

SX.Y

Shard Leader replica

SX.Y

Shard Follower replica

Service1

Service3

Service2

Service3

Service2

Service1Slide11

Handling RPCs and Notifications in a ClusterData Change NotificationsFlexibly delivered to shard leader, or any subset of nodesYANG-modeled Notifications

Delivered to the node on which they were generatedTypically guided to entity ownerGlobal RPCsDelivered to node where calledRouted RPCs

Delivered to the node which registered to handle them

Node

Node

Node

L

L

L

L

L

F

F

F

F

F

F

F

F

F

FSlide12

Service/Shard Interactions

Service-x “resolves”

a read or write to a

subtree

/shard

Reads are sent to the leader

Working on allowing for local reads

Writes are send to the leader to be orderedNotifications changed data are sent to the shard leaderand to anyone registered for remote notification

Node1

S1.1

S3.1

S1.2

S2.9

S3.7

Node2

S1.1

S3.1

S1.2

S2.9

S3.7

Node3

S1.1S3.1

S1.2

S2.9

S3.7

Data Store API

Data Store API

Data Store API

SX.Y

Shard Leader replica

SX.Y

Shard Follower replica

1

“resolve”

“read”

Service1

Service-x

1

2.1

4

2

3

2

3

“write”

Service-y

4.1

“notification”

“notification”Slide13

Major additions in BerylliumEntity Ownership ServiceEntityOwnershipServiceClustered Data Change

ListenersClusteredDataChangeListener and ClusteredDataTreeChangeListener

Singificant

application/plugin adoption

OVSDB

OpenFlowNETCONFNeutron

Etc.Slide14

Work in progressDynamic, multi-level shardingMulti-level, e.g., OpenFlow should be able to say it’s subtrees

start at the “switch” nodeDynamic, e.g., an OpenFlow subtree should be moved if the connection moves

Improve performance, scale, stability, etc. as always

Faster, but “stale”, reads from local replica vs. always reading

from leader

Pipelined transactions for better cluster write throughput

Whatever else you’re interested in helping withSlide15

Longer term thingsHelper code for building common app patternsrun once in the cluster and fail-over if that node goes downrun everywhere and handle things

Different consistency modelsGiving people the knob is easyDealing with the ramifications is hardFederated/hierarchical clustering