/
SDN Controller Challenges SDN Controller Challenges

SDN Controller Challenges - PowerPoint Presentation

yoshiko-marsland
yoshiko-marsland . @yoshiko-marsland
Follow
385 views
Uploaded On 2017-11-06

SDN Controller Challenges - PPT Presentation

The Story T hus Far SDN centralize the networks control plane The controller is effectively the brain of the network Controller determines what to do and tell switches how to do it The Story Thus Far ID: 603180

hedera controller controllers global controller hedera global controllers network local detection application stats elephant flow switch kandoo entries state

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "SDN Controller Challenges" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

SDN Controller ChallengesSlide2

The Story T

hus Far

SDN --- centralize the network’s control plane

The controller is effectively the brain of the networkController determines what to do and tell switches how to do it.Slide3

The Story Thus FarSlide4

The Story Thus Far

Something Happened!!!!Slide5

The Story Thus Far

Let’s Ask the Brian!!!!Slide6

The Story Thus Far

Think about what happen…

Maybe come up with a solutionSlide7

The Story Thus Far

Controller runs control function

Control function creates switch state

F(global network state) 

Switch state Global network state can be graph of the network

Tell the network what to doSlide8

Challenges with Centralization

Single point of failure

Fault tolerance

Performance bottleneckScalability

Efficiency (switch-controller latency)Single point for security violationsSlide9

Motivation for Distributed Controllers

Wide-Area-Network

Wide distribution of switches: from USA to Australia.

High latency between one controller and All switches

Application + Network growthHigher CPU load for controllerMore memory for storing FIB entries and calculations

High availabilitySlide10

Class Outline

Fault Tolerance

Google’s B4 paper

Controller ScalabilityWays to scale the controllerDistributed controllers: Mesh Versus Hierarchy

Implications of controller placementSlide11

Fault ToleranceSlide12

Google’s B4 Network

Provides connectivity between DC sites

Uses SDN to control edge switches

Goal: high utilization of linksInsight: fine-grained control over edge and network can lead to higher utilizationDistributed Controllers

One set of controllers for each Data center (site)Slide13

Google’s B4 Network

Provides connectivity between DC sites

Uses SDN to control edge switches

Goal: high utilization of linksDistributed ControllersOne set of controllers for each Data center (site)Slide14

Fault Tolerance in B4

Each site runs a set of controller

Paxos

is run between controllers in a site to determine masterSlide15

Quick Overview of Paxos

Given N controllers

1

Acts as leader, and N-1 as workersAll N controller maintain the same state

Switches interact with leaderChange doesn’t happen until whole group agreesFailure of primaryN-1 work together to elect a new leader(determine new leader)

Network

Events

Propagate

State changesSlide16

Pros-Cons of Paxos

Pros

Well understood and studied; gives good FTMany implementations in the wildE.g. Zookeeper

ConsTime to recoverImpacts through of the put of the entire systemSlide17

Controller ScalabilitySlide18

What limits a controller’s scalability?

Number of control messages from switch

Depends on the application logic

E.g. MicroTE/Hedera

periodically query all switches for statsReactive controller, evaluated in NoX, requires each switch to send messages for a new flow

Packet-in (if reactive Apps)Flow stats,

Flow_time-outsSlide19

What limits a controller’s scalability?

Application processing overhead

The controller runs a bunch of application

Similar to: A server running a set of programsCPU/Memory constraint limit how the app runsSlide20

What limits a controller’s scalability?

Distance between controller and the switches

Controller 1

Hedera

L3

FWSlide21

How to Scale the Controller.

Obvious: add more controllers.

BUT: how about the applications?

Synchronization/concurrency problems. Who controls which switch?

Who reacts to which events?

Controller 1

Hedera

L3

FW

Controller 2

Hedera

L3

FW

Controller N

Hedera

L3

FW

?

?

Stats + Install OF entriesSlide22

Medium Sized Networks

Assumption:

controller can’t store all forwarding table entries in memory

But can process all events and run all appsEach controller

Get same network events+ running same app.  same outputBut store output for only a fraction and

config only a fraction

Controller 1

Hedera

L3

FW

Controller 2

Hedera

L3

FW

Controller N

Hedera

L3

FW

Stats + Install OF entriesSlide23

Medium Sized Networks:

hyperflow

Each controller

Push state to each controllerEach controller things it’s the only one in the network

Controller 1

Hedera

L3

FW

Controller 2

Hedera

L3

FW

Controller N

Hedera

L3

FW

Stats + Install OF entries

Sub-subscribe

ssytemSlide24

Large Sized Networks

Assumptions

Each controller can’t store all the FIB entries

Each controller can’t run the entire application or handle eventsNeed to partition the application

But how?Slide25

Application partition 1

Approach 1: each controller runs a specific application

How do your resolve conflicts in FW entries

Apps can conflict in the rules they install

Controller 1

Hedera

Controller 2

L3

Controller N

FWSlide26

Application partition 2

Approach 2: all controllers run the same application but for a subset of devices

Results in a Distributed Mesh control plane

Controller 1

Hedera

L3

FW

Controller 2

Hedera

L3

FW

Controller N

Hedera

L3

FW

Abstract

Network viewSlide27

Application Partition 2

A

bstract view exchanged with each other

Abstract view reduces the n/w information used by each controller

Controller 2

Hedera

L3

FW

REAL NETWORK

Controller 2’s View of NETWORK

Abstraction

Provided by

Controller 1

Abstraction

Provided by

Controller NSlide28

How to Deal with State + Concurrency Issues?

Controllers synchronize through a DB or DHT

So each app needs synchronization code.

How do you deal with concurrency.Each switch has a table/Row in a DB.

Table/Row reflects switch stateProgrammer interacts directly with the DBOnix

takes care of synch between DB and switchSlide29

ONIX to the SDN Programmer

How

to synchronize between domains.

How many domains? Or controllers?How many switches in a domain?Slide30

Application partition 3

Approach 3: divide application into local, and global.

Results in a hierarchical control plane

Global Controller and Local ControllersApplications that do not need network-wide state

Can be run locally without communicate with other controllersSlide31

Are Hierarchical Controllers Feasible

Examples of local applications:

Link Discovery, Learning switch, local policies

Examples of local portions of a global algo

Data center Traffic engineeringElephant flow detection (hedera)

Predictability detection (MicroTE)

Local apps/controllers have other benefitsHigh parallelismCan be run closer to the devices.Slide32

Kandoo: Hierarchical controllers

Controller 1

Hedera

L3

FW

Controller 2

Hedera

L3

FW

Controller N

Hedera

L3

FW

Global Controller

Hedera

2 levels of controllers: global and local

Local applications are embarrassingly parallel

Local shields global from network eventsSlide33

Kandoo: Hierarchical controllers

Controller 1

Hedera

L3

FW

Controller 2

Hedera

L3

FW

Controller N

Hedera

L3

FW

Global Controller

Hedera

Local Controllers: run local apps

Returns abstract view to the global controller

Reduces # events sent to global and reduce size of network seen by Slide34

Kandoo: Hierarchical controllers

Controller 1

Hedera

L3

FW

Controller 2

Hedera

L3

FW

Controller N

Hedera

L3

FW

Global Controller

Hedera

Global Controllers

Runs global apps: AKA apps that need network wide stateSlide35

Hedera Reminder

Goal: reduce network contention

Insight: contention happens when elephants share paths.

Solution:Detect Elephant flowsPlace Elephant flows on different flowsSlide36

Implementing Hedera

in

Onix

Controller 1

Hedera

:

detection +placement

Controller 2

Hedera

:

detection+placement

2 levels of controllers: global and local

Local applications are embarrassingly parallel

Local shields global from network events

Stats

Stats

Flow

Table

entries

Flow

Table

entries

Exchange

TM+detectionSlide37

Implementing Hedera

in

Kandoo

Controller 1

Elephant detection

Controller 2

Controller N

Global Controller

Hedera

: Global placement

Local Controllers: get stats from networks + elephant detection

Global Controller: decide flow placement + flow installation

Elephant detection

Elephant detection

Inform of

elephant flows

Stats

Install new flow table entriesSlide38

Implementing B4 in

Kandoo

like architecture

Site Controller

Elephant detection

Site Controller 2

Site Controller N

Global Controller

TE+BW allocator

Local Controllers: get stats from networks + determines demand

Global Controller: calculate paths for traffic

Elephant detection

Elephant detection

Install TE Ops

Stats + Install OF entries

TE DB

Inform of Flow demandsSlide39

Kandoo to the SDN Programmer

Think of what is local and what is global

When apps are written, annotate with local flag

Kandoo will automatically place local

And place global.Kandoo restricts messages between global and local controllers

You can’t send OF styles messages Must send Kandoo

style messagesSlide40

Summary

Centralization provide simplicity at the cost of reliability and scalability

Replication can improve reliability and scalability

For Reliability, Paxos is an option

For Scalability, conqueror and divide Partition the applicationsKandoo: Local apps and global appsPartition the network

Onix: each controller controls a subset of switches (Domain)