/
Ananta Ananta

Ananta - PowerPoint Presentation

danika-pritchard
danika-pritchard . @danika-pritchard
Follow
377 views
Uploaded On 2017-11-29

Ananta - PPT Presentation

Cloud Scale Load Balancing Presenter Donghwi Kim 1 Background Datacenter Each server has a hypervisor and VMs Each VM is assigned a Direct IP DIP 2 Each service has zero or more external ID: 611197

vip mux dst host mux vip host dst dip ananta agent snat vip1 vip2 dip2 service client src payload

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Ananta" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Ananta: Cloud Scale Load Balancing

Presenter: Donghwi Kim

1Slide2

Background: Datacenter

Each server has a hypervisor and VMs

Each VM is assigned a Direct IP(

DIP

)

2

Each service has zero or more external

end-points

Each service is assigned one Virtual IP (

VIP

)Slide3

Background: Datacenter

Each datacenter has many servicesA service may work with Another service in same datacenterAnother service in other datacenter

A client over the internet

3Slide4

Background: Load-balancer

Entrance of server poolDistribute workload to worker serversHide server pools from client with network address translator (NAT)

4Slide5

Do destination address translation (DNAT)

Inbound VIP Communication

5

Front-end

VM

LB

Front-end

VM

Front-end

VM

Internet

DIP 1

VIP

src

: Client,

dst

: VIP

payload

src

: Client,

dst

:

DIP1

payload

DIP 2

DIP 3

src

: Client,

dst

:

DIP2

payload

src

: Client,

dst

:

DIP3

payload

src

: Client,

dst

: VIP

payload

src

: Client,

dst

: VIP

payloadSlide6

Do source address translation (SNAT)

VIP 1

Outbound VIP Communication

6

Front-end

VM

LB

Back-end

VM

DIP 1

DIP

2

Front-end

VM

LB

Front-end

VM

Front-end

VM

DIP 3

Service 1

Service 2

Datacenter

Network

VIP 2

src

: DIP2,

dst

: VIP2

payload

src

:

VIP1

,

dst

: VIP2

payload

DIP 4

DIP 5

src

:

VIP1

,

dst

: VIP2

payloadSlide7

State of the ArtA load balancer is a hardware device

Expensive, slow failover, no scalability7

LBSlide8

Cloud RequirementsScale

Reliability

8

Requirement

State-of-the-art

~40

Tbps

throughput using 400 servers

20Gbps

for $80,000

100Gbps for a single VIP

Up to 20Gbps per VIP

Requirement

State-of-the-art

N+1 redundancy

1+1 redundancy

or slow failover

Quick

failoverSlide9

Cloud RequirementsAny service anywhere

Tenant isolation

9

Requirement

State-of-the-art

Servers and LB/NAT are placed across L2 boundaries

NAT

supported only in the same L2

Requirement

State-of-the-art

An overloaded or abusive tenant cannot affect other tenants

Excessive

SNAT from one tenant causes complete outageSlide10

Ananta

10Slide11

SDNSDN: Managing a flexible data plane via a centralized control plane

11

Controller

Control Plane

Data plane

SwitchSlide12

Break downLoad-balancer’s functionality

Control plane:VIP configurationMonitoring

Data plane

Destination/source selection

address translation

12Slide13

Design

Ananta ManagerSource selectionNot scalable(like SDN controller)

Multiplexer (Mux)

Destination selection

Host Agent

Address translation

Reside in each server’s hypervisor13Slide14

Data plane

14

Multiplexer

Multiplexer

Multiplexer

. . .

VM Switch

VM

N

Host Agent

VM

1

. . .

VM Switch

VM

N

Host Agent

VM

1

. . .

VM Switch

VM

N

Host Agent

VM

1

. . .

. . .

dst

: VIP1

dst

: VIP2

dst

: VIP1

dst

: VIP2

dst

:

DIP3

dst

: VIP1

dst

:

DIP1

dst

: VIP1

dst

:

DIP2

dst

:

DIP1

dst

:

DIP2

dst

:

DIP3

1

st

tier (Router)

packet-level

load spreading

via

ECMP.

2

nd

tier (Multiplexer)

connection-level

load spreading

destination selection

.

3

rd

tier (Host Agent)

Stateful

NATSlide15

Inbound connections

15

Router

Router

MUX

Host

MUX

Router

MUX

Host Agent

1

2

3

VM

DIP

4

5

6

7

8

Client

s:

CLI

, d: VIP

s:

CLI

, d:

DIP

s:

VIP

, d: CLI

s:

DIP

, d: CLI

s:

CLI

, d: VIP

s:

MUX

, d: DIPSlide16

Outbound (SNAT) connections

16

Server

s:

DIP:555

, d: SVR:80

Port??

Map VIP:777 to DIP

Map VIP:777 to DIP

s:

VIP

:

777

, d: SVR:80

s:

SVR:80

, d: VIP:777

s:

SVR:80, d: VIP:777

s:

MUX, d: DIP:555

s: SVR:80, d:

DIP:555Slide17

Reducing Load of AnantaManager

OptimizationBatching: Allocate 8 ports instead of onePre-allocation: 160 ports per VM

Demand

prediction: Consider recent request history

Less

than 1% of outbound connections ever hit Ananta

ManagerSNAT request latency is reduced

17Slide18

VIP traffic in a datacenterLarge portion of traffic via load-balancer is intra-DC

18Slide19

Step 1: Forward Traffic

19

Host

MUX

MUX

MUX1

VM

Host Agent

1

DIP1

MUX

MUX

MUX2

2

Host

VM

Host Agent

DIP2

Data Packets

Destination

VIP1

VIP2Slide20

Step 2: Return Traffic

20

Host

MUX

MUX

MUX1

VM

Host Agent

1

DIP1

4

MUX

MUX

MUX2

2

3

Host

VM

Host Agent

DIP2

Data Packets

Destination

VIP1

VIP2Slide21

Step 3: Redirect Messages

21

Host

MUX

MUX

MUX1

VM

Host Agent

DIP1

5

6

MUX

MUX

MUX2

Host

VM

Host Agent

DIP2

7

Redirect Packets

Destination

VIP1

VIP2Slide22

Step 4: Direct Connection

22

Host

MUX

MUX

MUX1

VM

Host Agent

DIP1

MUX

MUX

MUX2

8

Host

VM

Host Agent

DIP2

Redirect Packets

Data Packets

Destination

VIP1

VIP2Slide23

SNAT FairnessAnanta

Manager is not scalableMore VMs, more resources

23

DIP1

DIP2

DIP3

DIP4

VIP1

VIP2

1

2

3

Pending SNAT Requests per DIP. At most one per DIP.

1

Pending SNAT Requests per VIP.

SNAT processing queue

Global queue. Round-robin

dequeue

from VIP queues. Processed by thread pool.

4

6

5

1

3

2

4

4

2

3Slide24

Packet Rate Fairness

Each Mux keeps track of its top-talkers(top-talker: VIPs with the highest rate of packets)When packet drop happens, Ananta

Manager

withdraws the topmost top-talker from all

Muxes

24Slide25

ReliabilityWhen Ananta Manager fails

Paxos provides fault-tolerance by replicationTypically 5 replicasWhen Mux fails

1

st

tier routers detect failure by BGP

The routers stop sending traffic to that Mux.

25Slide26

Evaluation

26Slide27

Impact of Fastpath

Experiment:One 20 VM tenant as the serverTwo 10 VM tenants a clientsEach VM setup 10 connections, upload 1MB data

27Slide28

Ananta Manager’s SNAT latency

Ananta manager’s port allocation latencyover 24 hour observation

28Slide29

SNAT Fairness

Normal users (N) make 150 outbound connections per minuteA heavy user (H) keep increases outbound connection rateObserve SYN retransmit and SNAT latencyNormal users are not affected by a heavy user

29Slide30

Overall AvailabilityAverage availability over a month: 99.95%

30Slide31

SummaryHow

Ananta meet cloud requirements

31

Requirement

Description

Scale

Mux: ECMP

Host agent: Scale-out naturally

Reliability

Ananta

manager:

Paxos

Mux: BGP

Any service anywhere

Ananta

is on layer

4 (Transport layer)

Tenant isolation

SNAT fairness

Packet

rate fairnessSlide32

MUX (NEW)

MUX

Discussion

Ananta may lose some

connections

W

hen

it recovers from MUX

failure

B

ecause

there is no way to copy MUX’s internal

state.

32

5-tuple

DIP

DIP1

DIP2

1

st tier Router5-tuple

DIP

???

TCP flowsSlide33

Discussion

Detection of MUX failure takes at most 30 seconds (BGP hold timer). Why don’t we use additional health monitoring?Fastpath does not preserve the order of packets.Passing through a software component, MUX, may increase the latency of connection establishment.* (

Fastpath

does not relieve this.)

Scale of evaluation is too small. (e.g. Bandwidth of 2.5Gbps, not

Tbps

). Another paper insists that Ananta requires 8,000 MUXes to cover mid-size datacenter.*

33

*DUET: Cloud Scale Load Balancing with Hardware and Software, SIGCOMM‘14Slide34

Thanks !Any Questions ?

34Slide35

Lessons learnt

Centralized controllers workThere are significant challenges in doing per-flow processing, e.g., SNATProvide overall higher reliability and easier to manage system

Co-location of control plane and data plane provides faster local recovery

Fate sharing eliminates the need for a separate, highly-available management channel

Protocol semantics are violated on the Internet

Bugs in external code forced us to change network MTU

Owning our own software has been a key enabler for:Faster turn-around on bugs, DoS

detection, flexibility to design new featuresBetter monitoring and management

MicrosoftSlide36

Backup: ECMPEqual-Cost Multi-Path Routing

Hash packet header and choose one of equal-cost paths36Slide37

Backup: SEDA

37Slide38

Backup: SNAT

38Slide39

VIP traffic in a data center

MicrosoftSlide40

CPU usage of MuxCPU usage over typical 24-hr period by 14

Muxes in single Ananta instance

40Slide41

Remarkable PointsThe

first middlebox architecture that moves parts of it to the hostDeployed and served for Microsoft datacenter more than 2 years

41

Related Contents


Next Show more