Measurement SOSR 16 Xuemei Liu Meral Shirazipour Minlan Yu Ying Zhang 1 Measurement in data center Incentive examples of measurement Fault diagnosis Capture root causes for failures ID: 550439
Download Presentation The PPT/PDF document "MOZART: Temporal Coordination of" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
MOZART: Temporal Coordination of Measurement(SOSR’ 16)
Xuemei Liu, Meral Shirazipour, Minlan Yu, Ying Zhang
1Slide2
Measurement in data centerIncentive examples of measurementFault diagnosis: Capture root causes for failures.
Traffic engineering: Capture statistics for big flows.Attack detection: Capture signatures of attacks.
Essence of measurement
Capture
data related to events.
2Slide3
Different views/abilities of devices
3View:
per source/destination traffic
Abilities:
end-2-end loss, latency, etc.
h
osts
switches
View:
p
er link traffic
Abilities:
per link
volume, latency, etc.Slide4
No-coordination of measurement
Controller
4
Limited resource
may be
utilized
by
flows not related to the event
.
Too much reporting overhead
We propose
temporal coordination
of
measurementSlide5
Measure & report
loss of all flowsMeasure & report flow volume of all
flows
S0
S1
S2
Example1
– loss detection
5
Traffic flow
Packet loss affects performance.
O
perators
want to
locate the loss.
No-coordinationSlide6
Detect high loss
for some flows Measure & report flow volume of only lossy flows
S0
S1
S2
Example1
-
loss detection
6
Selected flows
Traffic flow
Packet loss affects performance.
O
perators
want to
locate the loss.
Coordination
Sender needs to
coordinate the
lossy
flows
with switches.Slide7
Example2 - port scan
Count & report number of destinations for all senders
S0
S1
Compromised sever
Port: 123
Port: 456
Port: 789
7
Traffic flow
Compromised servers
detect
vulnerable servers
.
No-coordinationSlide8
Count & report
number of destinations for detected senderDetect senders with unwanted traffic sent to secure ports
S0
S1
Http server (80)
Compromised sever
Port: 123
Port: 456
Port: 789
8
Selected
flows
Traffic flow
Compromised
servers detect vulnerable servers.
Example2
-
port s
can
Coordination
Egress switch
coordinates
candidate compromised senders
with ingress switch Slide9
Example3 - ECMP flow
Measure & report volume of all flows
S1
S0
S2
9
Facebook reported congestion caused by unbalanced
ECMP
traffic distribution.
Traffic flow
No-coordinationSlide10
Example3 - ECMP flow
Detect elephant flowsMeasure & report volume of
elephant flows
S1
S0
S2
10
Facebook reported congestion caused by unbalanced
ECMP
traffic distribution
.
Selected
flows
Traffic flow
Coordination
Switches coordinate
elephant flows
with each otherSlide11
MOZARTMO
nitor flowZ At the Right Time11Slide12
MOZART framework
MOZART controller
selector
selector
monitor
monitor
Report
data of
selected flows
Configure
Selected
flows
Detect events
Capture data related
to events
12Slide13
MOZART design challengesCoordination measurementPlacement of MOZART tasks13Slide14
MOZART design challengesCoordination measurementPlacement of tasks14Slide15
…
Strawman
Coordination
15
f1 in Selector
:
f1
in Monitor
:
…
Normal packet
f
1 is selected
TIME
f1
satisfies
the eventSlide16
…
Strawman Coordination 16
f1 in Selector
:
f1
in Monitor
:
…
Normal packet
Captured
packet
…
Traffic before selected
is not captured
f1 is selected
TIME
f1
satisfies
the eventSlide17
Event Mode
Normal Mode
…
Two-mode
Coordination
17
f1 in Selector
:
f1
in Monitor
:
…
Normal packet
TIME
Captured
packet
…
f
1 is selected
Sampling
in
Normal Mode
Sampled
packet
Traffic
before selected
has
a chance to be captured
.
f1
satisfies
the eventSlide18
Memory management in monitors
Flow ID
Selected flow?
Flow statistics
f1
1
10240
f2
1
2048
f3
0
500
f7
18
Selected flows, non-selected flows coexist in hash table.
Limited memory in devices.
Collision may happen
in hash table.
Selected flowsSlide19
Memory management in monitors19
Flow ID
Selected flow?
Flow statistics
f1
1
10240
f2
1
2048
f7
1
1024
f7
Selected flows
Selected flows, non-selected flows coexist in hash table.
Limited memory in devices.
Collision may happen
in hash table.Slide20
Memory management in monitors
Flow ID
Selected flow?
Flow statistics
f1
1
10240
f2
1
2048
f7
1
1024
f5
f7
f6
20
Selected flows
Non-selected flows
More memory
is allocated to
selected flows
.
Selected flows, non-selected flows coexist in hash table.
Limited memory in devices.
Collision may happen
in hash table.Slide21
MOZART design challengesCoordination measurementPlacement of MOZART tasks21Slide22
Placement of MOZART tasksMany candidate MOZART tasks to
runOperators want to detect many events.Device Resource ConstraintsSwitches: limited memory; Hosts: limited CPU.Measurement can just use leftover resources.
Latency
constraint within one MOZART taskTimely communication is
critical.Latency between selectors/monitors should be small.
22Slide23
Strawman algorithmMaximize Allocated Modules (MAM).ChallengesOne task - Selectors and monitors should all be placed.Multiple tasks - Joint placement to max running tasks.
MOZART- Binary Integer Linear ProgrammingObjective - Maximize the number of tasks to run.Subject to resource and latency constraints.
23
Placement of
MOZART tasksSlide24
Evaluation SetupTopology & Traffic
B4 topology (12 switches, 12 hosts).Implemented in Mininet.Switches run Open
vSwitch
.2 hours Caida
trace.
Compared algorithms
No-coordination - Just Sample and Hold (
SH
) in monitors.
Coordination
- Selectors sends selected
flows;
SH
in
monitors.
24Slide25
High loss for some flows
measure flow volume of lossy flows
S0
S1
S2
Example
– loss detection
25
Selected flows from selector
Traffic flow
selector
monitor
monitor
monitorSlide26
MOZART achieves high accuracy
2615%
1.3%
Ratio of selected flows not captured
Memory size in each monitor for measurementSlide27
MOZART supports more tasks27
Algorithmstasks assigned(%)Avg. latency(ms)Maximize Allocated Modules77%94MOZART(Latency <= infinite)
100%
110Slide28
MOZART supports more tasks28
Algorithmstasks assigned(%)Avg. latency(ms)Maximize Allocated Modules77%94MOZART(Latency <= infinite)
100%
110
MOZART(Latency <= 250ms)98%
64Slide29
ConclusionTemporal coordination is importantCollect data related to events.Different views/abilities of devices.
MOZART design highlightsCoordination algorithms.Placement algorithm for maximizing tasks to run.BenefitsHigh measurement accuracy.Support more tasks.Meet memory constraints in devices.29Slide30
Communication between selectors and monitorsSame pathTag following packets of selected flows.Reverse pathTag reverse packets of selected flows.
Different pathSend explicit packets.30