/
Data Center Fabrics Lecture 12 Data Center Fabrics Lecture 12

Data Center Fabrics Lecture 12 - PowerPoint Presentation

susan2
susan2 . @susan2
Follow
0 views
Uploaded On 2024-03-13

Data Center Fabrics Lecture 12 - PPT Presentation

Aditya Akella PortLand Scalable faulttolerant L 2 network c through Augmenting DCs with an optical circuit switch PortLand A Scalable FaultTolerant Layer 2 Data Center Network Fabric ID: 1047789

traffic switch network circuit switch traffic circuit network packet tree hosts amp host fabric layer data optical switches edge

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Data Center Fabrics Lecture 12" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Data Center FabricsLecture 12Aditya Akella

2. PortLand: Scalable, fault-tolerant L-2 networkc-through: Augmenting DCs with an optical circuit switch

3. PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network FabricIn a nutshell:PortLand is a single “logical layer 2” data center network fabric that scales to millions of endpointsPortLand internally separates host identity from host locationuses IP address as host identifierintroduces “Pseudo MAC” (PMAC) addresses internally to encode endpoint locationPortLand runs on commodity switch hardware with unmodified hosts3

4. Design Goals for Network FabricSupport for Agility!Easy configuration and management: plug-&-playFault tolerance, routing and addressing: scalabilityCommodity switch hardware: small switch stateVirtualization support: seamless VM migration4

5. Forwarding TodayLayer 3 approach:Assign IP addresses to hosts hierarchically based on their directly connected switch.Use standard intra-domain routing protocols, eg. OSPF.Large administration overheadLayer 2 approach:Forwarding on flat MAC addressesLess administrative overhead Bad scalabilityLow performanceMiddle ground between layer 2 and layer 3:VLANFeasible for smaller scale topologiesResource partition problem

6. Requirements due to VirtualizationEnd host virtualization:Needs to support large addresses and VM migrationsIn layer 3 fabric, migrating the VM to a different switch changes VM’s IP addressIn layer 2 fabric, migrating VM incurs scaling ARP and performing routing/forwarding on millions of flat MAC addresses.

7. Background: Fat-TreeInter-connect racks (of servers) using a fat-tree topologyFat-Tree: a special type of Clos Networks (after C. Clos)K-ary fat tree: three-layer topology (edge, aggregation and core)each pod consists of (k/2)2 servers & 2 layers of k/2 k-port switcheseach edge switch connects to k/2 servers & k/2 aggr. switches each aggr. switch connects to k/2 edge & k/2 core switches(k/2)2 core switches: each connects to k podsFat-tree with K=27

8. Why?Why Fat-Tree?Fat tree has identical bandwidth at any bisectionsEach layer has the same aggregated bandwidthCan be built using cheap devices with uniform capacityEach port supports same speed as end hostAll devices can transmit at line speed if packets are distributed uniform along available paths Great scalability: k-port switch supports k3/4 serversFat tree network with K = 3 supporting 54 hosts8

9. PortLandAssuming: a Fat-tree network topology for DCIntroduce “pseudo MAC addresses” to balance the pros and cons of flat- vs. topology-dependent addressingPMACs are “topology-dependent,” hierarchical addressesBut used only as “host locators,” not “host identities”IP addresses used as “host identities” (for compatibility w/ apps)Pros: small switch state & Seamless VM migrationPros: “eliminate” flooding in both data & control planesBut requires a IP-to-PMAC mapping and name resolution a location directory serviceAnd location discovery protocol & fabric managerfor support of “plug-&-play”9

10. PMAC Addressing SchemePMAC (48 bits): pod.position.port.vmid Pod: 16 bits; position and port (8 bits); vmid: 16 bitsAssign only to servers (end-hosts) – by switches10podposition

11. Location Discovery ProtocolLocation Discovery Messages (LDMs) exchanged between neighboring switchesSwitches self-discover location on boot upLocation Characteristics Technique Tree-level (edge, aggr. , core) auto-discovery via neighbor connectivity Position # aggregation switch help edge switches decide Pod # request (by pos. 0 switch only) to fabric manager 11

12. PortLand: Name ResolutionEdge switch listens to end hosts, and discover new source MACs Installs <IP, PMAC> mappings, and informs fabric manager12

13. PortLand: Name Resolution …Edge switch intercepts ARP messages from end hosts send request to fabric manager, which replies with PMAC13

14. PortLand: Fabric Managerfabric manager: logically centralized, multi-homed server maintains topology and <IP,PMAC> mappings in “soft state”14

15. Loop-free Forwarding and Fault-Tolerant RoutingSwitches build forwarding tables based on their positionedge, aggregation and core switches Use strict “up-down semantics” to ensure loop-free forwardingLoad-balancing: use any ECMP path via flow hashing to ensure packet orderingFault-tolerant routing:Mostly concerned with detecting failuresFabric manager maintains logical fault matrix with per-link connectivity info; inform affected switchesAffected switches re-compute forwarding tables15

16. 16c-Through: Part-time Optics in Data Centers

17. Current solutions for increasing data center network bandwidth 171. Hard to construct2. Hard to expandFatTreeBCube

18. An alternative: hybrid packet/circuit switched data center network18Goal of this work: Feasibility: software design that enables efficient use of optical circuitsApplicability: application performance over a hybrid network

19. Electrical packet switchingOptical circuit switchingSwitching technologyStore and forwardCircuit switchingSwitching capacitySwitching timeOptical circuit switching v.s. Electrical packet switching1916x40Gbps at high end e.g. Cisco CRS-1320x100Gbps on market, e.g. Calient FiberConnect Packet granularityLess than 10mse.g. MEMS optical switch

20. 20Optical circuit switching is promising despite slow switching time Full bisection bandwidth at packet granularitymay not be necessary[WREN09]: “…we find that traffic at the five edge switches exhibit an ON/OFF pattern… ” [IMC09][HotNets09]: “Only a few ToRs are hot and most their traffic goes to a few other ToRs. …”

21. Hybrid packet/circuit switched network architectureOptical circuit-switched network for high capacity transfer Electrical packet-switched network for low latency deliveryOptical paths are provisioned rack-to-rackA simple and cost-effective choice Aggregate traffic on per-rack basis to better utilize optical circuits

22. Design requirements22Control plane:Traffic demand estimation Optical circuit configurationData plane:Dynamic traffic de-multiplexingOptimizing circuit utilization (optional)Traffic demands

23. c-Through (a specific design)23No modification to applications and switchesLeverage end-hosts for traffic managementCentralized control for circuit configuration

24. c-Through - traffic demand estimation and traffic batching24Per-rack traffic demand vector2. Packets are buffered per-flow to avoid HOL blocking. 1. Transparent to applications.ApplicationsAccomplish two requirements: Traffic demand estimation Pre-batch data to improve optical circuit utilizationSocket buffers

25. c-Through - optical circuit configuration25Use Edmonds’ algorithm to compute optimal configurationMany ways to reduce the control traffic overheadTraffic demandconfigurationControllerconfiguration

26. c-Through - traffic de-multiplexing26VLAN #1Traffic de-multiplexerVLAN #1VLAN #2circuit configurationtrafficVLAN #2VLAN-based network isolation:No need to modify switchesAvoid the instability caused by circuit reconfigurationTraffic control on hosts:Controller informs hosts about the circuit configurationEnd-hosts tag packets accordingly

27. FAT-Tree: Special RoutingEnforce a special (IP) addressing scheme in DCunused.PodNumber.switchnumber.EndhostAllows host attached to same switch to route only through switchAllows inter-pod traffic to stay within podUse two level look-ups to distribute traffic and maintain packet orderingFirst level is prefix lookup used to route down the topology to serversSecond level is a suffix lookupused to route up towards coremaintain packet ordering by using same ports for same server27