/
Integrating Verification and Repair into the Control Plane Integrating Verification and Repair into the Control Plane

Integrating Verification and Repair into the Control Plane - PowerPoint Presentation

jordyn
jordyn . @jordyn
Follow
0 views
Uploaded On 2024-03-13

Integrating Verification and Repair into the Control Plane - PPT Presentation

Costin Raiciu University Politehnica of Bucharest Hotnets 2017 Aaron Gember Jacobson Colgate University Laurent Vanbever ETH Zurich THANKS TO Superfluidity H2020 NSF CCF1637427 ID: 1047576

planedata planecontrol plane pref planecontrol planedata pref plane extp r2p control data bgp root network verification update ibgp configuration

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Integrating Verification and Repair into..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Integrating Verification and Repair into the Control Plane Costin RaiciuUniversity Politehnica of BucharestHotnets 2017Aaron Gember JacobsonColgate UniversityLaurent VanbeverETH ZurichTHANKS TO:Superfluidity H2020NSF CCF-1637427

2. Incorrect networks ground airplanesIncorrect networks affect millions of usersIncorrect networks affect people

3. GOAL: CORRECT NETWORKSNetwork administrator specifies policyWant network to always obey policyBCDXA

4. Network operation 101Data plane Control plane (BGP, OSPF)network operatorconfigureupdateR1R2R3R4Network devicerouteupdate

5. Network configuration is difficultnetwork operator?Data planeControl planeData planeControl planeData planeControl planeData planeControl planeData planeControl planeData planeControl planeData planeControl plane

6. Data plane verificationCreate snapshotGenerate modelCheck policy.network operatorData planeControl planeData planeControl planeData planeControl planeData planeControl planeData planeControl planeData planeControl planeData planeControl plane+ Fast, accurate- Faults installed, caught reactively.- Is the data plane snapshot consistent?- When an error is found, what is the root cause?

7. Control plane verificationnetwork operator?Data planeControl planeData planeControl planeData planeControl planeData planeControl planeData planeControl planeData planeControl planeData planeControl planeCreate model of control planeSimulate proposed changeGenerate data plane.Verify against policy+ Can pinpoint the root cause of errors+ No problems with consistency- Big gap to reality: vendor implementation quirks and bugs are not modeled.

8. Goal: Accurate verification, provenance and automatic repairProblem: today verification is a separate system that works before or after the control plane.Proposal: Integrate verification into the operation of the distributed control plane

9. BGP instanceFIBOSPF instanceConfiguration changesRIB updatesFIB updatesBGP RIBOSPF RIBData Plane VerifierData planesnapshotBad FIBupdatesTrace ProvenanceBlock I/OsRootcauseRoute updatesCapture Control Plane I/OsOUR APPROACH

10. R1R2PiBGPiBGPeBGPeBGPeBGPPP, Pref=30P, ExtP, R2P, Pref=30, ExtP, Pref=30, R2PP, Pref=20, ExtA running examplePolicy: A should reach P directly via R2 when link is up.NetworkA

11. R1R2PiBGPiBGPeBGPeBGPeBGPP, R2P, Pref=30, R2P, Pref=20, ExtP, ExtP, Pref=30, Ext50P, EXTP, Pref=50P, R1A running exampleP, Pref=50, ExtNetworkAfaultPolicy: A should reach P directly via R2 when link is up.

12. R1 LogR2 LogR1 configuration changeBuilding a global Happens-Before GraphR1 update P ➡ R1, LP=50 in BGP RIBR1 send iBGP ad P ➡ R1, LP=50R1 recv iBGP ad P ➡ R2, LP=10R1 update P ➡ R2, LP=10 in BGP RIBR1 install P ➡ Extin FIBfaultRouter R1Router R2R1 localpref =50FIB: P via R1Config TTY0P: soft reconfigurationFIB: P ExtRoute: P via R1Route:P via R18ms8ms25s4ms0msR1 configuration changeMessagesSame prefixTimestampsInferred HBRsfaultConfig TTY0R1 localpref =50

13. Dealing with faultsBGP instanceFIBBGP RIBRouter R2P directP via R1XAlert operatorBlock updateRevert root cause

14. Consistent snapshotsR1 localpref =200Config TTY0P: soft reconfigurationFIB: P directRoute: P via R1Route: P via R1Router 1 LogRouter 2 LogVerifier viewR1 FIB: P direct✔R2 FIB: P via R1R1 FIB: P directFIB: P via R1Config change✖Loop!✖Use HBG to check data plane consistencyPostpone checking FIB entries with no root cause.

15. LimitationsWe will capture transient faults: too many?Wait a bit before alerting operator.Probabilistic provenance – false root causes.Undoing effects automatically may not always be safe or feasible.Route withdrawals due to link failures cannot be blocked or fixed.

16. What next?Is passive mode useful / enough?How to avoid policy specification?What is a tolerable false positive/negative threshold?Distributed verificationSafe to check some properties locally?

17. R1R2PiBGPiBGPeBGPeBGPeBGPP, R2P, Pref=30, R2P, Pref=20, ExtP, ExtP, Pref=30, Ext10P, Pref=10P, Pref=10, R2P, EXTP, Pref=20P, R1A running example

18. R2 configuration changeBuilding a Global Happens-Before GraphR2 update P ➡ R2, LP=10 in BGP RIBR2 send iBGP ad P ➡ R2, LP=10R1 recv iBGP ad P ➡ R2, LP=10R1 update P ➡ R2, LP=10 in BGP RIBR1 install P ➡ Extin FIBfaultRouter R1R2