/
Conceiving “Availability” Conceiving “Availability”

Conceiving “Availability” - PowerPoint Presentation

myesha-ticknor
myesha-ticknor . @myesha-ticknor
Follow
403 views
Uploaded On 2016-09-02

Conceiving “Availability” - PPT Presentation

1 It seems like the basic objective All a network does is make stuff available We view with suspicion networks that transform what they transport We are ok with networks that store itthat just makes it more available ID: 459476

failures availability failed failure availability failures failure failed network general architecture encryption routers detect localize level region definition agent

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Conceiving “Availability”" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Conceiving “Availability”

1Slide2

It seems like the basic objective

“All” a network does is make stuff available.

We view with suspicion networks that transform what they transport.

We are ok with networks that store it—that just makes it more available.

But our field seems to lack a theory of availability, nor a general definition.

2Slide3

In the early days…

Pragmatically, we knew things failed.

Links and routers would fail.

So we did dynamic routing.

Routers would drop packets.So we did retransmission. We addressed specific aspects of availability.

We had no general theory.

3Slide4

We can generalize this

Assume that a network without failures will be available.

Begs definition of “failure”.

Included deliberate adverse intervention.

To deal with failures:Must detect them.

Must localize them to a region or component.

Must reconfigure so as not to depend on that region or component.

Must trigger repair of the failed device.

4Slide5

In practice

We do ok with simple failures

Fail-stop.

Routers talk to each other—if they stop talking we assume they failed.

Localization is impliedWe do much less well with more general or Byzantine failures.

Devices that succeed at the “I’m ok” protocol but don’t actually perform their function.

5Slide6

Example: email

If a mail relay agent fails to receive mail, go back to DNS and find another agent.

However:

If it receives mail but does not forward it, no detection of failure.

No way to tell the operator of the agent that it has failed. Just wait until a human figures it out.

6Slide7

Security makes it worse

Attacks do not normally manifest as simple failures.

Comcast

attacked

BitTorrent by injecting resets.Redirection of packets due to rerouting.

Encryption can prevent disclosure.

Encryption turns attacks on integrity into a successful attack on availability.

Users care about availability. One reason they “click through” warnings.

7Slide8

Detecting failure

By the end-to-end principle, the only locus that can

in general

detect failures is the end-points.

They understand what the correct function is. But how can they localize the problem?And how can they avoid the affected region?

And can they tell anyone about the failure?

8Slide9

Is this logic true in general?

Perhaps for some architecture, this problem is mitigated by design.

Any failure that affects correct delivery can be detected by the network.

No end-node correction required.

Or there is a well-defined end-node response to all classes of failure.

Is there an exhaustive classification of failure modes?

Encryption may help.

Does not prevent failures.

Reduces the kinds

of failures.

9Slide10

Quantification

Are there aspects of availability that are amenable to quantification.

Can we talk in a meaningful way about a system that is “more” available?

Are any such measures useful to compare different architectural approaches with respect to availability?

10Slide11

Metrics of availability

Is there enough redundancy to allow reconfiguration?

Cut-sets as a metric.

But how does this apply

if the definition of success is access to a service or to content?Outage: the opposite of available.

How much went down for how long?

To what part of the Internet?

11Slide12

Definitions of availability

How do regulators define availability?

Current discussion at FCC, etc.

Builds (perhaps improperly) on definitions from phone era

Is the network “available” if your access ISP is working? How does availability and censorship relate?

How much of the Internet must be reachable for it to be “available”?

12Slide13

What can architecture do?

Can architectural features improve the components of availability?

Detect and localize faults at the network layer.

Provide means to reconfigure.

Must not be a new attack vector.Allow reporting of failed components.

To which part of the ecosystem?

Is there a relation between architecture and redundancy?

13Slide14

At every level…

Design at every level must build in detection, localization, reconfiguration, recovery.

A higher level may be designed to recover from unrecovered failures at lower layers.

Applications can be available even in the face of some lack of lower-level availability.

At what layer should availability be evaluated?If you can call 911, is the phone system available?

14