/
Implementing A Simple Storage Case Implementing A Simple Storage Case

Implementing A Simple Storage Case - PowerPoint Presentation

conchita-marotz
conchita-marotz . @conchita-marotz
Follow
386 views
Uploaded On 2017-03-20

Implementing A Simple Storage Case - PPT Presentation

Consider a simple case for distributed storage I want to back up files from machine A on machine B Avoids many tricky issues Multiple nodes Multiple writers Most synchronization How might I design such a system ID: 527231

files file backed foo file files foo backed bar send sending machine backup inventory multiple process start distributed fails case asks illustrating

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Implementing A Simple Storage Case" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Implementing A Simple Storage Case

Consider a simple case for distributed storageI want to back up files from machine A on machine BAvoids many tricky issuesMultiple nodesMultiple writersMost synchronizationHow might I design such a system?

1Slide2

Distributed Backup

Keeping it simple, there’s only one live machineMachine AAnd one machine holding the backupsMachine BPeriodically I want to back up A’s files on BOccasionally I want to fetch a backed up file from B to A

2Slide3

Illustrating the Situation

3

Machine A

Machine B

We want to copy files from A to B on command

Preferably only files not yet on B (or files that have changed)

Stored on B in a way that will make it possible to recover themSlide4

Basic Approach

Separate processes run on A and BB runs a server processAlways activeWaiting for requests from AA runs a process on demandWhen the user wants to perform backupA’s process contacts B’s to tell it what to do

4Slide5

The Process On A (Backing Up)

Needs to know what should be backed upThree choices:A stores information about what was already backed upA asks B for information on what was already backed upA sends B an inventory of what A has and B asks for what it needsWhich is best?

5Slide6

Illustrating the Choices for A

6

1. A keeps an inventory

2. A asks B

2. A tells B what he’s got

?Slide7

The Process on B (Backing Up)

Server process waiting for A to contact itOr should it be pro-active . . . ?Assuming A knows what is to be backed up,Should A send B an inventory of what’s coming?Or should A simply feed files to B, one by one?

7Slide8

Illustrating The Choices

8

foo

bar

foobar

. . .

foo

bar

1. Ship an inventory first

2. Just send the files

foo

bar

foobar

. . .

foo

barSlide9

Backing Up One File

A decides file foo needs to be backed upDoes he:Send all of file foo at once?Figure out what’s changed in file foo since he last backed it up and only send the

diffs?In the second case, how does A know what to do?

And how does

h

e tell it to B

?

9Slide10

Sending the Entire File

Assuming A is going to send the entire file, does he:Send a message telling B what he’s going to do (e.g., name of file, length of file, etc.) and wait for B to say “OK”Just start sending pieces of the file (maybe with control info in the first message?)Should A use TCP or UDP?Can B ask for flow control?

10Slide11

Sending Multiple Files

Should A send all of file foo before sending file bar?When does he start sending bar?After the last part of foo is sent?After B tells him foo

is taken care of?Or start sending multiple files off at once?All in parallel?Some number >1 but less than all?

11Slide12

Handling An Incoming File on B

File foo starts arriving on BWhat does B do?Buffer all the data of foo till he has a complete file?Write the contents of each incoming message to the appropriate place in the file on disk?Fill a fixed size buffer up, then write it to disk?

12Slide13

How Does B Handle Multiple File Transfers?

What is used to indicate to B that file foo is done and file bar is coming?Does B confirm handling of an entire file to A?Even if A is using TCP?If multiple files are moving at once, how does B tell which part belongs to which file?Can B request that A slow down?

13Slide14

What Records Does B Keep?

Nothing but the files themselves?The files plus an inventory of what he’s got?A transcript of a backup session?If so, how many kept and for how long?

14Slide15

How About Failures?

What happens if B fails during backup?During the setup phase?During transfer of one/multiple files?What happens if A fails during backup?During setup phase?While transferring files?In either case, what happens on recovery?

15Slide16

Restoring Files

Does A ask for restore of full set of files?Or for selected files?How does A know what is backed up on B?Does A keep records?Does he ask B for an inventory?How does A handle moving an individual file?Akin to the questions about B receiving it

16Slide17

Restoration From B’s Side

Just on a per-file basis?I.e., A asks for what he wants and B provides itOr some big “whole set” backup? Done one file at a time or in parallel?Does A provide flow control information to B?Does B keep records about restorations?

17Slide18

Failures During Restoration

What if A fails in the middle of a restore operation?What happens to files being restored at time of failure?What if B fails in the middle of a restore operation?What’s different than the actions taken on A’s failure?What happens on recovery?Complete restart or from where the failure occurred?

18Slide19

Extending the System

What if B is a backup machine servicing several other machines?How does B’s software differ?Do the machines B services run the same software as before?Or does their client software need to be altered?

19Slide20

Going From Here

We will start talking about these design questions in classThink about what you consider the best answers to eachBe prepared to suggest alternatives and discuss why to use themConcentrate on distributed systems aspectsBut some local issues may impact the distributed issues

20