Consider a simple case for distributed storage I want to back up files from machine A on machine B Avoids many tricky issues Multiple nodes Multiple writers Most synchronization How might I design such a system ID: 527231
Download Presentation The PPT/PDF document "Implementing A Simple Storage Case" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Implementing A Simple Storage Case
Consider a simple case for distributed storageI want to back up files from machine A on machine BAvoids many tricky issuesMultiple nodesMultiple writersMost synchronizationHow might I design such a system?
1Slide2
Distributed Backup
Keeping it simple, there’s only one live machineMachine AAnd one machine holding the backupsMachine BPeriodically I want to back up A’s files on BOccasionally I want to fetch a backed up file from B to A
2Slide3
Illustrating the Situation
3
Machine A
Machine B
We want to copy files from A to B on command
Preferably only files not yet on B (or files that have changed)
Stored on B in a way that will make it possible to recover themSlide4
Basic Approach
Separate processes run on A and BB runs a server processAlways activeWaiting for requests from AA runs a process on demandWhen the user wants to perform backupA’s process contacts B’s to tell it what to do
4Slide5
The Process On A (Backing Up)
Needs to know what should be backed upThree choices:A stores information about what was already backed upA asks B for information on what was already backed upA sends B an inventory of what A has and B asks for what it needsWhich is best?
5Slide6
Illustrating the Choices for A
6
1. A keeps an inventory
2. A asks B
2. A tells B what he’s got
?Slide7
The Process on B (Backing Up)
Server process waiting for A to contact itOr should it be pro-active . . . ?Assuming A knows what is to be backed up,Should A send B an inventory of what’s coming?Or should A simply feed files to B, one by one?
7Slide8
Illustrating The Choices
8
foo
bar
foobar
. . .
foo
bar
1. Ship an inventory first
2. Just send the files
foo
bar
foobar
. . .
foo
barSlide9
Backing Up One File
A decides file foo needs to be backed upDoes he:Send all of file foo at once?Figure out what’s changed in file foo since he last backed it up and only send the
diffs?In the second case, how does A know what to do?
And how does
h
e tell it to B
?
9Slide10
Sending the Entire File
Assuming A is going to send the entire file, does he:Send a message telling B what he’s going to do (e.g., name of file, length of file, etc.) and wait for B to say “OK”Just start sending pieces of the file (maybe with control info in the first message?)Should A use TCP or UDP?Can B ask for flow control?
10Slide11
Sending Multiple Files
Should A send all of file foo before sending file bar?When does he start sending bar?After the last part of foo is sent?After B tells him foo
is taken care of?Or start sending multiple files off at once?All in parallel?Some number >1 but less than all?
11Slide12
Handling An Incoming File on B
File foo starts arriving on BWhat does B do?Buffer all the data of foo till he has a complete file?Write the contents of each incoming message to the appropriate place in the file on disk?Fill a fixed size buffer up, then write it to disk?
12Slide13
How Does B Handle Multiple File Transfers?
What is used to indicate to B that file foo is done and file bar is coming?Does B confirm handling of an entire file to A?Even if A is using TCP?If multiple files are moving at once, how does B tell which part belongs to which file?Can B request that A slow down?
13Slide14
What Records Does B Keep?
Nothing but the files themselves?The files plus an inventory of what he’s got?A transcript of a backup session?If so, how many kept and for how long?
14Slide15
How About Failures?
What happens if B fails during backup?During the setup phase?During transfer of one/multiple files?What happens if A fails during backup?During setup phase?While transferring files?In either case, what happens on recovery?
15Slide16
Restoring Files
Does A ask for restore of full set of files?Or for selected files?How does A know what is backed up on B?Does A keep records?Does he ask B for an inventory?How does A handle moving an individual file?Akin to the questions about B receiving it
16Slide17
Restoration From B’s Side
Just on a per-file basis?I.e., A asks for what he wants and B provides itOr some big “whole set” backup? Done one file at a time or in parallel?Does A provide flow control information to B?Does B keep records about restorations?
17Slide18
Failures During Restoration
What if A fails in the middle of a restore operation?What happens to files being restored at time of failure?What if B fails in the middle of a restore operation?What’s different than the actions taken on A’s failure?What happens on recovery?Complete restart or from where the failure occurred?
18Slide19
Extending the System
What if B is a backup machine servicing several other machines?How does B’s software differ?Do the machines B services run the same software as before?Or does their client software need to be altered?
19Slide20
Going From Here
We will start talking about these design questions in classThink about what you consider the best answers to eachBe prepared to suggest alternatives and discuss why to use themConcentrate on distributed systems aspectsBut some local issues may impact the distributed issues
20