Deadlocks Problems and Solutions CS 111 Operating Systems Peter Reiher Outline The deadlock problem Approaches to handling the problem Handling general synchronization bugs Simplifying synchronization ID: 534743
Download Presentation The PPT/PDF document "Operating System Principles:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Operating System Principles:Deadlocks – Problems and SolutionsCS 111Operating Systems Peter Reiher
Slide2
OutlineThe deadlock problemApproaches to handling the problemHandling general synchronization bugsSimplifying synchronizationSlide3
DeadlockWhat is a deadlock?A situation where two entities have each locked some resourceEach needs the other’s locked resource to continueNeither will unlock till they lock both resourcesHence, neither can ever make progressSlide4
The Dining Philosophers ProblemPhilosophers eat whenever they choose toA philosopher needstwo
forks
to eat
p
asta, but
must
pick
t
hem
up one at a time
The problem demands an absolute solution
Five philosophersfive plates of pastafive forks
Philosophers will
not
negotiate with
one-anotherSlide5
Dining Philosophers and DeadlockThis problem is the classical illustration of deadlockingIt was created to illustrate deadlock problemsIt is a very artificial problemIt was carefully designed to cause deadlocksChanging the rules eliminate deadlocksBut then it couldn't be used to illustrate deadlocksActually, one point of it is to see how changing the rules solves the problemSlide6
Why Are Deadlocks Important?A major peril in cooperating parallel processesThey are relatively common in complex applicationsThey result in catastrophic system failuresFinding them through debugging is very difficultThey happen intermittently and are hard to diagnoseThey are much easier to prevent at design timeOnce you understand them, you can avoid themMost deadlocks result from careless/ignorant designAn ounce of prevention is worth a pound of cureSlide7
Deadlocks May Not Be ObviousProcess resource needs are ever-changingDepending on what data they are operating onDepending on where in computation they areDepending on what errors have happenedModern software depends on many servicesMost of which are ignorant of one-anotherEach of which requires numerous resourcesServices encapsulate much complexityWe do not know what resources they require
W
e
do not know when/how they are serializedSlide8
Deadlocks and Different Resource TypesCommodity ResourcesClients need an amount of it (e.g. memory)Deadlocks result from over-commitmentAvoidance can be done in resource managerGeneral ResourcesClients need a specific instance of somethingA particular file or semaphoreA particular message or request completion
Deadlocks result from specific dependency relationships
Prevention is usually done at
design timeSlide9
Four Basic Conditions For DeadlocksFor a deadlock to occur, these conditions must hold:Mutual exclusionIncremental allocationNo pre-emptionCircular waitingSlide10
Deadlock Conditions: 1. Mutual ExclusionThe resources in question can each only be used by one entity at a timeIf multiple entities can use a resource, then just give it to all of themIf only one can use it, once you’ve given it to one, no one else gets itUntil the resource holder releases itSlide11
Deadlock Condition 2: Incremental AllocationProcesses/threads are allowed to ask for resources whenever they wantAs opposed to getting everything they need before they startIf they must pre-allocate all resources, either:They get all they need and run to completionThey don’t get all they need and abortIn either case, no deadlockSlide12
Deadlock Condition 3: No Pre-emptionWhen an entity has reserved a resource, you can’t take it away from himNot even temporarilyIf you can, deadlocks are simply resolved by taking someone’s resource awayTo give to someone elseBut if you can’t take it away from anyone, you’re stuckSlide13
Deadlock Condition 4: Circular WaitingA waits on B which waits on AIn graph terms, there’s a cycle in a graph of resource requestsCould involve a lot more than two entitiesBut if there is no such cycle, someone can complete without anyone releasing a resourceAllowing even a long chain of dependencies to eventually unwindMaybe not very fast, though . . .Slide14
A Wait-For Graph Thread 1Thread 2
Critical
Section A
Critical
Section B
Thread 1 acquires a lock for Critical Section A
Thread 2 acquires a lock for Critical Section B
Thread 1 requests a lock for Critical Section B
Thread 2 requests a lock for Critical Section A
No problem!
Deadlock!
We can’t give him the lock right now, but . . .
Hmmmm . . . Slide15
Deadlock AvoidanceUse methods that guarantee that no deadlock can occur, by their natureAdvance reservationsThe problems of under/over-bookingThe Bankers’ AlgorithmPractical commodity resource managementDealing with rejectionReserving critical resourcesSlide16
Avoiding Deadlock Using ReservationsAdvance reservations for commodity resourcesResource manager tracks outstanding reservationsOnly grants reservations if resources are availableOver-subscriptions are detected earlyBefore processes ever get the resourcesClient must be prepared to deal with failures But these do not result in deadlocksDilemma: over-booking vs. under-utilizationSlide17
Overbooking Vs. Under Utilization Processes generally cannot perfectly predict their resource needsTo ensure they have enough, they tend to ask for more than they will ever needEither the OS:Grants requests till everything’s reservedIn which case most of it won’t be usedOr grants requests beyond the available amountIn which case sometimes someone won’t get a resource he reservedSlide18
Handling Reservation ProblemsClients seldom need all resources all the timeAll clients won't need max allocation at the same timeQuestion: can one safely over-book resources?For example, seats on an airplane What is a “safe” resource allocation?One where everyone will be able to completeSome people may have to wait for others to completeWe must be sure there are no deadlocksSlide19
Commodity Resource Management in Real SystemsAdvanced reservation mechanisms are commonMemory reservationsDisk quotas, Quality of Service contractsOnce granted, system must guarantee reservationsAllocation failures only happen at reservation time Hopefully before the new computation has begunFailures will not happen at request timeSystem behavior more predictable, easier to handle
But clients must deal with reservation failuresSlide20
Dealing With Reservation FailuresResource reservation eliminates deadlockApps must still deal with reservation failuresApplication design should handle failures gracefullyE.g., refuse to perform new request, but continue runningApp must have a way of reporting failure to requesterE.g., error messages or return codesApp must be able to continue runningAll critical resources must be reserved at start-up timeSlide21
Isn’t Rejecting App Requests Bad?It’s not great, but it’s better than failing laterWith advance notice, app may be able to adjust service not to need the unavailable resourceIf app is in the middle of servicing a request, we may have other resources allocated And the request half-performedIf we fail then, all of this will have to be unwoundCould be complex, or even impossibleSlide22
System Services and ReservationsSystem services must never deadlock for memoryPotential deadlock: swap managerInvoked to swap out processes to free up memoryMay need to allocate memory to build I/O requestIf no memory available, unable to swap out processesSo it can’t free up memory, and system wedgesSolution:Pre-allocate and hoard a few request buffersKeep reusing the same ones over and over again
Little bit of hoarded memory is a small price to pay to avoid deadlock
That’s just one example system service, of courseSlide23
Deadlock PreventionDeadlock avoidance tries to ensure no lock ever causes deadlockDeadlock prevention tries to assure that a particular lock doesn’t cause deadlock By attacking one of the four necessary conditions for deadlockIf any one of these conditions doesn’t hold, no deadlockSlide24
Four Basic Conditions For DeadlocksFor a deadlock to occur, these conditions must hold:Mutual exclusionIncremental allocationNo pre-emptionCircular waitingSlide25
1. Mutual ExclusionDeadlock requires mutual exclusionP1 having the resource precludes P2 from getting itYou can't deadlock over a shareable resourcePerhaps maintained with atomic instructionsEven reader/writer locking can helpReaders can share, writers may be handled other waysYou can't deadlock on your private resourcesCan we give each process its own private resource?Slide26
2. Incremental Allocation Deadlock requires you to block holding resources while you ask for othersAllocate all of your resources in a single operationIf you can’t get everything, system returns failure and locks nothingWhen you return, you have all or nothingNon-blocking requestsA request that can't be satisfied immediately will failDisallow blocking while holding resourcesYou must release all held locks prior to blockingReacquire them again after you returnSlide27
Releasing Locks Before BlockingCould be blocking for a reason not related to resource lockingHow can releasing locks before you block help? Won’t the deadlock just occur when you attempt to reacquire them?When you reacquire them, you will be required to do so in a single all-or-none transaction Such a transaction does not involve hold-and-block, and so cannot result in a deadlockSlide28
3. No Pre-emption Deadlock can be broken by resource confiscationResource “leases” with time-outs and “lock breaking”Resource can be seized & reallocated to new clientRevocation must be enforcedInvalidate previous owner's resource handleIf revocation is not possible, kill previous ownerSome resources may be damaged by lock breakingPrevious owner was in the middle of critical sectionMay need mechanisms to audit/repair resourceResources must be designed with revocation in mindSlide29
When Can The OS “Seize” a Resource?When it can revoke access by invalidating a process’ resource handleIf process has to use a system service to access the resource, that service can no longer honor requestsWhen is it not possible to revoke a process’ access to a resource?If the process has direct access to the objectE.g., the object is part of the process’ address space Revoking access requires destroying the address space Usually killing the process.Slide30
4. Circular DependenciesUse total resource orderingAll requesters allocate resources in same orderFirst allocate R1 and then R2 afterwardsSomeone else may have R2 but he doesn't need R1Assumes we know how to order the resourcesOrder by resource type (e.g. groups before members)Order by relationship (e.g. parents before children)May require a lock dance
Release R2, allocate R1, reacquire R2Slide31
Lock Dances buffer
list head
To find a desired buffer:
read
lock list head
search for desired buffer
lock desired buffer
unlock list head
return (locked) buffer
To delete a (locked) buffer from list
unlock buffer
write lock list head
search for desired buffer
lock desired buffer
remove from list
unlock list head
buffer
buffer
list head
must be locked for searching, adding & deleting
individual buffers
must be locked to perform I/O & other operations
To avoid deadlock, we must always lock the
list head
before we lock an
individual buffer
.Slide32
An Example of Breaking DeadlocksThe problem – urban traffic gridlock“Resource” is the ability to pass through intersectionDeadlock happens when nobody can get throughSlide33
Using Attack Approach 1 To Prevent DeadlockAvoid mutual exclusionBuild overpass bridges for east/west trafficSlide34
Using Attack Approach 2 To Prevent DeadlockMake it illegal to enter the intersection if you can’t exit itThus, preventing “holding” of the intersectionSlide35
Using Attack Approach 3 To Prevent DeadlockAllow preemption Force some car to pull over to the sideSlide36
Using Attack Approach 4 To Prevent Deadlock Avoid circular dependencies by decreeing a totally ordered right of wayE.g., North beats West beats South beats East Slide37
Which Approach Should You Use?There is no one universal solution to all deadlocksFortunately, we don't need one solution for all resourcesWe only need a solution for each resourceSolve each individual problem any way you canMake resources sharable wherever possibleUse reservations for commodity resourcesOrdered locking or no hold-and-block where possibleAs a last resort, leases and lock breakingOS must prevent deadlocks in all system services Applications are responsible for their own behaviorSlide38
One More Deadlock “Solution”Ignore the problemIn many cases, deadlocks are very improbableDoing anything to avoid or prevent them might be very expensiveSo just forget about them and hope for the bestBut what if the best doesn’t happen?Slide39
Deadlock Detection and RecoveryAllow deadlocks to occurDetect them once they have happenedPreferably as soon as possible after they occurDo something to break the deadlock and allow someone to make progressIs this a good approach?Either in general or when you don’t want to avoid or prevent deadlocks?Slide40
Implementing Deadlock DetectionNeed to identify all resources that can be lockedNeed to maintain wait-for graph or equivalent structureWhen lock requested, structure is updated and checked for deadlockIn which case, might it not be better just to reject the lock request?And not let the requester block?Slide41
Dealing With General Synchronization BugsDeadlock detection seldom makes senseIt is extremely complex to implementOnly detects true deadlocks for a known resourceNot always clear cut what you should do if you detect oneService/application health monitoring is betterMonitor application progress/submit test transactions
If response takes too long, declare service “hung”
Health monitoring is easy to implement
It can detect a wide range of problems
Deadlocks, live-locks, infinite loops & waits, crashesSlide42
Related Problems Health Monitoring Can HandleLive-lockProcess is running, but won't free R1 until it gets messageProcess that will send the message is blocked for R1Sleeping Beauty, waiting for “Prince Charming”A process is blocked, awaiting some completion that will never happenPriority inversion hangsWhich we talked about before
None of
these is a true deadlock
Wouldn't be found by deadlock detection algorithm
All leave
the system just as hung as a deadlock
Health monitoring handles themSlide43
How To Monitor Process HealthLook for obvious failuresProcess exits or core dumpsPassive observation to detect hangsIs process consuming CPU time, or is it blocked?Is process doing network and/or disk I/O?External health monitoring“Pings”, null requests, standard test requestsInternal instrumentationWhite box audits, exercisers, and monitoringSlide44
What To Do With “Unhealthy” Processes?Kill and restart “all of the affected software”How many and which processes to kill?As many as necessary, but as few as possibleThe hung processes may not be the ones that are brokenHow will kills and restarts affect current clients?That depends on the service APIs and/or protocolsApps must be designed for cold/warm/partial restartsHighly available systems define restart groupsGroups of processes to be started/killed as a group
Define inter-group dependencies (restart B after A)Slide45
Failure Recovery MethodologyRetry if possible ... but not foreverClient should not be kept waiting indefinitelyResources are being held while waiting to retryRoll-back failed operations and return an errorContinue with reduced capacity or functionalityAccept requests you can handle, reject those you can'tAutomatic restarts (cold, warm, partial)Escalation mechanisms for failed recoveriesRestart more groups, reboot more machinesSlide46
Making Synchronization EasierLocks, semaphores, mutexes are hard to use correctlyMight not be used when neededMight be used incorrectlyMight lead to deadlock, livelock, etc.We need to make synchronization easier for programmersBut how?Slide47
One ApproachWe identify shared resourcesObjects whose methods may require serializationWe write code to operate on those objectsJust write the codeAssume all critical sections will be serializedComplier generates the serializationAutomatically generated locks and releasesUsing appropriate mechanismsCorrect code in all required placesSlide48
Monitors – Protected ClassesEach monitor class has a semaphoreAutomatically acquired on method invocationAutomatically released on method returnAutomatically released/acquired around CV waitsGood encapsulationDevelopers need not identify critical sectionsClients need not be concerned with lockingProtection is completely automaticHigh
confidence of adequate protectionSlide49
monitor CheckBook { // class is locked when any method is invoked private int balance; public int balance() { return(balance); } public int debit(int
amount) {
balance -= amount;
return( balance)
}
}
Monitors:
UseSlide50
Monitors: Simplicity vs. PerformanceMonitor locking is very conservativeLock the entire class (not merely a specific object)Lock for entire duration of any method invocationsThis can create performance problemsThey eliminate conflicts by eliminating parallelismIf a thread blocks in a monitor a convoy can formTANSTAAFLFine-grained locking is difficult and error proneCoarse-grained locking creates bottle-necksSlide51
Evaluating MonitorsCorrectnessComplete mutual exclusion is assuredFairnessSemaphore queue prevents starvationProgressInter-class dependencies can cause deadlocksPerformanceCoarse grained locking is not scalableSlide52
Java Synchronized MethodsEach object has an associated mutexAcquired before calling a synchronized methodNested calls (by same thread) do not reacquireAutomatically released upon final returnStatic synchronized methods lock class mutexAdvantagesFiner lock granularity, reduced deadlock riskCostsDeveloper must identify serialized methodsSlide53
class CheckBook { private int balance; public int balance() { return(balance); } // object is locked when this method is invoked public synchronized int debit(int amount) {
balance -= amount;
return( balance)
}
}
Using Java Synchronized
MethodsSlide54
Evaluating Java Synchronized MethodsCorrectnessCorrect if developer chose the right methodsFairnessPriority thread scheduling (potential starvation)ProgressSafe from single thread deadlocksPerformanceFine grained (per object) lockingS
electing
which methods to synchronize