1Lecture 5 SynchronizationTopics synchronization primitives and optimi - PDF document

danya . @danya

344 views
Uploaded On 2021-10-08

1Lecture 5 SynchronizationTopics synchronization primitives and optimi - PPT Presentation

2SynchronizationThe simplest hardware primitive that greatly facilitatessynchronization implementations locks barriers etcis an atomic readmodifywriteAtomic exchange swap contents of register and me ID: 897889

lock bar counter process bar lock process counter test store variable set traffic read sense flag mycount barriers atomic

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/897889" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download Pdf The PPT/PDF document "1Lecture 5 SynchronizationTopics synchro..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

1 1Lecture 5: SynchronizationTopics: sync
1Lecture 5: SynchronizationTopics: synchronization primitives and optimizations 2SynchronizationThe simplest hardware primitive that greatly facilitatessynchronization implementations (locks, barriers, etc.)is an atomic read-modify-writeAtomic exchange: swap contents of register and memorySpecial case of atomic exchange: test & se

2 t: transfermemory location into register
t: transfermemory location into register and write 1 into memorylock: t&sregister, locationbnzregister, lockCSstlocation, #0 3Improving Lock AlgorithmsThe basic lock implementation is inefficient because thewaiting process is constantly attempting writes heavyinvalidate trafficTest & Set with exponential back-off: if you fail a

3 gain,double your wait time and try again
gain,double your wait time and try againTest & Test & Set: read the value, if it has not changed,dont bother doing the test&setheavy bus traffic onlywhen the lock is releasedDifferent implementations trade-off one of these lockproperties: latency, traffic, scalability, storage, fairness 4Load-Linked and Store ConditionalLL-SC is

4 an implementation of atomic read-modify-
an implementation of atomic read-modify-writewith very high flexibilityLL: read a value and update a table indicating you haveread this address, then perform any amount of computationSC: attempt to store a result into the same memory location,the store will succeed only if the table indicates that noother process attempted a store s

5 ince the local LLSC implementations may
ince the local LLSC implementations may not generate bus traffic if theSC fails hence, more efficient than test&test&set 5Load-Linked and Store Conditionallockit: LL R2, 0(R1) ; load linked, generates no coherence trafficBNEZ R2, lockit; not available, keep spinningDADDUI R2, R0, #1 ; put value 1 in R2SC R2,

6 0(R1) ; store-conditional succeeds if
0(R1) ; store-conditional succeeds if no one; updated the locksince the last LLBEQZ R2, lockit; confirm that SC succeeded, else keep trying 6Further Reducing Bandwidth NeedsEven with LL-SC, heavy traffic is generated on a lockrelease and there are no fairness guaranteesTicket lock: every arriving process atomically picks up at

7 icket and increments the ticket counter
icket and increments the ticket counter (with an LL-SC),the process then keeps checking the now-servingvariable to see if its turn has arrived, after finishing itsturn it increments the now-serving variable is thisreally better than the LL-SC implementation?Array-Based lock: instead of using a now-servingvariable, use a now-servi

8 ng array and each processwaits on a dif
ng array and each processwaits on a different variable fair, low latency, lowbandwidth, high scalability, but higher storage 7BarriersBarriers require each process to execute a lock andunlock to increment the counter and then spin on ashared variableIf multiple barriers use the same variable, deadlock canarise because some process

9 may not have left theearlier barrier s
may not have left theearlier barrier sense-reversing barriers can solve thisproblemA tree can be employed to reduce contention for thelock and shared variableWhen one process issues a read request, otherprocesses can snoop and update their invalid entries 8Barrier ImplementationLOCK(bar.lock);if (bar.counter== 0)bar.flag= 0;mycoun

10 t= bar.counter++;UNLOCK(bar.lock);if (my
t= bar.counter++;UNLOCK(bar.lock);if (mycount== p) {bar.counter= 0;bar.flag= 1;}lsewhile (bar.flag== 0) { }; 9Sense-Reversing Barrier Implementationlocal_sense= !(local_sense);LOCK(bar.lock);mycount= bar.counter++;UNLOCK(bar.lock);if (mycount== p) {bar.counter= 0;bar.flag= local_sense;}lse {while (bar.flag!= local_sense) { };} 10Tit