Wilfredo Velazquez Outline Basics of Concurrency Concepts and Terminology Advantages and Disadvantages Amdahls Law Synchronization Techniques Concurrent Data Structures Parallel Correctness ID: 488751
Download Presentation The PPT/PDF document "Techniques and Structures in Concurrent ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Techniques and Structures in Concurrent Programming
Wilfredo
VelazquezSlide2
Outline
Basics of Concurrency
Concepts and Terminology
Advantages and Disadvantages
Amdahl’s Law
Synchronization Techniques
Concurrent Data Structures
Parallel Correctness
Treading A.P.I.’sSlide3
Basics of Concurrency
A concurrent program is any in which two or more of its modules or sections are run either by a separate process, or by another thread
Not much attention given historically
Concurrent programs are much more difficult to reason about and implement
Physical limits of modern processors are being reached, Moore’s Law no longer applies
Instead of faster processors, use more of themSlide4
Concepts and Terminology
Process
A ‘program’, which has its own memory space, stack, etc.
Difficult to communicate between processes –Message Passing Communication
Thread
A ‘sub-program’
Threads share all program features with that of their parent process. That is to say, same memory space, stack, etc.
Easy to communicate between threads –Shared Memory CommunicationSlide5
Concepts and Terminology
Concurrent Program
Processes/threads which execute tasks in an ordering relative to each-other that is not defined
Essentially covers all multi-process/multi-threaded programs
Parallelism
Processes/threads that execute completely simultaneously
Parallelism is more readily applied to sections of a program
Impossible in single-core processors (those still exist?)
Increased parallelism = more processors used
Atomic action
An action (instruction) that either happens, completely without interruption, or not at all
For many purposes, the idea that an action ‘looks’ atomic is enough to classify it as suchSlide6
Advantages and Disadvantages
Advantages:
Concurrent Programs + More Processors = Faster Programs
Some problems more easily described in parallel environments
General Multitasking
Non-Determinism
Disadvantages
Concurrent Programs + Few Processors = Slower Programs
Most problems more difficult to implement in parallel environments
Non-DeterminismSlide7
Amdahl’s Law
Relates the speed-up of a program when more processors are added
Has very limiting implicationsSlide8
Outline
Basics of Concurrency
Synchronization Techniques
Mutual Exclusion and Locks
The Mighty
C.A.S.
Lock-free and Wait-free Algorithms
Transactional Algorithms
Concurrent Data Structures
Treading A.P.I.’sSlide9
Synchronization Techniques
These are techniques that assure program correctness in areas where the non-determinism inherited from a concurrent environment would cause undesirable behavior
Example: Let T1 and T2 be threads, x be a shared variable between them
x = 0; //initially
T1::x++;
T2::x++;
Value of x ?Slide10
Synchronization Techniques
x++ becomes
read x;
add 1;
write x;
So T1 and T2’s instructions could occur in the following order:
T1::read x //reading 0
T2::read x //reading 0
T1::add 1 //0+1
T2::add 1 //0+1
T1::write x //writing 1
T2::write x //writing 1Slide11
Mutual Exclusion and Locks
Algorithm that allows only one thread to execute a certain ‘area’ of code at a time
It essentially ‘locks out’ all other threads from accessing the area, thus ‘
mutex
’ and ‘lock’ are typically used synonymously
Varying algorithms exist for implementation, differing in robustness and performance
Typically easy to reason about their use
High overhead compared to other synchronization techniques
Can cause problems such as Deadlock,
Livelock
, and StarvationSlide12
The Mighty C.A.S.
Compare And Swap
Native instruction on many modern multiprocessors
Widely used in synchronizing threads
Cheap, compared to using locking algorithms
Expensive, compared to loading-storing as uses a hardware lock
ABA > CAS
boolean CAS(
memoryLocation
, old, new)
{
If(*
memoryLocation
== old)
{
*
memoryLocation
= new;
return true;
}
return false;
}Slide13
Lock-Free and Wait-Free Algorithms
Wait-Free Algorithm
An algorithm is defined to be ‘wait-free’ if it guarantees that for any number of threads, all of them will make progress in a finite number of steps
Deadlock-free,
Livelock
-free, Starvation-free
Lock-Free Algorithm
An algorithm is defined to be ‘lock-free’ if it guarantees that for any number of threads,
at least one
will make progress in a finite number of steps
Deadlock-free,
Livelock
-free
All wait-free algorithms are also lock-free, though not vice versa
Note that neither definition actually forbids the use of locks, thus a lock-free algorithm could be implemented with locksSlide14
Transactional Algorithms
Inspired by database systems
Gather data from memory locations (optional)
Make local changes to the locations
Commit changes to the actual locations as an atomic step
If commit fails (another transaction occurred), start again
Essentially a generalization of CAS, except that no prior knowledge of the data is needed (for CAS we needed an ‘expected’ value)Slide15
Outline
Basics of Concurrency
Synchronization Techniques
Concurrent Data Structures
Safety and Liveliness Properties
Differing Semantics
Treading A.P.I.’sSlide16
Concurrent Data Structures
In sequential programming, data structures are invaluable as programming abstractions as they:
Provide abstraction of the inner-workings via interfaces
Provide a set of properties and guarantees as per what happens when certain operations are performed
Increase modularity of code
In concurrent programming they provide similar benefits, in addition to:
Allows threads to communicate in a simple and maintainable manner
Can be used as a focal point for the work done by multiple threadsSlide17
Safety and Liveliness Properties
Safety
Assures that ‘nothing bad will happen’, for example, two calls to the ‘push’ function of a stack should result in two elements being added to the stack
Liveliness
Assures that progress continues
Deadlock
Livelock
Starvation
All bad!Slide18
Differing Semantics
Structures must share properties and guarantees with the sequential versions which they mimic, thus their operations must be deterministic (with a few exceptions)
Semantics of use and implementation differ greatly purely due to the concurrent environment
Example:
The result obtained from popping the stack is non-deterministic, even though the implementation of the interfaces themselves are deterministicSlide19
Differing Semantics
So how can we write the program in such a way that it is well-behaved for our purposes?
De-Facto standard: Use a lock
Parallelism suffers, as other threads may not operate at all during the entire given section of code
Introduces liveliness problemsSlide20
Constructing Concurrent Data Structures
A concurrent data structure must abide by its sequential counter-part’s properties and guarantees when operations are performed on it
It must be ‘thread-safe’, no matter how many parallel calls are made to it, the data structure will never be corrupted
It should be free from any liveliness issues such as Deadlock
Just as sequential ones are constructed for abstraction, concurrent data structures should be opaque in their implementationSlide21
Constructing Concurrent Data StructuresSlide22
Constructing Concurrent Data Structures
The sequential version of this data structure
Not suitable as-is for concurrent programming
Lacks any safety properties, though it has no liveliness issues
How can we resolve the issue?
Lock itSlide23
Constructing Concurrent Data StructuresSlide24
Constructing Concurrent Data Structures
Safety is no longer a concern, though liveliness now is
Deadlock possible should a thread die during execution
Starvation in case of an interrupt
Lock overhead will overwhelm applications with many pops/push
Look back to original implementation; What sequential assumptions were made? (push)Slide25
Constructing Concurrent Data Structures
Correct, but original property lost: pushing on to a stack does not always place the element on the stack
Easy solution: Keep tryingSlide26
Constructing Concurrent Data Structures
Pop implemented using the same logic:Slide27
Outline
Basics of Concurrency
Synchronization Techniques
Concurrent Data Structures
Treading A.P.I.’s
pthreads
M.C.A.S., W.S.T.M., O.S.T.M.Slide28
Threading API’s
pthreads
C library for multithreading. Contains utilities such as
mutexes
, semaphores, and others
Available on *nix platforms, though subset ports exist for windows
MCAS
A C API that allows the use of a software-built MCAS (Multiple-Compare-And-Swap) function
Very powerful, though larger overhead than CAS
WSTM
Word-Based Software Transactional Memory
API for easy use of the Transactional Model
Mixes normal objects with WSTM
datatypes
Easy to implement on existing systems
OSTM
Object-Based Software Transactional Memory
Similar to WSTM, except that it is more streamlined in its implementation due to operating exclusively on its own data types
More difficult to implement on existing systemsSlide29
Refferences
Concurrent Programming Without Locks
http://research.microsoft.com/en-us/um/people/tharris/papers/2007-tocs.pdf
MCAS, WSTM, OSTM implemented in paper
The art of
Pultiprocessor
Programming
By Maurice
Herlihy
,
Nir
Shavit
http://books.google.com/books?id=pFSwuqtJgxYC&printsec=frontcover#v=onepage&q&f=false
DCAS is not a Silver Bullet for
Nonblocking
Algorithm Design
http://labs.oracle.com/scalable/pubs/SPAA04.pdf