Concurrency on the JVM and between JVMs Working problem Java concurrency tools review Solution using traditional Java concurrency tools Solution using Akka concurrency tools Overview of ID: 760438
Download Presentation The PPT/PDF document "CSC 536 Lecture 2 Outline" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
CSC
536 Lecture
2
Slide2Outline
Concurrency on the JVM (and between
JVMs
)
Working problem
Java concurrency tools (review)
Solution using traditional Java concurrency tools
Solution using
Akka
concurrency tools
Overview of
Akka
Slide3Compute the total size of all regular files stored,directly or indirectly, in a directory sbt:FileSize> runMain sequential.Sequential C:\[info] Running sequential.Sequential C:\Total size: 58504721987Time taken: 30.15794637
Working problem
Users
docs
etc
ellie
sammy
foo.txt
bar
.txt
xyz
.txt
abc
.txt
Slide4A recursive solution
Basis step: if input is a regular file, return its size
Recursive step: if input is a directory, call function recursively on every item in the directory, add up the returned values and return the sum
(Depth-First Traversal)
Sequential.java
Threads
Should be using threads to traverse
filesystem
in parallel
A thread is a “lightweight process”
A
thread really lives inside a process
A
thread has its own:
program counter
stack
register set
A thread shares with other threads in the process
code
global variables
Slide6Interface Runnable
Must be implemented by any class that will be executed by a thread
Implement method
run()
with
code the thread will run
Anonymous class example:
new
Runnable
() {
public void run() {
// code to be run by thread
}
}
Slide7Class Thread
Encapsulates a thread of execution in a program
To execute a thread:
An instance of a
Runnable
class is passed as an argument when creating the thread
The thread is started with method start()
Example
:
Runnable
r
= new
Runnable
() {
public void run() {
// code executed by thread
}};
new Thread(r).start();
Slide8Class Thread
Encapsulates a thread of execution in a program
To execute a thread:
An instance of a
Runnable
class is passed as an argument when creating the thread
The thread is started with method start()
Example
:
Runnable
r
= new
Runnable
() {
public void run() {
// code executed by thread
}};
new
Thread(r).start
();
Issue with threads: synchronizing access to shared
data
Slide9Simple synchronization problem
Setup
A
s
hared
m
emory integer buffer
Two Adder threads increment the buffer 100000 times concurrently
AddersTest.java,
UnsyncBuffer.java
Slide10Simple synchronization problem
Setup
A
shared
memory integer buffer
Two Adder threads increment the buffer 100000 times concurrently
AddersTest.java,
UnsyncBuffer.java
Problem
:
Race condition that causes increments to overlap
Need
to synchronize (coordinate) the processes
Slide11Synchronization
Mechanisms that ensure that concurrent threads/processes do not render shared data inconsistent
Three most
w
idely
u
sed
s
ynchronization
m
echanisms in centralized systems are
Semaphores
Locks
Monitors
Slide12Monitors
Monitor =
Set
o
f
o
perations
+
set of variables
+
lock
Set
o
f
v
ariables is the
m
onitor’s
s
tate
Variables
c
an
b
e
a
ccessed
o
nly
b
y
t
he
m
onitor’s operations
At
most
o
ne
t
hread
c
an
b
e
a
ctive
w
ithin
t
he
m
onitor
a
t
a
time
To execute a monitor’s operation, thread A must acquire the monitor’s lock; if the lock is held by another thread B then thread A goes into the BLOCKED STATE
When thread B releases the lock, thread A will compete with other threads to acquire the lock
Slide13Synchronization in Java
Each Java
class
b
ecomes
a
m
onitor
w
hen
a
t
l
east
o
ne
o
f
i
ts
m
ethods
u
ses
t
he
synchronized
m
odifier
The
synchronized
m
odifier
i
s
u
sed
t
o
w
rite
c
ode blocks
a
nd
m
ethods
t
hat
r
equire
a
t
hread
t
o
o
btain
a
l
ock
Synchronization
is
a
lways
d
one
w
ith
r
espect
t
o
a
n object
AddersTest.java, SyncBuffer.java
Slide14Producer-Consumer example
Can get more complicated
Setup
A
s
hared
m
emory
b
uffer
Producer
puts objects into
t
he
b
uffer
Consumer reads objects
f
rom
t
he buffer
ProducerConsumerTest.java
,
UnsyncBuffer.java
Slide15Producer-Consumer example
Setup
A
s
hared
m
emory
b
uffer
Producer
puts objects into
t
he
b
uffer
Consumer reads objects
f
rom
t
he buffer
ProducerConsumerTest.java
,
UnsyncBuffer.java
Problem
:
producer can over-produce, consumer can over-consume (another example of
race condition
)
Need
to synchronize (coordinate) the
processes even more
Need to keep track of whose turn it is
Slide16Monitors (wait/notify)
Monitor =
Set
o
f
o
perations
+
set of variables
+
lock
…
If thread A holds the monitor’s lock but it is out of turn, it must release the lock and wait on the monitor’s queue (
wait
); its state is now WAITING
Then thread B can acquire the lock and do its job since it is its turn; when done, it must notify one or more threads that are in the WAITING state in order for them to start competing for the lock again (
notify
)
ProducerConsumerTest.java, SyncBuffer.java
Slide17Java Memory model (before Java 5)
Before Java 5: ill defined
a thread not seeing values written by other threads
a thread observing impossible behaviors by other threads
Java 5 and later
Monitor lock rule: a release of a lock
happens before
the subsequent acquire of the same lock
Volatile variable rule: a write of a volatile variable
happens before
every subsequent read of the same volatile variable
Slide18Disadvantages of synchronization
Disadvantages:
Synchronization is error-prone
Synchronization blocks threads and takes time
Improper
synchronization results
i
n deadlocks
Creating a thread is not a low-overhead operation
Too many threads slow down the system
Slide19Disadvantages of synchronization
Disadvantages:
Synchronization is error-prone
Synchronization blocks threads and takes time
Improper
synchronization results
i
n deadlocks
Creating a thread is not a low-overhead operation
Too many threads slow down the system
Slide20Thread pooling
Thread pooling is a solution to the thread creation and management problem
The main idea is to create a bunch of threads in advance and have them wait for something to do
The same thread can be recycled for different
operations
Thread pool components:
A blocking
queue (of tasks)
A pool of
(worker) threads
Slide21Blocking queue
Queue
is a sequence of objects
Two
basic operations:
enqueue
dequeue
Blocking Queue:
Thread invoking
dequeue
must
block if the queue is empty
Thread invoking
enqueue
must
add an object to the queue and notify blocked
threads
Blocking queue must be thread safe
Slide22Blocking Queue dequeue
To dequeue an object from the queue:Block until the lock on the queue is obtainedIf queue is empty, release lock and waitIf queue is not empty, remove the first element and return itTo enqueue an object to the queue:Block until the lock on the queue is obtainedAdd object at the end of the queueNotify any waiting thread
BlockingQueue.java
Slide23Thread Pool = threads + tasks
Thread pool = group of threads + queue of Runnable tasksThread pool starts by creating the group of threadsEach thread loops indefinitelyIn every iteration, each thread attempts to dequeue a task from the task queueIf the task queue is empty, block on the queueIf a task is dequeued, run the taskThread pool method execute(task)simply adds the task to the task queue
ThreadPool.java
,
ThreadPoolTest.java
Slide24Java thread pool API
Interface
ExecutorService
defines objects that run
Runnable
tasks
Using method
execute()
Class
Executors
defines factory methods for obtaining a thread pool (i.e. an
ExecutorService
object)
newFixedThreadPool(n
)
creates a pool of
n
threads
ExecutorService
service = Executors.newFixedThreadPool(10);
service.execute(new
Runnable
() {
public void run() {
// task code
});
Slide25Compute the total size of all regular files stored,directly or indirectly, in a directory
Back to working problem
Users
docs
etc
ellie
sammy
foo.txt
bar
.txt
xyz
.txt
abc
.txt
Slide26Modern Java Concurrent solution
Use
Runnable
objects
Create
Runnable
object for every (
sub)directory
Use thread pool
Keeps the number of threads manageable
Keep overhead of thread creation low
Reuse threads
Avoid sharing state
Variable
totalSize
only
Access must be synchronized
Slide27AtomicLong
Accumulator variable
totalSize
is incremented by all threads
Must insure that the incrementing operation (the critical section) is not interrupted by a context switch
Solution 1: Use a Java lock to synchronize access to the critical section
Solution 2: Use class
AtomicLong
method
addAndGet
() executes as a single atomic instruction
Slide28Modern Java Concurrent solution
Use Runnable objectsCreate Runnable object for every (sub)directory Use thread poolKeeps the number of threads manageableKeep overhead of thread creation lowReuse threadsAvoid sharing stateVariable totalSize onlyAccess must be synchronizedConcurrent1.java
Does not work
Slide29Concurrent1 problem
The main thread must
wait
until all (
sub)directories
have been processed
No way
to know when that happens
Need to:
keep track of pending tasks, i.e. (directory processing) task creation and termination
Block the main thread until the number of pending tasks is 0
Slide30Modern Java Concurrent solution
Use Runnable objectsCreate Runnable object for every (sub)directory Use thread poolKeeps the number of threads manageableKeep overhead of thread creation lowReuse threadsAvoid sharing stateVariable totalSize onlyAccess must be synchronizedRequire synchronization variablesTo terminate the application
Concurrent2.java
Slide31CountDownLatch
Synchronization tool that allows one or more threads to wait until a set of operations being performed in other threads completes.
initialized with a given count
method
await()
blocks until count reaches 0
method
countdown()
decrements count by 1
After count reaches 0, any subsequent invocations of await return immediately.
A
CountDownLatch
initialized with a count of 1 serves as a simple on/off gate: all threads invoking
await()
wait at the gate until it is opened by a thread invoking
countDown
()
.
Slide32An Akka/Scala concurrent solution
Use Akka ActorsTask of processing a directory is given to a worker actor by a master actorWorker actor processes directorycomputes the total size of all the regular files and sends it to mastersends to master the (path)name of every sub-directory Master actorInitiates the processsends tasks to worker actorscollects the total sizekeeps track of pending tasks
ConcurrentAkka.java
Slide33Akka
Actor-based concurrency framework
Provides solutions for non-blocking concurrency
Written in
Scala
, but also has Java API
Each actor has a state that is invisible to other actors
Each actor has a message queue
Actors receive and handle messages
sequentially, therefore no synchronization issues
Actors should rarely block
Actors are lightweight and asynchronous
650 bytes
can have millions of actors running on a few threads on a single machine
Slide34Why use Akka in DSII?
Distributed computing
Actors do not share state and interact through messages
Actor locations (local
vs
remote) are transparent
Akka
developed for distributed applications from ground up
Group membership
Akka
Cluster provides a fault-tolerant membership service
Uses gossip protocols and automatic failure-detectors
Fault tolerance
Akka
implements “let-it-crash” semantics model
Uses supervisor hierarchies that self-heal
Slide35Why use Akka in DSII?
Reliable communication
Akka
includes an implementation of reactive streams
Replicated distributed data
Akka
includes an implementation of
Conflict Free Replicated Data Types
(CRDTs).
Slide36Actors
State
Supposed to be
invisible
to other actors
Behavior
The actions to be taken in
reaction
to a message
Mailbox
A
ctors process messages from mailbox
sequentially
Children
Actors can create other actors
A
hierarchy
of actors
Supervisor strategy
An actor is
supervised
by its parent
Slide37Actors
class First extends Actor { def receive = { case "hello" => println("Hello world!") case msg: String => println("Got " + msg + " from " + sender) case _ => println("Unknown message") }}object Server extends App { val system = ActorSystem("FirstExample") val first = system.actorOf(Props[First], name = "first") println("The path associated with first is " + first.path) first ! "hello" first ! "Goodbye" first ! 4}
First.scala
Slide38Using sbt
Simple Build Tool (
http://www.scala-sbt.org
/)
Easy to set up
Sample
build.sbt
configuration file
lazy
val
root = (project in file(".")).
settings (
name := "First Example",
version := "1.0",
scalaVersion
:= "
2.12.8",
scalacOptions
in
ThisBuild
++=
Seq
("-unchecked", "-deprecation"),
resolvers += "
Typesafe
Repository" at "http://
repo.typesafe.com
/
typesafe
/releases/",
libraryDependencies
+= "
com.typesafe.akka
" %% "
akka
-actor" % "
2.5.22"
)
Slide39Abstract Class Actor
Extend
Actor
class and implement method
receive
Method
receive
should have case statements that
define the messages the actor handles
implement the logic of how messages are handled
use
Scala
pattern matching
class First extends Actor {
def receive = {
case "hello" =>
println("Hello
world!")
case
msg
: String =>
println("Got
" +
msg
)
case _ =>
println("Unknown
message")
}
}
Slide40Class ActorSystem
Actors form hierarchies, i.e. a system
Class
ActorSystem
encapsulates a hierarchy of actors
Class
ActorSystem
provides methods for
creating actors
looking up actors.
At least the first actor in the system is created using it
Slide41Class ActorContext
Class
ActorContext
also
provides methods for
creating actors
looking up actors.
Each actor has its own instance of
ActorContext
that allows it to create (child) actors and lookup references to actors
Slide42Obtaining actor references
Creating actors
ActorSystem.actorOf
()
ActorContext.actorOf
()
Both methods return
ActorRef
reference to new actor
Looking up existing actor by concrete path
ActorSystem.actorSelection
()
ActorContext.actorSelection
()
Both methods return
ActorSelection
reference to new actor
ActorRef
or
ActorSelection
references can be used to send a message to the actor
Slide43Class ActorRef
Immutable and
serializable
handle to an actor
actor could be in the same
ActorSystem
, a different one, or even another, remote JVM
obtained from
ActorSystem
(or indirectly from
ActorContext
)
ActorRefs
can be shared among actors by message passing
you can serialize it, send it over the wire and use it on a remote host and it will still be representing the same Actor on the original node, across the network.
In fact, every message carries the
ActorRef
of the sender
Message passing conversely is their only purpose
Slide44Actor System
Slide45Class Props
Props is an
Actor
configuration object
recipe for creating an actor including associated deployment
info
Hides the instantiation of the actor so reference to it is unavailable
Used when creating new actors through
ActorSystem.actorOf
ActorContext.actorOf
Slide46Sending messages
Messages are sent to an Actor through one of
method tell or simply !
means “fire-and-forget”, e.g. send a message asynchronously and return immediately.
method ask or simply ?
sends a message asynchronously and returns a Future representing a possible reply
Message ordering is guaranteed on a per-sender basis
Tell is the preferred way of sending messages.
No blocking waiting for a message
Best concurrency and scalability characteristics
Slide47Message ordering
For a given pair of actors, messages sent from the first to the second will be received in the order they were sent
Causality between messages is not guaranteed!
Actor A sends message M1 to actor C
Actor A then sends message M2 to actor B
Actor B forwards message M2 to actor C
Actor C may receive M1 and M2 in any order
Also, message delivery is “at-most-once delivery”
i.e. no guaranteed delivery
Slide48Message ordering
Akka
also guarantees
The actor send rule
The send of the message to an actor happens before the receive of that message by the same actor.
The actor subsequent processing rule
processing of one message happens before processing of the next message by the same actor.
Both rules only apply for the same actor instance and are not valid if different actors are used
Slide49Messages and immutability
Messages can be any kind of object but have to be immutable.
Scala
can’t enforce immutability (yet) so this has to be by convention.
Primitives like String,
Int
, Boolean are always immutable.
Apart from these the recommended approach is to use
Scala
case classes which are immutable (if you don’t explicitly expose the state) and work great with pattern matching at the receiver side
Other good messages types are scala.Tuple2,
scala.List
,
scala.Map
which are all immutable and great for pattern matching
Slide50Actor API
Scala
trait (think partially implemented Java Interface) that defines one abstract method:
receive()
Offers useful references:
self
: reference to the
ActorRef
of actor
sender
: reference to sender Actor of the last received message
typically used for replying to messages
context
: reference to
ActorContext
of actor that includes references to
factory methods to create child actors (
actorOf
)
system that the actor belongs to
parent supervisor
supervised children
Slide51Ping Pong examples
Second.scala
Third.scala
Slide52Scala pattern matching
Scala
has a built-in general pattern matching mechanism
It allows to match on any sort of data with a first-match policy
object MatchTest1 extends App {
def
matchTest(x
:
Int
): String =
x
match {
case 1 => "one"
case 2 => "two"
case _ => "many"
}
println(matchTest(3))
println(matchTest(2))
println(matchTest(1))
}
Slide53Scala pattern matching
Scala
has a built-in general pattern matching mechanism
It allows to match on any sort of data with a first-match policy
object MatchTest2 extends App {
def
matchTest(x
: Any): Any =
x
match {
case 1 => "one"
case "two" => 2
case
y
:
Int
=> "
scala.Int
: " +
y
}
println(matchTest(1))
println(matchTest("two
"))
println(matchTest(3))
println(matchTest("four
"))
}
Slide54Scala case classes
Case classes are regular classes with special conveniences
automatically have factory methods with the name of the class
all constructor parameters become immutable public fields of the class
have natural implementations of
toString
,
hashode
, and
equals
are
serializable
by default
provide a decomposition mechanism via pattern matching
case class
Start(secondPath
: String)
case object PING
case object
PONG
Slide55Scala pattern matching
Scala
has a built-in general pattern matching mechanism
It allows to match on any sort of data with a first-match policy
case class
Start(secondPath
: String)
case object PING
case object PONG
object MatchTest3 extends App {
def
matchTest(x
: Any): Any =
x
match {
case
Start(secondPath
) => "got " +
secondPath
case PING => "got ping"
case PONG => "got pong"
}
println(matchTest(Start("path
")))
println(matchTest(PING
))
}
Slide56Scala pattern matching
Scala
has a built-in general pattern matching mechanism
It allows to match on any sort of data with a first-match policy
object MatchTest4 extends App {
def length [X] (
xs:List[X
]):
Int
=
xs
match {
case Nil => 0
case
y
::
ys
=> 1 +
length(ys
)
}
println(length(List
()))
println(length(List(1,2)))
println(length(List("one
", "two", "three")))
}
Slide57Scala pattern matching
sealed trait Op
case object
OpAdd
extends Op
case object
OpSub
extends Op
case object
OpMul
extends Op
case object
OpDiv
extends Op
sealed trait Exp
case class
ExpNum
(
n:Double
) extends Exp
case class
ExpOp
(e1:Exp,
op:Op
, e2:Exp) extends Exp
object MatchTest5 extends App {
def evaluate (
e:Exp
) : Double =
e
match {
case
ExpNum
(
v
) =>
v
case
ExpOp
(e1, op, e2) =>
val
n1:Double = evaluate (e1)
val
n2:Double = evaluate (e2)
op match {
case
OpAdd
=> n1 + n2
case
OpSub
=> n1 - n2
case
OpMul
=> n1 * n2
case
OpDiv
=> n1 / n2
}
}
}
Slide58Defining Akka message classes
Use
Scala
case classes
case class
Start(secondPath
: String)
case object PING
case object PONG
class
PingPong
extends Actor {
def receive = {
case PING => ...
case PONG => ...
case
Start(secondPath
) => ...
}
}
Slide59An Akka/Scala concurrent solution,in more detail
Use Akka ActorsTask of processing a directory is given to a worker actor by a master actorWorker actor processes directorycomputes the total size of all the regular files and sends it to mastersends to master the (path)name of every sub-directory Master actorInitiates the processsends tasks to worker actorscollects the total sizekeeps track of pending tasks
ConcurrentAkka.scala
Slide60class RoundRobinPool
Creating a new worker actor for every task (processing a directory) is not efficient.
tasks are very small so Actor creation overhead is relatively large
Instead, create a pool of worker actors (
routees
) managed by a router actor of type
RoundRobinPool
the router is the parent of the
routees
a message (task) sent by some actor A to the router is forwarded to a
routee
chosen in a round-robin fashion
The
routee
sees actor A as the sender of the message
context.actorOf(RoundRobinPool(50).props(Props[FileProcessor]), name = "
workerRouter
")