L11 1 Implementing for Correct Concurrency Nirav Dave Computer Science amp Artificial Intelligence Lab Massachusetts Institute of Technology httpcsgcsailmitedu6375 March 9 2011 L11 ID: 759231
Download Presentation The PPT/PDF document "March 9, 2011 http://csg.csail.mit.edu/6..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
March 9, 2011
http://csg.csail.mit.edu/6.375
L11-1
Implementing for Correct ConcurrencyNirav DaveComputer Science & Artificial Intelligence LabMassachusetts Institute of Technology
http://csg.csail.mit.edu/6.375
Slide2March 9, 2011
L11-2
http://csg.csail.mit.edu/6.375
Dealing with Conflicts
When do conflicts arise?
How do we Analyze them?
How do we fix them?
How do we make sure we’re okay?
Slide3March 9, 2011
L11-3
http://csg.csail.mit.edu/6.375
SFIFO
interface SFIFO#(type t, type tr, type v); method Action enq(t); // enqueue an item method Action deq(); // remove oldest entry method t first(); // inspect oldest item method Action clear(); // make FIFO empty method Maybe#(v) find(tr); // search FIFOendinterface
n = # of bits needed to represent the values of type “t“ m = # of bits needed to represent the values of type “tr“ v = # of bits needed to represent the values of type “v“
not full
not empty
not empty
rdy
enab
n
n
rdy
enab
rdy
enq
deq
first
SFIFO
module
clear
enab
find
m
bool
V
Slide4March 9, 2011
L11-4
http://csg.csail.mit.edu/6.375
Processor Example
fetch
execute
iMem
rf
CPU
decode
memory
pc
write-
back
dMem
5 – stage Processor. 1 element FIFOs in between stages
Let’s add bypassing
Slide5March 9, 2011
L11-5
http://csg.csail.mit.edu/6.375
Decode Rule
rule decode (!newStallFunc(instr, d2eQ, e2mQ, m2wQ)); let fetInst = f2dQ.first(); f2dQ.deq(); match {.ra, .rb} = getRARB(fetInst); let va0 = rf[ra]; let va1 = fromMaybe (m2wQ.find(ra), va0); let va2 = fromMaybe (e2mQ.find(ra), va1); let vb0 = rf[rb]; let vb1 = fromMaybe (m2wQ.find(rb), vb0); let vb2 = fromMaybe (e2mQ.find(rb), vb1); let newInst = case (fetInst) match Add: return (DAdd .va2 .vb2); … endcase; d2eQ.enq(newInst);endrule
When do we want it to execute?
Decode is also correct correct anytime it’s allowed to execute
Search through each place in design
Slide6March 9, 2011
L11-6
http://csg.csail.mit.edu/6.375
some insight intoConcurrent rule firing
There are more intermediate states in the rule semantics (a state after each rule step) In the HW, states change only at clock edges
Rules
HW
Ri
Rj
Rk
clocks
rule
steps
Ri
Rj
Rk
http://csg.csail.mit.edu/6.375
Slide7March 9, 2011
L11-7
http://csg.csail.mit.edu/6.375
Parallel executionreorders reads and writes
In the rule semantics, each rule sees (reads) the effects (writes) of previous rules In the HW, rules only see the effects from previous clocks, and only affect subsequent clocks
Rules
HW
clocks
rule
steps
reads
writes
reads
writes
reads
writes
reads
writes
reads
writes
reads
writes
reads
writes
http://csg.csail.mit.edu/6.375
Slide8March 9, 2011
L11-8
http://csg.csail.mit.edu/6.375
Correctness
Rules are allowed to fire in parallel only if the net state change is equivalent to sequential rule execution Consequence: the HW can never reach a state unexpected in the rule semantics
Rules
HW
Ri
Rj
Rk
clocks
rule
steps
Ri
Rj
Rk
http://csg.csail.mit.edu/6.375
Slide9March 9, 2011
L11-9
http://csg.csail.mit.edu/6.375
Upshot
Given the concurrency of method/rules in a system we can determine viable schedules
Some variation do to applicability
BUT we know what schedule we want (mostly)
We should be able to back propagate results to submodules
Slide10March 9, 2011
L11-10
http://csg.csail.mit.edu/6.375
Determining Concurrency Properties
Slide11March 9, 2011
L11-11
http://csg.csail.mit.edu/6.375
Processor: Concurrencies
In-order: F < D < E < M < WPipelined W < M < E < D < F
fetch
execute
iMem
rf
CPU
decode
memory
pc
write-
back
dMem
http://csg.csail.mit.edu/6.375
Slide12March 9, 2011
L11-12
http://csg.csail.mit.edu/6.375
Concurrency requirements for Full Pipelining – Reg File
In-Order RF:(D calls sub) < (W calls upd)Pipelined RF:(W calls upd) < (D calls sub)
fetch
execute
imem
rf
CPU
decode
memory
pc
write-
back
dMem
Slide13March 9, 2011
L11-13
http://csg.csail.mit.edu/6.375
Concurrency requirements for Full Pipelining – FIFOs
In-Order FIFOs:1. m2wQ, e2mQ: find < enq < first < deq2. d2eQ: find < enq < first < deq, clearPipeline FIFOs:3. m2wQ, e2mQ : first < deq < enq < find4. d2eQ : first < deq < find < enq
fetch
execute
imem
rf
CPU
decode
memory
pc
write-
back
dMem
Slide14March 9, 2011
L11-14
http://csg.csail.mit.edu/6.375
Constructing Appropriately concurrent submodules
Slide15March 9, 2011
L11-15
http://csg.csail.mit.edu/6.375
From Analysis to Design
We need to create modules which behave as needed
Construct modules using “unsafe” primitives to have “safe” behaviors
Three major concepts:
Use primitives which remove “false” concurrency orderings (e.g. ConfigRegs vs. Regs)
Add RWires for forwarding values intra-cycle
Reason carefully to assure that execution appears “atomic”
Slide16March 9, 2011
L11-16
http://csg.csail.mit.edu/6.375
ConfigReg and RWire
mkConfigReg
is a Reg without this restriction
mkReg
requires that
read < write
Allows us to read stale values (dangerous)
RWire
is a “wire”
wset :: a -> Action
writes
wget :: Maybe#(a)
returns written value if read happened.
wset
happens before
wget
each cycle
Slide17March 9, 2011
L11-17
http://csg.csail.mit.edu/6.375
Let’s implement some modules
Slide18March 9, 2011
L11-18
http://csg.csail.mit.edu/6.375
Processor Redux
In-order: F < D < E < M < WPipelined W < M < E < D < F
fetch
execute
iMem
rf
CPU
decode
memory
pc
write-
back
dMem
http://csg.csail.mit.edu/6.375
Slide19March 9, 2011
L11-19
http://csg.csail.mit.edu/6.375
Concurrency: RegFile
The standard library regfile is implemented using with concurrency (sub < upd)
This handles the in-order case
We need to build a RegisterFile for the pipelined case
Slide20March 9, 2011
L11-20
http://csg.csail.mit.edu/6.375
BypassRegFile
module mkBypassRegFile(RegFile#(a,d)) #(d l, d h)
provisos#(Bits(a,asz), Bits#(d,dsz));
RegFile#(a,d) rfInt <- mkRegFileWCF(l,h);
RWire#(Tuple2#(a,d)) curWrite <- mkRWire();
method Action upd(a x, d v);
rfInternal.upd(x,v);
curWrite.wset(tuple2(x,v));
endmethod
method d sub(a x);
case (
curWrite.wget()
) matches
tagged Valid {.wa, .wd} &&& wa == a: return wd;
default: return rfInternal.sub(a);
endcase endmethod endmodule
Slide21March 9, 2011
L11-21
http://csg.csail.mit.edu/6.375
Processor Redux
In-order: F < D < E < M < WPipelined W < M < E < D < F
fetch
execute
iMem
rf
CPU
decode
memory
pc
write-
back
dMem
http://csg.csail.mit.edu/6.375
Slide22March 9, 2011
L11-22
http://csg.csail.mit.edu/6.375
One Element SFIFO (Naïve)
module mkSFIFO1#(function Maybe#(v) findf(tr r, t x)) (SFIFO#(t,tr,v)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkReg(False); method Action enq(t x) if (!full); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full ? findf(r, data): Nothing); endmethod endmodule
http://csg.csail.mit.edu/6.375
Concurrency:
find < first < (enq C deq)
Slide23March 9, 2011
L11-23
http://csg.csail.mit.edu/6.375
One Element SFIFO (In-Order d2eQ #1)
module mkSFIFO1#(function Maybe#(v) findf(tr r, t x)) (SFIFO#(t,tr,v)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(t) enqv <- mkRWire(); method Action enq(t x) if (!full); full <= True; data <= x; enqv.wset(x); endmethod method Action deq() if (full || isValid(enqv.wget())); full <= False; endmethod method t first() if (full); return data; endmethod method Maybe#(v) find(tr r); return full ? findf(r,data): Nothing; endmethodendmodule
http://csg.csail.mit.edu/6.375
find < first < enq < deq
Slide24March 9, 2011
L11-24
http://csg.csail.mit.edu/6.375
One Element SFIFO (In-Order e2mQ, m2wQ #2)
module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(t) enqv <- mkRWire(); method Action enq(t x) if (!full); full <= True; data <= x; enqv.wset(x); endmethod method Action deq() if (full || isValid(enqv.wget())); full <= False; endmethod method t first() if (full || isValid(enqv.wget())); return (fromMaybe(enqv.wget(), data)); endmethod method Maybe#(v) find(tr r); return full ? findf(r,data): Nothing; endmethodendmodule
http://csg.csail.mit.edu/6.375
find < enq < first
< deq
Slide25March 9, 2011
L11-25
http://csg.csail.mit.edu/6.375
One Element Searchable SFIFO (Pipelined #3)
module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqw <- mkRWire(); RWire#(void) enqw <- mkRWire(); method Action enq(t x) if (!full || isValid(deqw.wget()); full <= True; data <= x; enqw.wset(x); endmethod method Action deq() if (full); full <= False; deqw.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full&&!isValid(deqw.wget()) ? findf(r,data) : isValid(enqw.wget()) ? findf(r, fromMaybe(enqw.wget(),?)): Nothing; endmethod endmodule
http://csg.csail.mit.edu/6.375
first < deq < enq < find
Slide26March 9, 2011
L11-26
http://csg.csail.mit.edu/6.375
One Element Searchable SFIFO (Pipelined #4)
module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqw <- mkRWire(); method Action enq(t x) if (!full || isValid(deqw.wget()); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; deqw.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full&&!isValid(deqw.wget()) ? findf(r, data): Nothing;endmethod endmodule
http://csg.csail.mit.edu/6.375
first < deq < find < enq
Slide27March 9, 2011
L11-27
http://csg.csail.mit.edu/6.375
One Element Searchable SFIFO (Pipelined #4)
module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqEN <- mkRWire(); Bool deqp = isValid (deqEN.wget())); method Action enq(t x) if (!full|| deqp); full <= True; data <= x; 12endmethod method Action deq() if (full); full <= False; deqEN.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r);return (full&&!deqp) ? findf(r, data): Nothing; endmethod endmodule
http://csg.csail.mit.edu/6.375
first < deq < find < enq
Slide28March 9, 2011
L11-28
http://csg.csail.mit.edu/6.375
Up-Down Counter
Slide29March 9, 2011
L11-29
http://csg.csail.mit.edu/6.375
Counter Module Interface
interface Counter method Action up(); method Action down(); method Bit#(32) _read();endinterface
Concurrency: up and down should be independent
Slide30March 9, 2011
L11-30
http://csg.csail.mit.edu/6.375
Naïve Counter Example
module mkCounter(Counter);
Reg#(int) r <- mkReg();
method int _read();
return r;
endmethod
method Action up();
r <= r + 1;
endmethod
method Action down();
c <= r – 1;
endmethod
endmodule
Slide31March 9, 2011
L11-31
http://csg.csail.mit.edu/6.375
Counter Example
module mkCounter(Counter); Reg#(int) r <- mkConfigReg(); RWire#(void) upW <- mkRWire(); RWire#(void) downW <- mkRWire(); method int _read(); return r; endmethod method Action up(); upW.wset(); endmethod method Action down(); downW.wset(); endmethod rule updateR(True); r <= r + (isValid( upW.wget()) ? 1 : 0) - (isValid(downW.wget()) ? 1 : 0); endruleendmodule
What if want to call up then _read?
Slide32March 9, 2011
L11-32
http://csg.csail.mit.edu/6.375
Completion Buffer
Slide33March 9, 2011
L11-33
http://csg.csail.mit.edu/6.375
Completion buffer: Interface
interface CBuffer#(type t); method ActionValue#(Token) getToken(); method Action put(Token tok, t d); method ActionValue#(t) getResult();endinterface
typedef Bit#(TLog#(n)) TokenN#(numeric type n);typedef TokenN#(16) Token;
cbuf
getResult
getToken
put (result & token)
http://csg.csail.mit.edu/6.375
Slide34March 9, 2011
L11-34
http://csg.csail.mit.edu/6.375
IP-Lookup module with the completion buffer
module mkIPLookup(IPLookup); rule recirculate… ; rule exit …; method Action enter (IP ip); Token tok <- cbuf.getToken(); ram.req(ip[31:16]); fifo.enq(tuple2(tok,ip[15:0])); endmethod method ActionValue#(Msg) getResult(); let result <- cbuf.getResult(); return result; endmethodendmodule
done?
RAM
fifo
enter
getResult
cbuf
yes
no
getToken
for
enter
and
getResult
to execute simultaneously,
cbuf.getToken
and
cbuf.getResult
must execute simultaneously
http://csg.csail.mit.edu/6.375
Slide35March 9, 2011
L11-35
http://csg.csail.mit.edu/6.375
IP Lookup rules with completion buffer
rule recirculate (!isLeaf(ram.peek())); match{.tok,.rip} = fifo.first(); fifo.enq(tuple2(tok,(rip << 8))); ram.req(ram.peek() + rip[15:8]); fifo.deq(); ram.deq();endrule
rule exit (isLeaf(ram.peek())); cbuf.put(ram.peek()); fifo.deq(); ram.deq();endrule
For rule exit and method enter to execute simultaneously, cbuf.put and cbuf.getToken must execute simultaneously
For no dead cycles cbuf.getToken and cbuf.put and cbuf.getResult must be able to execute simultaneously
http://csg.csail.mit.edu/6.375
Slide36March 9, 2011
L11-36
http://csg.csail.mit.edu/6.375
Naïve Completion Buffer
module mkCBuffer(CBuffer#(a));
Vector#(Reg#(Bool)) valids <-
replicateM(mkReg(False));
RegFile#(Token, t) data <- mkRegFile();
Reg#(Token) rdP <- mkReg(0);
Reg#(Token) wrP <- mkReg(0);
Reg#(Token) cnt <- mkReg(0);
method ActionValue#(Token) getToken() if (cnt < Max);
cnt <= cnt + 1;
rdP <= nextPointer(rdP);
valids[rdP] <= False;
return rdp;
endmethod
method Action put(Token tok, t d);
valids[tok] <= True;
data.upd(tok, d);
endmethod
method ActionValue#(t) getResult() if (valids[wrP])
cnt <= cnt -1;
wrP <= nextPointer(wrP);
return (data.sub(wrP));
endmethod
endmodule
Slide37March 9, 2011
L11-37
http://csg.csail.mit.edu/6.375
Completion buffer: Interface Requirements
cbuf
getResult
getToken
put (result & token)
Rules and methods concurrency requirement to avoid dead-cycles:
exit
<
getResult
<
enter
cbuf methods’ concurency: cbuf.getResult < cbuf.put < cbuf.getToken
http://csg.csail.mit.edu/6.375
Slide38March 9, 2011
L11-38
http://csg.csail.mit.edu/6.375
Completion Buffer
module mkCBuffer(CBuffer#(a)); Vector#(Reg#(Bool)) valids <- replicateM(mkReg(False)); RegFile#(Token, t) data <- mkRegFile(); Reg#(Token) rdP <- mkConfigReg(0); Reg#(Token) wrP <- mkConfigReg(0); Counter cnt <- mkCounter(); method ActionValue#(Token) getToken() if (cnt < Max); cnt.up(); rdP <= rdP + 1; valids[rdP] <= False; return rdp; endmethod method Action put(Token tok, t d); valids[tok] <= True; data.upd(tok, d); endmethod method ActionValue#(t) getResult() if (valids[wrP]) cnt.down(); wrP <= wrP + 1; return (data.sub(wrP)); endmethodendmodule
getResult < put < getToken
Is the ordering correct?
Is valids okay?