/
The basic “Adaptive Logic Module (ALM) Block Diagram” The basic “Adaptive Logic Module (ALM) Block Diagram”

The basic “Adaptive Logic Module (ALM) Block Diagram” - PowerPoint Presentation

test
test . @test
Follow
473 views
Uploaded On 2017-11-15

The basic “Adaptive Logic Module (ALM) Block Diagram” - PPT Presentation

Note the fast adder carry chain does not require going out to programmable switch boxes Altera Stratix II FPGA Architecture Each ALM can be configured to one or two logic functions ALM Flexibility ID: 605679

logic alm stratix flexibility alm logic flexibility stratix alms altera built interfaces arm quad cores 5500 fpgas

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "The basic “Adaptive Logic Module (ALM)..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

In your bitcoin_hash.flow.rpt file, it

must say Total block memory bits is 0 (otherwise will not pass).If not, go to “Assignments→Settings”in Quartus, go to “Compiler Settings”,click “Advanced Settings (Synthesis)”Turn OFF “Auto RAM Replacement”and “Auto Shift Register Replacement”

No Block Memory BitsSlide2

In your Quartus compilation message

No inferred megafunctions: Most likely caused by block memories or shift-register replacement. Can turn OFF “Automatic RAM Replacement” and “Automatic Shift Register Replacement” in “Advanced Settings (Synthesis)”. If you still see “inferred megafunctions”, contact Professor. Your design will not pass if it has inferred megafunctions.No inferred latches: Your design will not pass if it has inferred latches.No Inferred Megafunctions/LatchesSlide3

You must

assign mem_clk = clk;Your bitcoin_hash.sta.rpt must show “clk” is the only clock.Only posedge clkSlide4

For most people, t1 + t2 in sha256_op will be the critical path.

As discussed before, can pipeline (pre-compute) the following:However, need to pre-compute “w” another cycle ahead (which requires non-trivial code change).Instead, can just pre-computee.g.More Tips

// SHA256 hash

roundfunction logic [255:0] sha256_op(input logic [31:0] a, b, c, d, e, f, g, h

, w,

input logic [ 7:0] t); logic

[31:0] S1, S0, ch, maj, t1, t2; // internal

signals

b

egin

S1 = rightrotate(e, 6) ^ rightrotate(e, 11) ^ rightrotate(e,

25

);

ch = (e & f) | ((~e) &

g

);

t1 = ch + S1 + h + k[t] + w; S0 = rightrotate(a, 2) ^ rightrotate(a, 13) ^ rightrotate(a, 22); maj = (a & b) | (a & c) | (b & c); t2 = maj + S0; sha256_op = {t1 + t2, a, b, c, d + t1, e, f, g};endendfunction

t1

= ch + S1 + h + k[t] + w;

p

=

g + k[t+1]; // g is the same as h[t+1]Slide5

Some of you have found that inline expanding function calls with the corresponding logic has been effective in improving area and Fmax. e.g. instead

of doingYou might inline expand it intoInline expanding function calls do not always help, but may be worth trying in the end, especially if you have many levels of nested function calls.More Tips

w[15] <= wtnew();

w[15] <= w[0] + (rightrotate(w[1

], 7)^rightrotate(w[1

], 18)^(

w[1] >> 3))

+ w[9

] + (rightrotate(w[14],

17

)^rightrotate(w[14

],

19

)^(

w[14] >> 10));Slide6

Very important to do a “Delay-optimized” version with parallel execution of all 16 nonces.

If your “Delay-optimized” design has Area*Delay > 60,000 millisec*area, it should be possible to do a separate “Area*Delay optimized” design with better results (e.g., by performing the 16 nonces one after another sequentially)More Tips