/
EE 194: Advanced VLSI Spring 2018 Tufts University Instructor: Joel EE 194: Advanced VLSI Spring 2018 Tufts University Instructor: Joel

EE 194: Advanced VLSI Spring 2018 Tufts University Instructor: Joel - PowerPoint Presentation

yoshiko-marsland
yoshiko-marsland . @yoshiko-marsland
Follow
343 views
Uploaded On 2019-10-30

EE 194: Advanced VLSI Spring 2018 Tufts University Instructor: Joel - PPT Presentation

EE 194 Advanced VLSI Spring 2018 Tufts University Instructor Joel Grodstein joelgrodsteintuftsedu Verification What is verification The design process highly simplified Talk to your customer ID: 761176

vlsi joel grodstein 194 joel vlsi 194 grodstein adv verification adr tests code test rtl random adder write offset

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "EE 194: Advanced VLSI Spring 2018 Tufts ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

EE 194: Advanced VLSI Spring 2018 Tufts University Instructor: Joel Grodstein joel.grodstein@tufts.edu Verification

What is verification? The design process (highly simplified) Talk to your customer Write the product specImplement the productValidation checks to see if you are building what the customer wantsVerification compares the spec vs. the implementationFor us, typically the “implementation” is RTL code EE 194/Adv. VLSI Joel Grodstein Validation Verification

What is RTL? “Register-Transfer Level” Written at the level of registers, gates. Usually a (e.g.,) 64-bit bus is just one signal; ditto for the registers that store itDoes not show individual transistors EE 194/Adv. VLSI Joel Grodstein

PC offset adder example Function: Add a shifted offset to the PC Uses? Note the pipe-stage naming conventionShift amount can be 0, 1, 2 (not 3) Maybe we don’t want to build shift-by-3 hardwareThere’s a valid bit too (but it doesn’t fit on the slide!) EE 194/Adv. VLSI Joel Grodstein D Q adr_I0[39:0] offset_I0[7:0] shiftL_I0[1:0] shifter adder D Q SO_I0[10:0] address_I1[39:0] SO_I1[10:0] adr_I1[39:0] CLK_I1 Branches, immediate-field offsets

System Verilog module pc_adder (input node[39:0] adr_I0, input node[7:0] offset_I0, input node[1:0] shiftL_I0, output node[39:0] adr_I1); node[39:0] adr_I1;node[10:0] SO_I0, SO_I1;always_ff @( posedge CLK_I1)adr_I1 = adr_I0;SO_I1 = SO_I0;SO_I0 = offset_I0 << shiftL_I0; adr_I1 = adr_I1 + SO_I1;endmodule EE 194/Adv. VLSI Joel Grodstein D Q adr_I0[39:0] offset_I0[7:0] shiftL_I0[1:0] shifter D Q SO_I0[10:0] adr_I1[39:0] SO_I1[10:0] CLK_I1 adder adr_I1[39:0]

Why is verification so important? How many of you have ever written a non-trivial computer program? How many of you always have your programs work perfectly the first time? Designing things is easy.Designing things that work is not so easy!We all agree that verifying stuff we design is important. But why build virtual models of it?Why not just build the real thing, try it out and iterate?EE 194/Adv. VLSI Joel Grodstein

Pentium FDIV bug Discovered in Dec 1994 by Thomas Nicely professor of math, working with prime numbers in Excel Discovered in May 1994 by an Intel verification coop Estimates of severity:Intel: 1 mistake every 27000 yearsIBM: 1 mistake every 24 days Byte magazine: 1 in 9 billion floating point divides with random parameters would produce inaccurate resultsConsequences:December 1994: Intel recalled the processors.January 1995: Intel announces $475M charge against earnings EE 194/Adv. VLSI Joel Grodstein

Jobs doing validation “Trends in Functional Verification: A 2014 Industry Study” by Mentor GraphicsSurvey of  2000 people in the VLSI chip industryConclusions:Average of 11 ver. engineers per project, vs. 10 design engineers (and design engineers spend 47% of their time in verification).3.7% CAGR for design engineers and 12.5% for verification engineers30% of products ship on first tapeout EE 194/Adv. VLSI Joel Grodstein

Who should do verification? Should an RTL designer do their own verification? Pros: They are probably the person who best understands the RTL model.They understand what the chip is supposed to doCons:What if the RTL architect misunderstands the spec; and thus codes the wrong RTL; and thus writes tests that merely validate that he correctly implemented the wrong thing. Hopefully a separate verification person is unlikely to misunderstand the spec in the same way as the architectRTL architect does not understand verification toolingTypically, the “cons” are more important. EE 194/Adv. VLSI Joel Grodstein

What skills do you need for verification? Verification people must… understand hardware, software, and architecture think about how to break thingshave a jack-of-all-trades mentality (though arguably not that much circuit design). EE 194/Adv. VLSI Joel Grodstein

How do you test a CPU RTL? First idea: Write the RTL. Write some more. Eventually you’re done. Run the RTL on some assembly code. Does it get the right answer?Will that idea work?How much effort is it to test the RTL?Problems?You have to wait until you’re fully done with the RTL before you can test it Consider:100 people, 20 teams of 5 people each writing code for one unit.Do you want to test the entire 1M lines of code at once, or write unit tests first? Clearly the latter! But now you have a problem: no unit will run assembly code by itself! So what do you do? EE 194/Adv. VLSI Joel GrodsteinSure Not much

Unit testing How do you test a single unit? (E.g., the PC offset-adder unit) By itself, it cannot run assembly code at all Its behavior can (in general) be quite hard to specify.This problem is not just CPU-specific; many systems are easier to understand at the top levelWe now need multiple piecesFirst, the RTL (e.g., for our offset adder) Second, some way to generate tests.Third, some way to know if our RTL got the right answer.Let’s look at those pieces. EE 194/Adv. VLSI Joel Grodstein

Ways to generate tests Focused tests A verification engineer hand-writes a specific test Pro: you get a test targeted at the specific feature you want to testCon: takes a long time to write each testWhat tests might you write for our PC offset adder?Try every shift amount If it’s a carry-bypass adder, perhaps try input values that swing all interior muxes both waysBut what if the actual bug is something you can’t predict? How can you find it without generating lots of tests?Random tests Constrained-random testsEE 194/Adv. VLSI Joel Grodstein

Ways to generate tests Focused tests A verification engineer hand-writes a specific test Random testsGet more tests the easy way: generate them randomlyJust generate random values for all of the inputs.Easy to generate a ton of testsAny issues? Shift amount=3 is not validMight pick values that overflows 40 bitsWhat if it’s not easy to know the right answer?Constrained-random tests EE 194/Adv. VLSI Joel Grodstein

Ways to generate tests Focused tests A verification engineer hand-writes a specific test Random testsGet more tests the easy way: generate them randomlyConstrained-random testsGenerate tests randomly, but within constraintsE.g., constrain shift amount to 0, 1 or 2 (but not 3) Constrain address to be in the range [0,240-(offset<<shift_amount)-1]Any issues now? Or is that the end of the story? Again: how do you know if you got the right answer?How do you know if/when you’re done?How do you get it to target parts of the design that you know are quite complex? EE 194/Adv. VLSI Joel Grodstein

How do you know if it worked? Hand-design a focused test  you (believe you) know the answer What if you have a random or constrained-random test?Reference model: run the test on a golden referencePredictor: much like a reference model (but not always a full model); enough to predict a test outcome Monitor: monitors the test, does coverage checking and sanity checking.EE 194/Adv. VLSI Joel Grodstein

Monitors Like assertions in a program Check that error conditions don’t occur Are they any better than just waiting for the test to fail and debugging it?Catch error conditions closer to the source for easier debugOften catch bugs even if the test still passesExample:assert property (shiftL_I0 != 3); assert property (clk_I1 -> valid_I0) // roughlyEE 194/Adv. VLSI Joel Grodstein

What would a reference model look like? module pc_adder (input node[39:0] adr_I0, input node[7:0] offset_I0, input node[1:0] shiftL_I0, output node[39:0] adr_I1); node[39:0] adr_I1;node[10:0] SO_I0, SO_I1; always_ff @(posedge CLK_I1)adr_I1 = adr_I0;SO_I1 = SO_I0; SO_I1 = offset_I0 << shiftL_I0;adr_I1 = adr_I0 + SO_I1;endmodule EE 194/Adv. VLSI Joel Grodstein D Q adr_I0[39:0] offset_I0[7:0] shiftL_I0[1:0] shifter D Q SO_I0[10:0] adr_I1[39:0] SO_I1[10:0] CLK_I1 adder adr_I1[39:0] Would it look any different from the model itself? Why write a reference model that’s just the model?

What would a reference model look like? If our adder is a carry-bypass adder should we write it as “adr_I1 = adr_I1 + SO_I1”? should we include all of the details of the full adders and bypass muxes ?Issues:The lower level we write it at, the more likely it is to be wrong If we write just a “+”, and the schematics implement a carry-bypass adder, who checks that they’re equivalent?Practical answer:write at the highest level that RLS supportsoften this is still fairly low write a high-level reference model that is likely correcttests compare the two EE 194/Adv. VLSI Joel Grodstein D Q adr_I0[39:0] offset_I0[7:0] shiftL_I0[1:0] shifter D Q SO_I0[10:0] adr_I1[39:0] SO_I1[10:0] CLK_I1 adder adr_I1[39:0]

Back to FDiv For a floating-point divide, what might the RTL and the golden reference model look like? Reference model is just a divide (one machine instruction)RTL implements the particular division algorithm we’ve chosen (Newton-Raphson, modified Booth, etc) EE 194/Adv. VLSI Joel Grodstein

Moving up the hierarchy Go up the hierarchy one level at a time Drop the internal generators Keep the monitorsProbably keep the predictors.Industry mostly uses UVMUniversal Verification MethodologyA set of System Verilog classes and functions to support everything we’ve talked about, and more Can also be used with VHDL models EE 194/Adv. VLSI Joel Grodstein

Any issues now? Or is that the end of the story? Again: how do you know if you got the right answer? How do you know if/when you’re done? How do you get it to target parts of the design that you know are quite complex?EE 194/Adv. VLSI Joel Grodstein 

Coverage checking Where we are: you write a lot of RTL code you write a lot of tests for it at various levels of the hierarchyHow do you know how much/which RTL you did/didn’t test?What you think you tested ≠ what you actually tested!Code coverage first and simplest metricI.e., how many lines of your RTL ever even got executed? (code in an if/then/else may not be executed)Related metricsDid each signal get set to both 0 and 1? For every state machine, did every state get reached and every arc get traversed? (depends on your compiler recognizing state machines, which usually isn’t hard). EE 194/Adv. VLSI Joel Grodstein

Code coverage Thoughts on those code coverage metrics? clearly necessary clearly not sufficient! (Executing code ≠ testing code)More sophisticated metrics:not only check whether nodes have toggledalso check if there’s a path from the node being at the wrong value, to that wrong value being captured/reportedHow do you know when you’re done? Coverage checking just tells you what you’ve coverednot if what you’ve covered is “enough”.Certainly doesn’t guarantee that the chip works EE 194/Adv. VLSI Joel Grodstein

Feedback 90-10 rule applies as usual 90% of the work is in hitting 10% of the coverage How do we improve that?Common methodology:Start with a few focused tests; see if the unit is aliveAdd constrained random tests Measure coverage, find gapsWrite more focused tests targeted at the gapsSteer your constrained-random testsSteering: Bias your random-number generatorsSo they (hopefully) target the code you’re missingExample: you tried to bias your generator to target the bypass muxes. But coverage data tells you that you didn’tSo you try again Or, do a better job of targeting whatever other issue is indicated EE 194/Adv. VLSI Joel Grodstein

How do you know you’re done? In general, you’re never 100% sure! Other than formal methods (see later) So what do people actually do?Track bugs foundDon’t stop until the bug rate drops “low enough”and stays there “long enough.”And usually that’s “good enough” But remember the statistic about how often we need multiple tapeouts EE 194/Adv. VLSI Joel Grodstein

How do you avoid the FDIV bug? FDIV on two 32-bit operands. How long to test them all? 32 bits * 32 bits = 2 64 combinations, 1019 Assume we test them using silicon, at 1GHzThat’s 1010 seconds, or 300 yearsThere’s no way to exhaustively test an FDIV How do we not have another $500M charge against earnings?Better public acceptance of bugs Not try any new division algorithmsFormal techniques EE 194/Adv. VLSI Joel Grodstein

Formal verif of arithmetic Definition of a single-bit full adder:sum = a  b  cin cout = ab | acin | bcinCan we prove that the following implementation works?p = a  b g = a & bs = p  cin cout = g | pcinWill that type of strategy work for a carry-bypass adder?In fact, yes! (but it’s quite a lot of Boolean algebra)Can you do the same thing for FDIV?It’s lots harder, but you canLots of effort has gone into this – FDIV gave people religion Can you prove that a CPU executes an instruction set?Not even closeEE 194/Adv. VLSI Joel Grodstein

s1 = a & b; s2 = a & !b; s3 = !a; unique case ({s1,s2,s3})3b'100: out = in1;3b'010: out = in2; 3b'001: out = in3;endcase EE 194/Adv. VLSI Joel Grodstein System Verilog supports a unique case statement automatically checks that exactly one choice is active during simulation checks during every cycle of every simulation you run Is simulation good enough, in general? No. What if there’s some situation that makes all of s1, s2 and s3 high, but we never test that case?

Formal property verification Can we do a Boolean proof? Sure. Try it on the board if you want This is called formal property verificationFrom the Mentor paper: 26% of products do itIf you have to work through too much logic to do the proof, the tools explode EE 194/Adv. VLSI Joel Grodstein s1 = a && b; s2 = a && !b; s3 = !a; unique case ({s1,s2,s3}) 3b'100: out = in1; 3b'010: out = in2; 3b'001: out = in3; endcase

Formal protocol verification Verifying protocols: networks, cache coherence Deadlock, livelock, forward progressThese are very hard to check by normal testing methods.Arguably the most successful use of formal verificationMentor paper: 21% of projects do this EE 194/Adv. VLSI Joel Grodstein

A few formal checkers TLA, Murphi and Spin: public-domain toolsTLA is Leslie LamportMurphi is David Dill (Stanford)Spin: primarily targeted at software verification, but also for protocols. See www.spinroot.comJasperGold: commercial software from Cadence EE 194/Adv. VLSI Joel Grodstein

Emulators RTL simulation may get ≈ 50 cycles/second Simulating a test with 1000 assembly instructions is fine Simulating O/S boot is not!What do you do about that?Emulators are one answerSpecialized, dedicated (and > $1M!) hardware for simulationOr you can just put your design into multiple FPGAs (much cheaper, but not great if your design is big)Becoming more and more common for big designs EE 194/Adv. VLSI Joel Grodstein

Post-silicon verification We’ve been discussing pre-silicon verificationRTL model vs. customer specTime to talk about post-silicon versionSilicon vs. customer specWhy do we need this? Sometimes silicon ≠ RTLMore to the point, pre-silicon verification still left bugsWhy might we find bugs more easily post-silicon than pre-?Pre-silicon RTL simulator  50 cycles/second.Silicon  2GHzRun the numbers:(Farm of 100 machines)*(50 cycles/sec)*(31M sec/year) =  155*10 9 cycles(Quad-socket system)*(2GHz)*(20 seconds) 160 * 109.More testing in the first minute post-silicon than entire pre-silicon! EE 194/Adv. VLSI Joel Grodstein

The usual questions Where do the tests come from, and how do you know if they pass? first one: can you boot an O/S? Probably a good 1-20 minutes of code, and mostly if it fails then you don’t boot.But you might still boot if some of the instructions that you tested don’t work; and there’s a lot that it doesn’t test Where else do you collect up code?We all have a ton of code lying around – but is it self checking?That’s the hard part; with a lot of code, you may not easily know if it passed or failed!But you can collect up any self-checking assembly code over the entire history of programming. Or really, any self-checking C++ code that you can compile to assembly EE 194/Adv. VLSI Joel Grodstein

Where can we get new tests? Old tests may not be the best for a new microarchitecture.Where do you get new tests?Random code generator (RCG): “randomly” generate assembly code (actually it’s constrained)The usual problem:It’s easy to write random code. But how do you know if it worked?Is it even deterministic? (What if you branch on a register or memory that’s not set?)A few resolutions:Initialize all of the registers, and all of the memory you plan to use Generate pseudo-random rather than random.Use an assembly-code simulator to check the results for all regs EE 194/Adv. VLSI Joel Grodstein

Papers to read Trends in functional verification: a 2016 industry study, DAC 2017 Functional verification of a Multiple-issue, out-of-order, superscalar Alpha processor – the DEC Alpha 21264 microprocessor, DAC 1998How does this compare to what we discussed in class? To what is done today? Efficient and exhaustive floating point verification using sequential equivalence checking, DVcon 2017 How does this compare to what we discussed in class?What do you think about the amount of human time the process took, and about the formal-verification numbers in the Mentor paper?Post-silicon validation of the IBM POWER8 processor , DAC 2014What new things did they do that we did not discuss in class?Describe their use of accelerators, irritators; describe the final issue in their “future challenges” Guidelines :Everyone reads the Mentor paper2 people read each of the other 3 papers EE 194/Adv. VLSI Joel Grodstein

First choice: debug breakpoints, find when the wrong answer commits to a register. So now you know roughly when the problem happens. Now what? Debugging tests: restart-replay, scan for debug, debug triggers EE 194/Adv. VLSI Joel Grodstein

In-class paper discussion What statistics from the paper are most interesting or surprising to you? Thoughts about, if the trends in the paper continue, what that implies for the future? Why do you think changes in specification are such a big problem?Do you think some of the trends will get worse or better in the future?Do you trust the methodology?Terms: hardware emulation, formal validation EE 194/Adv. VLSI Joel Grodstein