Shuvendu Lahiri Microsoft Research Redmond Joint work with S Qadeer MSR J Condit B Hackett Z Rakamaric T Wies J Voung J Galeotti Problem Modular property checking of C modules ID: 577262
Download Presentation The PPT/PDF document "SMT based predictable analysis of system..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
SMT based predictable analysis of systems code
Shuvendu LahiriMicrosoft Research, Redmond
Joint work
with:
S
.
Qadeer (MSR)
J
.
Condit,
B.
Hackett, Z. Rakamaric, T. Wies, J.
Voung
, J.
Galeotti
Slide2
Problem
Modular property checking of C modulesDevice drivers, file systems, kernel components,…Double-free, lock usage, memory safety, user-provided assertions
Goal: Predictable analysis using SMT solvers
E
fficiently decidable logicsSlide3
HAVOC
Property checker for C programsActive [’06-’09]Found 100+ errors in various kernel componentsSlide4
C
Boogie
SMT
Solver (Z3
)
Decision Procedures for types, lists, arrays
Boogie VC gen
C program
Boogie
program
SMT formula
Memory model
Verified
HAVOC modular checker
Warning
AnnotationsSlide5
Challenges imposed for analyzing C
Additional challenges (over Java/C#)Lack of type safetyPresence of low-level data structures
Explicit memory management (free)
Bit-wise operations
……Slide6
TypesSlide7
IRP
IRP
Flink
Blink
ListEntry
Flink
Blink
ListEntry
Example: Type Checking
p
typedef
struct
_LIST_ENTRY{
LIST_ENTRY *
Flink
, *Blink;
} LIST_ENTRY, *PLIST_ENTRY;
typedef
struct
_IRP{
….
LIST_ENTRY
ListEntry
;
…
} IRP, *PIRP;
Slide8
IRP
IRP
Flink
Blink
ListEntry
Flink
Blink
ListEntry
Example: Type Checking
p
q = CONTAINING_RECORD(p, IRP,
ListEntry
)
= (IRP*)((char*)p - &((IRP*)0->
ListEntry
))
Type Checker:
Does variable
q
have type
IRP*
?
qSlide9
Property Checker:
Is
r->Data1
unchanged?
...
q-
>Data2 = 42
;
IRP
IRP
Flink
Blink
ListEntry
Example: Property Checking
q
Data2
Data1
Flink
Blink
ListEntry
Data2
Data1
rSlide10
Example: Property Checking
Flink
Blink
ListEntry
Data2
Data1
Flink
Blink
ListEntry
Data2
Data2 / Data1
For all we know,
Data1
and
Data2
could be aliased!
q
rSlide11
Types in C programs
Types in C programs cannot be trustedUnsafe type casts, pointer arithmetic
Typical type checking in C compilers cannot ensure memory safety
Lack of types hurts property checking
DisambiguationSlide12
ListsSlide13
Simple type-state property
Allocation type-state of DEV_OBJDevice Objects (DEV_OBJ) allocated and freed
Property to check for a module
IoDeleteDevice
() only called on
MyDevObj
~
MyDevObj
MyDevObj
IoCreateDevice
()
IoDeleteDevice
()Slide14
Simple property
simple invariants
NT_STATUS Unload(…){
….
iter
=
hd
->First; while(iter
!= null) { RemoveEntryList
(iter);
iter = iter
->Next; IoDeleteDevice
(iter->Self); }
….}
First
Next
Self
DevExt
hd
DEV_OBJ
DEV_EXT
DEV_OBJ
Next
Self
DevExt
DEV_EXT
Pointers from the list point to distinct objectsSlide15
Lists
Prevalent in most systems codeManipulated by explicit pointer operationsUpdates to next fieldsSlide16
This talk
Focus on two of these challengesLack of type-safetyPresence of low-level data structuresSolution
New
efficient
SMT theories for the above problemsSlide17
Overview
MotivationBackgroundExploiting types [POPL’09]Logic for lists [POPL’08]Application [CAV’09]Slide18
Program Correctness: Floyd-Hoare Triple
Floyd-Hoare triple
{
P
} S {
Q
}
P
, Q : predicates/propertyS : a programFrom a state satisfying P, if S executes, No assertion in S fails, and Terminating executions end up in a state satisfying QSlide19
Select(f1,b) = 5
f2 = Store(f1,a,5)
Select(f2,a) + Select(f2,b)
=
10
is valid
{ b.f = 5 } a.f = 5 { a.f + b.f = 10 }
is valid theory of equality: f, =theory of arithmetic: 5, 10, +theory of arrays: Select, Store
iffProgram verification
Formula
[Nelson & Oppen ’79]Slide20
Satisfiability-Modulo-Theory (SMT)
Boolean satisfiability
solving +
theory
reasoning
Ground theories
Equality, arithmetic, arrays, bit-vectors, ….Powerful methods to combine decision procedures for theories[Nelson & Oppen ’79]Phenomenal progress in the past few yearsZ3, Mathsat,
Yices, …. Works best for NP-complete theoriesSlide21
Overview
MotivationBackgroundExploiting types Logic for listsCase studySlide22
Memory model for C
Each pointer is an integerHeap as a map
//
Mutable
Mem
:
int intAlloc: int {UNALLOCATED, ALLOCATED, FREED}
// ImmutableBase: int int //base address of each pointerSlide23
C
Boogie
function f_DATA: int -> int;
forall u: int:: f_DATA(u) = u + 40;
procedure create() returns d:int{
var @a: int;
@a := malloc(4);
d := call malloc(44);
call init(g_DATA(d),10, @a);
Mem[f_DATA(d)] := Mem[@a];
Mem[g_DATA(d) + 1*4]:=2;
free(@a);
return;
}
typedef struct {
int g[10]; int f;} DATA;
DATA *create() {
int a;
DATA *d = (DATA*) malloc(sizeof(DATA));
init(d->g, 10, &a);
d->f = a;
d->g[1] = 2;
return d;
}Slide24
Missing part: Types?
Types in C programs can’t be trusted
Lack of types hurts property
checkingSlide25
Our Approach
[POPL’09]Type checking assertion checking
Provide
formal semantics
for C and its types
Use types to improve the
property checker
Provide Java-style
field disambiguationProvide decision procedures for the assertion checkingSlide26
Formalizing Type Safety
A C program is
type safe
if the
run-time value
of every variable and heap location corresponds to its
compile-time type
.
Mem : addr -> value
Type :
addr -> type
HasType : value x type -> bool
for all a in
addr
,
HasType(
Mem(a), Type
(a))Slide27
Gives value stored at each heap location
Values are integersGives declared type for each heap location
Types include
Int
,
Ptr
(
Int
), …Modeling the Heap
Type : addr -> type
Mem
:
addr
-> valueSlide28
“Match” Predicate
Lifts the Type map to multi-word typesMatch(a, t) holds iff
Type[a … n] matches t
Match
:
addr
x type ->
bool
…
Int
99
Int
100
Ptr
(
Int
)
101
Int
102
Ptr
(
Foo
)
103
…
Type
Match(a,
Foo
) <==>
Match(a,
Int
) &&
Match(a+1,
Int
) &&
Match(a+2,
Ptr
(
Int
))
struct
foo
{
int
n;
int
m;
int
*p;
}
C Type
HAVOC Axiom
Match(99,
Foo
)
¬
Match(101,
Foo
)
Match(a,
Int
) <==>
Type[a] ==
Int
int
C Type
HAVOC Axiom
int
*
Match(a,
Ptr
(
Int
)) <==>
Type[a] ==
Ptr
(
Int
)
Match(99,
Int
)
Match(101,
Ptr
(
Int
))Slide29
“HasType” Predicate
Defines which values belong to each typeHasType
(v, t) holds
iff
v is a value of type t
HasType
: value x type ->
bool
HasType
(v,
Int) <==>
true
int
C Type
HAVOC Axiom
t*
HasType
(v,
Ptr
(t)) <==>
v == 0 || (v > 0 && Match(v, t))
…
Int
99
Int
100
Ptr
(
Int
)
101
Int
102
Ptr
(
Foo
)
103
…
Type
HasType
(99,
Ptr
(
Foo
))
¬
HasType
(101,
Ptr
(
Foo
))Slide30
Type Safety Invariant
for all a in
addr
,
HasType
(
Mem
(a),
Type(a))Part of preconditions, postconditions, loop invariantsAssert at every program point
Add similar assertions for locals (if desired)Slide31
Decision Procedure
Verification conditions refer to Mem
,
Type
,
Match
,
HasType
, Type-safety invariantDecision problem: NP-completeProvide decision procedure using an SMT solverSuffices to instantiate the quantifiers in these axioms on a fixed set of termsSlide32
IRP
IRP
Flink
Blink
ListEntry
Flink
Blink
ListEntry
Example: Type Checking
p
q = CONTAINING_RECORD(p, IRP,
ListEntry
)
= (IRP*)((char*)p - &((IRP*)0->
ListEntry
))
Type Checker:
Does variable
q
have type
IRP*
?
qSlide33
Solution: Add Preconditions
requires(
HasType
(ENCL(p), record
*) &&
ENCL(p)!=
NULL
)void init_record
(list *p) { record *r = CONTAINING_RECORD(p, record, node); r->data2 = 42;
}
#define ENCL(x) CONTAINING_RECORD(x,
record, node) Slide34
Field Safety Invariant
Field safetyRefinement of type safety
Disambiguate two fields of sam
e type
Change
HasType
/Match
are refined
to distinguish different field names of same typeSlide35
Adding Field Names
struct
list { list *
prev
; list *next; }
struct
record {
int
data1; list node;
int data2; }
Match(a, List) <==>
Match(a, Ptr
(List))
&& Match(a+1, Ptr
(List))
Match(a, Record) <==>
Match(a,
int) && Match(a+1, List) && Match(a+3,
int)
Match(a,
Ptr
(List))
<==> Type[a] == Ptr
(List)
HasType
(v,
Ptr
(List
)
)<==>
v == 0 || (v > 0 &&
Match(v, List
))
Match(a,
int
)
<==> Type[a] ==
int
HasType
(v,
int
)
<==> true
… same for
Next
and
Data2
…
same definition as
IntSlide36
Adding Field Names
struct
list { list *
prev
; list *next; }
struct
record {
int
data1; list node;
int data2; }
Match(a, List) <==>
Match(a, Prev
) && Match(a+1, Next
)
Match(a, Record) <==> Match(a,
Data1) && Match(a+1, List) && Match(a+3,
Data2)
Match(a,
Prev) <==> Type[a] ==
Prev
HasType(v,
Prev) <==> v == 0 || (v > 0 && Match(v, List))
Match(a,
Data1
) <==> Type[a] ==
Data1
HasType
(v,
Data1
) <==> true
… same for
Next
and
Data2
…
same definition as
IntSlide37
Experiments
Implementation supports full C languageSupports polymorphismSupports user-defined, dependent types
Annotated and checked four Windows drivers
Sample drivers provided with Windows DDKSlide38
Enables field splitting
Can split the heap for “field-safe” programsOne heap map per word-type field and pointer type (almost!)
Mem_f
:
addr
val
Mem_g : addr
val Mem_T*: addr
valSimple exampleC code
x->f = 1;Boogie code
Mem_f[x + Offset(f)] := 1;
38
Disambiguates writes to fields + faster checkingSlide39
Why almost?
struct A {int a; int
b; };
struct
B {
int
c;
int
d; int e;}void P(struct B *x){ struct A *y = (struct A*) x; y->a = 1; assert (x->c == 1);}
Field safety assertion will fail
Have to merge {a, c} {b, d}Slide40
Summary
Types as addition part of the stateType safety checking assertion checking
Efficiently decidable (NP) logic
Separation of concern for property checking
Can exploit field disambiguation for “field-safe” programsSlide41
Overview
MotivationBackgroundExploiting types Logic for listsCase studySlide42
Logic for lists
SMT theory with new predicate symbolsSlide43
next
prev
data
next
prev
data
next
prev
data
y
x
Btwn
next
(
x,y
)
Btwn
prev
(y,x)
Reachability predicate:
Btwn
fSlide44
next
prev
data
next
prev
data
next
prev
data
y
x
Inverse of a function: f
-1
w
data
-1
(w) = {x, y}Slide45
Expressive logic
Express properties of collections
x
Btwn
f
(f(hd), hd). state(x) = LOCKED //cyclicArithmetic reasoning on data (e.g. sortedness)
x Btwn
f (hd
, null) \ {null}. y
Btwnf
(x, null) \ {null}. d(x) d(y)
Type/object invariants x
Type-1(“__logentry”).
logtype(x) > 0 file_name(x) != nullSlide46
Can express desired invariants
NT_STATUS Unload(…){
….
iter
=
hd
->First;
while(
iter != null) {
RemoveEntryList(
iter);
iter = iter->Next;
IoDeleteDevice(iter
->Self); } ….
}
First
Next
Self
DevExt
hd
x
Btwn
Next
(
hd
->
First,NULL
)
.
x->Self->
DevExt
= x
DEV_OBJ
DEV_EXT
DEV_OBJ
Next
Self
DevExt
DEV_EXT
x
Btwn
Next
(
hd
->
First,NULL
)
.
Self
-1
(x->Self) = {&x->Self}
ORSlide47
Precise and efficient
[POPL ‘08]PrecisionGiven a Floyd-Hoare triple {P} S {Q}, P/Q are in the assertion logic, and S is a loop-free, call-free code fragment
There is a formula in the assertion logic
Linear in the size of the triple
V
alid
iff
the triple holds
EfficiencyThe decision problem is NP-complete Slide48
Ground Logic
t
Term ::= c | x |
t
1
+ t
2
|
t1 - t2 | f(t) G GFormula ::= t = t’ | t < t’ |
t Btwnf(t1, t2) | G
S
Set ::= f-1
(t) | Btwnf(t
1, t2)
F Formula ::= G | F1
F2
|F1 F
2 | x S. F
LogicSlide49
Ground decision procedure
Provide a set of 10 rewrite rules for
Btwn
f
Sound, complete and terminating
E.g. Transitivity3
t
1 Btwnf(t0, t2) t Btwn
f(t0, t1) t Btwnf(t
0, t2), t1
Btwnf(t, t2
) Slide50
t
Term ::= c | x |
t
1
+ t
2
|
t
1 - t2 | f(t) G GFormula ::= t = t’ | t < t’ | t Btwnf(t1, t2) | G
S Set ::= f-1
(t) | Btwnf(t1
, t2)
F Formula ::= G | F
1 F
2 |F1
F2 |
x S. F
Logic
Bounded quantification over interpreted setsSlide51
Sort restriction
The unsorted logic is undecidable
Unsorted logic
Sorted logic
Each term has a sort D, each function f has a sort D
E
There is a partial order on the sorts
Sort-restriction on x S. F sort(x) should be less than the
sort(t[x]) for any term t[x] inside
FSlide52
Sort restriction
Sort-restriction on x S. F
sort(x)
should be less than the
sort(t[x])
for any term t[x] inside
FSorts are quite naturalCome from program typesMost interesting specifications can be expressed
See paper for exceptionsSlide53
Evaluation
Compare with an incomplete axiomatization of reachability predicate [TACAS’07]Greatly improved the
predictability
of the verifier
Reduced runtimes (2X – 100X)
Eliminate need for carefully crafted axioms and invariants
Can handle newer examplesSlide54
Assertion logic of HAVOC
TheoriesGround logics (linear arithmetic, uninterpreted functions, arrays,…), lists Can
quantify
over
lists
,
array, types
Decision procedures (based on rewriting)
Implemented using quantifiers + triggers in Boogie/Z3Slide55
Experience
Quite encouraging compared to previous efforts based on axiomatizing listsOn medium benchmarks (list insert, delete, sort, …)Slide56
Overview
MotivationBackgroundExploiting types Logic for listsCase studySlide57
Case study
Can we verify properties on reasonably large modules in the presence of lists?With high automationChallengeSynthesizing the quantified invariants starting from a property is
difficult for real code
Answer: sometimes
If we can identify how to split the burden between a user and annotation inferenceSlide58
Intra module inference
CAV ’09SettingA module with a few public and lot of private methods (e.g. device drivers)High level idea
Let the user specify the module invariant on public APIs
An inference engine infers “exceptions” from the module invariant for private methodsSlide59
Module invariant broken
requires
(
TypeInvDO
)
ensures
(
TypeInvDO
)void publicFoo () { PDEV_OBJ do = NewDEV_OBJ
(); privateBar(do);}
requires
(TypeInvDOExcept
(do))requires
(TypeInvDO)
ensures (
TypeInvDO)void
privateBar (PDEV_OBJ do) { do->
DevExt->Self = do;}
DevExt
Self
x
DEV_OBJ
DEV_EXT
TypeInvDO
x
MyDevObj
. x->
DevExt
->Self = x
TypeInvDOExcept
(y)
x
MyDevObj
.
x = y || x-
>
DevExt
->Self = x Slide60
Intra module inference
User specifies the module invariantsModule invariants are simple, as they are on the “steady state”Usually 1 or 2 such invariants
Add “candidate annotations” for module invariant exceptions
Simple heuristics to infer the “exceptions”
Run Houdini [Flanagan&Leino’01] to perform monomial predicate abstractionSlide61
Intra-module inference
Applied to a 4 device driversLack of double-free, lock API usage~7KLOC, ~50 loops/procedures
Module invariants over lists and arrays
Verified the properties with ~10
manual
annotations, ~1000
inferred
annotationsSlide62
Conclusion
Difficulty for reasoning about C programsLack of type safetyLow level listsProvided new efficient logics on top on SMT solvers
Able to apply them to verify programs several thousand lines largeSlide63
Download
New version to be released shortlyCurrent version athttp
://research.microsoft.com/en-us/projects/havoc/
Try out VCC
A functional verifier for concurrent C
http://vcc.codeplex.com/
Visit Rise4Fun webpage for other tools
http://rise4fun.comSlide64
HAVOC references
C memory modelBasic memory model [TACAS ‘07]Encoding types [POPL ’09]Decision procedures for lists [POPL ‘06, ‘08]
Annotation inference
Complexity [CADE ‘09]
Intra-module inference [CAV ‘09]
Transparent inference [VMCAI ‘11]
Local reasoning
Call invariants [NFM ‘11]
Linear maps [PLPV ‘11]Property checking in the large [VSTTE ‘10]Slide65
Related work
Logics for data structuresLists [Nelson POPL‘83]PALE [Moller et al. PLDI’01]….
Strand [Madhusudan et al. POPL’11]
Trees [Wies et al. CADE’11]
C memory model + types
VCC [Cohen et al. SSV‘09]
…
Separation logic
[O’ Hearn, Reynolds, Yang CSL‘01]Slide66
Questions