/
SMT based predictable analysis of systems code SMT based predictable analysis of systems code

SMT based predictable analysis of systems code - PowerPoint Presentation

conchita-marotz
conchita-marotz . @conchita-marotz
Follow
373 views
Uploaded On 2017-08-09

SMT based predictable analysis of systems code - PPT Presentation

Shuvendu Lahiri Microsoft Research Redmond Joint work with S Qadeer MSR J Condit B Hackett Z Rakamaric T Wies J Voung J Galeotti Problem Modular property checking of C modules ID: 577262

type int amp match int type match amp list data types irp ptr logic checking lists listentry property mem

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "SMT based predictable analysis of system..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

SMT based predictable analysis of systems code

Shuvendu LahiriMicrosoft Research, Redmond

Joint work

with:

S

.

Qadeer (MSR)

J

.

Condit,

B.

Hackett, Z. Rakamaric, T. Wies, J.

Voung

, J.

Galeotti

Slide2

Problem

Modular property checking of C modulesDevice drivers, file systems, kernel components,…Double-free, lock usage, memory safety, user-provided assertions

Goal: Predictable analysis using SMT solvers

E

fficiently decidable logicsSlide3

HAVOC

Property checker for C programsActive [’06-’09]Found 100+ errors in various kernel componentsSlide4

C

Boogie

SMT

Solver (Z3

)

Decision Procedures for types, lists, arrays

Boogie VC gen

C program

Boogie

program

SMT formula

Memory model

Verified

HAVOC modular checker

Warning

AnnotationsSlide5

Challenges imposed for analyzing C

Additional challenges (over Java/C#)Lack of type safetyPresence of low-level data structures

Explicit memory management (free)

Bit-wise operations

……Slide6

TypesSlide7

IRP

IRP

Flink

Blink

ListEntry

Flink

Blink

ListEntry

Example: Type Checking

p

typedef

struct

_LIST_ENTRY{

LIST_ENTRY *

Flink

, *Blink;

} LIST_ENTRY, *PLIST_ENTRY;

typedef

struct

_IRP{

….

LIST_ENTRY

ListEntry

;

} IRP, *PIRP;

Slide8

IRP

IRP

Flink

Blink

ListEntry

Flink

Blink

ListEntry

Example: Type Checking

p

q = CONTAINING_RECORD(p, IRP,

ListEntry

)

= (IRP*)((char*)p - &((IRP*)0->

ListEntry

))

Type Checker:

Does variable

q

have type

IRP*

?

qSlide9

Property Checker:

Is

r->Data1

unchanged?

...

q-

>Data2 = 42

;

IRP

IRP

Flink

Blink

ListEntry

Example: Property Checking

q

Data2

Data1

Flink

Blink

ListEntry

Data2

Data1

rSlide10

Example: Property Checking

Flink

Blink

ListEntry

Data2

Data1

Flink

Blink

ListEntry

Data2

Data2 / Data1

For all we know,

Data1

and

Data2

could be aliased!

q

rSlide11

Types in C programs

Types in C programs cannot be trustedUnsafe type casts, pointer arithmetic

Typical type checking in C compilers cannot ensure memory safety

Lack of types hurts property checking

DisambiguationSlide12

ListsSlide13

Simple type-state property

Allocation type-state of DEV_OBJDevice Objects (DEV_OBJ) allocated and freed

Property to check for a module

IoDeleteDevice

() only called on

MyDevObj

~

MyDevObj

MyDevObj

IoCreateDevice

()

IoDeleteDevice

()Slide14

Simple property

simple invariants

NT_STATUS Unload(…){

….

iter

=

hd

->First; while(iter

!= null) { RemoveEntryList

(iter);

iter = iter

->Next; IoDeleteDevice

(iter->Self); }

….}

First

Next

Self

DevExt

hd

DEV_OBJ

DEV_EXT

DEV_OBJ

Next

Self

DevExt

DEV_EXT

Pointers from the list point to distinct objectsSlide15

Lists

Prevalent in most systems codeManipulated by explicit pointer operationsUpdates to next fieldsSlide16

This talk

Focus on two of these challengesLack of type-safetyPresence of low-level data structuresSolution

New

efficient

SMT theories for the above problemsSlide17

Overview

MotivationBackgroundExploiting types [POPL’09]Logic for lists [POPL’08]Application [CAV’09]Slide18

Program Correctness: Floyd-Hoare Triple

Floyd-Hoare triple

{

P

} S {

Q

}

P

, Q : predicates/propertyS : a programFrom a state satisfying P, if S executes, No assertion in S fails, and Terminating executions end up in a state satisfying QSlide19

Select(f1,b) = 5

 f2 = Store(f1,a,5)

Select(f2,a) + Select(f2,b)

=

10

is valid

{ b.f = 5 } a.f = 5 { a.f + b.f = 10 }

is valid theory of equality: f, =theory of arithmetic: 5, 10, +theory of arrays: Select, Store

iffProgram verification

 Formula

[Nelson & Oppen ’79]Slide20

Satisfiability-Modulo-Theory (SMT)

Boolean satisfiability

solving +

theory

reasoning

Ground theories

Equality, arithmetic, arrays, bit-vectors, ….Powerful methods to combine decision procedures for theories[Nelson & Oppen ’79]Phenomenal progress in the past few yearsZ3, Mathsat,

Yices, …. Works best for NP-complete theoriesSlide21

Overview

MotivationBackgroundExploiting types Logic for listsCase studySlide22

Memory model for C

Each pointer is an integerHeap as a map

//

Mutable

Mem

:

int  intAlloc: int  {UNALLOCATED, ALLOCATED, FREED}

// ImmutableBase: int  int //base address of each pointerSlide23

C

 Boogie

function f_DATA: int -> int;

forall u: int:: f_DATA(u) = u + 40;

procedure create() returns d:int{

var @a: int;

@a := malloc(4);

d := call malloc(44);

call init(g_DATA(d),10, @a);

Mem[f_DATA(d)] := Mem[@a];

Mem[g_DATA(d) + 1*4]:=2;

free(@a);

return;

}

typedef struct {

int g[10]; int f;} DATA;

DATA *create() {

int a;

DATA *d = (DATA*) malloc(sizeof(DATA));

init(d->g, 10, &a);

d->f = a;

d->g[1] = 2;

return d;

}Slide24

Missing part: Types?

Types in C programs can’t be trusted

Lack of types hurts property

checkingSlide25

Our Approach

[POPL’09]Type checking  assertion checking

Provide

formal semantics

for C and its types

Use types to improve the

property checker

Provide Java-style

field disambiguationProvide decision procedures for the assertion checkingSlide26

Formalizing Type Safety

A C program is

type safe

if the

run-time value

of every variable and heap location corresponds to its

compile-time type

.

Mem : addr -> value

Type :

addr -> type

HasType : value x type -> bool

for all a in

addr

,

HasType(

Mem(a), Type

(a))Slide27

Gives value stored at each heap location

Values are integersGives declared type for each heap location

Types include

Int

,

Ptr

(

Int

), …Modeling the Heap

Type : addr -> type

Mem

:

addr

-> valueSlide28

“Match” Predicate

Lifts the Type map to multi-word typesMatch(a, t) holds iff

Type[a … n] matches t

Match

:

addr

x type ->

bool

Int

99

Int

100

Ptr

(

Int

)

101

Int

102

Ptr

(

Foo

)

103

Type

Match(a,

Foo

) <==>

Match(a,

Int

) &&

Match(a+1,

Int

) &&

Match(a+2,

Ptr

(

Int

))

struct

foo

{

int

n;

int

m;

int

*p;

}

C Type

HAVOC Axiom

Match(99,

Foo

)

¬

Match(101,

Foo

)

Match(a,

Int

) <==>

Type[a] ==

Int

int

C Type

HAVOC Axiom

int

*

Match(a,

Ptr

(

Int

)) <==>

Type[a] ==

Ptr

(

Int

)

Match(99,

Int

)

Match(101,

Ptr

(

Int

))Slide29

“HasType” Predicate

Defines which values belong to each typeHasType

(v, t) holds

iff

v is a value of type t

HasType

: value x type ->

bool

HasType

(v,

Int) <==>

true

int

C Type

HAVOC Axiom

t*

HasType

(v,

Ptr

(t)) <==>

v == 0 || (v > 0 && Match(v, t))

Int

99

Int

100

Ptr

(

Int

)

101

Int

102

Ptr

(

Foo

)

103

Type

HasType

(99,

Ptr

(

Foo

))

¬

HasType

(101,

Ptr

(

Foo

))Slide30

Type Safety Invariant

for all a in

addr

,

HasType

(

Mem

(a),

Type(a))Part of preconditions, postconditions, loop invariantsAssert at every program point

Add similar assertions for locals (if desired)Slide31

Decision Procedure

Verification conditions refer to Mem

,

Type

,

Match

,

HasType

, Type-safety invariantDecision problem: NP-completeProvide decision procedure using an SMT solverSuffices to instantiate the quantifiers in these axioms on a fixed set of termsSlide32

IRP

IRP

Flink

Blink

ListEntry

Flink

Blink

ListEntry

Example: Type Checking

p

q = CONTAINING_RECORD(p, IRP,

ListEntry

)

= (IRP*)((char*)p - &((IRP*)0->

ListEntry

))

Type Checker:

Does variable

q

have type

IRP*

?

qSlide33

Solution: Add Preconditions

requires(

HasType

(ENCL(p), record

*) &&

ENCL(p)!=

NULL

)void init_record

(list *p) { record *r = CONTAINING_RECORD(p, record, node); r->data2 = 42;

}

#define ENCL(x) CONTAINING_RECORD(x,

record, node) Slide34

Field Safety Invariant

Field safetyRefinement of type safety

Disambiguate two fields of sam

e type

Change

HasType

/Match

are refined

to distinguish different field names of same typeSlide35

Adding Field Names

struct

list { list *

prev

; list *next; }

struct

record {

int

data1; list node;

int data2; }

Match(a, List) <==>

Match(a, Ptr

(List))

&& Match(a+1, Ptr

(List))

Match(a, Record) <==>

Match(a,

int) && Match(a+1, List) && Match(a+3,

int)

Match(a,

Ptr

(List))

<==> Type[a] == Ptr

(List)

HasType

(v,

Ptr

(List

)

)<==>

v == 0 || (v > 0 &&

Match(v, List

))

Match(a,

int

)

<==> Type[a] ==

int

HasType

(v,

int

)

<==> true

… same for

Next

and

Data2

same definition as

IntSlide36

Adding Field Names

struct

list { list *

prev

; list *next; }

struct

record {

int

data1; list node;

int data2; }

Match(a, List) <==>

Match(a, Prev

) && Match(a+1, Next

)

Match(a, Record) <==> Match(a,

Data1) && Match(a+1, List) && Match(a+3,

Data2)

Match(a,

Prev) <==> Type[a] ==

Prev

HasType(v,

Prev) <==> v == 0 || (v > 0 && Match(v, List))

Match(a,

Data1

) <==> Type[a] ==

Data1

HasType

(v,

Data1

) <==> true

… same for

Next

and

Data2

same definition as

IntSlide37

Experiments

Implementation supports full C languageSupports polymorphismSupports user-defined, dependent types

Annotated and checked four Windows drivers

Sample drivers provided with Windows DDKSlide38

Enables field splitting

Can split the heap for “field-safe” programsOne heap map per word-type field and pointer type (almost!)

Mem_f

:

addr

val

Mem_g : addr

val Mem_T*: addr

valSimple exampleC code

x->f = 1;Boogie code

Mem_f[x + Offset(f)] := 1;

38

Disambiguates writes to fields + faster checkingSlide39

Why almost?

struct A {int a; int

b; };

struct

B {

int

c;

int

d; int e;}void P(struct B *x){ struct A *y = (struct A*) x; y->a = 1; assert (x->c == 1);}

Field safety assertion will fail

Have to merge {a, c} {b, d}Slide40

Summary

Types as addition part of the stateType safety checking  assertion checking

Efficiently decidable (NP) logic

Separation of concern for property checking

Can exploit field disambiguation for “field-safe” programsSlide41

Overview

MotivationBackgroundExploiting types Logic for listsCase studySlide42

Logic for lists

SMT theory with new predicate symbolsSlide43

next

prev

data

next

prev

data

next

prev

data

y

x

Btwn

next

(

x,y

)

Btwn

prev

(y,x)

Reachability predicate:

Btwn

fSlide44

next

prev

data

next

prev

data

next

prev

data

y

x

Inverse of a function: f

-1

w

data

-1

(w) = {x, y}Slide45

Expressive logic

Express properties of collections

x

Btwn

f

(f(hd), hd). state(x) = LOCKED //cyclicArithmetic reasoning on data (e.g. sortedness)

x  Btwn

f (hd

, null) \ {null}. y

 Btwnf

(x, null) \ {null}. d(x)  d(y)

Type/object invariants x

Type-1(“__logentry”).

logtype(x) > 0  file_name(x) != nullSlide46

Can express desired invariants

NT_STATUS Unload(…){

….

iter

=

hd

->First;

while(

iter != null) {

RemoveEntryList(

iter);

iter = iter->Next;

IoDeleteDevice(iter

->Self); } ….

}

First

Next

Self

DevExt

hd

x

Btwn

Next

(

hd

->

First,NULL

)

.

x->Self->

DevExt

= x

DEV_OBJ

DEV_EXT

DEV_OBJ

Next

Self

DevExt

DEV_EXT

x

Btwn

Next

(

hd

->

First,NULL

)

.

Self

-1

(x->Self) = {&x->Self}

ORSlide47

Precise and efficient

[POPL ‘08]PrecisionGiven a Floyd-Hoare triple {P} S {Q}, P/Q are in the assertion logic, and S is a loop-free, call-free code fragment

There is a formula in the assertion logic

Linear in the size of the triple

V

alid

iff

the triple holds

EfficiencyThe decision problem is NP-complete Slide48

Ground Logic

t

 Term ::= c | x |

t

1

+ t

2

|

t1 - t2 | f(t) G  GFormula ::= t = t’ | t < t’ |

t  Btwnf(t1, t2) | G

S

 Set ::= f-1

(t) | Btwnf(t

1, t2)

F  Formula ::= G | F1

 F2

|F1  F

2 | x  S. F

LogicSlide49

Ground decision procedure

Provide a set of 10 rewrite rules for

Btwn

f

Sound, complete and terminating

E.g. Transitivity3

t

1  Btwnf(t0, t2) t  Btwn

f(t0, t1) t  Btwnf(t

0, t2), t1

 Btwnf(t, t2

) Slide50

t

 Term ::= c | x |

t

1

+ t

2

|

t

1 - t2 | f(t) G  GFormula ::= t = t’ | t < t’ | t  Btwnf(t1, t2) | G

S  Set ::= f-1

(t) | Btwnf(t1

, t2)

F  Formula ::= G | F

1  F

2 |F1 

F2 |

x  S. F

Logic

Bounded quantification over interpreted setsSlide51

Sort restriction

The unsorted logic is undecidable

Unsorted logic

Sorted logic

Each term has a sort D, each function f has a sort D

 E

There is a partial order on the sorts

Sort-restriction on x  S. F sort(x) should be less than the

sort(t[x]) for any term t[x] inside

FSlide52

Sort restriction

Sort-restriction on x  S. F

sort(x)

should be less than the

sort(t[x])

for any term t[x] inside

FSorts are quite naturalCome from program typesMost interesting specifications can be expressed

See paper for exceptionsSlide53

Evaluation

Compare with an incomplete axiomatization of reachability predicate [TACAS’07]Greatly improved the

predictability

of the verifier

Reduced runtimes (2X – 100X)

Eliminate need for carefully crafted axioms and invariants

Can handle newer examplesSlide54

Assertion logic of HAVOC

TheoriesGround logics (linear arithmetic, uninterpreted functions, arrays,…), lists Can

quantify

over

lists

,

array, types

Decision procedures (based on rewriting)

Implemented using quantifiers + triggers in Boogie/Z3Slide55

Experience

Quite encouraging compared to previous efforts based on axiomatizing listsOn medium benchmarks (list insert, delete, sort, …)Slide56

Overview

MotivationBackgroundExploiting types Logic for listsCase studySlide57

Case study

Can we verify properties on reasonably large modules in the presence of lists?With high automationChallengeSynthesizing the quantified invariants starting from a property is

difficult for real code

Answer: sometimes

If we can identify how to split the burden between a user and annotation inferenceSlide58

Intra module inference

CAV ’09SettingA module with a few public and lot of private methods (e.g. device drivers)High level idea

Let the user specify the module invariant on public APIs

An inference engine infers “exceptions” from the module invariant for private methodsSlide59

Module invariant broken

requires

(

TypeInvDO

)

ensures

(

TypeInvDO

)void publicFoo () { PDEV_OBJ do = NewDEV_OBJ

(); privateBar(do);}

requires

(TypeInvDOExcept

(do))requires

(TypeInvDO)

ensures (

TypeInvDO)void

privateBar (PDEV_OBJ do) { do->

DevExt->Self = do;}

DevExt

Self

x

DEV_OBJ

DEV_EXT

TypeInvDO

x

MyDevObj

. x->

DevExt

->Self = x

TypeInvDOExcept

(y)

x

MyDevObj

.

x = y || x-

>

DevExt

->Self = x Slide60

Intra module inference

User specifies the module invariantsModule invariants are simple, as they are on the “steady state”Usually 1 or 2 such invariants

Add “candidate annotations” for module invariant exceptions

Simple heuristics to infer the “exceptions”

Run Houdini [Flanagan&Leino’01] to perform monomial predicate abstractionSlide61

Intra-module inference

Applied to a 4 device driversLack of double-free, lock API usage~7KLOC, ~50 loops/procedures

Module invariants over lists and arrays

Verified the properties with ~10

manual

annotations, ~1000

inferred

annotationsSlide62

Conclusion

Difficulty for reasoning about C programsLack of type safetyLow level listsProvided new efficient logics on top on SMT solvers

Able to apply them to verify programs several thousand lines largeSlide63

Download

New version to be released shortlyCurrent version athttp

://research.microsoft.com/en-us/projects/havoc/

Try out VCC

A functional verifier for concurrent C

http://vcc.codeplex.com/

Visit Rise4Fun webpage for other tools

http://rise4fun.comSlide64

HAVOC references

C memory modelBasic memory model [TACAS ‘07]Encoding types [POPL ’09]Decision procedures for lists [POPL ‘06, ‘08]

Annotation inference

Complexity [CADE ‘09]

Intra-module inference [CAV ‘09]

Transparent inference [VMCAI ‘11]

Local reasoning

Call invariants [NFM ‘11]

Linear maps [PLPV ‘11]Property checking in the large [VSTTE ‘10]Slide65

Related work

Logics for data structuresLists [Nelson POPL‘83]PALE [Moller et al. PLDI’01]….

Strand [Madhusudan et al. POPL’11]

Trees [Wies et al. CADE’11]

C memory model + types

VCC [Cohen et al. SSV‘09]

Separation logic

[O’ Hearn, Reynolds, Yang CSL‘01]Slide66

Questions