Lecture 1 Introduction Roman Manevich BenGurion University December 31 2008 30GB Zunes all over the world fail en masse 2 Zune bug 1 while days gt 365 2 if IsLeapYear year ID: 802997
Download The PPT/PDF document "Spring 2013 Program Analysis and Verific..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Spring 2013Program Analysis and VerificationLecture 1: Introduction
Roman
Manevich
Ben-Gurion University
Slide2December 31, 200830GB Zunes all over the world fail en masse2
Slide3Zune bug 1 while (days > 365) { 2 if (IsLeapYear(year)) { 3 if (days > 366) { 4 days -= 366; 5 year += 1;
6 }
7 } else {
8 days -= 365;
9 year += 1;
10 }
11 }
3
Slide4Zune bug 1 while (366 > 365) { 2 if (IsLeapYear(2008)) {
3 if (
366 > 366
) {
4 days -= 366;
5 year += 1;
6 }
7 } else {
8 days -= 365; 9 year += 1;
10 }
11 }
Suggested solution: wait for tomorrow
4
Slide5February 25, 1991On the night of the 25th of February, 1991, a Patriot missile system operating in Dhahran, Saudi Arabia, failed to track and intercept an incoming Scud. The Iraqi missile impacted into an army barracks, killing 28 U.S. soldiers and injuring another 98.
Patriot missile failure
5
Slide6Patriot bug – rounding errorTime measured in 1/10 secondsBinary expansion of 1/10: 0.0001100110011001100110011001100....24-bit register0.00011001100110011001100error of 0.0000000000000000000000011001100... binary, or ~0.000000095 decimalAfter 100 hours of operation error is 0.000000095×100×3600×10=0.34A Scud travels at about 1,676 meters per second, and so travels more than half a kilometer in this time
Suggested solution: reboot every 10 hours
6
Slide7August 13, 2003Billy Gates why do you make this possible ? Stop making moneyand fix your software!!(W32.Blaster.Worm)
7
Slide8Windows exploit(s)Buffer Overflow8void foo (char *x) { char
buf
[2];
strcpy
(
buf
, x);
} int main (
int
argc
, char *argv[]) {
foo(
argv
[1]);
}
./
a.out
abracadabra
Segmentation fault
Stack grows
this way
Memory
addresses
Previous frame
Return address
Saved FP
char* x
buf
[2]
…
ab
ra
ca
da
br
Slide9Buffer overrun exploits int check_authentication(char *password) {
int
auth_flag
= 0;
char
password_buffer[16];
strcpy
(
password_buffer
, password); if(
strcmp(password_buffer, "
brillig") == 0) auth_flag = 1;
if(
strcmp
(
password_buffer
, "
outgrabe
") == 0)
auth_flag
= 1;
return
auth_flag
;
}
int
main(
int
argc
, char *argv[]) {
if(check_authentication(argv
[1])) { printf
("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n");
printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); } else
printf("\nAccess Denied.\n");
}
(source: “hacking – the art of exploitation, 2nd Ed”)
9
Slide10(In)correct usage of APIsApplication trend: Increasing number of libraries and APIsNon-trivial restrictions on permitted sequences of operations
Typestate:
Temporal safety properties
What sequence of operations are permitted on an object?
Encoded as DFA
e.g. “Don’t use a Socket unless it is connected”
init
connected
closed
err
connect()
close()
getInputStream
()
getOutputStream
()
getInputStream()
getOutputStream()
getInputStream
()
getOutputStream
()
close()
*
10
Slide11Challengesclass SocketHolder { Socket s; }
Socket
makeSocket
() { return new Socket(); // A }
open
(Socket l) {
l.connect
(); }
talk
(Socket s) {
s.getOutputStream
()).write(“hello”); }
main() {
Set<SocketHolder
> set = new
HashSet
<
SocketHolder
>();
while(…) {
SocketHolder
h = new
SocketHolder
();
h.s
=
makeSocket
();
set.add
(h
)
;
} for (Iterator<
SocketHolder> it = set.iterator(); …) { Socket g =
it.next().s;
open(g);
talk(g); }}
11
Slide12Testing is not enoughObserve some program behaviorsWhat can you say about other behaviors?Concurrency makes things worseSmart testing is usefulrequires the techniques that we will see in the course12
Slide13Static analysis definitionReason statically (at compile time) about the possible runtime behaviors of a program
“The
algorithmic discovery
of
properties
of a program by inspection of its source text
1
”-- Manna, Pnueli
1 Does not have to literally be the source text, just means w/o running it
13
Slide14Is it at all doable?x = ?if (x > 0) { y = 42;} else { y = 73; foo();} assert (y == 42);
Bad news: problem is generally undecidable
14
Slide15universeCentral idea: use approximation
Under Approximation
Exact set of configurations/
behaviors
15
Over Approximation
Slide16Goal: exploring program states
initial
states
bad
states
16
reachable
states
Slide17Technique: explore abstract states
initial
states
bad
states
17
reachable
states
Slide18Technique: explore abstract states
initial
states
bad
states
18
reachable
states
Slide19Technique: explore abstract states
initial
states
bad
states
19
reachable
states
Slide20Technique: explore abstract states
initial
states
bad
states
20
reachable
states
Slide21Sound: cover
all reachable states
21
initial
states
bad
states
reachable
states
Slide22Unsound: miss some reachable states
22
initial
states
bad
states
reachable
states
Slide2323
Imprecise abstraction
initial
states
bad
states
23
reachable
states
False alarms
Slide24A sound messagex = ?if (x > 0) { y = 42;} else { y = 73; foo();} assert (y == 42);
Assertion
may
be violated
24
Slide25Avoid useless resultLow false alarm rateUnderstand where precision is lostPrecisionUselessAnalysis(Program p) {
printf
(“assertion may be violated\n”);
}
25
Slide26Runtime vs. static analysis
Runtime
Static analysis
Effectiveness
Can miss errors
Finds real errors
Can find rare errors
Can raise false alarms
Cost
Proportional to program’s execution
Proportional to program’s complexity
No need to efficiently handle rare cases
Can handle limited classes of programs and still be useful
26
Slide27Driver’s Source Code in C
Precise
API Usage Rules
(SLIC)
Defects
100% path
coverage
Rules
Static Driver Verifier
Environment
model
Static Driver Verifier
Slide28Bill Gates’ Quote"Things like even software verification, this has been the Holy Grail of computer science for many decades but now in some very key areas, for example, driver verification we’re building tools that can do actual proof about the software and how it works in order to guarantee the reliability." Bill Gates, April 18, 2002. Keynote address at WinHec 2002
Slide29The Astrée Static AnalyzerPatrick CousotRadhia CousotJérôme FeretLaurent MauborgneAntoine Miné
Xavier Rival
ENS
France
Slide30Objectives of AstréeProve absence of errors in safety critical C codeASTRÉE was able to prove completely automatically the absence of any RTE in the primary flight control software of the Airbus A340 fly-by-wire systema program of 132,000 lines of C analyzed
Slide31Objectives of AstréeProve absence of errors in safety critical C codeASTRÉE was able to prove completely automatically the absence of any RTE in the primary flight control software of the Airbus A340 fly-by-wire systema program of 132,000 lines of C analyzedBy Lasse Fuss (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
Slide32A little about meHistoryStudied B.Sc., M.Sc., Ph.D. at Tel-Aviv UniversityResearch in program analysis with IBM and MicrosoftPost-doc in UCLA and in UT AustinJoined Ben-Gurion University this yearExample research challengesWhat’s a good algorithm for automatically discovering (with no hints) that a program generates a binary tree where all leaves are connected in a list?What’s a good algorithm for automatically proving that a parallel program behaves “well”?How can we automatically synthesize parallel code that is both correct and efficient?32
Slide33Why study program analysis?Challenging and thought provokingAn approach for dealing with computationally hard (usually undecidable) problemsTreat programs as mathematical objectsUnderstand how to systematicallyDesign optimizationsReason about correctness / find bugs (security)Some techniques may be applied in other domainsComputational learningAnalysis of biological systems33
Slide34What do you get in this course?Learn basic principles of static analysisUnderstand jargon/papersLearn a few advanced techniquesSome principled way of developing analysisDevelop one in a small-scale projectPut to practice what you learned in logic, automata, programming34
Slide35My roleTeach you theory and practiceTeach you how to think of new techniquesE-mail: romanm@cs.bgu.ac.ilOffice hours: Wednesday 13:00-15:00Course web-pageAnnouncementsForum…35
Slide36RequirementsSummarize one lecture: 10% of gradeSubmit initial summaryGet corrections/suggestionsSubmit revised summaryTheoretical assignments and programming assignments: 50%About 8 (some very small)Must submit allMust solve all questionsOtherwise re-submit (and get a lower grade)Final project: 40%Implement a program analyzer for a given component
36
Slide37How to succeed in this courseAttend all classesMake sure you understand material in classEngage by asking questions and raising ideasBe on top of assignmentsSubmit on timeDon’t get stuck or give up on exercises – get help – ask meDon’t start working on assignments the day beforeBe ethical37
Joe (a day before assignment deadline):
“I don’t really understand what you want from me in this assignment, can you help me/extend the deadline”?
Slide38The static analysis approachFormalize software behavior in a mathematical model (semantics)Prove properties of the mathematical modelAutomatically, typically with approximation of the formal semanticsDevelop theory and tools for program correctness and robustness
38
Slide39Kinds of static analysisSpans a wide range type checking … up to full functional verificationGeneral safety specifications Security properties (e.g., information flow)Concurrency correctness conditions (e.g., absence of data races, absence of deadlocks, atomicity)Correct usage of libraries (e.g., typestate)Underapproximations useful for bug-finding, test-case generation,…39
Slide40Static analysis techniques Abstract InterpretationDataflow analysis Constraint-based analysisType and effect systems
40
Slide41Static analysis for verification programspecificationAbstract
counter
example
Analyzer
Valid
41
Slide42Relation to program verificationFully automaticApplicable to a programming languageCan be very impreciseMay yield false alarmsRequires specification and loop invariantsProgram specificRelatively completeProvides counter examples
Provides useful documentation
Can be mechanized using theorem provers
Static
Analysis
Program Verification
42
Slide43Verification challengemain(int i
)
{
int
x=3,y=1;
do {
y = y + 1;
} while(--
i
> 0)
assert 0 < x + y;
}
Determine what states can arise during any execution
Challenge: set of states is unbounded
43
Slide44Abstract Interpretationmain(int i
)
{
int
x=3,y=1;
do {
y = y + 1;
} while(--
i
> 0)
assert 0 < x + y;
}
Recipe
Abstraction
Transformers
Exploration
Challenge: set of states is unbounded
Solution:
compute
a
bounded
representation of (a
superset
) of program states
Determine what states can arise during any execution
44
Slide451) Abstractionmain(int i
)
{
int
x=3,y=1;
do {
y = y + 1;
} while(--
i
> 0)
assert 0 < x + y;
}
concrete state
abstract state (sign)
:
Var
Z
#
:
Var
{+, 0, -, ?}
x
y
i
3
1
7
x
y
i
+
+
+
3
2
6
x
y
i
…
45
Slide462) Transformersmain(int i
)
{
int
x=3,y=1;
do {
y = y + 1;
} while(--
i
> 0)
assert 0 < x + y;
}
concrete transformer
abstract transformer
x
y
i
+
+
0
x
y
i
3
1
0
y = y + 1
x
y
i
3
2
0
x
y
i
+
+
0
y = y + 1
+
-
0
+
?
0
+
0
0
+
+
0
+
?
0
+
?
0
46
Slide473) Exploration+
+
?
+
+
?
x
y
i
main(
int
i
)
{
int
x=3,y=1;
do {
y = y + 1;
} while(--
i
> 0)
assert 0 < x + y;
}
+
+
?
+
+
?
?
?
?
x
y
i
+
+
?
+
+
?
+
+
?
+
+
?
+
+
?
+
+
?
47
Slide48Incompletenessmain(int
i
)
{
int
x=3,y=1;
do {
y = y - 2;
y = y + 3;
} while(--
i > 0)
assert 0 < x + y;
}
+
?
?
+
?
?
x
y
i
+
?
?
+
+
?
?
?
?
x
y
i
+
?
?
+
?
?
+
?
?
48
Slide49Parity abstractionchallenge: how to find “the right” abstractionwhile (x !=1 ) do {
if (x % 2
) == 0 {
x
:= x / 2;
} else
{
x
:= x * 3 + 1;
assert (x %2 ==0);
}
}
49
Slide50How to find “the right” abstraction?Pick an abstract domain suited for your propertyNumerical domainsDomains for reasoning about the heap…Combination of abstract domainsAnother approach Abstraction refinement50
Slide51Following the recipe (in a nutshell)1) AbstractionConcrete stateAbstract state
x
t
n
n
n
x
t
n
2) Transformers
n
x
t
n
t
n
x
n
t->n = x
51
Slide52Example: shape (heap) analysis
t
x
n
x
t
n
x
t
n
n
x
t
n
n
x
t
t
x
n
t
t
n
t
x
t
x
t
x
emp
void
stack-
init
(
int
i
)
{
Node* x = null;
do {
Node
t =
malloc
(…)
t->n = x;
x = t;
}
while
(--
i
>0)
Top = x;
}
assert(acyclic(Top))
t
x
n
n
x
t
n
n
x
t
n
n
n
x
t
n
n
n
x
t
n
n
n
top
52
Slide53x
t
n
n
t
x
n
x
t
n
x
t
n
n
x
t
t
x
n
t
t
n
t
x
t
x
t
x
emp
x
t
n
n
x
t
n
n
n
x
t
n
t
n
x
n
x
t
n
n
3) Exploration
x
t
n
Top
n
n
t
x
Top
t
x
Top
x
t
n
Top
n
void
stack-
init
(
int
i
)
{
Node* x = null;
do {
Node
t =
malloc
(…)
t->n = x;
x = t;
}
while
(--
i
>0)
Top = x;
}
assert(acyclic(Top))
53
Slide54Example: polyhedra (numerical) domainproc MC(n:int) returns (r:int) var t1:int, t2:int;
begin
if (n>100) then
r = n-10;
else
t1 = n + 11;
t2 = MC(t1);
r = MC(t2);
endif;
end
var
a:int, b:int;
begin
b = MC(a); end
What is the result of this program?
54
Slide55McCarthy 91 functionproc MC (n : int) returns (r : int) var t1 : int
, t2 :
int
;
begin
/* (L6 C5) top */
if n > 100 then
/* (L7 C17) [|n-101>=0|] */
r = n - 10; /* (L8 C14) [|-n+r+10=0; n-101>=0|] */
else
/* (L9 C6) [|-n+100>=0|] */
t1 = n + 11; /* (L10 C17) [|-n+t1-11=0; -n+100>=0|] */
t2 = MC(t1); /* (L11 C17) [|-n+t1-11=0; -n+100>=0; -n+t2-1>=0; t2-91>=0|] */
r = MC(t2); /* (L12 C16) [|-n+t1-11=0; -n+100>=0; -n+t2-1>=0; t2-91>=0; r-t2+10>=0;
r-91>=0|] */
endif
; /* (L13 C8) [|-n+r+10>=0; r-91>=0|] */
end
var
a :
int
, b :
int
;
begin
/* (L18 C5) top */
b = MC(a); /* (L19 C12) [|-a+b+10>=0; b-91>=0|] */
end
if (n>=101) then n-10 else 91
55
Slide56Some things that should trouble youDoes a result always exist?Does the recipe always converge?How “optimal” is the result?How do I pick my abstraction?How do come up with abstract transformers?Other practical issuesEfficiencyHow does it do in practice?56
Slide57Change the abstraction to match the programAbstraction refinementprogram
specification
Abstract
counter
example
abstraction
Abstraction
Refinement
counter
example
Verify
Valid
57
Slide58Recap: program analysis
Reason statically (at compile time) about the possible runtime behaviors of a
program
use sound overapproximation of program behavior
abstract interpretation
abstract domain
transformers
exploration (fixed-point computation)
finding the right abstraction?
58
Slide59Next lecture:semantics of programming languages59
Slide60ReferencesPatriot bug:http://www.cs.usyd.edu.au/~alum/patriot_bug.htmlPatrick Cousot’s NYU lecture notesZune bug: http://www.crunchgear.com/2008/12/31/zune-bug-explained-in-detail/Blaster worm:http://www.sans.org/security-resources/malwarefaq/w32_blasterworm.phpInteresting CACM articlehttp://cacm.acm.org/magazines/2010/2/69354-a-few-billion-lines-of-code-later/fulltextInteresting blog post
http://www.altdevblogaday.com/2011/12/24/static-code-analysis/
60