2 The Deconstruction of Dyninst lockfoo main foo dynamic instrumentation debugger static binary analysis tools malware analysis binary editorrewriter 3 Familiar territory Benjamin Schwarz ID: 689968
Download Presentation The PPT/PDF document "Intermission Binary parsing" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
IntermissionSlide2
Binary parsing
2
The Deconstruction of Dyninst
_
lock_foo
main
foo
dynamic instrumentation, debugger, static binary analysis tools, malware analysis, binary editor/rewriter, …Slide3
3
Familiar territory
Benjamin Schwarz,
Saumya
Debray, and Gregory R. Andrews.
Disassembly of executable code revisited.
2002
Cristina Cifuentes
and K. John Gough. Decompilation of binary programs
. 1995
Richard L. Sites, Anton Chernoff, Matthew B. Kirk, Maurice P. Marks, and Scott G. Robinson.
Binary translation. 1993.
HenrikTheiling.
Extracting safe and precise control flow from binaries. 2000.
Ramkumar
Chinchani and Eric van den Berg. A fast static analysis approach to detect exploit code inside network flows
. 2005.J. Troger and C.
Cifuentes. Analysis of virtual method invocation for binary translation. 2002.Laune C. Harris and Barton P. Miller.
Practical analysis of stripped binary code. 2005.Christopher Kruegel
, William Robertson, Fredrik Valeur, and Giovanni Vigna. Static disassembly of obfuscated binaries. 2004.
Nathan Rosenblum, Xiaojin Zhu, Barton P. Miller, and Karen Hunt.
Learning to analyze binary computer code. 2008. Amitabh
Srivastava and Alan Eustace. ATOM: a system for building customized program analysis tools. 1994.
Barton Miller, Jeffrey Hollingsworth, and Mark Callaghan. Dynamic Program Instrumentation for Scalable Performance Tools. 1994.Slide4
We’ve been down this road…
4
The Deconstruction of Dyninst
recursive traversal parsing
“gap” parsing heuristics
probabilistic code models
non-contiguous functions
code sharing
non-returning functions
preamble scanning
handles stripped binaries
learn to recognize function entry points
very accurate gap parsing
the
DYNINST binary parserSlide5
What makes a parsing component?
5
The Deconstruction of Dyninst
011101011010101010101110101001010101110001001001011010110011010101010101010010011110
0101110010110
Parsing API
simple, intuitive representation
2
functions
blocks
edges
InstructionAPI
SymtabAPI
platform independence supported by previous
Dyninst
components
3
Binary
code
source
abstraction
1Slide6
Flexible code sources
6
The Deconstruction of Dyninst
a binary code object
Parser code source requirements:
code location
code
data
access to code bytes
unsigned char *
buf
41 56 49 89
fe
41 55 …
main
foo
bar
baz
function hints & names
a few (
optional
) facts
pointer width
external linkage
PLTSlide7
Code source contract
7
The Deconstruction of Dyninst
boolisValidAddress
bool
isExecutableAddressvoid *
getPtrToInstructionvoid *
getPtrToData
unsignedgetAddressWidthbool
isCodeboolisDataAddress
codeOffsetAddresscodeLength
Nine mandatory methods
SymtabAPI
implementation in 232 lines (including optional hints, function names)
Any binary code object that can be memory mapped can be parsedSlide8
Simple control flow interface
8
The Deconstruction of Dyninst
Functions
Blocks
Edges
start
addr
.
extents
contain
joined by
start
addr
.
end
addr
.
in edges
out edges
src
targ
typeSlide9
Views of control flow
9
The Deconstruction of Dyninst
while(!
work.empty
()) {
Block *b = work.pop();
/* do something with b */
edgeiter
eit
= b->out().begin(); while(
eit != b->out().end()) {
work.push
(*eit
++); }
}Walking a control flow graph
starting here
What if we only want
intraprocedural
edges?
Slide10
Edge predicates
10
The Deconstruction of Dyninst
while(!
work.empty
()) {
Block *b = work.pop();
/* do something with b */
IntraProc
pred
;
edgeiter
eit = b->out().begin(
&pred
); while(
eit != b->out().end()) {
work.push(*eit
++); }}Walking a control flow graph
Edge PredicatesTell iterator whether Edge argument should be returned
Composable (and, or)Examples:
IntraproceduralSingle function contextDirect branches onlySlide11
Extensible CFG objects
11
The Deconstruction of Dyninst
image_func
Function
Dyninst
image_func
ParseAPI
Function
Simple, only need to represent control flow graph
Complex, handles instrumentation,
liveness
, relocation, etc.
Special callback points during parsing
parse parse
parse
unresBranchNotify(insn)
[derived class does stuff]
parse parse parse
Factory interface for CFG objectsparser
custom factory
mkfunc
()
(Function*)image_funcSlide12
What’s in the box?
12
The Deconstruction of Dyninst
* box to be released soon
Binary Parser
Control Flow Graph Representation
SymtabAPI
-based Code Source
recursive descent parsing
speculative gap parsing
cross platform: x86, x86-64, PPC, IA64, SPARC
graph interface
extensible objects for easy tool integration
exports
Dyninst
InstructionAPI
interface
cross-platformsupports ELF, PE, XCOFF formatsSlide13
Status
13
The Deconstruction of Dyninst
conception
code refactoring
interface design
Dyninst
re-integration
(major test case)
other major test case:
compiler provenance
(come tomorrow!)