/
Model-Assisted  Machine-Code Synthesis Model-Assisted  Machine-Code Synthesis

Model-Assisted Machine-Code Synthesis - PowerPoint Presentation

ellena-manuel
ellena-manuel . @ellena-manuel
Follow
352 views
Uploaded On 2018-11-04

Model-Assisted Machine-Code Synthesis - PPT Presentation

Venkatesh Srinivasan Ara Vartanian Thomas Reps venk aravart repscswiscedu 1 Venkatesh Srinivasan is now at Google Software is everywhere Everyday systems phones laptops watches cars etc ID: 713004

eax esp mov mcsynth esp eax mcsynth mov push instruction ebx

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Model-Assisted Machine-Code Synthesis" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Model-Assisted Machine-Code Synthesis

Venkatesh Srinivasan*, Ara Vartanian, Thomas Reps{venk, aravart, reps}@cs.wisc.edu

1

*

Venkatesh

Srinivasan is now at GoogleSlide2

Software is everywhereEveryday systems: phones, laptops, watches, cars, etc.Critical systems: aircrafts, space shuttles, medical devices, etc.

01000101011111101010101001000101011111101010Binary00101010101110101010101001000101011111101010Software is Pervasive2

Binaries are

Binaries are

No source code

No documentation

[Motivation]

Analyze Binaries

Rewrite BinariesSlide3

0100010

Binary’1011101

1101010Binary’0010111

0111101Binary’

0010111

Binary Rewriting

Superoptimization

Partial Evaluation

Program Repair

Obfuscation

Partial Evaluation

Slicing

3

01000101011111101010101001000101011111101010

Binary

00101010101110101010101001000101011111101010

[Motivation]

Binary Rewriter

Why rewrite binaries?Slide4

Inside a Semantics-Based

Binary Rewriter *4

Semantic Representation

Binary Analyses

Transform

Transformed Semantics

McSynth

++

Offline binary optimization

[Motivation]

Partial evaluation

Slicing

Binary obfuscation

Binary repair

* V.

Srinivasan, T. Sharma,

and T. Reps.

Speeding up Machine-Code Synthesis.

In

OOPSLA, 2016

Disassembly of section .text:

080482e0 <.text>:

80482e0: xor

ebp,ebp

80482e2: pop

esi

80482e3: mov

ecx,esp

80482e5: and esp, 0xfffffff0

80482e8: push

eax

80482e9: push esp

80482ea: push

edx

80482eb: push 0x80483d0

80482f0: push 0x80483e0

80482f5: push

ecx

80482f6: push

esi

80482f7: push 0x80483a5

80482fc: call 80482c4

.

.

8048394: push ebp

8048395: mov ebp, esp

8048397: sub esp, 0x10

804839a: mov

eax

, [ebp + 8]

804839d: mov [ebp-4],

eax

80483a0: mov

eax

, [ebp-4]

.

.

80483c0: mov

eax

, [ebp-4]

80483c3: leave

80483c4: ret

Rewritten Binary

080482e0 <.text>:

80482e0: xor

ebp,ebp

80482e2: pop

esi80482e3: mov ecx,esp80482e5: and esp, 0xfffffff080482e8: push 180482e9: push 280482ea: push 380482eb: push 0x80483d080482f0: push 0x80483e080482f5: push ecx80482f6: push esi80482f7: push 0x80483a580482fc: call 80482c4 ..8048394: lea esp, [esp – 4] 8048395: mov ebp, esp8048397: sub esp, 0x10804839a: mov eax, 10804839d: mov [ebp-4], eax80483a0: mov eax, 20..80483c0: mov [ebp-4], eax80483c3: leave80483c4: ret

Semantic Representation

Transform

Transformed Semantics

McSynth

++

Logical Formula

Machine-Code SynthesizerSlide5

0100010

Binary’1011101

1101010Binary’0010111

0111101Binary’

0010111

McSynth

++

is not fast enough!

Superoptimization

Partial Evaluation

Program Repair

Obfuscation

Partial Evaluation

Slicing

5

01000101011111101010101001000101011111101010

Binary

00101010101110101010101001000101011111101010

[Motivation]

Binary Rewriter

Analysis

Transformation

Synthesis

Rewriter makes many calls to

McSynth

++

Each call to

McSynth

++

takes minutes

Time taken to rewrite a binary – several hours or days!

We need to speed-up

McSynth

++!

McSynth

++

McSynth

-ML Slide6

OutlineMotivationIntel x86 (IA-32) PrimerMcSynth++

McSynth-MLExperimentsConclusion6Slide7

Intel x86 (IA-32) Primer

7 Registers

1000

ESP

Stack

push

eax

20

EAX

20

mov

ebx

,

eax

20

EBX

mov [

esp

], 40

40

lea

esp

, [esp+4]

996

1000

[Background]Slide8

OutlineMotivationIntel x86 (IA-32) PrimerMcSynth++

McSynth-MLExperimentsConclusion8Slide9

Synthesis of Machine Code*

Synthesize machine-code instructions from a semantic specificationQFBV formulaParameterized by ISA9

push

eax

push

eax

ESP’

= ESP – 4

Mem’ = Mem[ESP – 4 ↦ EAX]

⟪ push

eax

 

lea

esp

, [

esp

– 4]

mov

[

esp

],

eax

ESP’

= ESP – 4

Mem’ = Mem[ESP – 4 ↦ EAX]

lea esp, [esp – 4];

mov

[esp],

eax

 

McSynth

++

ESP’

= ESP – 4

Mem’ = Mem[ESP – 4 ↦ EAX]

ESP’

= ESP – 4

Mem’ = Mem[ESP – 4 ↦ EAX]

EBP’ = EBP

EAX’ = EAX

EBX’ = EBX

.

.

.

CF’ = CF

OF’ = OF

[

McSynth

++]

*

V

. Srinivasan and T. Reps. Synthesis of machine code from semantics. In PLDI, 2015 V. Srinivasan, T. Sharma, and T. Reps. Speeding up machine-code synthesis. In OOPSLA, 2016Slide10

43,000 unique IA-32 instruction schemas

Exponential cost of enumeration

+

=

Enumerative Synthesis –

Challenges

Synthesis of an instruction sequence of length 2 takes several hours or days!

[

McSynth

++]Slide11

McSynth++ Design11

ϕMaster

ϕ

1,2

ϕ

1,1

.

.

.

.

.

ϕ

m,1

ϕ

m,2

ϕ

m,n

.

.

.

Slave

Slave

Slave

I

1

I

2

I

n

ϕ

m-1,1

ϕ

m-1,k

T

ϕ

'

m,1

ϕ

'

m,2

I'

2

I'

1

Slave

I

[

McSynth

++]Slide12

McSynth++ Design

12[McSynth++]

Master

Slave

Slave

φ

1

:

EAX’ = Mem(ESP + 10)

φ

2

:

ESP’ = ESP + 18

ZF

’ = (ESP + 10 = 0)

add

esp

, 10

lea

esp

, [esp+8]

mov

eax

, [esp+10

]

φ

:

EAX

= Mem(ESP +

10)

ESP’

=

ESP + 18

ZF’ = (ESP + 10 = 0)

Few seconds

10 minutesSlide13

McSynth

++ Slave13

Instruction Enumerator

φ

1

I

Search Space

(Instruction sequences)

Footprint-Based

Pruner

CEGIS

Slave Synthesizer

Prunes useless candidates

Instantiation of counterexample-guided inductive synthesis framework for machine code

[

McSynth

++]

Bits-Lost-Based Pruner Slide14

McSynth++ Slave

14Space of instruction sequences

Linear search

[

McSynth

++]

One-instruction sequences

Two-instruction sequences

Three-instruction sequences

.

.

.

.

.

.

.

Slave performs no prioritization of candidates

Not all candidates are equally likely to implement

φ

Slide15

OutlineMotivationIntel x86 (IA-32) PrimerMcSynth++

McSynth-MLExperimentsConclusion15Slide16

McSynth-ML Key InsightPrioritize candidatesRetain completeness propertiesLinear search → Best-first search assisted by models

16[McSynth-ML]Slide17

McSynth-ML Design17

[McSynth-ML]

McSynth

++

McSynth

-ML

Master

ϕ

Slave

Slave

Slave

. . . .

I

1

I

2

I

m

Master

ϕ

Slave-ML

Slave-ML

Slave-ML

. . . .

I

1

I

2

I

n

Model-assisted slave

:

Faster enumerative synthesisSlide18

Training CorpusCorpus of

〈Specification, Implementation〉 pairs 18Machine-code Synthesizer

QFBV

Instruction sequence

Search

Instruction

sequence

InstrToQFBV

QFBV

Symbolic execution

Corpus of 〈

φ

, I〉 pairs

[McSynth-ML]Slide19

Learning Models19

Corpus of 〈φ, I〉 pairs

Language model

n-gram

“push ebp; mov ebp, esp” is more common than “push ebp; xor ebp, esp”

Regression model

k-nearest neighbors

If

φ

contains

“+

8/16/32

operator, I most likely will contain “add”, “sub”, or “lea” opcodes

If

φ

contains “+

8

” operator, I most likely will contain 8-bit operands

[McSynth-ML]

n-gram

k-NNSlide20

Space of instruction sequences

Model–Assisted

Best–First Search

20

[McSynth-ML]

n-gram

k-NN

Next prefix to expand?

Model scores

Next prefix to expand

mov

esp

, <

imm

>

mov

esp

, [

esp

]

add

esp

, <

imm

>

add

esp

, <

imm

>

add

esp

, <

imm

>

mov

esp

, <

imm

>

add

esp

, <

imm

>

mov

esp

, <

imm

>

add esp, <imm> add esp, <imm

>

. . .

N-gram score:

How common is I?

K-NN score:

How well do features of I

correlate with features of

φ?Slide21

Space of instruction sequences

Optimization –

Instruction-Pool Truncation

21

[McSynth-ML]

n-gram

k-NN

k-NN

φSlide22

OutlineMotivationIntel x86 (IA-32) PrimerMcSynth++

McSynth-MLExperimentsConclusion22Slide23

Test Suite

“Important” instruction-sequences from real-world programsSPECINT 2006 benchmarks (10 binaries)Instruction-sequences of length 5 through 1050

instruction-sequences in total

No overlaps

No restriction on source of QFBV formula

23

Symbolic Execution

McSynth++

I

I’

φ

[Experiments]Slide24

Training Corpus24

[Experiments]Sym Exec

I

t

Binary1

ϕ

t

Test Input

Training Corpus

Sym

Exec

{I

1

, I

2

, I

3

, …, I

n

}

Binary2

Binary3

Binary10

{

ϕ

1

,

ϕ

2

,

ϕ

3

, …,

ϕ

n

}

Corpus of 〈

φ

, I〉 pairs

. . . .Slide25

McSynth-ML Vs. McSynth++

25No. of timeouts: 6 Vs. 0

(out of 50)Avg. speedup for formulas that timed out in McSynth++:

over

526X

Avg. speedup for remaining formulas:

4.55X

(

12.6X

for formulas with baseline synthesis-time > 100s)

[Experiments]Slide26

Experiments Summary26

[Experiments]

Search strategy

Instruction-pool truncation

via k-NN

Linear

Model-assisted best-first

No

No. of timeouts

Speedup for formulas

that timeout

Speedup for remaining formulas

No. of timeouts

Speedup for formulas that timeout

Speedup for remaining formulas

6

--

--

0

At least 38

1.78

Yes

0

At least

200

4

0

At least 526

4.55

Search strategy

Instruction-pool truncation

via k-NN

Linear

Model-assisted best-first

No

No. of timeouts

Speedup for formulas

that timeout

Speedup for remaining formulas

No. of timeouts

Speedup for formulas that timeout

Speedup for remaining formulas

6

--

--

0

Yes

0

0Slide27

ConclusionUse of Machine Learning to make machine-code synthesis smarter and fasterBest-first search assisted by modelsn-gram language model

k-NN regression modelPrioritization of candidatesRetains completenessOver 526X speedup for formulas that timed out in McSynth++4.55X speedup for remaining formulas27Slide28

Space of instruction sequences

Questions?

28

n-gram

k-NN

k-NN

φSlide29

Master in McSynth++

29 φ : EAX’

= Mem(ESP + 4) ∧

EBX’ = ((EAX * 2) >> 2) + EAX

Mem’= Mem[ESP ↦ EAX]

Flow

independence

Legal Split

φ

1

: Mem’ = Mem[ESP

EAX]

φ

2

: EAX’

= Mem(ESP + 4)

EBX’ = ((EAX * 2) >> 2) + EAX

φ

3

:

EBX’ = ((EAX * 2) >> 2) + EAX

φ

4

: EAX’

= Mem(ESP + 4)

Mem(ESP) and Mem(ESP +4) could never aliasSlide30

Flattening “Deep” Terms30

φ3 :

EBX’ = ((EAX * 2) >> 2) + EAX

φ

3

:

m = EAX * 2

n = m >> 2

EBX’ = n + EAX

Equisatisfiable

Scratch Locations (Dead Registers) = {EBX}

φ

5

:

m = EAX * 2

φ

6

:

n = m >> 2

φ

7

:

EBX’ = n + EAX

φ

5

:

EBX’ = EAX * 2

φ

6

:

EBX’ = EBX >> 2

φ

7

:

EBX’ = EBX + EAXSlide31

Master in

McSynth++31Master++

φ

1

:

Mem’ = Mem[ESP

↦ EAX]

φ

5

:

EBX’ = EAX * 2

φ

6

:

EBX’ = EBX >> 2

φ

7

:

EBX’ = EBX + EAX

φ

4

:

EAX’ = Mem(ESP + 4)

φ

: EAX’

= Mem(ESP + 4)

EBX’ = ((EAX * 2) >> 2) + EAX

Mem’= Mem[ESP ↦ EAX]Slide32

Footprint-Based Search-Space Pruning32

φ1 : Mem’ = Mem[ESP – 4 ↦ EBP]

Instruction Enumerator

mov

eax

, <imm>

mov

esp

, <imm>

add

esp

, <imm>

push

eax

mov

eax

, <

imm

>

Abstract semantic Footprint

Sound abstraction of locations that might be used or modified by a QFBV formula

SFP#

use

(

φ

1

) = {ESP, EBP}

SFP#

kill

(

φ

1) = {Mem’}Ψ : EAX’ = imm

SFP

#use

(Ψ) =

{ }SFP#kill(

Ψ) = {EAX’}

mov

eax

, <

imm

>

mov

ebx

,

ecx

Ψ

: EAX’ =

imm

^ EBX’ = ECX

SFP

#

use

(

Ψ

)

=

{

ECX

}

SFP#kill(Ψ) = {EAX’, EBX’}Useless PrefixSFP#useSFP#killmov eax, <imm>Slide33

Bits-Lost-Based Pruning

φ7 requires pre-state bits in EBXΨ loses the pre-state bits in EBX when it transforms the statePrunes a candidate if Ψ does not retain enough bits to implement

φ7

Retains a candidate if pre-state bits required to implement

φ

7

are

possibly latent

in

Ψ

33

φ

7

:

EBX’ = EBX + EAX

Instruction Enumerator

Ψ

: EBX’ = EAX

mov ebx,

eax

Slave

Ψ

: EBX’ = EBX - EAX

sub ebx,

eaxSlide34

Synthesis Vs. CompilationFinds several implementations of varying “qualities”Can handle more general formulas (implicit form), e.g., EAX’ + EBX’ = EAX + EBX + 4

Finds a single implementationOnly handles state-transformation formulas (explicit form)34Slide35

Synthesis Vs. SuperoptimizationMachine-code synthesis is more general than superoptimization: superopt(I) =

synthOpt(InstrToQFBV(I))Input to synthesizer is QFBV formula, not an instruction sequenceSuperoptimizer cannot handle more general QFBV formulas such as EAX’ + EBX’ = EAX + EBX + 435Slide36

Effect of McSynth++ on a ClientWiPEr: Machine-code partial evaluator *

Total time to synthesize residual code: McSynth Vs. McSynth++Effect of McSynth-ML: no significant speedupSmall input formulas36* V. Srinivasan and T. Reps. Partial Evaluation of machine code. In OOPSLA, 2015

Application

LOC

No. of calls to synthesizer

Synthesis time

using McSynth (seconds)

Synthesis time using McSynth++ (seconds)

Speedup

power

19

6

16

13.5

1.19interpreter7119

3022.81.32

sha11402325.4

211.21filter107212

2411771.36dotproduct29306312267

1.17

Average speedup: 1.25X