/
TERN: TERN:

TERN: - PowerPoint Presentation

cheryl-pisano
cheryl-pisano . @cheryl-pisano
Follow
379 views
Uploaded On 2017-06-25

TERN: - PPT Presentation

Stable Deterministic Multithreading through Schedule Memoization Heming Cui Jingyue Wu Chia che Tsai Junfeng Yang Computer Science Columbia University New York NY USA 1 Nondeterministic Execution ID: 563158

int block nblock argv block int argv nblock nthread tern schedules worker bug create add reuse worklist amp schedule input symbolic pbzip2

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "TERN:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

TERN:Stable Deterministic Multithreading through Schedule Memoization

Heming CuiJingyue WuChia-che TsaiJunfeng YangComputer ScienceColumbia UniversityNew York, NY, USA

1Slide2

Nondeterministic Execution

Same input  many schedulesProblem: different runs may show different behaviors, even on the same inputs2nondeterministic

bug

1

manySlide3

Deterministic Multhreading (DMT)

Same input  same schedule [DMP ASPLOS '09], [KENDO ASPLOS '09], [COREDET ASPLOS '10], [dOS OSDI '10]Problem: minor input change  very different schedule3existing DMT systems

bug

nondeterministic

bug

1

many

1

1

Confirmed in experiments Slide4

Schedule Memoization

Many inputs  one scheduleMemoize schedules and reuse them on future inputsStability: repeat familiar schedulesBig benefit: avoid possible bugs in unknown schedules4schedule memoization

bug

nondeterministic

bug

1

many

many

 1

existing DMT systems

bug

1

1

Confirmed in experiments Slide5

TERN: the First Stable DMT SystemRun on Linux as user-space schedulers

To memoize a new scheduleMemoize total order of synch operations as scheduleRace-free ones for determinism [RecPlay TOCS]Track input constraints required to reuse schedulesymbolic execution [KLEE OSDI '08]To reuse a scheduleCheck input against memoized input constraintsIf satisfies, enforce same synchronization order5Slide6

Summary of ResultsEvaluated on diverse set of 14 programsApache,

MySQL, PBZip2, 11 scientific programsReal and synthetic workloadsEasy to use: < 10 lines for 13 out of 14Stable: e.g., 100 schedules to process over 90% of real HTTP trace with 122K requestsReasonable overhead: < 10% for 9 out of 146Slide7

OutlineTERN overviewAn Example

EvaluationConclusion7Slide8

Overview of TERN

TERN components are shaded8Input I

Program

Replayer

OS

Program

Memoizer

OS

LLVM Compiler

Instrumentor

Runtime

Compile Time

<C, S>

<

Ci

, Si>

<C1, S1>

<

Cn

,

Sn

>

Hit

I, Si

Miss

I

Schedule Cache

Match?

Program

Source

DeveloperSlide9

OutlineTERN overviewAn Example

EvaluationConclusion9Slide10

Simplified PBZip2 Code10

main(int argc, char *argv[]) { int

i

;

int

nthread

=

argv

[1]; int nblock = argv

[2];

for(i=0; i<nthread; ++i

)

pthread_create

(worker);

for(

i

=0;

i

<

nblock

; ++

i

) {

block = bread(

i,argv

[3]); add(worklist, block); }

}worker() { for(;;) { block = get(worklist); compress(block); }

}

// create worker threads// read i'th file block

// add block to work list// worker thread code

// get a block from work list

// read input

// compress blockSlide11

Annotating Source11

main(int argc, char *argv[]) { int

i

;

int

nthread

=

argv

[1]; int nblock = argv

[2];

for(i=0; i<nthread; ++i

)

pthread_create

(worker);

for(

i

=0;

i

<

nblock

; ++

i

) {

block = bread(

i,argv

[3]); add(worklist, block);

}}worker() { for(;;) { block = get(worklist

); compress(block); }

}// marking inputs affecting schedule

symbolic(&nthread);

symbolic(&

nblock

);

// marking inputs affecting schedule

// TERN intercepts

// TERN intercepts

// TERN intercepts

// TERN tolerates inaccuracy in annotations.Slide12

Memoizing Schedules

12main(int argc, char *argv[]) {

int

i

;

int

nthread

=

argv[1]; int nblock =

argv[2];

for(i=0; i<nthread; ++

i

)

pthread_create

(worker);

for(

i

=0;

i

<

nblock

; ++

i

) {

block = bread(

i,argv[3]); add(worklist

, block); }}worker() { for(;;) { block = get(

worklist); compress(block); }

}symbolic(&nthread);

symbolic(&nblock

);cmd

$ pbzip2 2 2 foo.txt

T2

T3

T1

T1

T1

T1

T1

T1

T1

T1

T2

T3

T1

T2

T3

p…create

add

p…create

get

get

add

Synchronization order

Constraints

0 <

nthread

? true

1 <

nthread

? true

2 <

nthread

? false

0 <

nblock

? true

1 <

nblock

? true

2 <

nblock

? false

// 2

// 2Slide13

Simplifying Constraints

13main(int argc, char *argv[]) {

int i

;

int

nthread

=

argv

[1]; int nblock = argv

[2];

for(i=0; i<nthread; ++i

)

pthread_create

(worker);

for(

i

=0;

i

<

nblock

; ++

i

) {

block = bread(

i,argv

[3]); add(worklist

, block); }}worker() { for(;;) { block = get(worklist

); compress(block); }

}symbolic(&nthread);

symbolic(&nblock);

cmd

$ pbzip2 2 2 foo.txt

T1

T2

T3

p…create

add

p…create

get

get

add

Synchronization order

Constraints

2 ==

nthread

2 ==

nblock

Constraint simplification techniques in paperSlide14

Reusing Schedules

14main(int argc, char *argv[]) {

int i

;

int

nthread

=

argv

[1]; int nblock = argv

[2];

for(i=0; i<nthread; ++i

)

pthread_create

(worker);

for(

i

=0;

i

<

nblock

; ++

i

) {

block = bread(

i,argv

[3]); add(worklist

, block); }}worker() { for(;;) { block = get(worklist

); compress(block); }

}symbolic(&nthread);

symbolic(&nblock);

cmd

$ pbzip2 2 2

bar.txt

T1

T2

T3

p…create

add

p…create

get

get

add

Synchronization order

Constraints

2 ==

nthread

2 ==

nblock

// 2

// 2Slide15

OutlineTERN OverviewAn Example

EvaluationConclusion15Slide16

Stability Experiment SetupProgram – Workload

Apache-CS: 4-day Columbia CS web trace, 122KMySql-SysBench-simple: 200K random select queriesMySql-SysBench-tx: 200K random select, update, insert, and delete queriesPBZip2-usr: random 10,000 files from “/usr”Machine: typical 2.66GHz quad-core IntelMethodologyMemoize schedules on random 1% to 3% of workloadMeasure reuse rates on entire workload (Many  1

)Reuse rate: % of inputs processed with memoized schedules

16Slide17

How Often Can TERN Reuse Schedules?Over 90% reuse rate for three

Relatively lower reuse rate for MySql-SysBench-tx due to random query types and parameters17Program-WorkloadReuse Rate (%)# SchedulesApache-CS

90.3100MySQL

-

SysBench

-Simple

94.0

50

MySQL-SysBench-tx

44.2

109PBZip2-usr96.290Slide18

Bug Stability Experiment Setup

Bug stability: when input varies slightly, do bugs occur in one run but disappear in another?Compared against COREDET [ASPLOS’10]Open-source, software-onlyTypical DMT algorithms (one used in dOS)Buggy programs: fft, lu, and barnes (SPLASH2)Global variables are printed before assigned correct valueMethodology: vary thread count and computation amount, then record bug occurrence over 100 runs for COREDET and TERN

18Slide19

Is Buggy Behavior Stable? (fft)

19COREDET: 9 schedules, one for each cell.TERN: only 3 schedules, one for each thread count.Fewer schedules  lower chance to hit bug  more stable

COREDET

TERN

2

4

8

10

12

14

10

12

14

Matrix size

# of threads

Similar results for 2 to 64 threads, 2 to 20 matrix size, and the other two buggy programs

lu

and

barnes

: no bug

: bug occurredSlide20

Does TERN Incur High Overhead in reuse runs?

20Smaller is better. Negative values mean speed up.Slide21

Conclusion and Future WorkSchedule memoization: reuse schedules across different inputs (

Many  1)TERN: easy to use, stable, deterministic, and fastFuture workFast & Deterministic Replay/Replication21