/
An Analysis of the Search Space of Generate and Validate Pa An Analysis of the Search Space of Generate and Validate Pa

An Analysis of the Search Space of Generate and Validate Pa - PowerPoint Presentation

min-jolicoeur
min-jolicoeur . @min-jolicoeur
Follow
385 views
Uploaded On 2017-04-27

An Analysis of the Search Space of Generate and Validate Pa - PPT Presentation

Fan Long and Martin Rinard MIT EECS amp CSAIL 1 Application Negative Inputs Positive Inputs Test Suite Inputs Correct Outputs Context Goal Automatic Patch Generation System ID: 542074

space patches search correct patches space correct search patch validated generate inputs prophet spr code candidate program incorrect validate

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "An Analysis of the Search Space of Gener..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

An Analysis of the Search Space of Generate and Validate Patch Generation Systems

Fan Long and Martin RinardMIT EECS & CSAIL

1Slide2

Application

=

Negative

Inputs

=

Positive

Inputs

Test Suite

Inputs

Correct

Outputs

Context

=

=

=

Goal

Automatic Patch Generation System

2Slide3

=

Negative

Inputs

=

Positive

Inputs

=

=

=

Generate and Validate Patching

Generate a search space of candidate patches

p-

>

f1

= y

;

3Slide4

=

Negative

Inputs

=

Positive

Inputs

=

=

=

Generate and Validate Patching

Validate each candidate patch against the test suite

p-

>

f1

=

y

z

;

4Slide5

=

Negative

Inputs

=

Positive

Inputs

=

=

Generate and Validate Patching

Validate each candidate patch against the test suite

p-

>

f1

f2

= y;

5Slide6

=

Negative

Inputs

=

Positive

Inputs

=

=

=

Generate and Validate Patching

Collect all of the patches that validate

if (p != 0) return;

p-

>

f1

= y

;

6Slide7

Are all validated patches correct?

No!

7Slide8

=

Negative

Inputs

=

Positive

Inputs

=

=

=

A Validated but Incorrect Patch

Because test suite is incomplete!

exit(0)

p-

>

f1

= y

;

8Slide9

How to make such a system to generate correct patches?

9Slide10

Correct patch

Original code

Possible Code Space

Other points in the plane correspond to candidate patches (modified code).

if (p != 0) return;

p-

>

f1

= y

;

10Slide11

Correct patch

Original code

Possible Code Space

Search space

1. Search space needs to contain correct patches

11Slide12

Correct patch

Original code

Possible Code Space

Search space

2

. The search algorithm can explore the search space and find correct patches

12Slide13

Correct patch

Original code

Possible Code Space

Search space

Validated but incorrect patch

Some points in the plane correspond to validated but incorrect patches!

exit(0)

p-

>

f1

= y

;

13Slide14

Validated but incorrect patch

Possible Code Space

Correct patch

Original code

Search space

3. Search space does not contain too many validated but incorrect patches

14Slide15

Research Questions

How many correct patches in the search space for a defect?How many candidate patches in the search space for a defect?How many validated but incorrect patches in the search space

for a defect?

15Slide16

An Inherent Search Space Tradeoff

Coverage: the space needs to contain enough useful patchesTractability: the space needs to be small enough to explore effectivelyFind correct patch in timeRank correct patch first among validated patches

16Slide17

An empirical quantitative analysis on the

search space of two patch generation systems SPR and Prophet17Slide18

SPR and Prophet Overview

SPR [FSE’15]:Work with a search space derived from a productive set of expression-level modificationsUse staged program repair and condition synthesis techniques to efficiently explore the space

Prophet

[POPL’16]

:

Work with the same SPR search space

Learn a patch correctness model from past successful human patches to guide the search to recognize the correct patches

18Slide19

SPR Search Space: Anatomy of a Modification

if (C) {…} else {…}

i

f (C

&& E

) {…} else {…}

Statement in Original Unpatched Program

Statement in Patched Program

Instantiate

E

to get a patch

19Slide20

SPR Search Space: Other Modifications

S

if (

E

)

{ S }

if (

E ) return c

; SS

S

Q[

replace v1 with v2]; S

S

S[replace v1 with v2]

if (C) {…} else {…}if (C || E ) {…} else {…}

Replace

Copy & Replace

Initialize

memset

(&e, 0,

sizeof

(e)); S

S

20Slide21

Prophet

Patch Correctness Model

0.2

0.05

0.02

0.01

Search Space

Rank Patches

with Probability Scores

21Slide22

Prophet: Key Ideas

Correct patches share

universal

features that hold across applications

These features capture

interactions

between the

patch

and the

surrounding code

Use program analysis to extract features

Obtain corpus of patches from open source software development

efforts

Learn model to prioritize correct patches

22Slide23

Setup for Model

S

if (

E

)

{ S }

Modification

Location

(statement in )

Program

Goal: estimate , given ,

Use the estimate to rank the patches

A patch is a modification applied to a location

( identifies a statement in program )

23Slide24

A patch is a modification applied to a location

Probability that modification applied at location in program given produces a correct patch

Geometric distribution that encodes error localization

Log linear distribution based on extracted features

lllll

l

Probabilistic Model

( identifies a statement in program )

24Slide25

Experimental Setup

25Slide26

Application

LoC

Tests

Defects

libtiff

77 K

78

8

lighttpd

62 K295

7php

1,046 K847131

gmp145 K

1462gzip491 K12

4python407 K

359wireshark2,814 K636fbc97 K7732Total69GenProg Benchmark set by Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest,

Westley Weimer [ICSE 2012]

Benchmark

26Slide27

Search Space Configurations

Consider different numbers of candidate statementsTop 100, 200, 300, or 2000 instead of just top 200 from the error localizerConsider more operators for synthesizing condition (

CExt

)

Consider more complicated expressions for replacement modifications (

RExt

)e.g., r

eplace v1 with (v2 + v3)27Slide28

Experimental Setup

Run SPR and Prophet with 16 different search spaces including the default one.There are 19 defects whose correct patches are in the default search space.

There are 24 defects whose correct patches are in some extended search spaces.

28Slide29

Experimental Setup

For each of the 24 defects and each search space configuration, we record:The number of candidate patches in 12 hoursThe number of v

alidated patches in 12 hours

The number of correct patches in 12 hours

The rank of the first correct patch among all validated patches

29Slide30

Finding 1: Validated patches are abundant

in the search space

30Slide31

1. SPR and Prophet generate 700-4000 validated patches if

CExt

is off

2. SPR and Prophet generate 5900-12000 validated patches if

CExt

is on

CExt

: Condition synthesis extension turned on

RExt

: Replacement extension turned on

100, 200, 300, or 2000: The number of candidate statements to consider

31Slide32

Identify Correct Patches

Manually analyze the root cause of each defectAnnotate modifications that produce correct patches for each caseRun a script to match validated patches against annotationsAll developer fixes in later revisions are correct

32Slide33

Finding 2: Correct patches are sparse in search space

33Slide34

1. SPR and Prophet generate 11 to 25 correct patches for all 24 defects.

2. Prophet generates 25 correct patches with default search space.

3. SPR and Prophet generate only 11 correct patches with largest search space 2000+CExt+RExt.

CExt

: Condition synthesis extension turned on

RExt

: Replacement extension turned on

100, 200, 300, or 2000: The number of candidate statements to consider

34Slide35

Validated but incorrect patches are much more abundant than correct patches!

35Slide36

Will Stronger Test Suite Reduce the Number of Validated Patches?

PHP has 10x more test cases than any other application in the benchmark set13 PHP defects 11 non-PHP defects

SPR and Prophet generate 5x-20x more validated patches on non-PHP defects

However, SPR and Prophet still generate hundreds of validated patches on PHP defects

36Slide37

Finding 3: Larger search space may produce correct patches for less defects

37Slide38

38Slide39

Explanation

More candidate patches:Unable to find correct patch within time-budgetRelatively much more validated patches than correct patches:

Find incorrect patch first

The second case occurs more often!

39Slide40

Conclusion

Search Space Tradeoff: Coverage v.s. tractability

Correct patches are sparse; validated patches are relatively abundant.

Challenge: How to distinguish correct patches among many validated patches?

How to move forward:

Use additional information other than test suite to recognize correct patches (Prophet)

More productive and focused search space

40