Fan Long MIT EECS amp CSAIL 1 Negative Inputs Positive Inputs Generate and Validate Patching Validate each candidate patch against the test suite p gt f1 ID: 650979
Download Presentation The PPT/PDF document "Automatic Inference of Code Transforms a..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Automatic Inference of Code Transforms and Search Spaces for Patch Generation
Fan LongMIT EECS & CSAIL
1Slide2
=
Negative
Inputs
=
Positive
Inputs
≠
=
=
=
Generate and Validate Patching
Validate each candidate patch against the test suite
…
p-
>
f1
=
y
z
;
…
2Slide3
=
Negative
Inputs
=
Positive
Inputs
=
=
≠
Generate and Validate Patching
Validate each candidate patch against the test suite
…
p-
>
f1
f2
= y;
…
3Slide4
=
Negative
Inputs
=
Positive
Inputs
=
=
=
Generate and Validate Patching
Collect all of the patches that validate
…
if (p != 0) return;
p-
>
f1
= y
;
…
4Slide5
A Prophet Transform
if (C) {…} else {…}
if (C
&& E
) {…} else {…}
Statement in Original Unpatched Program
Statement in Patched Program
E
is a clause of the form “
exp
== c” or “
exp != c”, where
exp is a variable or field access and c is a constant
5Slide6
Other Prophet Transforms
S
if (
E
)
{ S }
if ( E
) return c; S
S
S
Q[
replace v1 with v2]; S
S
S[replace v1 with v2]
if (C) {…} else {…}
if (C
|
| E
) {…} else {…}
Replace
Copy & Replace
Initialize
memset
(&e, 0,
sizeof
(e)); S
S
6Slide7
All
previous systems operates with a set of manually defined
transforms
Manual
Transforms
…
Search Space of Candidate Patches
…
7Slide8
?:
Learn how developers write patches in the first place
Inferred Transforms
…
Search Space of Candidate Patches
…
Training Human Patches
8Slide9
Genesis:
Learn how developers write patches in the first place
Inferred Transforms
…
Search Space of Candidate Patches
…
Training Human Patches
9Slide10
How to
capture code transforms used by human developers?
10Slide11
Example: Patch from Google Cloud Dataflow SDK
if
(
unions.isEmpty
()
) {
if (useDefault) return
defaultValue; … }
Original code:
if (
unions == null || unions.isEmpty()) {
if (useDefault) return
defaultValue; … }
Patched code:Transform: Disjoins an existing expression with an additional clause to check
nullError: Null pointer exception (NPE)
11Slide12
Example: Patch from Google Cloud Dataflow SDK
if
(
unions.isEmpty
()
) {
if (useDefault) return
defaultValue; … }
Original code:
if (
unions == null || unions.isEmpty()) {
if (useDefault) return
defaultValue; … }
Patched code:
call
.
unions
isEmpty
Original
Code AST
bop
bop
null
==
||
unions
Patched Code AST
call
.
unions
isEmpty
12Slide13
Example: Patch from Google Cloud Dataflow SDK
bop
bop
null
==
||
unions
call
.
unions
isEmpty
call
.
unions
isEmpty
Non-leaf nodes denote the type
of the sub-tree (expression).
13Slide14
Example: Patch from Google Cloud Dataflow SDK
bop
bop
null
==
||
unions
call
.
unions
isEmpty
call
.
unions
isEmpty
Non-leaf nodes denote the type
of the sub-tree (expression).
Leaf nodes are variable/method
references, constants, and operators.
14Slide15
Example: Patch from Google Cloud Dataflow SDK
bop
bop
null
==
||
unions
call
.
unions
isEmpty
call
.
unions
isEmpty
The original expression is copied to the patched code.
15Slide16
Example: Patch from Google Cloud Dataflow SDK
bop
bop
null
==
||
unions
The original expression is copied to the patched code.
A
A
: Expr
A
16Slide17
Example: Patch from Google Cloud Dataflow SDK
bop
bop
null
==
||
unions
Syntactic details like variable names
will not generalize across programs
A
A
: Expr
A
17Slide18
Example: Patch from Google Cloud Dataflow SDK
bop
bop
null
==
||
A
A
: Expr
B
B
:
Var
A
Syntactic details like variable names
will not generalize across programs
18Slide19
Example: Patch from Google Cloud Dataflow SDK
bop
bop
null
==
||
A
A
A
: Expr
B
B
:
Var
B
is a variable that appears in
A
Syntactic details like variable names
will not generalize across programs
19Slide20
Anatomy of Code Transform
bop
bop
null
==
||
A
A
A
: Expr
B
B
:
Var
B
is a variable that appears in
A
Pre-transform
Template AST
Post-transform
Template AST
Template Variables
Template variables provide abstractions
to enable
program-independent
transforms!
20Slide21
Anatomy of Code Transform
Generator for
B
bop
bop
null
==
||
A
A
A
: Expr
B
B
:
Var
B
is a variable that appears in
A
Generators specifies constraints (bounds)
of generating new code.
21Slide22
Anatomy of Code Transform
bop
bop
null
==
||
A
A
A
: Expr
B
B
:
Var
Generator for
B
Generators specifies constraints (bounds)
of generating new code.
22Slide23
Why Not This Code Transform?
bop
||
A
A
A
: Expr
B
B
:
Bop
Enumerate all possible binary operator
expressions with less than
4 nodes.
if
(
unions.isEmpty
()
) {
if
(
useDefault
)
return
defaultValue
;
… }
Original code:
if
(
unions == null
||
unions.isEmpty()) { if (useDefault) return
defaultValue; … }Patched code:
23Slide24
Why Not This Code Transform?
A
A
: Expr
B
:
Expr
Enumerate all possible expressions with less than 10 nodes.
B
if
(
unions.isEmpty
()
) {
if
(
useDefault
)
return
defaultValue; … }
Original code:
if (
unions == null || unions.isEmpty()) {
if (useDefault) return defaultValue; … }
Patched code:
24Slide25
How to determine the right abstraction granularity?
25
Solution:
Generalize
transforms
from sets of relevant patches.Slide26
Apply Code Transform
A
A
: Expr
bop
bop
null
C
D
A
B
B
: Expr
transform
if
(
unions.isEmpty
()
) {
if
(
useDefault
)
return
defaultValue
;
… }
unions.isEmpty
()
bop
bop
null
==
||
unions.isEmpty
()
unions
if
(
unions == null ||
unions.isEmpty
()
) {
if
(
useDefault
)
return
defaultValue
;
… }
Original code:
Patched code:
D
: { ||, && }
C
: {==, !=}
transform
26Slide27
Example: Training Patches
if
(
MapperPrism.getInstanceof
(
mapperTypeElement) == null) { … }
if
(mapperTypeElement == null || MapperPrism.getInstanceof
( mapperTypeElement) == null) { … }
return
type.isAssignableFrom(
subject.getClass());
return subject != null && type.isAssignableFrom(
subject.getClass());
if (Material.getMaterial
(getTypeId()). getData() != null
) { … }
if (
Material.getMaterial(getTypeId
()) != null && Material.getMaterial(getTypeId()).
getData() != null) { … }
Original code:
Patched code:
These patches are from three different projects:mapstruct, modelmapper, and BukkitEach patch modifies a condition expression
27Slide28
Example: Training Patches
if
(
mapperTypeElement
== null ||
MapperPrism.getInstanceof( mapperTypeElement) == null
) { … }
return subject != null &&
type.isAssignableFrom(
subject.getClass());
if
(Material.getMaterial(
getTypeId()) != null && Material.getMaterial(getTypeId()). getData() != null) { … }
Original code:
Patched code:
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
==
null
type.isAssignableFrom
(
subject.getClass
());
bop
Material.getMaterial
(
getTypeId
()).
getData
()
!=
null
28Slide29
Example: Training Patches
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
==
null
type.isAssignableFrom
(
subject.getClass
());
bop
Material.getMaterial
(
getTypeId
()).
getData
()
!=
null
||
bop
bop
mapperTypeElement
null
==
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
null
==
&&
bop
bop
subject
null
!=
type.isAssignableFrom
(
subject.getClass
());
&&
bop
bop
Material.getMaterial
(
getTypeId
())
null
!=
bop
Material.getMaterial
(
getTypeId
()).
getData
()
null
!
=
Original code:
Patched code:
29Slide30
Example: Training Patches
||
bop
bop
mapperTypeElement
null
==
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
null
==
&&
bop
bop
subject
null
!=
type.isAssignableFrom
(
subject.getClass
());
&&
bop
bop
Material.getMaterial
(
getTypeId
())
null
!=
bop
Material.getMaterial
(
getTypeId
()).
getData
()
null
!
=
Original code:
Patched code:
A
A
: Expr
30Slide31
Example: Training Patches
Original code:
Patched code:
A
A
: Expr
A
: Expr
bop
bop
null
C
D
A
B
B
: Expr
D
: { ||, && }
C
: {==, !=}
31Slide32
Template AST for Original Code
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
==
null
type.isAssignableFrom
(
subject.getClass
());
bop
Material.getMaterial
(
getTypeId
()).
getData
()
!=
null
A
is a template variable in the template AST, which can represent an arbitrary expression subtree.
A
A
: Expr
if
(
MapperPrism.getInstanceof
(
mapperTypeElement
) == null
) {
… }
return
type.isAssignableFrom
(
subject.getClass
());
if
(Material.getMaterial(getTypeId()).
getData() != null) { … }
32Slide33
Template AST for Patched Code
||
A
A
: Expr
bop
bop
mapperTypeElement
null
==
if
(
mapperTypeElement
== null ||
MapperPrism.getInstanceof
(
mapperTypeElement
) == null
) {
… }
return
subject != null &&
type.isAssignableFrom
(
subject.getClass
());
if
(Material.getMaterial(getTypeId
()) != null && Material.getMaterial(getTypeId()). getData
() != null) { … }
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
null
==
&&
bop
bop
subject
null
!=
type.isAssignableFrom
(
subject.getClass
());
&&
bop
bop
Material.getMaterial
(
getTypeId
())
null
!=
bop
Material.getMaterial
(
getTypeId
()).
getData
()
null
!
=
transform
33Slide34
Template AST for Patched Code
||
A
A
: Expr
bop
bop
mapperTypeElement
null
==
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
null
==
&&
bop
bop
subject
null
!=
type.isAssignableFrom
(
subject.getClass
());
&&
bop
bop
Material.getMaterial
(
getTypeId
())
null
!=
bop
Material.getMaterial
(
getTypeId
()).
getData
()
null
!
=
bop
transform
if
(
mapperTypeElement
== null ||
MapperPrism.getInstanceof
(
mapperTypeElement
) == null
) {
… }
return
subject != null &&
type.isAssignableFrom
(
subject.getClass
());
if
(
Material.getMaterial
(
getTypeId
()) != null &&
Material.getMaterial
(
getTypeId
()).
getData
() != null
) {
… }
34Slide35
Template AST for Patched Code
||
A
A
: Expr
bop
bop
mapperTypeElement
null
==
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
null
==
&&
bop
bop
subject
null
!=
type.isAssignableFrom
(
subject.getClass
());
&&
bop
bop
Material.getMaterial
(
getTypeId
())
null
!=
bop
Material.getMaterial
(
getTypeId
()).
getData
()
null
!
=
bop
bop
transform
if
(
mapperTypeElement
== null ||
MapperPrism.getInstanceof
(
mapperTypeElement
) == null
) {
… }
return
subject != null &&
type.isAssignableFrom
(
subject.getClass
());
if
(
Material.getMaterial
(
getTypeId
()) != null &&
Material.getMaterial
(
getTypeId
()).
getData
() != null
) {
… }
35Slide36
Template AST for Patched Code
||
A
A
: Expr
bop
bop
mapperTypeElement
null
==
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
null
==
&&
bop
bop
subject
null
!=
type.isAssignableFrom
(
subject.getClass
());
&&
bop
bop
Material.getMaterial
(
getTypeId
())
null
!=
bop
Material.getMaterial
(
getTypeId
()).
getData
()
null
!
=
bop
bop
B
B
: Expr
transform
if
(
mapperTypeElement
== null ||
MapperPrism.getInstanceof
(
mapperTypeElement
) == null
) {
… }
return
subject != null &&
type.isAssignableFrom
(
subject.getClass
());
if
(
Material.getMaterial
(
getTypeId
()) != null &&
Material.getMaterial
(
getTypeId
()).
getData
() != null
) {
… }
36Slide37
Template AST for Patched Code
||
A
A
: Expr
bop
bop
mapperTypeElement
null
==
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
null
==
&&
bop
bop
subject
null
!=
type.isAssignableFrom
(
subject.getClass
());
&&
bop
bop
Material.getMaterial
(
getTypeId
())
null
!=
bop
Material.getMaterial
(
getTypeId
()).
getData
()
null
!
=
bop
bop
C
B
B
: Expr
C
: {==, !=}
transform
if
(
mapperTypeElement
== null ||
MapperPrism.getInstanceof
(
mapperTypeElement
) == null
) {
… }
return
subject != null &&
type.isAssignableFrom
(
subject.getClass
());
if
(
Material.getMaterial
(
getTypeId
()) != null &&
Material.getMaterial
(
getTypeId
()).
getData
() != null
) {
… }
37Slide38
Template AST for Patched Code
||
A
A
: Expr
bop
bop
mapperTypeElement
null
==
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
null
==
&&
bop
bop
subject
null
!=
type.isAssignableFrom
(
subject.getClass
());
&&
bop
bop
Material.getMaterial
(
getTypeId
())
null
!=
bop
Material.getMaterial
(
getTypeId
()).
getData
()
null
!
=
bop
bop
null
C
B
B
: Expr
C
: {==, !=}
transform
if
(
mapperTypeElement
== null ||
MapperPrism.getInstanceof
(
mapperTypeElement
) == null
) {
… }
return
subject != null &&
type.isAssignableFrom
(
subject.getClass
());
if
(
Material.getMaterial
(
getTypeId
()) != null &&
Material.getMaterial
(
getTypeId
()).
getData
() != null
) {
… }
38Slide39
Template AST for Patched Code
||
A
A
: Expr
bop
bop
mapperTypeElement
null
==
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
null
==
&&
bop
bop
subject
null
!=
type.isAssignableFrom
(
subject.getClass
());
&&
bop
bop
Material.getMaterial
(
getTypeId
())
null
!=
bop
Material.getMaterial
(
getTypeId
()).
getData
()
null
!
=
bop
bop
null
C
D
B
B
: Expr
D
: { ||, && }
C
: {==, !=}
transform
if
(
mapperTypeElement
== null ||
MapperPrism.getInstanceof
(
mapperTypeElement
) == null
) {
… }
return
subject != null &&
type.isAssignableFrom
(
subject.getClass
());
if
(
Material.getMaterial
(
getTypeId
()) != null &&
Material.getMaterial
(
getTypeId
()).
getData
() != null
) {
… }
39Slide40
Template AST for Patched Code
||
A
A
: Expr
bop
bop
mapperTypeElement
null
==
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
null
==
&&
bop
bop
subject
null
!=
type.isAssignableFrom
(
subject.getClass
());
&&
bop
bop
Material.getMaterial
(
getTypeId
())
null
!=
bop
Material.getMaterial
(
getTypeId
()).
getData
()
null
!
=
bop
bop
null
C
D
A
B
B
: Expr
D
: { ||, && }
C
: {==, !=}
transform
if
(
mapperTypeElement
== null ||
MapperPrism.getInstanceof
(
mapperTypeElement
) == null
) {
… }
return
subject != null &&
type.isAssignableFrom
(
subject.getClass
());
if
(
Material.getMaterial
(
getTypeId
()) != null &&
Material.getMaterial
(
getTypeId
()).
getData
() != null
) {
… }
40Slide41
Generator
A
A
: Expr
bop
bop
null
C
D
A
B
B
: Expr
transform
D
: { ||, && }
C
: {==, !=}
B
mapperTypeElement
subject
Material.getMaterial
(
getTypeId
())
Genesis infers a generator for
B
, which bounds the variables and functions the result expression contains.
What expressions
B
represents?
41Slide42
Generalized Code Transform
A
bop
bop
null
C
D
A
B
A
:
Expr
B
:
Expr
D
: { ||,
&&
}
C
:
{==,
!=}
Abstract away elements that are different across training patches
Keep elements that are common for all training patches
Infer generator constraints that enable the generation of original patches
42Slide43
Generalized Code Transform
A
bop
bop
null
C
D
A
B
A
:
Expr
B
:
Expr
D
: { ||,
&&
}
C
:
{==,
!=}
B
can be expressions with up to two calls like “
v.foo
().bar()”
C
can be either “==” or “!=”
D
can be either “||” or “&&”
43Slide44
Tradeoff between Coverage and Tractability
A Useful Transform:
1. Generate correct patches for many errors
2. Generate tractable number of candidate patches
A
bop
bop
null
C
D
A
B
A
:
Expr
B
:
Expr
D
: { ||,
&&
}
C
:
{==,
!=}
44Slide45
Another Set of Training Patches
if
(
MapperPrism.getInstanceof
(
mapperTypeElement) == null) { … }
if
(mapperTypeElement == null || MapperPrism.getInstanceof
( mapperTypeElement) == null) { … }
s
os.close();
i
f (sos != null) { sos.close(); }
Original code:
Patched code:45Slide46
Another Set of Training Patches
Original code:
Patched code:
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
==
null
||
bop
bop
mapperTypeElement
null
==
bop
MapperPrism.getInstnaceof
(
mapperTypeElement
)
null
==
if
bop
sos
null
!
=
sos.close
();
sos.close
();
46Slide47
A Not Very Useful Transform
A
Enumerate all possible AST subtrees with less than 10 nodes.
B
A
: Expr/
Stmt
B
:
Expr/
Stmt
47Slide48
Tradeoff between Coverage and Tractability
A Not Very Useful Transform:
Intractable number of candidate patches:
…
1. Unable to find correct patches in time
2. Unable to rank correct patches ahead
A
Enumerate all possible AST subtrees with less than 10 nodes.
B
A
: Expr/
Stmt
B
:
Expr/
Stmt
48Slide49
Transform Generalization
There are many possible transforms
Different high level templates
Different constraints/parameters
We can obtain one transform from each possible subset of training patches
49Slide50
How to select useful transforms?
50Slide51
Reserve Validation Patches
Validation Patches
Training Human Patches
51Slide52
Reserve Validation Patches
Validation Patches
…
…
Training Human Patches
52Slide53
Evaluate a Transform
A
bop
bop
null
C
D
A
B
A
:
Expr
B
:
Expr
D
: { ||,
&&
}
C
:
{==,
!=}
(E op
1
null) op
2
C
C
C
:
Expr
E
: Expr
op
2
:
{ ||,
&&
}
op
1
: {==,
!=}
E satisfies:
53Slide54
Evaluate a Transform
A Transform
…
…
Tractability
: How many candidate patches the transform generates in total?
Coverage
: Does the transform generates the correct patch?
(E op
1
null) op
2
C
C
C
:
Expr
E
: Expr
op
2
:
{ ||,
&&
}
op
1
: {==,
!=}
E satisfies:
54Slide55
Evaluate Transforms
…
…
(E op
1
null) op
2
C
C
op
2
:
{ ||,
&&
}
op
1
: {==,
!=}
E == null ||
C
C
if (
E != null
)
{ S; }
S
S’
S
S’
is
a new statement
with less than AST 10 nodes.
Transforms
Validation Cases
#
of Candidate Patches
400
200
100
55Slide56
Evaluate Transforms
…
…
(E op
1
null) op
2
C
C
op
2
:
{ ||,
&&
}
op
1
: {==,
!=}
E == null ||
C
C
if (
E != null
)
{ S; }
S
S’
S
S’
is
a new statement
with less than AST 10 nodes.
Transforms
Validation Cases
#
of Candidate Patches
100
50
25
56Slide57
Evaluate Transforms
…
…
(E op
1
null) op
2
C
C
op
2
:
{ ||,
&&
}
op
1
: {==,
!=}
E == null ||
C
C
if (
E != null
)
{ S; }
S
S’
S
S’
is
a new statement
with less than AST 10 nodes.
Transforms
Validation Cases
#
of Candidate Patches
20
5
10
57Slide58
Evaluate Transforms
…
…
(E op
1
null) op
2
C
C
op
2
:
{ ||,
&&
}
op
1
: {==,
!=}
E == null ||
C
C
if (
E != null
)
{ S; }
S
S’
S
S’
is
a new statement
with less than AST 10 nodes.
Transforms
Validation Cases
#
of Candidate Patches
>10
5
>10
5
>10
5
58Slide59
Goal: Select a Set of Transforms for Validation Cases
…
…
(E op
1
null) op
2
C
C
op
2
:
{ ||,
&&
}
op
1
: {==,
!=}
E == null ||
C
C
if (
E != null
)
{ S; }
S
S’
S
S’
is
a new statement
with less than AST 10 nodes.
Transforms
Validation Cases
# of Candidate Patches
400 + 20 = 420
200 + 5 = 205
100 + 10 = 110
59Slide60
Another Candidate Set
…
…
(E op
1
null) op
2
C
C
op
2
:
{ ||,
&&
}
op
1
: {==,
!=}
E == null ||
C
C
if (
E != null
)
{ S; }
S
S’
S
S’
is
a new statement
with less than AST 10 nodes.
Transforms
Validation Cases
# of Candidate Patches
100 + 20 = 120
50 + 5 = 55
25 + 10 = 35
60Slide61
A Suboptimal Set
…
…
(E op
1
null) op
2
C
C
op
2
:
{ ||,
&&
}
op
1
: {==,
!=}
E == null ||
C
C
if (
E != null
)
{ S; }
S
S’
S
S’
is
a new statement
with less than AST 10 nodes.
Transforms
Validation Cases
#
of Candidate Patches
400+100 = 500
200 + 50 = 250
100 + 25 = 125
61Slide62
Another Suboptimal Set
…
…
(E op
1
null) op
2
C
C
op
2
:
{ ||,
&&
}
op
1
: {==,
!=}
E == null ||
C
C
if (
E != null
)
{ S; }
S
S’
S
S’
is
a new statement
with less than AST 10 nodes.
Transforms
Validation Cases
#
of Candidate Patches
>10
5
>10
5
>10
5
62Slide63
Navigate the Tradeoff
Maximize
: The number of covered validation
cases
Coverage
: At least one of selected transform can generates correct patches for a validation case.
Tractability: The total number of candidate patches of each covered case is less than a threshold
.
63Slide64
ILP Formulation
Gi,j is 0 or 1, which denotes whether the
j-
th
transform generates the
i-th
validation case.Ci,j is
the number of candidate patches the j-th transform would generate if applied to the
i-th case.Variables: xi
and yi are 0 or 1.xi
denotes whether the result space covers i-th validation patch.
yi denotes whether to select the i-th transform
Objective:Coverage Constraints:Tractability Constraints:
64Slide65
Greedy Sampling Algorithm
Generalization
Transform
1
Transform
2
Transform
…
…
…
Work List of
Candidate Subsets
Iterative Greedy Sampling
Training Human Patches
Evaluate with
Validation Patches
65Slide66
Greedy Sampling Algorithm
Generalization
Transform
1
Transform
2
Transform
…
…
…
Work List of
Candidate Subsets
Iterative Greedy Sampling
Training Human Patches
Evaluate with
Validation Patches
Produce a ranked list of promising subsets and transforms!
66Slide67
Select Transforms as Search Space
Transform
1
Transform
2
Transform
k
…
Ranked List of Candidate Transforms
Integer Linear Program
Search Space as Selected Transforms
67Slide68
Genesis: Evaluation
68Slide69
Dataset Collection
20 NPE bugs with
testcases
13 OOB bugs with
testcases
The rest of 483 NPE bugs
The rest of 199 OOB bugs
Filter out revisions we cannot compile.
Scan keyword in commit logs to identify null pointer exception (NPE) bugs, out of bound (OOB) bugs, and class cast
error (CCE
) bugs
.
Run JUnit
testcases
in the repository.
Identify bugs that can be exposed with the
testcases
16 CCE bugs with
testcases
The rest of 287 CCE bugs
All Revisions in Top 1000
Github
Java Projects & MUSE Java Corpus
503 NPE bugs
212 OOB bugs
303 CCE bugs
69Slide70
Dataset Collection
All Revisions in Top 1000
Github
Java Projects & MUSE Java Corpus
503 NPE bugs
212 OOB bugs
20 NPE bugs with
testcases
13 OOB bugs with
testcases
The rest of 483 NPE bugs
The rest of 199 OOB bugs
Training
Dataset
The result dataset is from more than
3
50 different
projects
!
303 CCE bugs
16 CCE bugs with
testcases
The rest of 287 CCE bugs
Testing
Benchmark
70Slide71
Infer Search Space
We implemented Genesis for Java programsRun Genesis to infer search spaces for:
Null-pointer exception (NPE)
Index out-of-bound (OOB)
Class cast exceptions (CCE)
Genesis infers NPE, OOB, and CCE spaces in 24h, 20h, and 14h, respectively.
Run Genesis on 20 NPE errors, 13 OOB errors, and 16 CCE errors in the test benchmark set.
71Slide72
Inferred NPE Transform Examples
C
E != null &&
C
E == null ||
C
C
if (
E == null
)
return
K
; SS
if (
E == null
)
continue; S
S
if (
E == null
)
throw new Ex(); S
S
if (
E != null
) { S; }
S
if (
E != null)
{ S1; …; Sn; }
S1; …; Sn;
(
A == null
) ? c : A.B;
A.B
B.equals
(A)
A.equals(B)
Conjoins or Disjoins a Clause
If Guarded Control Flow
If Guarded Exception Throw
Guard Existing Statements
Switch Invocation Order
Add Conditional Expression
72Slide73
Results for 20 NPE Errors
Repository
Revision
Result
Repository
Revision
Result
caelumn-stella
2ec5459
✔★
jongo
f46f658
caelumn-stella
2d2dd9c
✔★
Dataflow JavaSDK
c06125d
✔★
caelumn-stella
e73113f
✔★
webmagic
ff2f588
HikariCP
ce4ff92
✔
javapoet
70b38e5
nuts
80e85d0 closure-compiler
9828574
✔★
spring-data-restaa28aeb
✔★
truth
99b314e
checkstyle
8381754
✔★
error-prone3709338
✔★
checkstyle
536bc20
✔★
javaslang
faf9ac2
✔★
checkstyle
aaf606e
Activiti
3d624a5
✔★
checkstyle
aa829d4
spring-hateoas
48749e7
✔★
✔: Generate correct patch
★: One of the top 3 generated patches is correct
73Slide74
Inferred OOB Transform Examples
C
A.length
== 0 ||
C
A.length
!= 0 ||
C
C
Conjoins a Clause
If Guarded Control Flow
if
(
E < 0
)
return K; S
S
if (
E < 0
)
continue; S
S
if (
E < 0) break;
SS
Modifications for Off-by-one
C
A.length
==
idx || C
A.length
!=
idx || C
C
E
+ 1
E
A < B
A <= B
A >= B
A > B
C
E >= 0 &&
C
v > 0 ? E : c
E
Add Conditional Expression
Check Array Size
Check Index Off-by-one
74Slide75
Results for 13 OOB Errors
Repository
Revision
Result
Repository
Revision
Result
Bukkit
a91c4c6
✔★
named-
regexp
82bdfeb RoaringBitmap
29c6d59
jgit
929862f
✔★
commons-
lang
52b46e7
jPOS
df400ac
✔★
HdrHistogram
db18018
httpcore
dd00a9e
✔★
spring-hateoas
29b4334
vectorz
2291d0d
✔wicket
b708e2b
✔
maven-shared
77937e1
coveralls-maven-plugin
20490f6
✔
: Generate correct patch
★
: One of the top 3 generated patches is correct
75Slide76
Inferred CCE Transform Examples
Change Cast Type
Add “
instanceof
" Clause
Change Variable Type
Add Try-catch Block
C
E
instanceof
T &&
C
!(E instanceof
T) || CC
T1 v
T2
v
for (
T2
v : E)
for (T1 v : E)
where T2 is a super class of T1
v = ((T1) E).
func
()
v = ((
T2
) E).
func
()
where T2 is a super class of T1
try
{ S; } catch (…
) { }S
try
{ S1; …; Sn; }
catch (...) { }
S1; …; Sn;
Call Function to Convert Expression
E.
func
()
E
func
(E)
E
76Slide77
Results for 16 CCE Errors
Repository
Revision
Result
Repository
Revision
Result
jade4j
dd47397
✔★
htmlelements
bf3f275
✔
jade4j
114e886
✔★
spring-cloud-connectors
56c6eca
HdrHistogram
030aac1
✔
joinmo
a5ee885
pdfbox
93c0b69
buildergenerator
d9d73b3
tree-root
fef0f36
mybatis-3
809c35d
spoon
48d3126
antlr4
9e7b131
pebble
942aa6e
hamcrest
-bean
84586d9
✔★
fastjson
c886874
raml
-java-parser
49aab8f
✔
: Generate correct patch
★
: One of the top 3 generated patches is correct
77Slide78
Comparison with PAR
# of cases with correct patches
78Slide79
Conclusion
The Genesis inferred transforms and search spaces enable the patch generation system to successfully repair three common domains of Java errorsConsistently outperforms manual transform rules
The
Genesis inference algorithm provides a systematic way to summarize useful human software engineering
patterns
79