Ehsan Salamati Taba Foutse Khomh Ying Zou Meiyappan Nagappan Ahmed E Hassan 1 2 Predict Bugs Model Code Antipatterns 3 Past Defects History of Churn Zimmermann Hassan et al ID: 790372
Download The PPT/PDF document "Predicting Bugs Using Antipatterns" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Predicting Bugs Using Antipatterns
Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan
1
Slide22
Slide3Predict BugsModel
Code
Antipatterns
3
Past Defects, History of Churn (Zimmermann, Hassan et al.)
Topic Modeling (Chen et al.)
Slide44
Slide5not technically incorrect and don't prevent a system from functioning
weaknesses in design5
Antipatterns
Slide6Indicate a deeper problem in the system
6
Slide7Antipatterns indicate weaknesses in the design that may increase the risk for bugs in the future. (Fowler 1999)
Motivation7
Slide8There is not a lot of refactoring activities when developing a system. (Olbrich et al.)
Motivation8
Slide9Approach
CVS Repository
Mining Source Code Repositories
Detecting
Antipatterns
Mining Bug Repositories
Bugzilla
Calculating Metrics
Analyzing
RQ1
RQ2
RQ3
9
Slide1010SystemsRelease(#)Churn
LOCsEclipse2.0 - 3.3.1(12)148,45426,209,669ArgoUML
0.12 - 0.26.2(9
)
21,427
2,025,730
Studied Systems
Studied Systems
Mining Source Code Repositories
Slide11Approach
CVS Repository
Mining Source Code Repositories
Detecting
Antipatterns
Mining Bug Repositories
Bugzilla
Calculating Metrics
Analyzing
RQ1
RQ2
RQ3
11
Slide12Detecting Antipatterns
1213 different antipatternsDECOR (Moha et al.)# of Antipatterns
# Files
Systems
#
Antipatterns
Eclipse
273,766
ArgoUML
15,100
Slide13Approach
CVS Repository
Mining Source Code Repositories
Detecting
Antipatterns
Mining Bug Repositories
Bugzilla
Calculating Metrics
Analyzing
RQ1
RQ2
RQ3
13
Systems
#Post Bugs
#Pre Bugs
Eclipse
27,406
23,554
ArgoUML
2,549
2,569
Slide14Research QuestionsRQ1: Do antipatterns affect the density of bugs in files?RQ2:
Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics?RQ3: Can we improve traditional bug prediction models with antipatterns information?14
Slide15RQ1: Do antipatterns affect the density of bugs in files?
Null HypothesisDensity of bugs in the files with antipatterns and the other files without antipatterns is the same.15
Wilcoxon rank sum test
Slide1616SystemsReleases(#)D
A – DNA> 0p-value<0.05Eclipse1288ArgoUML966
Files with
Antipatterns
Density of Bugs
Files without
Antipatterns
Density of Bugs
RQ1:
Do
antipatterns
affect the density of bugs in files?
Slide17Research QuestionsRQ1: Do antipatterns
affect the density of bugs in files?RQ2: Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics?RQ3: Can we improve traditional bug prediction models with antipatterns information?17
Slide18RQ2: Metrics Average Number of Antipatterns
(ANA) Antipattern Cumulative Pairwise Differences (ACPD)18
Antipattern Recurrence Length(ARL)
Antipattern
Complexity Metric (ACM)
Slide1919
1.0
2
.0
3
.0
4
.0
5
.0
6
.0
a.java
b
.java
c
.java
3
4
0
2
1
3
4
5
1
0
0
3
0
6
5454ANA(a.java) =2.16, ARL(a.java) = 18.76, ACPD(a.java) = 0RQ2: Example
Slide2020
Slide2121
Provide
additional explanatory power over traditional
metrics
ARL shows the biggest improvement
Slide22Research QuestionsRQ1: Do antipatterns affect the density of bugs in files?
RQ2: Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics?RQ3: Can we improve traditional bug prediction models with antipatterns information?22
Slide23RQ3: Can we improve traditional bug prediction models with
antipatterns information?Intra System ModelsStep-wise analysisRemoving Independent VariablesCollinearity Analysis
23
Metric name
Description
LOC
Source lines of codes
MLOC
Executable lines of codes
PAR
Number of parameters
NOF
Number of attributes
NOM
Number of methods
NOC
Number of children
VG
Cyclomatic
complexity
DIT
Depth of inheritance tree
LCOM
Lack of cohesion of methods
NOT
Number of classes
WMC
Number of weighted methods per class
PRE
Number of pre-released bugs
Churn
Number of lines of code addedmodified or deleted
Slide2424
ARL remained statistically significant and had a low
collinearity
with other metrics
# Versions
# Versions
Slide25RQ3: Can we improve traditional bug prediction models with
antipatterns information?F-measure25
ARL can improve cross-system bug prediction on the two studied systems
Slide26Slide27Backup Slides
27
Slide2828
1.0
2
.0
3
.0
4
.0
5
.0
6
.0
a.java
b
.java
c
.java
3
4
0
2
1
3
4
5
1
0
0
3
0
6
5454ANA(a.java) =2.16, ARL(a.java) = 18.76, ACPD(a.java) = 0RQ2) Example
Slide2929Anti Singleton
BlobClass Data Should be Private
Complex
Class
Large Class
Lazy Class
LPL
Long Method
Message Chain
RPB
Spaghetti
Code
SG
SwissArmy
Knife
-
-
Slide3030
Slide31RQ1) Do
antipatterns affect the density of bugs in files?HypothesisThere is no difference between the density of future bugs of the files with antipatterns and the other files without antipatterns.Wilcoxon rank sum testWe perform a Wilcoxon rank sum test to acceptor refuse the hypothesis, using the 5% level (i.e., p-value < 0:05).
Hypothesis
There is no difference between the density of future bugs of the files with
antipatterns
and the other files without
antipatterns
.
Wilcoxon rank sum
test
Findings
In general, the density of bugs in a file with
antipatterns
is higher than the density of bugs in
a file
without
antipatterns
.
31