/
Multi-Abstraction  Concern Localization Multi-Abstraction  Concern Localization

Multi-Abstraction Concern Localization - PowerPoint Presentation

interviewpsych
interviewpsych . @interviewpsych
Follow
342 views
Uploaded On 2020-06-15

Multi-Abstraction Concern Localization - PPT Presentation

TienDuy B Le Shaowei Wang and David Lo School of Information Systems Singapore Management University 1 Motivation Concern Localization Locating code units that match a text descriptions ID: 777831

level abstraction hierarchy topic abstraction level topic hierarchy multi vector model topics concern document space models 100 number word

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Multi-Abstraction Concern Localization" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Multi-Abstraction Concern Localization

Tien-Duy B. Le, Shaowei Wang, and David LoSchool of Information SystemsSingapore Management University

1

Slide2

Motivation

Concern LocalizationLocating code units that match a text descriptionsText descriptions: bug reports or feature requestsCode units: classes or methods’ source codeDocuments are compared Based on words (IR) or topics (topic modeling) that they contain

compared at

one level of abstraction i.e. word/topic level

2

Slide3

Motivation

A word can be abstracted at multiple levels of abstraction. 3

Eindhoven

North Brabant

Netherlands

Western Europe

European Continent

Level 1

Level 2

Level 3

Level N

Slide4

Multi-Abstraction Concern Localization

4

Level 1

Level 2

Level 3

Level N

Level 1

Level 2

Level 3

Level N

Source Code

Bug Report or

Feature Request

compare

Slide5

Multi-Abstraction Concern Localization

Locating code units that match a textual descriptionsBy comparing documents at multiple abstraction levels.By leveraging multiple topic models3 main components

Text preprocessing

Hierarchy creation

Multi-abstraction retrieval technique

5

Slide6

Abstraction

Hierarchy

Method Corpus

 

Concerns

Preprocessing

Hierarchy

Creation

Level 1

Level 2

Level N

….

Standard Retrieval Technique

+

 

Multi-Abstraction Retrieval

Ranked

Methods

Per Concern

 

Overall framework

6

Slide7

Hierarchy Creation

We apply Latent Dirichlet Allocation (LDA) a number of timesLDA (with default setting) acceptsNumber of topics KA set of documentsLDA returnsK

topics, each is a distribution of words

Probability of topic

t to appear in document d

7

Slide8

Hierarchy Creation

Each application of LDA creates a topic model with K topicsAssigned to a documentCorresponds to an abstraction levelAbstraction hierarchy of height

L

Height = number of topic models Created by

L

LDA applications

8

Slide9

Multi-Abstraction Vector Space Model

Multi-Abstraction Vector Space Model (VSM)Standard VSM + Abstraction HierarchyIn standard Vector Space ModelDocument is represented as a vector of weightsEach element corresponds to a wordIts value is the weight of the word

Term frequency-inverse document frequency (

tf-idf

)

9

Slide10

Multi-Abstraction Vector Space Model

We extend document vectorsAdded elements:Topics of topic models in the abstraction hierarchyTheir values are the probabilities of the topics to appear in the documentsExample:Document vector has length of 10

Abs. hierarchy has 3 topic models of size 50,100,150

Extended document vector is of size:

10+ (50+100+150) = 310

10

Slide11

Experiments

Dataset:285 AspectJ faulty versions extracted from iBugsEvaluation Metric:Mean Average Precision (MAP)

11

Hierarchies

Number

of Topics

H1

50

H2

50,

100

H3

50, 100, 150

H4

50,

100, 150, 200

Slide12

Empirical Result

MAPImprovement Over Baseline

Baseline (VSM)

0.0669

N/A

H1

0.0715

6.82%

H2

0.0777

16.11%

H3

0.0787

17.65%

H40.0799

19.36%

The MAP improvement of H4 is 19.36%

The MAP is improved when the height of the abstraction hierarchy is increased

12

Slide13

Empirical Result

Improvement(p)H1H2

H3

H4

21

27

30

30

25

22

25

22

18

14

12

11

113

6442

41

108158

176

181

Number of concerns with various Improvements:

 The improvements are positive for most of the concerns

13

Slide14

Conclusion

We propose a multi-abstraction concern localization frameworkWe also propose a multi-abstraction vector space modelOur experiments on 285 AspectJ bugs show that MAP improvement is up to 19.36%14

Slide15

Future work

Extend experiments by investigating:Different numbers of topics in each level of the hierarchy Different hierarchy heightsDifferent topic models

Topic

Model

Word Ordering

Word

Correlation

Latent

Dirichlet

Allocation

Bag of Words

No

Pachinko Allocation Model

Bag of Words

YesSyntactic Topic Model

Sequence of WordsNo

15

Slide16

Future work

Analyze the effects of document lengths:For different number of topicsFor different hierarchy heightsExperiment with Panichella et al. ‘s method [1] to infer good LDA configurations for our approach[1] A.

Panichella

, B.

Dit, R.Oliveto, M.D.

Penta

, D.

Poshyvanyk

, and A.D Lucia.

How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms

. (

ICSE 2013)

16

Slide17

17

Thank you!

Questions

? Comments? Advice?

{btdle.2012, shaoweiwang.201,

davidlo

}@smu.edu.sg