/
A utomatic Model Generation from Documentation for Java API Functions A utomatic Model Generation from Documentation for Java API Functions

A utomatic Model Generation from Documentation for Java API Functions - PowerPoint Presentation

davies
davies . @davies
Follow
69 views
Uploaded On 2023-06-25

A utomatic Model Generation from Documentation for Java API Functions - PPT Presentation

Juan Zhai Jianjun Huang Shiqing Ma Xiangyu Zhang Lin Tan Jianhua Zhao Feng Qin Motivation App Libs Libraries Part of software behaviors As important as the software itself ID: 1003491

element index int model index element model int design deque size evaluation null head icse tree api inserts index1

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "A utomatic Model Generation from Documen..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Automatic Model Generation from Documentation for Java API Functions Juan Zhai, Jianjun Huang, Shiqing Ma, Xiangyu Zhang,Lin Tan, Jianhua Zhao, Feng Qin

2. MotivationAppLibsLibrariesPart of software behaviorsAs important as the software itselfChallengesBinary only, no source codeImplemented in different languagesComplex optimizationsMany more…...Previous works: generate models manuallyAutomatically Generate Models for Libraries

3. How to model library behaviors?public void add(int index, E element)Inserts the specified element at the specified position in this list.Shifts the element currently at that position (if any), and any subsequent elements to the right.Throws:IndexOutOfBoundsException: if the index is out of range (index<0||index>size())public void add(int index, E elememt){ if(index<0 || index>size()) throw new IndexOutOfBoundsException(); size = size + 1; for(int i=size-1; i>index; i--) elements[i] = elements[i-1]; elements[index] = element;}-Q: How do we know how to use libraries?-A: By reading API documents.-Q: Can we model libraries from API documents?-A: Yes. This is what we do.

4. Design: OverviewText AnalyzerModel GeneratorModel ValidatorTree TransformerJavadocSyntactic TreeTree Variants

5. Design: Text AnalyzerState-of-the-art work: Stanford ParserNot perfect because of the nature (e.g. ambiguity) of natural languages

6. the head ofthis deque,ornull if this deque is emptyReturns the head of this deque,OR returns the head of null if this deque is empty.Returns the head of this deque, or null if this deque is empty.Returnsthe head ofthis deque,ornull if this deque is emptyReturns the head of this deque,OR returns null if this deque is empty.ReturnsLift Up

7. Design: Tree Node Transformer Ambiguities of natural languagesK tree candidatesHighest-scoring tree is not always the correct oneToo much time to try all candidatesObservation:Caused by phrases starting with “, or” and “, and”Solution:Lift up & Push down “, ”, “or / and” and all their right siblings

8. Design: GeneratorsJavadocFormal DefinitionNatural LanguageJava code (Model)Package, class, function metadataFunction bodyCopy & PasteIR & Model generatorVariables, Structures, Operations

9. Design: IR Generator -- VariablesInternal variables: any namesParameters: identify from documentsadd(int index , E element)Inserts the specified element at the specified position in this list.Inserts element at index in this list.the specified $(word)

10. Design: IR Generator -- Program structuresSequentialDefaultLoop structurepluralssingular nouns modified by “each”the first/last occurrence  indicate the loop iteration orderConditional structureif/whenotherwise

11. Design: Model GeneratorTile the IR (tree)TLRReturns,orofotag:thisthe indextag:ltr-1tag:ifcontaintag:-othisthe index(tag:t)ofo1(tag:o2))throwo1o1(-)?containo2copyintoo1o2o(-)? isemptyreturn,oro1o2o3Tree Pattern

12. Design: Model GeneratorUnified data structure model: one-dimensional arrayPrimitive: tree pattern corresponds to a piece of code templateinsertelementindex(tag:this)atelements[o2] = o1inserto1o2(tag:this)atelements[index] = elementinsert

13. Design: Model GeneratorTile the IR (tree)TLRint index1 = -1;if(o == null){ for(int i=0; i<size; i++){ if(element[i] == null){ index1 = i; break; } }}else{ for(int i=0; i<size; i++){ if(o.equals(element[i])){ index1 = i; break; } }} Lint index2 = -1;if(o == null){ for(int i=0; i<size; i++){ if(element[i] == null){ index2 = i; break; } }}else{ for(int i=0; i<size; i++){ if(o.equals(element[i])){ index2 = i; break; } }} Rif(index2 == -1) return -1;else return index1;TReturns,orofotag:thisthe indextag:ltr-1tag:ifcontaintag:-othis

14. Design: Model ValidatorRandoop

15. Evaluation SetupHardwareCPU: Intel® i7-3770RAM: 8GBOperating systemUbuntu 12.04

16. Evaluation: Overall ResultClass# Total Methods# Modeled Methods% ArrayList342985.29%Vector544685.19%Stack6583.33%ArrayDeque363597.22%LinkedList424197.62%HashMap282382.14%LinkedHashMap151493.33%HashSet131292.31%LinkedHashSet5480.00%AttributeList151173.33%RoleList14964.29%RoleUnresolvedList14964.29%StringBuffer544074.07%StringBuilder544074.07%Summary39732682.12%

17. Evaluation: Cases that Cannot be HandledIncompleteness of API documentsadd(int index, Object element) in AttributeList lack of descriptions about the IndexOutOfBoundsException Describe one primitive behavior with several sentencesinsert(int index, Char[] str, int offset, int len) in StringBuffer Inserts the string representation of a subarray of the str array argument into this sequence. The subarray begins at the specified offset and extends len chars. The characters of the subarray are inserted into this sequence atthe position indicated by index

18. Evaluation: Static Taint AnalysisUndesirable information flowCompare paths found by using our model V.S. official JDKSet upAndroid: 96 appsSources: User inputSinks: Internet, Log

19. Evaluation: Static Taint AnalysisResultsthe same set of information leak warnings for both versions for almost all appsexcept app com.yes123.mobileCase: com.123yes.mobilOur model - 16 paths V.S. JDK – 14 pathsJava Native Interface (JNI) function call: toArray(object[]) -> System.arrayCopy()

20. Evaluation: Static Taint AnalysisEfficiency improvement distributionMaximum: ~50%Average: ~16%

21. Evaluation: Dynamic SlicingOur Models vs Naïve ModelsProgramsNaïve ModelsOur ModelsSlice SizeTimeSlice SizeTimeSPECJBB5641.763930.73FunkyJFilter6292.1852.16ListAppend705012795045.9Batik357211973.655516295.77Unit Test320.3930.36~32 times smaller dynamic slice size~17% performance improvement

22. Related WorkDocumentation AnalysisSarah[ICSE ‘16], Zhong [OOPSLA ’13, ASE ’09], Tan [ICST ’12, ICSE ’11, SOSP ’07] , Pandita [ICSE ’12], Sun [ICSE ’10], Sinha [ICST ’10], Runeson [ICSE ’07] , Henkel [TSE ’07]Environment ModelingJeon[ICSE ‘16], Merwe [SEN ’15], Ceccarello [SEN ’14], Palepu [ASE ’13], Qi [WCRE ’12], Cadar [OSDI ’08], Tkachuk [ASE ’03]

23. ConclusionIdea: modeling Java library from Java API documentsA combination of NLP and auto-testingAdvantagesExpected behaviors with simpler code (no JNI code, no other languages etc.)Helps many program analysis techniques

24.