/
On Abstraction Refinement for Program Analyses in  Datalog Xin Zhang On Abstraction Refinement for Program Analyses in  Datalog Xin Zhang

On Abstraction Refinement for Program Analyses in Datalog Xin Zhang - PowerPoint Presentation

test
test . @test
Follow
343 views
Uploaded On 2019-11-03

On Abstraction Refinement for Program Analyses in Datalog Xin Zhang - PPT Presentation

On Abstraction Refinement for Program Analyses in Datalog Xin Zhang Ravi Mangal Mayur Naik Georgia Tech Radu Grigore Hongseok Yang Oxford University 6102014 2 Datalog for program a ID: 762663

abs path language edge path abs edge language implementation design iteration query abstractions eliminated constraints program 2014 abstraction datalog

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "On Abstraction Refinement for Program An..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

On Abstraction Refinement for Program Analyses in Datalog Xin Zhang, Ravi Mangal, Mayur NaikGeorgia Tech Radu Grigore, Hongseok YangOxford University

6/10/20142 Datalog for program analysisDatalogProgramming Language Design and Implementation, 2014

6/10/20143 What is Datalog?DatalogProgramming Language Design and Implementation, 2014

6/10/20144 What is Datalog?DatalogInput relations:Output relations:Rules:Programming Language Design and Implementation, 2014 Least fixpoint computation:edge(i, j).path(i, j).(1) path(i, i).(2) path(i, k) :- path(i, j), edge(j, k).Input: edge(0, 1), edge(1, 2).path(0, 0).path(1, 1).path(2, 2).path(0, 1) :- path(0, 0), edge(0, 1).path(0, 2) :- path(0, 1), edge(1, 2).

6/10/20145 Why Datalog?Programming Language Design and Implementation, 2014DatalogIf there exists a path from a to b, and there is an edge from b to c, then there exists a path from a to c: path(a, c) :- path(a, b), edge(b, c).

6/10/2014 6Why Datalog?Programming Language Design and Implementation, 2014k-object-sensitivity,k = 2, ~100KLOC

6/10/20147 LimitationProgramming Language Design and Implementation, 2014k-object-sensitivity,k = 2, ~100KLOCk-object-sensitivity,k = 10, ~500KLOC

6/10/20148 Program abstractionProgramming Language Design and Implementation, 2014PrecisionScalabilityAbstraction

6/10/20149 Parametric program abstractionProgramming Language Design and Implementation, 2014PrecisionScalabilityAbstraction11 111

6/10/201410 Parametric program abstractionProgramming Language Design and Implementation, 2014PrecisionScalabilityAbstraction10 110

6/10/201411 Parametric program abstraction: Example 1Programming Language Design and Implementation, 201410110 Cloning depth K for each call site and allocation sitePointer Analysis

6/10/201412 Parametric program abstraction: Example 2Programming Language Design and Implementation, 201410110 Predicates to use as abstraction predicatesShape Analysis

6/10/201413 Program abstractionProgramming Language Design and Implementation, 201410110 01100Datalog ProgramDatalog Program alias(p, q)? a lias(m, n )?

6/10/201414 Program abstractionProgramming Language Design and Implementation, 2014101100 1100Datalog ProgramDatalog Programalias(p, q)? alias(m, n )? Counterexample guided refinement (CEGAR) via MAXSAT

6/10/201415 Pointer analysis exampleProgramming Language Design and Implementation, 2014f(){ v1 = new ...; v2 = id1(v1); v3 = id2(v2); q2:assert(v3!= v1);}id1(v){return v;}g(){ v4 = new ...; v5 = id1( v4 ); v6 = id2( v5 ); q1 : assert ( v6 != v1 ); } id2(v ){return v ;}

6/10/201416 Pointer analysis as graph reachabilityProgramming Language Design and Implementation, 2014012345 6’766’’7’7’’ a 1 a 0 b 0 c 1 c 0 d 0 d 1 a 1 c 1 c 0 b 0 b 1 d 1 d 0 a 0 b 1

6/10/201417 Graph reachability in DatalogProgramming Language Design and Implementation, 2014Input relations: edge(i, j, n), abs(n)Output relations: path(i, j)Rules: (1) path(i, i).(2) path(i, j) :- path(i, k), edge(k, j, n), abs(n).012 3456’766’’7’ 7’’ a 1 a 0 b 0 c 1 c 0 d 0 d 1 a 1 c 1 c 0 b 0 b 1 d 1 d 0 a 0 b 1 Input tuples: edge(0, 6, a 0 ), edge(0, 6’, a 1 ), edge(3, 6, b 0 ), … abs(a 0 ) abs(a 1 ) , abs(b 0 ) abs(b 1 ) , abs(c 0 ) abs(c 1 ) , abs(d 0 ) abs(d 1 ) .   Query Tuple Original Query q 1 : path(0, 5) assert ( v6 != v1 ) q 2 : path(0, 2) assert ( v3 != v1 ) 16 possible abstractions in total

6/10/201418 Desired resultProgramming Language Design and Implementation, 20140123456’ 767’ 6’’ 7’’ a 1 b 0 c 1 d 0 a 1 c 1 b 0 d 0 a 0 c 0 d 1 c 0 b 1 d 1 a 0 b 1 Input tuples: edge(0, 6, a 0 ), edge(0, 6’, a 1 ), edge(3, 6, b 0 ), … Query Answer q 1 : path(0, 5) a 1 b 0 c 1 d 0 q 2 : path(0, 2) Impossibility abs(a 0 ) abs(a 1 ) , abs(b 0 ) abs(b 1 ) , abs(c 0 ) abs(c 1 ) , abs(d 0 ) abs(d 1 ) .   Input relations: edge( i , j, n), abs(n) Output relations: path( i , j) Rules: (1) path( i , i ). (2) path( i , j) :- path( i , k), edge(k, j, n), abs(n ).

6/10/201419 Iteration 1Programming Language Design and Implementation, 20140123457 6 6’ 7’ 6’’ 7’’ b 0 d 0 b 0 d 0 a 0 c 0 c 0 a 0 a 1 c 1 a 1 c 1 d 1 b 1 d 1 b 1 Query Eliminated Abstractions q 1 : path(0, 5) q 2 : path(0, 2) abs(a 0 ) abs(a 1 ) , abs(b 0 ) abs(b 1 ) , abs(c 0 ) abs(c 1 ) , abs(d 0 ) abs(d 1 ) .   path (0, 0). path(0, 6) :- path(0, 0), edge(0, 6, a 0 ), abs( a 0 ) . path(0, 1) :- path(0, 6), edge(6, 1, a 0 ), abs(a 0 ) . path(0, 7) :- path(0, 1), edge(1, 7, c 0 ), abs(c 0 ) . path(0, 2) :- path(0, 7), edge(7, 2, c 0 ), abs(c 0 ) . path(0, 4) :- path(0, 6), edge(6, 4, b 0 ), abs(b 0 ) . path(0, 7) :- path(0, 4), edge(4, 7, d 0 ), abs(d 0 ) . path(0, 5) :- path(0, 7), edge(7, 5 , d 0 ), abs(d 0 ) . …

6/10/201420 Iteration 1 - derivation graphProgramming Language Design and Implementation, 20140123457 6 6’ 7’ 6’’ 7’’ b 0 d 0 b 0 d 0 a 0 c 0 c 0 a 0 a 1 c 1 a 1 c 1 d 1 b 1 d 1 b 1 abs(a 0 ) abs(a 1 ) , abs(b 0 ) abs(b 1 ) , abs(c 0 ) abs(c 1 ) , abs(d 0 ) abs(d 1 ) .   Query Eliminated Abstractions q 1 : path(0, 5) q 2 : path(0, 2)

6/10/201421 Iteration 1 - derivation graphProgramming Language Design and Implementation, 2014abs(d0)path(0,6)edge(6,1,a0)edge(6,4,b0)path(0,1)path(0,4)abs(c0) edge(1,7,c0)edge(4,7,d0)path(0,7)edge(7,2,c0)edge(7,5,d0)path(0,2)path(0,5) abs(a 0 ) edge(0,6,a 0 ) path(0,0) abs(a 0 ) abs(b 0 ) abs(c 0 ) abs(d 0 )

6/10/201422 Iteration 1 - derivation graphProgramming Language Design and Implementation, 2014abs(d0)path(0,6)edge(6,1,a0)edge(6,4,b0)path(0,1)path(0,4) abs(c0)edge(1,7,c0)edge(4,7,d0)path(0,7)edge(7,2,c0)edge(7,5,d0)path(0,2)path(0,5) abs(a 0 ) edge(0,6,a 0 ) path(0,0) abs(a 0 ) abs(b 0 ) abs(c 0 ) abs(d 0 ) a 0 c 0  

6/10/201423 Iteration 1 - derivation graphProgramming Language Design and Implementation, 2014abs(d0)path(0,6)edge(6,1,a0)edge(6,4,b0)path(0,1)path(0,4) abs(c0)edge(1,7,c0)edge(4,7,d0)path(0,7)edge(7,2,c0)edge(7,5,d0)path(0,2)path(0,5) abs(a 0 ) edge(0,6,a 0 ) path(0,0) abs(a 0 ) abs(b 0 ) abs(c 0 ) abs(d 0 ) a 0 c 0  

6/10/201424 Iteration 1 - derivation graphProgramming Language Design and Implementation, 2014abs(d0)path(0,6)edge(6,1,a0)edge(6,4,b0)path(0,1)path(0,4)abs(c 0)edge(1,7,c0)edge(4,7,d0)path(0,7)edge(7,2,c0)edge(7,5,d0)path(0,2)path(0,5) abs(a 0 ) edge(0,6,a 0 ) path(0,0) abs(a 0 ) abs(b 0 ) abs(c 0 ) abs(d 0 ) a 0 b 0 d 0  

6/10/201425 Iteration 1 - derivation graphProgramming Language Design and Implementation, 20140123457 6 6’ 7’ 6’’ 7’’ b 0 d 0 b 0 d 0 a 0 c 0 c 0 a 0 a 1 c 1 a 1 c 1 d 1 b 1 d 1 b 1 abs(a 0 ) abs(a 1 ) , abs(b 0 ) abs(b 1 ) , abs(c 0 ) abs(c 1 ) , abs(d 0 ) abs(d 1 ) .   Query Eliminated Abstractions q 1 : path(0, 5) a 0 c 0 d 0 , a 0 b 0 d 0 (3/16 ) q 2 : path(0, 2) a 0 c 0 (4/16) Query Eliminated Abstractions q 1 : path(0, 5) q 2 : path(0, 2)

6/10/201426 Encoded as MAXSATProgramming Language Design and Implementation, 2014MAXSAT( Find thatMaximize Subject to   Hard Constraints Soft Constraints

6/10/201427 Encoded as MAXSATProgramming Language Design and Implementation, 2014abs(a0)abs(a1), abs(b0)abs(b1),abs(c0)abs(c1), abs(d0 )abs(d1). Hard constraints: …   Soft constraints:   Avoid all the counterexamples Minimize the abstraction cost

6/10/201428 Encoded as MAXSATProgramming Language Design and Implementation, 2014Hard constraints: …   Soft constraints:   Query Eliminated Abstractions q 1 : path(0, 5) a 0 c 0 d 0 , a 0 b 0 d 0 (3/16 ) q 2 : path(0, 2) a 0 c 0 (4/16) Query Eliminated Abstractions q 1 : path(0, 5) q 2 : path(0, 2) Solution:   a 1 c 0 d 0  

6/10/201429 Iteration 2 and beyondProgramming Language Design and Implementation, 2014a1c0d0 Iteration 1Derivation  QueryAnswerEliminated Abstractionsq1: path(0, 5)a0c0d0, a0b0d0, a1 d 0 (3/16 ) q 2 : path(0, 2) a 0 c 0 , a 1 c 0 (4/16) Query Answer Eliminated Abstractions q 1 : path(0, 5) q 2 : path(0, 2)      

6/10/201430 Iteration 2 and beyondProgramming Language Design and Implementation, 2014a1c0d0 Iteration 2Derivation  QueryAnswerEliminated Abstractionsq1: path(0, 5)a0c0d0, a0b0d0, a1 d 0 (3/16 ) q 2 : path(0, 2) a 0 c 0 , a 1 c 0 (4/16) Query Answer Eliminated Abstractions q 1 : path(0, 5) q 2 : path(0, 2)      

6/10/201431 Iteration 2 and beyondProgramming Language Design and Implementation, 2014a1c0d0 Iteration 2Derivation  QueryAnswerEliminated Abstractionsq1: path(0, 5)a0c0d0, a0b0d0, a1 d 0 (3/16 ) q 2 : path(0, 2) a 0 c 0 (4/16) Query Answer Eliminated Abstractions q 1 : path(0, 5) q 2 : path(0, 2)      

6/10/201432 Iteration 2 and beyondProgramming Language Design and Implementation, 2014a1c1d0 Iteration 2Derivation  QueryAnswerEliminated Abstractionsq1: path(0, 5)a0c0d0, a0b0d0, a1 c 0 d 0 (5/16 ) q 2 : path(0, 2) a 0 c 0 , a 1 c 0 (8/16) Query Answer Eliminated Abstractions q 1 : path(0, 5) q 2 : path(0, 2)      

Query AnswerEliminated Abstractionsq1: path(0, 5)a0c0d0, a0b0d0, a1c0d0 (5/16)q2: path(0, 2) a0c0, a1c0 (8/16) QueryAnswer Eliminated Abstractions q 1 : path(0, 5) q 2 : path(0, 2) 6/10/2014 33 Iteration 2 and beyond Programming Language Design and Implementation, 2014 a 1 c 1 d 0   Iteration 3 q 1 is proven. a 1 c 1 d 0   Derivation        

     6/10/201434Iteration 2 and beyondProgramming Language Design and Implementation, 2014a1c 1d0   Query Answer Eliminated Abstractions q 1 : path(0, 5) a 0 c 0 d 0 , a 0 b 0 d 0 , a 1 c 0 d 0 (5/16 ) q 2 : path(0, 2) a 0 c 0 , a 1 c 0 , a 1 c 1 , a 0 c 1 (16/16) Query Answer Eliminated Abstractions q 1 : path(0, 5) q 2 : path(0, 2) Iteration 3 q 2 is impossible to prove. Impossibility Derivation   q 1 is proven. a 1 c 1 d 0  

6/10/201435 Mixing counterexamplesProgramming Language Design and Implementation, 2014Iteration 1Iteration 3a0c0  a1c1 Eliminated Abstractions:

6/10/2014 36Mixing counterexamplesProgramming Language Design and Implementation, 2014Iteration 1Iteration 3a0c0 a1c1  a0c1 Mixed!Eliminated Abstractions:

Implemented in JChord using off-the-shelf solvers: Datalog: bddbddbMAXSAT: MiFuMaXApplied to two analyses that are challenging to scale:k-object-sensitivity pointer analysis:flow-insensitive, weak updates, cloning-basedtypestate analysis:flow-sensitive, strong updates, summary-basedEvaluated on 8 Java programs from DaCapo and Ashes.6/10/201437Experimental setupProgramming Language Design and Implementation, 2014

6/10/201438 Benchmark characteristicsProgramming Language Design and Implementation, 2014classesmethodsbytecode(KB)KLOCtoba-s1K6K423258javasrc-p1K6.5K434265weblech1.2K8K504326 hedc1K7K442283antlr1.1K7.7K532303luindex1.3K7.9K508295lusearch1.2K8K511314schroeder-m1.9k12K708460

6/10/201439 Results: pointer analysisProgramming Language Design and Implementation, 2014queriesabstraction sizeiterationstotalresolvedcurrentbaselinefinalmaxtoba-s77 017018K10javasrc-p4646047018K13weblech55214031K10hedc4747673029K18antlr143 1435 970 29K 15 luindex 138 138 67 1K 40K 26 lusearch 322 322 29 1K 39K 17 schroeder -m 51 51 25 450 58K 15 4-object-sensitivity < 50% < 3% of max

6/10/201440 Performance of Datalog: pointer analysisProgramming Language Design and Implementation, 2014lusearchk = 1, 153sk = 2, 214sk = 4, 3h28mk = 3, 590sBaseline

6/10/201441 Performance of MAXSAT: pointer analysisProgramming Language Design and Implementation, 2014lusearch

6/10/201442 Statistics of MAXSAT formulaeProgramming Language Design and Implementation, 2014pointer analysisvariablesclausestoba-s0.7M1.5Mjavasrc-p0.5M0.9Mweblech1.6M3.3Mhedc 1.2M2.7Mantlr3.6M6.9Mluindex2.4M5.6Mlusearch2.1M5Mschroeder-m6.7M23.7M

6/10/201443 ConclusionProgramming Language Design and Implementation, 2014AbstractionDatalogMAXSAT

6/10/201444 ConclusionProgramming Language Design and Implementation, 2014DatalogSoundnessTradeoffsMAXSATHard ConstraintsSoft Constraints Scalability vs. Precision Sound vs. Complete …A(x, y):- B(x, z), C(z, y)   1 0 1 1 0  

Related work6/10/2014 Our approach:Any parametric analysis written in Datalog.Optimum abstraction.Impossibility.Cannot disprove queries.All counterexamples within and across iterations.Multiple queries simultaneously.SLAM/BLAST/Yogi:Predicate abstraction-basedanalysis.Cheap enough abstraction.May diverge.Concrete counterexamples.One counterexample per iteration.Single query at a time.45Programming Language Design and Implementation, 2014

6/10/201446 Future work-1Programming Language Design and Implementation, 2014Hard constraints: ……   Soft constraints:   Cost Optimum Abstraction Early Convergence

6/10/201447 Future work-2Programming Language Design and Implementation, 2014Hard constraints: ……   Soft constraints:   Precision vs. Scalability Soundness vs. Completeness

6/10/201448 Future work-2Programming Language Design and Implementation, 2014Soft constraints: ……   Precision vs. Scalability Soundness vs. Completeness