# Optimal Query Processing Meets Information Theory PowerPoint Presentation

2018-10-06 8K 8 0 0

##### Description

Dan Suciu – University of Washington. Hung Ngo. Mahmoud Abo-Khamis . [PODS’2016]. [PODS’2017]. – . RelationalAI. Inc.. Basic Question. What is the optimal runtime to compute a query . Q. on a database . ID: 685331

Embed code:

DownloadNote - The PPT/PDF document "Optimal Query Processing Meets Informati..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

### Presentations text content in Optimal Query Processing Meets Information Theory

Slide1

Optimal Query Processing Meets Information Theory

Dan Suciu – University of Washington

Hung Ngo

Mahmoud Abo-Khamis

[PODS’2016]

[PODS’2017]

RelationalAI

Inc.

Slide2

Basic Question

What is the optimal runtime to compute a query Q on a database

D?Q, D are labeled hypergraphsProblem 1: list all occurrences in

Q in DProblem 2: check if there exists

Q in DData complexity: Q is fixed, runtime = f(

D)2

Slide3

Example Queries

3

Enumerate all labeled triangles:

R(X,Y)

S(Y,Z) ∧ T(Z,X)

∃x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u

)∧K(

u,x

)

Check if there exists a labeled 4-cycle

U

Z

X

Y

Z

X

Y

R

R

S

S

T

T

K

Slide4

Main Results

Problem 1: enumeration problem

Thm

D,(1) |Q(D)| ≤ Entropic-bound ≤ Polymatroid-bound

(2) Q(D) computable in time Õ(Polymatroid-bound)

Fix statistics for

D

(cardinalities, functional dependencies, max degrees)

Fix the query

Q

Problem 2: decision problem

Thm

D

,

Q(

D) is computable in time Õ(2submodular-width)

Tight, but openif computable

Computable,

but not tight

Optimal?

Slide5

Main Principle

Find information-theoretic

proof ofthe upper bound, or the submodular widthConvert proof

to algorithm

5

Slide6

Outline

Enumeration problemDecision problemConclusions

6

Slide7

Maximum Output Size

maxD satisfies stats (|

Q(D)|)E.g. R(X,Y)

∧ S(Y,Z), |R|, |

S| ≤ NNo other info: |Q(D)| ≤ N

2S.Y is a key: |Q(D)| ≤ NS.Y has degree ≤ d: |Q(D)| ≤

d×NE.g. R(X,Y)

S(Y,Z)

T(Z,X)

No other info: |Q(D)| ≤ N

3/2

7

Slide8

Maximum Output Size

maxD satisfies stats (|

Q(D)|)E.g. R(X,Y)

∧ S(Y,Z), |R|, |

S| ≤ NNo other info: |Q

(D)| ≤ N2S.Y is a key: |Q(D)| ≤ NS.Y has degree ≤ d:

|Q(D)| ≤ d×N

E.g. R(X,Y)

S(Y,Z)

T(Z,X)

No other info: |Q(D)| ≤ N

3/2

8

Slide9

Maximum Output Size

maxD satisfies stats (|

Q(D)|)E.g. R(X,Y)

∧ S(Y,Z), |R|, |

S| ≤ NNo other info: |Q

(D)| ≤ N2S.Y is a key: |Q(D)| ≤ N

S.Y has degree ≤ d: |Q(D)| ≤ d×N

E.g. R(X,Y)

S(Y,Z)

T(Z,X)

No other info: |Q(D)| ≤ N

3/2

9

Slide10

Maximum Output Size

maxD satisfies stats (|

Q(D)|)E.g. R(X,Y)

∧ S(Y,Z), |R|, |

S| ≤ NNo other info: |Q

(D)| ≤ N2S.Y is a key: |

Q(D)| ≤ NS.Y has degree ≤ d:

|Q(D)| ≤

d×N

E.g. R(X,Y)

S(Y,Z)

T(Z,X)

No other info: |Q(D)| ≤ N

3/2

10

Slide11

Maximum Output Size

maxD satisfies stats (|

Q(D)|)E.g. R(X,Y)

∧ S(Y,Z), |R|, |

S| ≤ NNo other info: |Q

(D)| ≤ N2S.Y is a key: |

Q(D)| ≤ NS.Y has degree ≤

d

:

|Q(D)| ≤

d×N

E.g. R(X,Y)

S(Y,Z)

T(Z,X)

No other info: |Q(D)| ≤ N

3/211

Slide12

Maximum Output Size

maxD satisfies stats (|

Q(D)|)E.g. R(X,Y)

∧ S(Y,Z), |R|, |

S| ≤ NNo other info: |Q

(D)| ≤ N2S.Y is a key: |

Q(D)| ≤ NS.Y has degree ≤

d

: |

Q

(

D

)| ≤

d

×

NE.g. R(X,Y)

∧ S(Y,Z) ∧

T(Z,X) No other info: |Q(D)| ≤ N3/2

12

Slide13

Maximum Output Size

maxD satisfies stats (|

Q(D)|)E.g. R(X,Y)

∧ S(Y,Z), |R|, |

S| ≤ NNo other info: |Q

(D)| ≤ N2S.Y is a key: |

Q(D)| ≤ NS.Y has degree ≤

d

: |

Q

(

D

)| ≤

d

×

NE.g. R(X,Y)

∧ S(Y,Z) ∧

T(Z,X) No other info: |Q(D)| ≤ N3/2

13

Slide14

Maximum Output Size

maxD satisfies stats (|

Q(D)|)E.g. R(X,Y)

∧ S(Y,Z), |R|, |

S| ≤ NNo other info: |Q

(D)| ≤ N2S.Y is a key: |

Q(D)| ≤ NS.Y has degree ≤

d

: |

Q

(

D

)| ≤

d

×

NE.g. R(X,Y)

∧ S(Y,Z) ∧

T(Z,X) No other info: |Q(D)| ≤ N3/2

14

Slide15

Background: Entropy, Polymatroid

Fix a set

X={X

1,…,Xk} and a function H: 2

X  R+

Def

H is called entropic if there exists randomvariables

X

s.t.

H(

U

) = entropy of

U, for U ⊆ X

Def

H is a polymatroid ifH(

∅) = 0H(V) ≥ H(U) for

U ⊆ VH(U

) + H(V) ≥ H(U ∩

V) + H(U ∪ V

)

Every entropic function is a polymatroidConverse fails for k≥4 [Zhang&Yeung’98]

Shannon

inequalities

Slide16

Enumeration Problem

16

Theorem ∀D

that satisfies the statisticslog |Q(D)| ≤ max

H entropic satisfying stats H(X) ≤ max H polymatroid satisfying stats

H(X)

Thm ∀

D

,

Q

(

D

) computable in time

Õ

(

Polymatroid-bound)

Fix a set of statistics for D (cardinalities, FDs, degrees)Fix a query

Q with variables X={X1,…,

Xk}

Asymptotically tight,but open if computable

Computable

in EXPTIME, but not tight

Slide17

Proof of Upper Bound

17

Q(X,Y,Z) = R(X,Y)

∧ S(Y,Z)

∧ T(Z,X)Database D

 entropic function H

Z

X

Y

Slide18

Proof of Upper Bound

X

Y

a

3

a

2

b

2

d

3

Y

Z

3

m

2

q

3

q

2

m

Z

X

m

a

q

a

q

b

m

d

R(X,Y)

S(Y,Z)

T(Z,X)

Database

D

18

Q(X,Y,Z) = R(X,Y)

S(Y,Z)

T(Z,X)

Database

D

entropic function H

Z

X

Y

Slide19

Proof of Upper Bound

X

Y

Z

a

3

m

a

2

q

b

2

q

d

3

m

a

3

q

X

Y

a

3

a

2

b

2

d

3

Y

Z

3

m

2

q

3

q

2

m

Z

X

m

a

q

a

q

b

m

d

Output

Q(

D

)

R(X,Y)

S(Y,Z)

T(Z,X)

Database

D

19

Q(X,Y,Z) = R(X,Y)

S(Y,Z)

T(Z,X)

Database

D

entropic function H

Z

X

Y

Slide20

Proof of Upper Bound

X

Y

Z

a

3

m

1/5

a

2

q

1/5

b

2

q

1/5

d

3

m

1/5

a

3

q

1/5

X

Y

a

3

a2

b

2

d3

Y

Z

3

m

2

q

3

q

2

m

Z

X

m

a

q

a

q

b

m

d

Output

Q(

D

)

R(X,Y)

S(Y,Z)

T(Z,X)

Database

D

20

H(

XYZ

) = log |

Q

(

D

)|

Q(X,Y,Z) = R(X,Y)

S(Y,Z)

T(Z,X)

Database

D

entropic function H

Z

X

Y

Slide21

Proof of Upper Bound

X

Y

Z

a

3

m

1/5

a

2

q

1/5

b

2

q

1/5

d

3

m

1/5

a

3

q

1/5

X

Y

a

32/5

a2

1/5b

21/5

d3

1/5

Y

Z

3

m

2/5

2

q

2/5

3

q

1/5

2

m

0

Z

X

m

a

1/5

q

a

2/5

q

b

1/5

m

d

1/5

Output

Q(

D

)

R(X,Y)

S(Y,Z)

T(Z,X)

Database

D

21

H(

XYZ

) = log |

Q

(

D

)|

Q(X,Y,Z) = R(X,Y)

S(Y,Z)

T(Z,X)

Database

D

entropic function H

Z

X

Y

Slide22

Proof of Upper Bound

X

Y

Z

a

3

m

1/5

a

2

q

1/5

b

2

q

1/5

d

3

m

1/5

a

3

q

1/5

X

Y

a

32/5

a2

1/5b

21/5

d3

1/5

Y

Z

3

m

2/5

2

q

2/5

3

q

1/5

2

m

0

Z

X

m

a

1/5

q

a

2/5

q

b

1/5

m

d

1/5

H(

XZ

) ≤ log

N

T

H(

YZ

) ≤ log

N

S

H(

Z|Y

) ≤ log

deg

S

(

z|y

)

H(

XY

) ≤ log

N

R

Q(X,Y,Z) = R(X,Y)

S(Y,Z)

T(Z,X)

Database

D

entropic function H

Output

Q(

D

)

R(X,Y)

S(Y,Z)

T(Z,X)

Database

D

H(

XYZ

) = log |

Q

(

D

)|

Cardinalitites

, functional dependences, max degrees

Z

X

Y

Slide23

Proof of Upper Bound

23

Q(X,Y,Z) = R(X,Y)

∧ S(Y,Z)

∧ T(Z,X)

|R|,|S|,|T| ≤

N  |Q(

D

)

|≤

N

3/2

Z

X

Y

Slide24

Proof of Upper Bound

3 log N

≥ h(XY) + h(YZ) + h(XZ)

≥ h(XYZ) + h(Y) + h(XZ)

≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ)

= 2 log |Output|

24

Q(X,Y,Z)

=

R(X,Y)

S(Y,Z)

T(Z,X)|

R|,|S|,|T| ≤ N

 |Q(D)

|≤ N3/2

Z

X

Y

Slide25

Proof of Upper Bound

3 log N

≥ h(XY) + h(YZ) + h(XZ)

≥ h(XYZ) + h(Y) + h(XZ)

≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ)

= 2 log |Output|

submodularity

25

Q(X,Y,Z)

=

R(X,Y)

S(Y,Z)

T(Z,X)|R|,|S|,|

T| ≤ N  |Q(

D)|≤ N

3/2

Z

X

Y

Slide26

Proof of Upper Bound

3 log N

≥ h(XY) + h(YZ) + h(XZ)

≥ h(XYZ) + h(

Y) + h(XZ)

≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ)

= 2 log |Output|

submodularity

26

Q(X,Y,Z)

=

R(X,Y)

S(Y,Z)

T(Z,X)

|R|,|S|,|T| ≤

N  |Q(

D)|≤ N

3/2

Z

X

Y

Slide27

Proof of Upper Bound

3 log N

≥ h(XY) + h(YZ) + h(XZ)

≥ h(XYZ) + h(

Y) + h(XZ)

≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ)

= 2 log |Output|

submodularity

submodularity

27

Q(X,Y,Z)

=

R(X,Y)

S(Y,Z)

T(Z,X)|

R|,|S|,|T| ≤ N

 |Q(

D)|≤ N3/2

Z

X

Y

Slide28

Proof of Upper Bound

3 log N

≥ h(XY) + h(YZ) + h(XZ)

≥ h(XYZ) + h(

Y) + h(XZ)

≥ h(XYZ) + h(XYZ) + h(

∅) = 2 h(XYZ)

= 2 log |Output|

submodularity

submodularity

28

Q(X,Y,Z)

=

R(X,Y)

S(Y,Z)

T(Z,X)|

R|,|S|,|T| ≤

N  |Q(D

)|≤ N3/2

Z

X

Y

Slide29

Proof of Upper Bound

3 log N

≥ h(XY) + h(YZ) + h(XZ)

≥ h(XYZ) + h(

Y) + h(XZ)

≥ h(XYZ) + h(XYZ) + h(

∅) = 2 h(

XYZ

)

= 2 log |

Q(

D

)

|

submodularity

submodularity

29

Q(X,Y,Z)

=

R(X,Y)

∧ S(Y,Z)

∧ T(Z,X)|

R|,|S|,|T| ≤ N

 |Q(D

)|≤ N3/2

Z

X

Y

Slide30

Proof of Upper Bound

3 log N

≥ h(XY) + h(YZ) + h(XZ)

≥ h(XYZ) + h(

Y) + h(XZ)

≥ h(XYZ) + h(XYZ) + h(

∅) = 2 h(

XYZ

)

= 2 log |

Q(

D

)

|

submodularity

submodularity

30

Shearer’s inequality

h(

XY

) + h(

YZ) + h(XZ) ≥ 2 h(

XYZ)

Q(X,Y,Z) = R(X,Y)

∧ S(Y,Z)

∧ T(Z,X)|R|,|

S|,|T| ≤ N  |

Q(D)|≤

N3/2

Z

X

Y

Slide31

Proof to Algorithm

h(

XY)+h(

YZ)+h(XZ)

h(XYZ)

h(Y

) + h(XZ)

h(

XYZ

)

+

31

h(

XY

) + h(

YZ

) + h(

XZ

)

≥ 2 h(

XYZ

)

Q(X,Y,Z) =

R(X,Y) ∧ S(Y,Z)

∧ T(Z,X)

Proof

Z

X

Y

Slide32

Proof to Algorithm

R(X,Y)

∧S(Y,Z)∧

T(Z,X)

h(XY)+h(

YZ)+h(XZ)

h(XYZ)

h(

Y

) + h(

XZ

)

h(

XYZ

)

+

32

h(

XY

) + h(

YZ

) + h(

XZ

) ≥ 2 h(XYZ)

Q(X,Y,Z) = R(X,Y)

∧ S(Y,Z)

∧ T(Z,X)

Proof

Algorithm

Z

X

Y

Slide33

Proof to Algorithm

R(X,Y)

∧S(Y,Z)∧

T(Z,X)

Rlight(X,Y)∧

S(Y,Z)

N

3/2

h(

XY

)+h(

YZ

)

+h(

XZ

)

h(

XYZ)

h(Y) + h(

XZ)h(

XYZ)

+

33

h(

XY

) + h(

YZ) + h(XZ) ≥ 2 h(XYZ

)

Q(X,Y,Z) = R(X,Y)

∧ S(Y,Z) ∧

T(Z,X)

Proof

Algorithm

Z

X

Y

R

light

or

R

heavy

:

degree(

Y

)

≤ or >

N

1/2

Slide34

Proof to Algorithm

R(X,Y)

∧S(Y,Z)∧

T(Z,X)

Rlight(X,Y)∧

S(Y,Z)R

heavy(Y)∧T(X,Z)

N

3/2

N

3/2

h(

XY

)+h(

YZ

)

+h(

XZ

)

h(

XYZ

)

h(Y) + h(XZ)

h(XYZ)

+

34

h(

XY

) + h(

YZ) + h(XZ

) ≥ 2 h(XYZ)

Q(X,Y,Z) =

R(X,Y) ∧

S(Y,Z) ∧

T(Z,X)

Proof

Algorithm

Z

X

Y

R

light

or

R

heavy

:

degree(

Y

)

≤ or >

N

1/2

Slide35

Proof to Algorithm

R(X,Y)

∧S(Y,Z)∧

T(Z,X)

Rlight(X,Y)∧

S(Y,Z)R

heavy(Y)∧T(X,Z)

N

3/2

N

3/2

h(

XY

)+h(

YZ

)

+h(

XZ

)

h(

XYZ

)

h(Y) + h(XZ)

h(XYZ)

+

Runtime

Õ

(

N

3/2)

35

h(XY) + h(YZ) + h(

XZ) ≥ 2 h(XYZ)

Q(X,Y,Z)

= R(X,Y) ∧

S(Y,Z) ∧

T(Z,X)

Proof

Algorithm

Z

X

Y

R

light

or

R

heavy

:

degree(

Y

)

≤ or >

N

1/2

Slide36

Enumeration Problem: Discussion

Cardinalities: [Atserias,Grohe,Marx’08, Ngo,Re,Rudra’13]Entropic bound = polymatroid bound

Algorithm for Q(D) has single log factor

Cardinalities + FDs + max degrees:Entropic bound ≨ polymatroid boundAlgorithm for

Q(D) has polylog factor

36

Slide37

Outline

Enumeration problemDecision problemConclusions

37

Slide38

Decision Problem

38

Fix Q, fix statistics on

DProblem: does Q occur in D?

Theorem

One can check if Q is in D in time

Õ(2subw(Q))

Optimal? (fine grained lower bound is open!)

“submodular width”

Slide39

Background: Tree Decomposition

Informally: TD = a tree where each node t represents an enumeration problem

Fractional hypetree width [Grohe,Marx’14]

mintree maxnode

t maxD

Submodular width [Marx’2013]maxD min

tree maxnode

t

39

Slide40

40

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u

)∧K(

u,x

)

|R|,|S|,|T|,|K|

N

O(

N

3/2

) algorithm [Alon,Yuster,Zwick’97]

min

tree

max

node t maxD

Slide41

R(

x,y),S(

y,z)

T(

z,u),K(u,x)

Tree decompositions

S(y,z),T(z,u)

K(

u,x

),R(

x,y

)

41

min

tree

max

node

t

maxD

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(z,u)∧K(

u,x)

|R|,|S|,|T|,|K| ≤ N O(

N3/2) algorithm [Alon,Yuster,Zwick’97]

Slide42

R(

x,y

),S(y,z)

T(

z,u),K(u,x)

Tree decompositions

S(y,z),T(

z,u

)

K(

u,x

),R(

x,y

)

42

min

tree

max

node t maxD

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u)∧K(u,x)

|R|,|S|,|T|,|K| ≤

N O(N3/2) algorithm [Alon,Yuster,Zwick’97]

Runtime

Õ

(N2

)

(suboptimal)

Slide43

R(

x,y),S(

y,z)

T(

z,u),K(u,x)

Tree decompositions

S(y,z),T(z,u

)

K(

u,x

),R(

x,y

)

43

min

tree

max

node

t maxD

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u)∧K(u,x)

|R|,|S|,|T|,|K| ≤

N O(N3/2) algorithm [Alon,Yuster,Zwick’97]

Runtime

Õ

(N2

)

(suboptimal)

Slide44

R(

x,y),S(

y,z)

T(

z,u),K(u,x)

Tree decompositions

S(y,z),T(z,u)

K(

u,x

),R(

x,y

)

44

max

D

min

tree

maxnode t

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u)∧K(u,x)

|R|,|S|,|T|,|K|

≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97]

Slide45

R(

x,y

),S(y,z)

T(

z,u),K(u,x)

Tree decompositions

S(y,z),T(

z,u

)

K(

u,x

),R(

x,y

)

min(

max(h(

xyz

),h(zux))

, max(h(yzu),h(uxy))) =

T1

45

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u)∧K(u,x)

|R|,|S|,|T|,|K| ≤

N O(N3/2) algorithm [Alon,Yuster,Zwick’97]

max

D

min

tree

max

node

t

Slide46

R(

x,y),S(

y,z)

T(

z,u),K(u,x)

Tree decompositions

S(y,z),T(z,u

)

K(

u,x

),R(

x,y

)

min(

max(h(

xyz

),h(

zux)),

max(h(yzu),h(uxy))) =

T1

T2

46

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u

)∧K(u,x)

|R|,|S|,|T|,|K|

≤ N O(N3/2

) algorithm [Alon,Yuster,Zwick’97]

max

D

min

tree

max

node

t

Slide47

R(

x,y),S(

y,z)

T(

z,u),K(u,x)

Tree decompositions

S(y,z),T(z,u)

K(

u,x

),R(

x,y

)

min( max(h(

xyz

),h(

zux

)),

max(h(yzu),h(

uxy))) =

T1

T2

47

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u)∧K(

u,x)

|R|,|S|,|T|,|K| ≤ N O(N

3/2) algorithm [Alon,Yuster,Zwick’97]

max

D

min

tree

max

node

t

Slide48

R(

x,y),S(

y,z)

T(

z,u),K(u,x)

Tree decompositions

S(y,z),T(z,u)

K(

u,x

),R(

x,y

)

min( max(h(xyz),h(

zux

)),

max(h(

yzu

),h(

uxy))) =

T1

T2

= max(min(h(

xyz

),h(yzu)), min(h(

xyz),h(uxy)), min(h(zux

),h(yzu)), min(h(zux),h(uxy

))) 48

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u

)∧K(

u,x

)

|R|,|S|,|T|,|K|

N

O(

N

3/2

) algorithm [Alon,Yuster,Zwick’97]

max

D

min

tree

max

node

t

Slide49

R(

x,y

),S(y,z)

T(

z,u),K(u,x)

Tree decompositions

S(y,z),T(

z,u

)

K(

u,x

),R(

x,y

)

= max(

min(h(

xyz

),h(yzu)),

min(h(xyz),h(uxy)),

min(h(zux),h(yzu)),

min(h(zux),h(

uxy)))

min( max(h(xyz),h(

zux)), max(h(yzu

),h(uxy))) =

T1

T2

3

log

N

≥ h(

xy) + h(yz) + h(zu

)

49

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u

)∧K(

u,x

)

|R|,|S|,|T|,|K|

N

O(

N

3/2

) algorithm [Alon,Yuster,Zwick’97]

max

D

min

tree

max

node

t

Slide50

S(

y,z

),T(z,u)

K(

u,x),R(x,y)

= max(min(h(xyz),h(yzu)

), min(h(xyz

),h(

uxy

)),

min(h(

zux

),h(

yzu

)),

min(h(

zux

),h(uxy)))

min( max(h(xyz),h(zux

)), max(h(yzu),h(

uxy))) =

T1

T2

3

log

N

h(xy) + h(yz) + h(

zu) ≥ h(xyz) + h(y) + h(

zu)50

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u

)∧K(

u,x

)

|R|,|S|,|T|,|K|

N

O(

N

3/2

) algorithm [Alon,Yuster,Zwick’97]

max

D

min

tree

max

node

t

R(

x,y

),S(

y,z

)

T(

z,u

),K(

u,x

)

Tree decompositions

Slide51

= max(

min(h(xyz),h(yzu)

), min(h(xyz),h(

uxy)),

min(h(zux),h(yzu

)), min(h(zux

),h(uxy)))

min( max(h(xyz),h(

zux

)),

max(h(

yzu

),h(

uxy

))) =

T1

T2

3

log

N

h(

xy

) + h(yz)

+ h(zu) ≥ h(xyz) + h(

y) + h(zu) ≥ h(xyz) + h(

yzu) + h(∅)

51

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u

)∧K(

u,x

)

|R|,|S|,|T|,|K|

N

O(

N

3/2

) algorithm [Alon,Yuster,Zwick’97]

max

D

min

tree

max

node

t

S(

y,z

),T(

z,u

)

K(

u,x

),R(

x,y

)

R(

x,y

),S(

y,z

)

T(

z,u

),K(

u,x

)

Tree decompositions

Slide52

3

log N ≥

h(xy) + h(

yz) + h(zu

) ≥ h(xyz) + h(y) + h(

zu) ≥ h(xyz) + h(yzu) + h(

∅) ≥ 2 min(h(

xyz

),h(

yzu

))

= max(

min(h(

xyz

),h(

yzu

)), min(h(xyz

),h(uxy)), min(h(

zux),h(yzu)),

min(h(zux),h(uxy

)))

min( max(h(xyz),h(zux)),

max(h(yzu),h(

uxy))) =

T1

T2

52

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u

)∧K(

u,x

)

|R|,|S|,|T|,|K|

N

O(

N

3/2

) algorithm [Alon,Yuster,Zwick’97]

max

D

min

tree

max

node

t

S(

y,z

),T(

z,u

)

K(

u,x

),R(

x,y

)

R(

x,y

),S(

y,z

)

T(

z,u

),K(

u,x

)

Tree decompositions

Slide53

= max(min(h(

xyz),h(yzu)), min(h(

xyz),h(uxy)), min(h(

zux),h(yzu)), min(h(zux

),h(uxy))) ≤ 3/2 log

N

min( max(h(xyz),h(zux)),

max(h(

yzu

),h(

uxy

))) =

T1

T2

subw

(

Q

) =

3/2

log

N

3

log

N

≥ h(

xy) + h(yz) + h(

zu) ≥ h(xyz) + h(y) + h(

zu) ≥ h(xyz) + h(yzu

) + h(∅) ≥ 2 min(h(xyz),h(

yzu))53

U

Z

X

Y

Q() = ∃

x∃y∃z∃u

R(

x,y

)∧S(

y,z

)∧T(

z,u

)∧K(

u,x

)

|R|,|S|,|T|,|K|

N

O(

N

3/2

) algorithm [Alon,Yuster,Zwick’97]

max

D

min

tree

max

node

t

S(

y,z

),T(

z,u

)

K(

u,x

),R(

x,y

)

R(

x,y

),S(

y,z

)

T(

z,u

),K(

u,x

)

Tree decompositions

Slide54

Proof

to Algorithm

Use the proof of: to compute the

disjunctive datalog rule:

(details omitted)

54

Runtime Õ(N

3/2

)

h(

xyz

)+h(

yzu

) ≤ h(

xy

) + h(

yz) + h(

zu)

A(x,y,z) ∨ B

(y,z,u)  R(x,y

) ∧ S(y,z) ∧ T

(z,u)

Slide55

Outline

Enumeration problemDecision problemConclusions

55

Slide56

Conclusions

Query evaluation summary:

Information theory  ProofProof

 AlgorithmOpen problems:Better “Proof

 Algorithm”Fine-grained lower bounds

56

Slide57

Thank You!

Questions?

Hung Ngo

Mahmoud Abo-Khamis

[PODS’2016]

[PODS’2017]

RelationalAI

Inc.

Slide58