Dan Suciu University of Washington Hung Ngo Mahmoud AboKhamis PODS2016 PODS2017 RelationalAI Inc Basic Question What is the optimal runtime to compute a query Q on a database ID: 685331
Download Presentation The PPT/PDF document "Optimal Query Processing Meets Informati..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Optimal Query Processing Meets Information Theory
Dan Suciu – University of Washington
Hung Ngo
Mahmoud Abo-Khamis
[PODS’2016]
[PODS’2017]
–
RelationalAI
Inc.Slide2
Basic Question
What is the optimal runtime to compute a query Q on a database
D?Q, D are labeled hypergraphsProblem 1: list all occurrences in
Q in DProblem 2: check if there exists
Q in DData complexity: Q is fixed, runtime = f(
D)
2Slide3
Example Queries
3
Enumerate all labeled triangles:
R(X,Y)
∧
S(Y,Z) ∧ T(Z,X)
∃x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u
)∧K(
u,x
)
Check if there exists a labeled 4-cycle
U
Z
X
Y
Z
X
Y
R
R
S
S
T
T
KSlide4
Main Results
Problem 1: enumeration problem
Thm
∀
D,(1) |Q(D
)| ≤ Entropic-bound ≤ Polymatroid-bound(2) Q(D) computable in time Õ(Polymatroid-bound)
Fix statistics for
D
(cardinalities, functional dependencies, max degrees)
Fix the query
Q
Problem 2: decision problem
Thm
∀
D
,
Q(
D) is computable in time Õ(2submodular-width)
Tight, but openif computable
Computable,
but not tight
Optimal?Slide5
Main Principle
Find information-theoretic
proof ofthe upper bound, or the submodular widthConvert
proof to algorithm
5Slide6
Outline
Enumeration problemDecision problemConclusions
6Slide7
Maximum Output Size
maxD satisfies stats (|
Q(D)|)E.g. R(X,Y)
∧ S(Y,Z), |R
|, |S| ≤ NNo other info: |Q(D)| ≤ N
2S.Y is a key: |Q(D)| ≤ NS.Y has degree ≤ d: |Q(D)| ≤
d×NE.g. R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)
No other info: |Q(D)| ≤ N
3/2
7Slide8
Maximum Output Size
maxD satisfies stats (|
Q(D)|)E.g. R(X,Y)
∧ S(Y,Z), |R
|, |S| ≤ NNo other info: |
Q(D)| ≤ N2S.Y is a key: |Q(D)| ≤ N
S.Y has degree ≤ d: |Q(D)| ≤ d×N
E.g. R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)
No other info: |Q(D)| ≤ N
3/2
8Slide9
Maximum Output Size
maxD satisfies stats (|
Q(D)|)E.g. R(X,Y)
∧ S(Y,Z), |R
|, |S| ≤ NNo other info: |
Q(D)| ≤ N2S.Y is a key:
|Q(D)| ≤ NS.Y has degree ≤ d: |Q(D)| ≤ d×N
E.g. R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)
No other info: |Q(D)| ≤ N
3/2
9Slide10
Maximum Output Size
maxD satisfies stats (|
Q(D)|)E.g. R(X,Y)
∧ S(Y,Z), |R
|, |S| ≤ NNo other info: |
Q(D)| ≤ N2S.Y is a key: |
Q(D)| ≤ NS.Y has degree ≤ d:
|Q(D)| ≤
d×N
E.g. R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)
No other info: |Q(D)| ≤ N3/2
10Slide11
Maximum Output Size
maxD satisfies stats (|
Q(D)|)E.g. R(X,Y)
∧ S(Y,Z), |R
|, |S| ≤ NNo other info: |
Q(D)| ≤ N2S.Y is a key: |
Q(D)| ≤ NS.Y
has degree ≤
d
:
|Q(D)| ≤
d×N
E.g. R(X,Y)
∧
S(Y,Z)
∧
T(Z,X) No other info: |Q(D)| ≤ N
3/211Slide12
Maximum Output Size
maxD satisfies stats (|
Q(D)|)E.g. R(X,Y)
∧ S(Y,Z), |R
|, |S| ≤ NNo other info: |
Q(D)| ≤ N2S.Y is a key: |
Q(D)| ≤ NS.Y
has degree ≤
d
: |
Q
(
D
)| ≤
d
×N
E.g. R(X,Y) ∧ S(Y,Z) ∧
T(Z,X) No other info: |Q(D)| ≤ N3/2
12Slide13
Maximum Output Size
maxD satisfies stats (|
Q(D)|)E.g. R(X,Y)
∧ S(Y,Z), |R
|, |S| ≤ NNo other info: |
Q(D)| ≤ N2S.Y is a key: |
Q(D)| ≤ NS.Y
has degree ≤
d
: |
Q
(
D
)| ≤
d
×NE.g.
R(X,Y) ∧ S(Y,Z) ∧
T(Z,X) No other info: |Q(D)| ≤ N3/2
13Slide14
Maximum Output Size
maxD satisfies stats (|
Q(D)|)E.g. R(X,Y)
∧ S(Y,Z), |R
|, |S| ≤ NNo other info: |
Q(D)| ≤ N2S.Y is a key: |
Q(D)| ≤ NS.Y
has degree ≤
d
: |
Q
(
D
)| ≤
d
×NE.g.
R(X,Y) ∧ S(Y,Z) ∧
T(Z,X) No other info: |Q(D)| ≤ N
3/214Slide15
Background: Entropy, Polymatroid
Fix a set
X={X
1,…,Xk} and a function H: 2
X R+
Def
H is called entropic if there exists randomvariables
X
s.t.
H(
U
) = entropy of
U, for U ⊆ X
Def
H is a polymatroid ifH(
∅) = 0H(V) ≥ H(U
) for U ⊆ VH(U
) + H(V) ≥ H(U ∩
V) + H(U ∪
V)
Every entropic function is a polymatroidConverse fails for k≥4 [Zhang&Yeung’98]
Shannon
inequalitiesSlide16
Enumeration Problem
16
Theorem ∀D
that satisfies the statisticslog |Q(D
)| ≤ max H entropic satisfying stats H(X) ≤ max H polymatroid satisfying stats
H(X)
Thm ∀
D
,
Q
(
D
) computable in time
Õ
(
Polymatroid-bound)
Fix a set of statistics for D (cardinalities, FDs, degrees)Fix a query
Q with variables X={X1,…,
Xk}
Asymptotically tight,but open if computable
Computable
in EXPTIME, but not tightSlide17
Proof of Upper Bound
17
Q(X,Y,Z) = R(X,Y)
∧ S(Y,Z)
∧ T(Z,X)Database D
entropic function H
Z
X
YSlide18
Proof of Upper Bound
X
Y
a
3
a
2
b
2
d
3
Y
Z
3
m
2
q
3
q
2
m
Z
X
m
a
q
a
q
b
m
d
R(X,Y)
S(Y,Z)
T(Z,X)
Database
D
18
Q(X,Y,Z) = R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)
Database
D
entropic function H
Z
X
YSlide19
Proof of Upper Bound
X
Y
Z
a
3
m
a
2
q
b
2
q
d
3
m
a
3
q
X
Y
a
3
a
2
b
2
d
3
Y
Z
3
m
2
q
3
q
2
m
Z
X
m
a
q
a
q
b
m
d
Output
Q(
D
)
R(X,Y)
S(Y,Z)
T(Z,X)
Database
D
19
Q(X,Y,Z) = R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)
Database
D
entropic function H
Z
X
YSlide20
Proof of Upper Bound
X
Y
Z
a
3
m
1/5
a
2
q
1/5
b
2
q
1/5
d
3
m
1/5
a
3
q
1/5
X
Y
a
3
a2
b
2
d3
Y
Z
3
m
2
q
3
q
2
m
Z
X
m
a
q
a
q
b
m
d
Output
Q(
D
)
R(X,Y)
S(Y,Z)
T(Z,X)
Database
D
20
H(
XYZ
) = log |
Q
(
D
)|
Q(X,Y,Z) = R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)
Database
D
entropic function H
Z
X
YSlide21
Proof of Upper Bound
X
Y
Z
a
3
m
1/5
a
2
q
1/5
b
2
q
1/5
d
3
m
1/5
a
3
q
1/5
X
Y
a
32/5
a2
1/5b
21/5
d3
1/5
Y
Z
3
m
2/5
2
q
2/5
3
q
1/5
2
m
0
Z
X
m
a
1/5
q
a
2/5
q
b
1/5
m
d
1/5
Output
Q(
D
)
R(X,Y)
S(Y,Z)
T(Z,X)
Database
D
21
H(
XYZ
) = log |
Q
(
D
)|
Q(X,Y,Z) = R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)
Database
D
entropic function H
Z
X
YSlide22
Proof of Upper Bound
X
Y
Z
a
3
m
1/5
a
2
q
1/5
b
2
q
1/5
d
3
m
1/5
a
3
q
1/5
X
Y
a
32/5
a2
1/5b
21/5
d3
1/5
Y
Z
3
m
2/5
2
q
2/5
3
q
1/5
2
m
0
Z
X
m
a
1/5
q
a
2/5
q
b
1/5
m
d
1/5
H(
XZ
) ≤ log
N
T
H(
YZ
) ≤ log
N
S
H(
Z|Y
) ≤ log
deg
S
(
z|y
)
H(
XY
) ≤ log
N
R
Q(X,Y,Z) = R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)
Database
D
entropic function H
Output
Q(
D
)
R(X,Y)
S(Y,Z)
T(Z,X)
Database
D
H(
XYZ
) = log |
Q
(
D
)|
Cardinalitites
, functional dependences, max degrees
Z
X
YSlide23
Proof of Upper Bound
23
Q(X,Y,Z) = R(X,Y)
∧ S(Y,Z)
∧ T(Z,X)
|R|,|S|,|T| ≤
N |Q(
D
)
|≤
N
3/2
Z
X
YSlide24
Proof of Upper Bound
3 log N
≥ h(XY) + h(YZ) + h(XZ)
≥ h(XYZ) + h(Y) + h(XZ)
≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ)
= 2 log |Output|
24
Q(X,Y,Z)
=
R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)|
R|,|S|,|T| ≤ N
|Q(D)
|≤ N3/2
Z
X
YSlide25
Proof of Upper Bound
3 log N
≥ h(XY) + h(YZ) + h(XZ)
≥ h(XYZ) + h(Y) + h(XZ)
≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ)
= 2 log |Output|
submodularity
25
Q(X,Y,Z)
=
R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)|R|,|S|,|
T| ≤ N |Q(
D)|≤ N
3/2
Z
X
YSlide26
Proof of Upper Bound
3 log N
≥ h(XY) + h(YZ) + h(XZ)
≥ h(XYZ) + h(
Y) + h(XZ)
≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ)
= 2 log |Output|
submodularity
26
Q(X,Y,Z)
=
R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)|R|,|S|,|T| ≤
N |
Q(D)|≤ N
3/2
Z
X
YSlide27
Proof of Upper Bound
3 log N
≥ h(XY) + h(YZ) + h(XZ)
≥ h(XYZ) + h(
Y) + h(XZ)
≥ h(XYZ) + h(XYZ) + h(∅) = 2 h(XYZ)
= 2 log |Output|
submodularity
submodularity
27
Q(X,Y,Z)
=
R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)|
R|,|S|,|T| ≤ N
|Q(
D)|≤ N3/2
Z
X
YSlide28
Proof of Upper Bound
3 log N
≥ h(XY) + h(YZ) + h(XZ)
≥ h(XYZ) + h(
Y) + h(XZ)
≥ h(XYZ) + h(XYZ) + h(
∅) = 2 h(XYZ)
= 2 log |Output|
submodularity
submodularity
28
Q(X,Y,Z)
=
R(X,Y)
∧
S(Y,Z)
∧
T(Z,X)|
R|,|S|,|T| ≤
N |Q(D
)|≤ N3/2
Z
X
YSlide29
Proof of Upper Bound
3 log N
≥ h(XY) + h(YZ) + h(XZ)
≥ h(XYZ) + h(
Y) + h(XZ)
≥ h(XYZ) + h(XYZ) + h(
∅) = 2 h(
XYZ
)
= 2 log |
Q(
D
)
|
submodularity
submodularity
29
Q(X,Y,Z)
=
R(X,Y)
∧ S(Y,Z)
∧ T(Z,X)|
R|,|S|,|T| ≤ N
|Q(D
)|≤ N3/2
Z
X
YSlide30
Proof of Upper Bound
3 log N
≥ h(XY) + h(YZ) + h(XZ)
≥ h(XYZ) + h(
Y) + h(XZ)
≥ h(XYZ) + h(XYZ) + h(
∅) = 2 h(
XYZ
)
= 2 log |
Q(
D
)
|
submodularity
submodularity
30
Shearer’s inequality
h(
XY
) + h(
YZ) + h(XZ) ≥ 2 h(
XYZ)
Q(X,Y,Z) = R(X,Y)
∧ S(Y,Z)
∧ T(Z,X)|R|,|
S|,|T| ≤ N
|Q(D)
|≤ N3/2
Z
X
YSlide31
Proof to Algorithm
h(
XY)+h(
YZ)+h(XZ)
h(XYZ)
h(Y
) + h(XZ)
h(
XYZ
)
+
31
h(
XY
) + h(
YZ
) + h(
XZ
)
≥ 2 h(
XYZ
)
Q(X,Y,Z) =
R(X,Y) ∧ S(Y,Z)
∧ T(Z,X)
Proof
Z
X
YSlide32
Proof to Algorithm
R(X,Y)
∧S(Y,Z)∧
T(Z,X)
h(XY)+h(
YZ)+h(XZ)
h(XYZ)
h(
Y
) + h(
XZ
)
h(
XYZ
)
+
32
h(
XY
) + h(
YZ
) + h(
XZ
) ≥ 2 h(XYZ)
Q(X,Y,Z) = R(X,Y)
∧ S(Y,Z)
∧ T(Z,X)
Proof
Algorithm
Z
X
YSlide33
Proof to Algorithm
R(X,Y)
∧S(Y,Z)∧
T(Z,X)
Rlight(X,Y)∧
S(Y,Z)
N
3/2
h(
XY
)+h(
YZ
)
+h(
XZ
)
h(
XYZ)
h(Y) + h(
XZ)h(
XYZ)
+
33
h(
XY
) + h(
YZ) + h(XZ) ≥ 2 h(XYZ
)
Q(X,Y,Z) = R(X,Y)
∧ S(Y,Z) ∧
T(Z,X)
Proof
Algorithm
Z
X
Y
R
light
or
R
heavy
:
degree(
Y
)
≤ or >
N
1/2Slide34
Proof to Algorithm
R(X,Y)
∧S(Y,Z)∧
T(Z,X)
Rlight(X,Y)∧
S(Y,Z)R
heavy(Y)∧T(X,Z)
N
3/2
N
3/2
∪
h(
XY
)+h(
YZ
)
+h(
XZ
)
h(
XYZ
)
h(Y) + h(XZ)
h(XYZ)
+
34
h(
XY
) + h(
YZ) + h(XZ
) ≥ 2 h(XYZ)
Q(X,Y,Z) =
R(X,Y) ∧
S(Y,Z) ∧
T(Z,X)
Proof
Algorithm
Z
X
Y
R
light
or
R
heavy
:
degree(
Y
)
≤ or >
N
1/2Slide35
Proof to Algorithm
R(X,Y)
∧S(Y,Z)∧
T(Z,X)
Rlight(X,Y)∧
S(Y,Z)R
heavy(Y)∧T(X,Z)
N
3/2
N
3/2
∪
h(
XY
)+h(
YZ
)
+h(
XZ
)
h(
XYZ
)
h(Y) + h(XZ)
h(XYZ)
+
Runtime
Õ
(
N
3/2)
35
h(XY) + h(YZ) + h(
XZ) ≥ 2 h(XYZ)
Q(X,Y,Z)
= R(X,Y) ∧
S(Y,Z) ∧
T(Z,X)
Proof
Algorithm
Z
X
Y
R
light
or
R
heavy
:
degree(
Y
)
≤ or >
N
1/2Slide36
Enumeration Problem: Discussion
Cardinalities: [Atserias,Grohe,Marx’08, Ngo,Re,Rudra’13]Entropic bound = polymatroid bound
Algorithm for Q(D) has single log factor
Cardinalities + FDs + max degrees:Entropic bound ≨ polymatroid bound
Algorithm for Q(D) has polylog factor
36Slide37
Outline
Enumeration problemDecision problemConclusions
37Slide38
Decision Problem
38
Fix Q, fix statistics on
DProblem: does Q occur in D?
Theorem
One can check if Q is in D
in time Õ(2subw(Q))
Optimal? (fine grained lower bound is open!)
“submodular width”Slide39
Background: Tree Decomposition
Informally: TD = a tree where each node t represents an enumeration problem
Fractional hypetree width [Grohe,Marx’14]
mintree max
node t maxD
Submodular width [Marx’2013]maxD min
tree maxnode
t
39Slide40
40
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u
)∧K(
u,x
)
|R|,|S|,|T|,|K|
≤
N
O(
N
3/2
) algorithm [Alon,Yuster,Zwick’97]
min
tree
max
node t maxDSlide41
R(
x,y),S(
y,z)
T(
z,u),K(u,x)
Tree decompositions
S(y,z),T(z,u)
K(
u,x
),R(
x,y
)
41
min
tree
max
node
t
maxD
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(z,u)∧K(
u,x)
|R|,|S|,|T|,|K| ≤ N O(
N3/2) algorithm [Alon,Yuster,Zwick’97]Slide42
R(
x,y
),S(y,z)
T(
z,u),K(u,x)
Tree decompositions
S(y,z),T(
z,u
)
K(
u,x
),R(
x,y
)
42
min
tree
max
node t maxD
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u)∧K(u,x)
|R|,|S|,|T|,|K| ≤
N O(N3/2) algorithm [Alon,Yuster,Zwick’97]
Runtime
Õ
(N
2
)
(suboptimal)Slide43
R(
x,y),S(
y,z)
T(
z,u),K(u,x)
Tree decompositions
S(y,z),T(z,u
)
K(
u,x
),R(
x,y
)
43
min
tree
max
node
t maxD
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u)∧K(u,x)
|R|,|S|,|T|,|K| ≤
N O(N3/2) algorithm [Alon,Yuster,Zwick’97]
Runtime
Õ
(N
2
)
(suboptimal)Slide44
R(
x,y),S(
y,z)
T(
z,u),K(u,x)
Tree decompositions
S(y,z),T(z,u)
K(
u,x
),R(
x,y
)
44
max
D
min
tree
maxnode t
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u)∧K(u,x)
|R|,|S|,|T|,|K|
≤ N O(N3/2) algorithm [Alon,Yuster,Zwick’97]Slide45
R(
x,y
),S(y,z)
T(
z,u),K(u,x)
Tree decompositions
S(y,z),T(
z,u
)
K(
u,x
),R(
x,y
)
min(
max(h(
xyz
),h(zux))
, max(h(yzu),h(uxy
))) =
T1
45
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u)∧K(u,x)
|R|,|S|,|T|,|K| ≤
N O(N3/2) algorithm [Alon,Yuster,Zwick’97]
max
D
min
tree
max
node
tSlide46
R(
x,y),S(
y,z)
T(
z,u),K(u,x)
Tree decompositions
S(y,z),T(z,u
)
K(
u,x
),R(
x,y
)
min(
max(h(
xyz
),h(
zux)),
max(h(yzu),h(uxy))) =
T1
T2
46
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u
)∧K(u,x
)
|R|,|S|,|T|,|K| ≤ N O(N
3/2
) algorithm [Alon,Yuster,Zwick’97]
max
D
min
tree
max
node
tSlide47
R(
x,y),S(
y,z)
T(
z,u),K(u,x)
Tree decompositions
S(y,z),T(z,u)
K(
u,x
),R(
x,y
)
min( max(h(
xyz
),h(
zux
)),
max(h(yzu),h(
uxy))) =
T1
T2
47
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u)∧K(
u,x)
|R|,|S|,|T|,|K| ≤ N O(
N3/2) algorithm [Alon,Yuster,Zwick’97]
max
D
min
tree
max
node
tSlide48
R(
x,y),S(
y,z)
T(
z,u),K(u,x)
Tree decompositions
S(y,z),T(z,u)
K(
u,x
),R(
x,y
)
min( max(h(xyz),h(
zux
)),
max(h(
yzu
),h(
uxy))) =
T1
T2
= max(min(h(
xyz
),h(yzu)), min(h(
xyz),h(uxy)), min(h(
zux),h(yzu)), min(h(zux),h(uxy
))) 48
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u
)∧K(
u,x
)
|R|,|S|,|T|,|K|
≤
N
O(
N
3/2
) algorithm [Alon,Yuster,Zwick’97]
max
D
min
tree
max
node
tSlide49
R(
x,y
),S(y,z)
T(
z,u),K(u,x)
Tree decompositions
S(y,z),T(
z,u
)
K(
u,x
),R(
x,y
)
= max(
min(h(
xyz
),h(yzu)),
min(h(xyz),h(uxy)),
min(h(zux),h(yzu
)), min(h(zux
),h(uxy)))
min( max(h(xyz),h(
zux)), max(h(yzu
),h(uxy))) =
T1
T2
3
log
N
≥ h(
xy) + h(yz) + h(
zu)
49
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u
)∧K(
u,x
)
|R|,|S|,|T|,|K|
≤
N
O(
N
3/2
) algorithm [Alon,Yuster,Zwick’97]
max
D
min
tree
max
node
tSlide50
S(
y,z
),T(z,u)
K(
u,x),R(x,y)
= max(min(h(xyz),h(yzu
)), min(h(xyz
),h(
uxy
)),
min(h(
zux
),h(
yzu
)),
min(h(
zux
),h(uxy)))
min( max(h(xyz),h(zux
)), max(h(yzu
),h(uxy))) =
T1
T2
3
log
N
≥
h(xy) + h(yz) + h(
zu) ≥ h(xyz) + h(y) + h(
zu)
50
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u
)∧K(
u,x
)
|R|,|S|,|T|,|K|
≤
N
O(
N
3/2
) algorithm [Alon,Yuster,Zwick’97]
max
D
min
tree
max
node
t
R(
x,y
),S(
y,z
)
T(
z,u
),K(
u,x
)
Tree decompositionsSlide51
= max(
min(h(xyz),h(yzu)
), min(h(xyz
),h(uxy)),
min(h(zux),h(yzu
)), min(h(zux
),h(uxy)))
min( max(h(xyz),h(
zux
)),
max(h(
yzu
),h(
uxy
))) =
T1
T2
3
log
N
≥
h(
xy
) + h(yz)
+ h(zu) ≥ h(xyz) +
h(y) + h(zu) ≥ h(xyz) + h(
yzu) + h(∅)
51
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u
)∧K(
u,x
)
|R|,|S|,|T|,|K|
≤
N
O(
N
3/2
) algorithm [Alon,Yuster,Zwick’97]
max
D
min
tree
max
node
t
S(
y,z
),T(
z,u
)
K(
u,x
),R(
x,y
)
R(
x,y
),S(
y,z
)
T(
z,u
),K(
u,x
)
Tree decompositionsSlide52
3
log N ≥
h(xy) + h(
yz) + h(zu
) ≥ h(xyz) + h(y) + h(
zu) ≥ h(xyz) + h(yzu) + h(
∅) ≥ 2 min(h(
xyz
),h(
yzu
))
= max(
min(h(
xyz
),h(
yzu
)), min(h(
xyz),h(uxy)),
min(h(zux),h(yzu)),
min(h(zux),h(uxy
)))
min( max(h(xyz),h(zux
)), max(h(yzu
),h(uxy))) =
T1
T2
52
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u
)∧K(
u,x
)
|R|,|S|,|T|,|K|
≤
N
O(
N
3/2
) algorithm [Alon,Yuster,Zwick’97]
max
D
min
tree
max
node
t
S(
y,z
),T(
z,u
)
K(
u,x
),R(
x,y
)
R(
x,y
),S(
y,z
)
T(
z,u
),K(
u,x
)
Tree decompositionsSlide53
= max(min(h(
xyz),h(yzu)), min(h(
xyz),h(uxy)), min(h(
zux),h(yzu)), min(h(zux
),h(uxy))) ≤ 3/2 log
N
min( max(h(xyz),h(zux)),
max(h(
yzu
),h(
uxy
))) =
T1
T2
subw
(
Q
) =
3/2
log
N
3
log
N
≥ h(
xy) + h(yz) + h(
zu) ≥ h(xyz) + h(y) + h(
zu) ≥ h(xyz) + h(yzu
) + h(∅) ≥ 2 min(h(xyz
),h(yzu))
53
U
Z
X
Y
Q() = ∃
x∃y∃z∃u
R(
x,y
)∧S(
y,z
)∧T(
z,u
)∧K(
u,x
)
|R|,|S|,|T|,|K|
≤
N
O(
N
3/2
) algorithm [Alon,Yuster,Zwick’97]
max
D
min
tree
max
node
t
S(
y,z
),T(
z,u
)
K(
u,x
),R(
x,y
)
R(
x,y
),S(
y,z
)
T(
z,u
),K(
u,x
)
Tree decompositionsSlide54
Proof
to Algorithm
Use the proof of: to compute the
disjunctive datalog rule:
(details omitted)
54
Runtime Õ(N
3/2
)
h(
xyz
)+h(
yzu
) ≤ h(
xy
) + h(
yz) + h(
zu)
A(x,y,z) ∨ B
(y,z,u) R(
x,y) ∧ S(y,z)
∧ T(z,u)Slide55
Outline
Enumeration problemDecision problemConclusions
55Slide56
Conclusions
Query evaluation summary:
Information theory ProofProof
AlgorithmOpen problems:Better “Proof
Algorithm”Fine-grained lower bounds
56Slide57
Thank You!
Questions?
Hung Ngo
Mahmoud Abo-Khamis
[PODS’2016]
[PODS’2017]
–
RelationalAI
Inc.