Paris Koutris Dan Suciu University of Washington Repairs An uncertain instance I for a schema with key constraints A repair r of I is a subinstance of I that satisfies the key constraints and is ID: 278399
Download Presentation The PPT/PDF document "A Dichotomy on The Complexity of Consist..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
A Dichotomy on The Complexity of Consistent Query Answering for Atoms With Simple Keys
Paris KoutrisDan Suciu
University of WashingtonSlide2
Repairs
An uncertain instance I for a schema with key constraintsA repair
r of
I
is a subinstance of I that satisfies the key constraints and is maximal
2
R
(
x, y)(a1, b1)(a1, b2)(a2, b2)(a3, b3)(a3, b4)(a4, b4)
(a1, b1)(a2, b2)(a3, b4)(a4, b4)
(a1, b1)(a2, b2)(a3, b3)(a4, b4)
(a1, b2)(a2, b2)(a3, b4)(a4, b4)
(a1, b2)(a2, b2)(a3, b3)(a4, b4)
The 4 possible repairsSlide3
Consistent Query Answering
If Q is boolean, we say that I is certain for Q
,
I
|= Q, if for every repair r of I, Q
(r) is true
3
R
(x, y)(a1, b1)(a1, b2)(a2, b2)(a3, b3)(a3, b4)(a4, b4)
S(y, z)(b1, c1)(b2, c1)(b2
, c2)(b3, c3)Q() = R(x, y), S(y, z)I |= QSlide4
Problem Statement
CERTAINTY(Q): Given as input an instance I, does I |= Q when Q is a boolean CQ? In general, CERTAINTY(Q) is in
coNP
Q
1 = R(x, y), S(y, z) : expressible as a first-order queryQ2 =
R(x, y), S(z, y
) : coNP-complete Q
3 = R(x, y), S
(y, x) : PTIME but not first-order expressible 4Conjecture For every boolean conjunctive query Q, CERTAINTY(Q) is either in PTIME or coNP-completeSlide5
Progress so Far
[Wijsen, 2010]Syntactic characterization of FO-expressible acyclic CQs w/o self-joins[Kolaitis and Pema, 2012]
A
trichotomy
for CQs with 2 atoms and no self-joins[Wijsen, 2010 & 2013]PTIME algorithm for cyclic queries: Ck
= R1(x1,x
2), …, Rk(
xk, x1)
Further classification of acyclic CQs w/o self-joins5Slide6
Our Contribution
A dichotomy for CQs w/o self-joins where atoms have either Simple keys : R(x, y, z)Keys that consist of all attributes: S(x, y, z)
6
Theorem
For every boolean
CQ Q w/o self-joins where for each atom the key consists of either
one attribute or all
attributes, there exists a dichotomy of CERTAINTY(Q) into PTIME and coNP-completeSlide7
Outline
The Dichotomy ConditionFrugal Repairs & Representable AnswersStrongly Connected Graphs
7Slide8
The Query Graph
We equivalently study boolean CQs consisting only of binary relations where one attribute is the key: R(x, y)Relations can be consistent (R
c
) or
inconsistent (Ri)Query Graph: a directed edge (u, v) for each atom R(u
,v)
8
Q
= Ri(x, y), Si(z, w), Tc(y, w)ywxST
Rz
G[Q]source node uRend node vRSlide9
Definitions
x+,R : set of nodes reachable from node x once we remove the edge R (through a directed path)R ~ S [
source-equivalent
]
: source nodes uR, uS are in the same SCC[R]: the equivalence class of R w.r.t
~ 9
y
R
z
x
T
Svwu
x+,R = {x, v, w}R ~ T and [R] = {R, T}VUSlide10
Coupled Edges
coupled+(R) = edges in [R] + any inconsistent edge S s.t. the source node uS is connected to the end node
v
R
through a (undirected) path that does not intersect with uR+,R
10
y
=
vR Rz
x = uR
T
Svwu = u
Vcoupled+(R): contains R,T: [R] = {R, T}contains V: path from y (= vR ) to u (= uV)does not contain UVU
The set
u
R
+,RSlide11
Splittable Graphs
Two inconsistent edges R, S are coupled if S in coupled+(R) & R in coupled+(S) A graph G[Q] is:
unsplittable
if it contains a pair of coupled edges that are not source-equivalent.splittable otherwise
11
y
R
z
x
T
SvwuV
Ucoupled+(R) = {R, T, V}coupled+(T) = {R, T, V}coupled+(V) = {V}coupled+(U) = {U,V,R,T}Only R,T are coupled
SPLITTABLE!Slide12
The Dichotomy Condition
12
y
R
z
x
T
S
v
w
u
VUDichotomy Theorem If G[Q] is splittable, CERTAINTY(Q) is in PTIMEIf G[Q
] is unsplittable, CERTAINTY(Q) is coNP-completeSplittable, so in PTIMESlide13
Examples
13
PTIME
R(
x
, y), S(
y
, z)
coNP
-complete
R(x, y), S(y, z), Tc(x, z)xyz
xyz
PTIME
R(
x
, y), S(
y
, z),
U
c
(
z
, y)
x
y
z
coNP
-complete
R(
x
, y), S(
z
,
y
),
U
c
(
y
,
z
)
x
y
zSlide14
Outline
The Dichotomy ConditionFrugal Repairs & Representable AnswersStrongly Connected Graphs
14Slide15
Frugal Repairs (1)
15Definition
A repair r of an instance
I
is frugal for a boolean query Q
if for any other repair r’ of I,
Qf(r’) is
not strictly contained in Q
f(r)R(x, y)(a1, b1)(a1, b2)(a2, b3)(a3, b4)(a4, b4)S(y
, x)(b1, a1)(b3, a2)(b4, a3)
(b4, a4)repair r1 = { R(a1, b1), R(a2, b3), R(a3, b4), R(a4, b4) S(b1
, a1), S(b3, a2), S(b4, a3) }Qf(r1) = { (a1, b1), (a2, b3), (a3, b4) }repair r2 = { R(a1, b2)
, R(a2, b3), R(a3, b4), R(a4, b4) S(b1, a1), S(b3, a2), S(b4, a3) }Qf(r2) = { (a2, b3), (a3, b4) }
not
frugal
frugal
Q
f
= all body variables to the head (
full query
)Slide16
R
(x, y)
(
a
1, b1)
(a1
, b2)
(
a2, b3)(a3, b4)(a4, b4)S(y, x)(b1, a1)(b3, a2)(b4, a3)(b4, a4)
Frugal Repairs (2)16I |= Q if and only if every frugal repair satisfies QWe lose no generality if we study only frugal repairs!Only two frugal repairs:Qf(r2) = {(a2, b3), (a3, b
4)}Qf(r3) = {(a2, b3), (a4, b4)}Slide17
Or-Sets
17Efficiently represent all answer sets of frugal repairsWe use or-sets
:
<1, 2, 3> means 1 or 2 or 3A = < {1, 3}, {1, 4}, {2, 3}, {2, 4} > We can “compress” A as B = {<1, 2>, <3, 4>}[Libkin
and Wong, ‘93] “decompression” α
operator: α(B) = AThe
or-set of answer sets for frugal repairs of I for Q:MQ
(I) = < {(a2, b3), (a3, b4)}, {(a2, b3), (a4, b4)} >Compressed form (set of or-sets):AQ(I) = { < (a2, b3) >, < (a3, b4), (a4, b4) > }Slide18
Representability (1)
18An or-set-of-sets
S
is representable if there exists a set-of-or-sets S0 (compression) such that:
α(S0) = S
For any distinct or-sets A, B in S0
, the tuples in A and B use distinct constants in all coordinatesThe compression of a representable set with active domain of size
n has size polynomial in n< {(a2, b3), (a3, b4)}, {(a2, b3), (a4, b4)} > {< (a2, b3) >, <(a3, b4), (a4, b4) >}< {(a2, b3), (a3, b4)}, {(a2, b2), (a4, b4)} > compressionnot representableSlide19
Representability (2)
19I |=
Q
iff the compression AQ(I) is not emptyIf we can compute AQ(I) in polynomial time, deciding whether I
|= Q is in PTIME
Theorem
If G[Q]
is a strongly connected graph, MQ(I) is representable and its compression can be computed in polynomial time in the size of ISlide20
Outline
The Dichotomy ConditionFrugal Repairs & Representable AnswersStrongly Connected Graphs
20Slide21
Cycles
21Ck= R
1
(
x1, x2), R2(x2, x3)…,
Rk(xk, x
1)The purified instance contains a collection of disjoint SCCs
ALGORITHM FrugalC
Find the SCCs that contain no directed cycle of length > kFor each such SCC i, create an or-set Ai that contains all cycles of length kOutput ACk(I) = {A1, A2, …}R(x, y)(a1, b1)(a2, b2)(a2, b3)S(y, z)(b1, c1
)(b2, c2)(b3, c2)T(z, x)
(c1, a1)(c2, a2)
a1b1c1
a2b2
c
2
b
3
A
C3
(I)
= {<(a
1
, b
1
, c
1
)>, <
(
a
2
, b
2
, c
2
),
(
a
2
, b
3
, c
2
)>}Slide22
General Case: SCCs (1)
22Recursively split a SCC G
into a SCC
G’
and a directed path P that intersects G’ only at its start and end nodeThe set AG’(I) can be recursively computed
x
y
R
S
T
t
UVGraph G’The path P = y -- > t -- > z
AG’(I) = {<(a1, b1, c1)>, <(a2, b2, c2), (a2, b3, c2)>}A1
A
2
zSlide23
General Case: SCCs (2)
23AG’
(I)
= {<(a
1, b1, c1)>, <(a2, b2, c2),
(a2, b3, c2
)>}
A
1A2B(a, b)(A1, [a1b1c1])(A2, [a2b2c2])(A2, [a2b3c2])B1
c (b, y)([a1b1c1], b1)([a2b2c2], b2)([a2b3c2]
, b3)B2c (b, z)([a1b1c1], c1)
([a2b2c2], c2)([a2b3c2], c2)B
0c (z, b)(c1, A1)(c2, A2)Any value belongs in a unique or-set
a
y
t
U
V
b
B
B
1
c
z
B
2
c
B
0
c
Replacement
of G’
A cycle
C = a -> b -> y -> t -> z -> a
+
a chord
B
2
that is a consistent relationSlide24
Rest Of the Proof
24PTIME
algorithm
for
splittable graphsFind a
separator in G[Q] (
always exists
if a
graph is splittable)The separator splits G[Q] into cases with fewer inconsistent edges, which are solved recursivelyBase case: all edges are consistent (check whether Q(I) is true)coNP-hardnessReduction from the Monotone-3SAT problemSlide25
Conlusions
25Significant
progress
towards
proving the dichotomy for the complexity
of C
ertain Q
uery Answering
for Conjunctive QueriesSettle the dichotomy (or trichotomy) even for queries with self-joins!Slide26
Thank you !
26