/
A Dichotomy on The Complexity of Consistent Query Answering A Dichotomy on The Complexity of Consistent Query Answering

A Dichotomy on The Complexity of Consistent Query Answering - PowerPoint Presentation

olivia-moreira
olivia-moreira . @olivia-moreira
Follow
406 views
Uploaded On 2016-04-11

A Dichotomy on The Complexity of Consistent Query Answering - PPT Presentation

Paris Koutris Dan Suciu University of Washington Repairs An uncertain instance I for a schema with key constraints A repair r of I is a subinstance of I that satisfies the key constraints and is ID: 278399

coupled repairs set dichotomy repairs coupled dichotomy set ptime frugal sets splittable query representable repair conp boolean edges complete

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "A Dichotomy on The Complexity of Consist..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

A Dichotomy on The Complexity of Consistent Query Answering for Atoms With Simple Keys

Paris KoutrisDan Suciu

University of WashingtonSlide2

Repairs

An uncertain instance I for a schema with key constraintsA repair

r of

I

is a subinstance of I that satisfies the key constraints and is maximal

2

R

(

x, y)(a1, b1)(a1, b2)(a2, b2)(a3, b3)(a3, b4)(a4, b4)

(a1, b1)(a2, b2)(a3, b4)(a4, b4)

(a1, b1)(a2, b2)(a3, b3)(a4, b4)

(a1, b2)(a2, b2)(a3, b4)(a4, b4)

(a1, b2)(a2, b2)(a3, b3)(a4, b4)

The 4 possible repairsSlide3

Consistent Query Answering

If Q is boolean, we say that I is certain for Q

,

I

|= Q, if for every repair r of I, Q

(r) is true

3

R

(x, y)(a1, b1)(a1, b2)(a2, b2)(a3, b3)(a3, b4)(a4, b4)

S(y, z)(b1, c1)(b2, c1)(b2

, c2)(b3, c3)Q() = R(x, y), S(y, z)I |= QSlide4

Problem Statement

CERTAINTY(Q): Given as input an instance I, does I |= Q when Q is a boolean CQ? In general, CERTAINTY(Q) is in

coNP

Q

1 = R(x, y), S(y, z) : expressible as a first-order queryQ2 =

R(x, y), S(z, y

) : coNP-complete Q

3 = R(x, y), S

(y, x) : PTIME but not first-order expressible 4Conjecture For every boolean conjunctive query Q, CERTAINTY(Q) is either in PTIME or coNP-completeSlide5

Progress so Far

[Wijsen, 2010]Syntactic characterization of FO-expressible acyclic CQs w/o self-joins[Kolaitis and Pema, 2012]

A

trichotomy

for CQs with 2 atoms and no self-joins[Wijsen, 2010 & 2013]PTIME algorithm for cyclic queries: Ck

= R1(x1,x

2), …, Rk(

xk, x1)

Further classification of acyclic CQs w/o self-joins5Slide6

Our Contribution

A dichotomy for CQs w/o self-joins where atoms have either Simple keys : R(x, y, z)Keys that consist of all attributes: S(x, y, z)

6

Theorem

For every boolean

CQ Q w/o self-joins where for each atom the key consists of either

one attribute or all

attributes, there exists a dichotomy of CERTAINTY(Q) into PTIME and coNP-completeSlide7

Outline

The Dichotomy ConditionFrugal Repairs & Representable AnswersStrongly Connected Graphs

7Slide8

The Query Graph

We equivalently study boolean CQs consisting only of binary relations where one attribute is the key: R(x, y)Relations can be consistent (R

c

) or

inconsistent (Ri)Query Graph: a directed edge (u, v) for each atom R(u

,v)

8

Q

= Ri(x, y), Si(z, w), Tc(y, w)ywxST

Rz

G[Q]source node uRend node vRSlide9

Definitions

x+,R : set of nodes reachable from node x once we remove the edge R (through a directed path)R ~ S [

source-equivalent

]

: source nodes uR, uS are in the same SCC[R]: the equivalence class of R w.r.t

~ 9

y

R

z

x

T

Svwu

x+,R = {x, v, w}R ~ T and [R] = {R, T}VUSlide10

Coupled Edges

coupled+(R) = edges in [R] + any inconsistent edge S s.t. the source node uS is connected to the end node

v

R

through a (undirected) path that does not intersect with uR+,R

10

y

=

vR Rz

x = uR

T

Svwu = u

Vcoupled+(R): contains R,T: [R] = {R, T}contains V: path from y (= vR ) to u (= uV)does not contain UVU

The set

u

R

+,RSlide11

Splittable Graphs

Two inconsistent edges R, S are coupled if S in coupled+(R) & R in coupled+(S) A graph G[Q] is:

unsplittable

if it contains a pair of coupled edges that are not source-equivalent.splittable otherwise

11

y

R

z

x

T

SvwuV

Ucoupled+(R) = {R, T, V}coupled+(T) = {R, T, V}coupled+(V) = {V}coupled+(U) = {U,V,R,T}Only R,T are coupled

SPLITTABLE!Slide12

The Dichotomy Condition

12

y

R

z

x

T

S

v

w

u

VUDichotomy Theorem If G[Q] is splittable, CERTAINTY(Q) is in PTIMEIf G[Q

] is unsplittable, CERTAINTY(Q) is coNP-completeSplittable, so in PTIMESlide13

Examples

13

PTIME

R(

x

, y), S(

y

, z)

coNP

-complete

R(x, y), S(y, z), Tc(x, z)xyz

xyz

PTIME

R(

x

, y), S(

y

, z),

U

c

(

z

, y)

x

y

z

coNP

-complete

R(

x

, y), S(

z

,

y

),

U

c

(

y

,

z

)

x

y

zSlide14

Outline

The Dichotomy ConditionFrugal Repairs & Representable AnswersStrongly Connected Graphs

14Slide15

Frugal Repairs (1)

15Definition

A repair r of an instance

I

is frugal for a boolean query Q

if for any other repair r’ of I,

Qf(r’) is

not strictly contained in Q

f(r)R(x, y)(a1, b1)(a1, b2)(a2, b3)(a3, b4)(a4, b4)S(y

, x)(b1, a1)(b3, a2)(b4, a3)

(b4, a4)repair r1 = { R(a1, b1), R(a2, b3), R(a3, b4), R(a4, b4) S(b1

, a1), S(b3, a2), S(b4, a3) }Qf(r1) = { (a1, b1), (a2, b3), (a3, b4) }repair r2 = { R(a1, b2)

, R(a2, b3), R(a3, b4), R(a4, b4) S(b1, a1), S(b3, a2), S(b4, a3) }Qf(r2) = { (a2, b3), (a3, b4) }

not

frugal

frugal

Q

f

= all body variables to the head (

full query

)Slide16

R

(x, y)

(

a

1, b1)

(a1

, b2)

(

a2, b3)(a3, b4)(a4, b4)S(y, x)(b1, a1)(b3, a2)(b4, a3)(b4, a4)

Frugal Repairs (2)16I |= Q if and only if every frugal repair satisfies QWe lose no generality if we study only frugal repairs!Only two frugal repairs:Qf(r2) = {(a2, b3), (a3, b

4)}Qf(r3) = {(a2, b3), (a4, b4)}Slide17

Or-Sets

17Efficiently represent all answer sets of frugal repairsWe use or-sets

:

<1, 2, 3> means 1 or 2 or 3A = < {1, 3}, {1, 4}, {2, 3}, {2, 4} > We can “compress” A as B = {<1, 2>, <3, 4>}[Libkin

and Wong, ‘93] “decompression” α

operator: α(B) = AThe

or-set of answer sets for frugal repairs of I for Q:MQ

(I) = < {(a2, b3), (a3, b4)}, {(a2, b3), (a4, b4)} >Compressed form (set of or-sets):AQ(I) = { < (a2, b3) >, < (a3, b4), (a4, b4) > }Slide18

Representability (1)

18An or-set-of-sets

S

is representable if there exists a set-of-or-sets S0 (compression) such that:

α(S0) = S

For any distinct or-sets A, B in S0

, the tuples in A and B use distinct constants in all coordinatesThe compression of a representable set with active domain of size

n has size polynomial in n< {(a2, b3), (a3, b4)}, {(a2, b3), (a4, b4)} > {< (a2, b3) >, <(a3, b4), (a4, b4) >}< {(a2, b3), (a3, b4)}, {(a2, b2), (a4, b4)} > compressionnot representableSlide19

Representability (2)

19I |=

Q

iff the compression AQ(I) is not emptyIf we can compute AQ(I) in polynomial time, deciding whether I

|= Q is in PTIME

Theorem

If G[Q]

is a strongly connected graph, MQ(I) is representable and its compression can be computed in polynomial time in the size of ISlide20

Outline

The Dichotomy ConditionFrugal Repairs & Representable AnswersStrongly Connected Graphs

20Slide21

Cycles

21Ck= R

1

(

x1, x2), R2(x2, x3)…,

Rk(xk, x

1)The purified instance contains a collection of disjoint SCCs

ALGORITHM FrugalC

Find the SCCs that contain no directed cycle of length > kFor each such SCC i, create an or-set Ai that contains all cycles of length kOutput ACk(I) = {A1, A2, …}R(x, y)(a1, b1)(a2, b2)(a2, b3)S(y, z)(b1, c1

)(b2, c2)(b3, c2)T(z, x)

(c1, a1)(c2, a2)

a1b1c1

a2b2

c

2

b

3

A

C3

(I)

= {<(a

1

, b

1

, c

1

)>, <

(

a

2

, b

2

, c

2

),

(

a

2

, b

3

, c

2

)>}Slide22

General Case: SCCs (1)

22Recursively split a SCC G

into a SCC

G’

and a directed path P that intersects G’ only at its start and end nodeThe set AG’(I) can be recursively computed

x

y

R

S

T

t

UVGraph G’The path P = y -- > t -- > z

AG’(I) = {<(a1, b1, c1)>, <(a2, b2, c2), (a2, b3, c2)>}A1

A

2

zSlide23

General Case: SCCs (2)

23AG’

(I)

= {<(a

1, b1, c1)>, <(a2, b2, c2),

(a2, b3, c2

)>}

A

1A2B(a, b)(A1, [a1b1c1])(A2, [a2b2c2])(A2, [a2b3c2])B1

c (b, y)([a1b1c1], b1)([a2b2c2], b2)([a2b3c2]

, b3)B2c (b, z)([a1b1c1], c1)

([a2b2c2], c2)([a2b3c2], c2)B

0c (z, b)(c1, A1)(c2, A2)Any value belongs in a unique or-set

a

y

t

U

V

b

B

B

1

c

z

B

2

c

B

0

c

Replacement

of G’

A cycle

C = a -> b -> y -> t -> z -> a

+

a chord

B

2

that is a consistent relationSlide24

Rest Of the Proof

24PTIME

algorithm

for

splittable graphsFind a

separator in G[Q] (

always exists

if a

graph is splittable)The separator splits G[Q] into cases with fewer inconsistent edges, which are solved recursivelyBase case: all edges are consistent (check whether Q(I) is true)coNP-hardnessReduction from the Monotone-3SAT problemSlide25

Conlusions

25Significant

progress

towards

proving the dichotomy for the complexity

of C

ertain Q

uery Answering

for Conjunctive QueriesSettle the dichotomy (or trichotomy) even for queries with self-joins!Slide26

Thank you !

26