Yin Yang Dimitris Papadias Stavros Papadopoulos H KUST Hong Kong Panos Kalnis KAUST Saudi Arabia Providence USA 2009 Database Outsourcing Advantages The data owner does not need the hardware software personnel to run a DBMS ID: 218076
Download Presentation The PPT/PDF document "Authenticated Join Processing in Outsour..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Authenticated Join Processing in Outsourced Databases
Yin Yang, Dimitris Papadias, Stavros PapadopoulosHKUST, Hong KongPanos KalnisKAUST, Saudi Arabia
Providence, USA, 2009Slide2
Database Outsourcing
AdvantagesThe data owner does not need the hardware / software / personnel to run a DBMSThe service provider achieves economy of scaleThe client enjoys better quality of serviceA main challenge
The service provider is not trusted, and may return incorrect query results
2Slide3
Query Authentication
The owner
signs its data with a digital signature scheme
Given a query, the
service provider attaches a VO
(
Verification Object
) to the
results
The client verifies query results with the
VO
and the owner’s signature
soundness
completeness
3Slide4
Example Queries
Range:
σ
quantity>100Purchase
Join:
Purchase
cid
Customer
Range & Join :(
σ
quantity
>100
Purchase)
cid(σ
city=“New York”Customer
)
4Slide5
State of the ArtRange authentication: many solutionsJoin authentication: few proposals
Materializing join results into viewsAINL (presented in detail later)Joins are inherently more complex than rangesA join combines information from multiple tablesOnly individual tables are signed5Slide6
Previous WorkMulti-dimensional range authentication Y. Yang, S. Papadopoulos, D. Papadias, G. Kollios (BU)
ICDE’08, VLDB J.Continuous range authenticationS. Papadopoulos, Y. Yang, D. PapadiasVLDB’07, VLDB J.Novel authentication frameworkS. Papadopoulos, D. Saccharidis, D. PapadiasICDE’096Slide7
Background
Concepts in CryptographyAuthenticated Data Structure (ADS)Merkle Hash TreeMB-TreeAINL7Slide8
Concepts in Cryptography
One-way, collision-resistant hash functionsh = H(
m)
Computationally infeasible to infer m from
h, or to find two
m
1
,
m
2
with the same hash value
h
Example: SHA1, SHA2, …
Public-key encryptionTwo keys: private key
sk, public key pk
Public key to encrypt, private key to decryptExample: RSADigital SignatureHard to forge without the secret key
Signing:
s
=
encrypt
(
H
(
m), sk)
Verifying: check if H(m) = decrypt
(s, pk)
8Slide9
Merkle Hash Tree (Merkle, Crypto’
89)A binary tree with hash values satisfying hn
= H(
hn.lc
| h
n.rc
)
Authenticates 1D range queries
Example: a query
Q
retrieves
d
4
, d
5
VO(Q) = {s
root
,
h
1-2
,
d
3
, d4, d
5, d6, h
7-8} The client re-constructs hRoot bottom-up, and verifies the signature
9Slide10
Merkle B-Tree (Li et al. SIGMOD
’06)Merkle Hash Tree + B-TreeConceptually, a Merkle Hash Tree with a large fanout (>2)10Slide11
AINL
For binary joinsRequires ADS on the join attribute of the inner relationReduces a join query into multiple rangesAlgorithmFor every tuple in the outer relation Perform an authenticated range on the inner relation11Slide12
Example of AINL
12
r
1
r
1
,
h
F
, h
10
, s
11
, s
12
,
h
E
r
2
,
h
1
, s
2
, s
3
, s
4
, h
5
, h
6
,
h
C
,
h
G
…
r
2Slide13
Drawbacks of AINL
Large VO size|R| records from R (outer relation)2|R|+|RS| records from
|S| (inner relation)Numerous hash values
Often larger than the combined size of R and
SHigh computation overhead at the server and the client
13Slide14
NAI: A Naïve solution
The server transmits all the data to the clientThe client performs the join locallyNAI often outperforms AINL14Slide15
Proposed MethodsBinary join authenticationAISM: requires ADS on one relation
AIM: requires ADSs on both relationsASM: requires no ADSComplex join query authenticationMulti-way joinSelect-project-join15Slide16
AISM: Query Processing
Sort the outer relation R on the join attributeTransmit all tuples in R to the client in their verifiable orderTransmit the sort order
R of
R tuples on the join attributeIncrementally traverse the ADS on
S once with the R
records
16Slide17
Example of AI
SM
17
R
[2
]=4
VO
: signature
of
R
, root signature of
T
S
,
r
1
-r
6
in their verifiable order
R
[1],
h1, s
2, s3, s4;
R
[2],
h
5
, h
6
,
h
C
, s
10
, s
11
, s
12
;
R
[3];
R
[4];
R
[5],
h
13
, h
14
, s
15
;
R
[6];
R
[1
]=2
R
[3
]=6
R
[4
]=1
R
[5
]=3
R
[6
]=5
r
2
r
1
r
3
r
4
r
6
r
5Slide18
AISM: Verification
The client checksR recordscorrectness of the sort order R of R
boundary recordswhether the re-constructed root hash of
TS
matches its signature
18Slide19
AIM
Query processingRequire ADSs on both relationsStart with one relation R, traverse its ADS T
R down to the first tuple r
1
Traverse T
S
until reaching the right boundary record
s
of
r
1
Traverse
TR until reaching the right boundary record r
of s
Alternatively traverse TS and TR similarly to the above
Verification: similar to AISM
19Slide20
Example of AIM
20
VO
: root signature of
T
S
, root signature of
T
R
,
r
1
h
s
1
,
s
2
,
s
3
,
s
4
;
r
2
;
h
s
5
,
h
s
6
,
h
C
,
s
10
,
s
11
,
s
12
;
r
3
,
r
4
;
r
5
;
h
s
13
,
h
s
14
,
s
15
;
h
r
6
;Slide21
ASM
IdeaSort-Merge-Join, sort at the server, merge at the clientQuery processingRequire no ADS
Transmit both R and
S in their verifiable order
Sort R and
S
respectively on the join attribute
Transmit the sort orders of
R
and
S
to the client
Transmit bitmaps
BR and B
S to the client, indicating the tuples with join partners
Verificationcorrectness of the base relations / sort-orders / the bitmaps21Slide22
Complex Query Authentication
Multi-way joinsSelection-Projection-Join queries22Slide23
Build a tree of binary join operators
m-ASM / m-AISM / m-AIM optimized for multi-way joinsExample:
A specialized algorithm AST applies when all relations are joined on the same attribute
One single VO
Multi-Way Join
23Slide24
Example of m-AIM and
m-AISM
VO
(
RS
):{root signature of
T
R
and
T
S
,
s
1
,
s2; hA
,
r
4
,
r
5
,
r6; s
3; s4;
s5; hC
}
VO
(
RST
):{root signature of
T
T
,
[1],
t
1
,
t
2
;
[2];
[3];
[4];
h
t3
}
24Slide25
Example of AST
25Slide26
Selection-Projection-Join Query
26
Selection
Use the m-
algorithms for joinsProjectionBuild a Merkle Hash Tree for each recordQuery optimizationSlide27
Experiments
27Three synthetic relations R(
a1
, a
2)
S
(
a
1
,
a
2
,
b
1,
b2)T(
b
1
,
b
2
)
Queries
R a1 S
R a2 S
(R a1 S)
b
1
T
(
R
a
2
S
)
b
2
T
Foreign keys
S.a
1
references
R.a
1
S.b
1
references
T.b
1
Parameters
Tuple size
Cardinality of
|S|Slide28
Repeatability and WorkabilityWe participated in the ACM SIGMOD 2009 Repeatability & Workability Evaluation (cf.,
http://homepages.cwi.nl/~manegold/SIGMOD-2009-RWE/). The reviewers were able to repeat all the experiments presented in our paper,yielding results that match the ones published in our paper, except from insignificant and to be expected variation due to randomness and/or hardware/software differences. The detailed reports will shortly be made publicly available by ACM SIGMOD.
28Slide29
Evaluations of AINL
29Tuple size (bytes)
32
64
128
256
512
C
VO
(
Gbytes
)
8.9
9.0
9.2
9.6
10.3
C
Client
(seconds)
205
207
210
214
219
C
DSP
(seconds)
262
271
429
1728
4603
|R| / |S|
0.1
0.5
1
2
5
C
VO
(
Gbytes
)
7.8
8.9
9.2
9.5
9.7
C
Client
(seconds)
196
205
210
218
223
C
DSP
(seconds)
296
311
429
540
647Slide30
Binary Join: Effect of Tuple Size
30Slide31
Binary Join: Effect of |R| / |S|
31Slide32
Multi-way Join: Effect of Tuple Size
32Slide33
Multi-way Join: Effect of |S| / |R|
33Slide34
Conclusion
Binary join authenticationAISM: authenticated structure on one relationAIM: authenticated structures on both relationsASM: no authenticated structureComplex query authenticationMulti-way join: eliminate unnecessary intermediate VO elementsSelection-projection-join queryFuture WorkAuthenticated Structures specialized to joins
Hash join instead of SMJ34Slide35
Thank you!
Questions?35