approximate membership dynamic data structures Shachar Lovett IAS Ely Porat Bar Ilan University Synergies in lower bounds June 2011 Information theoretic lower bounds Information theory ID: 529044
Download Presentation The PPT/PDF document "Lower bounds for" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Lower bounds for approximate membershipdynamic data structures
Shachar LovettIAS
Ely PoratBar-Ilan University
Synergies in lower bounds, June 2011Slide2
Information theoretic lower boundsInformation theory is a powerful tool to prove lower bounds, e.g. in data structuresStudy size of data structure (unlimited access)
Static d.s.: pure information theoryDynamic d.s.: communication gameSlide3
Talk overviewApproximate set membership problemBloom filters (simple near-optimal solution)Lower bounds – static caseNew dynamic lower boundsSlide4
Talk overviewApproximate set membership problemBloom filters (simple near-optimal solution)Lower bounds – static case
New dynamic lower boundsSlide5
Approximate set membershipLarge universe URepresent subset S UQuery: is x
S?Data structure representing S approximately:If x
S: answer YES alwaysIf x S: answer NO with high probabilityWhy approximately? To save space
U
S
~SSlide6
ApplicationsStorage (or communication) is costly, but a small false positive error can be toleratedOriginal applications (70’s): dictionaries, databases –
Bloom filtersNowadays: mainly network applicationsSlide7
Talk overviewApproximate set membership problemBloom filters (simple near-optimal solution)Lower bounds – static case
New dynamic lower boundsSlide8
Bloom filtersS={x
1,x2,…,xn}
0
0
0
0
0
0
0
0
0
0
Bit array
of length m
Hash function
h:U
{1,…,m}Slide9
Bloom filtersS={x
1,x2,…,xn}
0
0
0
1
0
0
0
0
0
0
Bit array
of length m
Hash function
h:U
{1,…,m}
h(x
1
)=4Slide10
Bloom filtersS={x
1,x2,…,xn}
1
0
0
1
0
0
0
0
0
0
Bit array
of length m
Hash function
h:U
{1,…,m}
h(x
2
)=
1Slide11
Bloom filtersS={x
1,x2,…,xn}
1
0
0
1
0
0
0
0
0
0
Bit array
of length m
Hash function
h:U
{1,…,m}
h(x
3
)=4Slide12
Bloom filtersS={x
1,x2,…,xn}
1
0
0
1
0
0
0
0
0
0
Bit array
of length m
Hash function
h:U
{1,…,m}
Query: y
S?Slide13
Bloom filtersS={x
1,x2,…,xn}
1
0
0
1
0
0
0
0
0
0
Bit array
of length m
Hash function
h:U
{1,…,m}
Query: y
S?
h(y)=3Slide14
Bloom filtersS={x
1,x2,…,xn}
1
0
0
1
0
0
0
0
0
0
Bit array
of length m
Hash function
h:U
{1,…,m}
Query: y
S?
NO
h(y)=3Slide15
Bloom filters: analysisS={x1,x2,…,xn
}Query: y S?If y S: returns
YES alwaysIf y S: returns NO with probability Error ½:
Error : (repetition)
1
0
0
1
0
0
0
0
0
0
Bit array
of length m
hashSlide16
Known boundsUpper bounds (e.g. algorithms)Bloom filter:Improvements: [Porat-Matthias’03, Arbitman-Naor-Segev’10]
Lower bounds:information theoretic:Can be matched by static data structures
[Charles-Chellapilla’08,Dietzfelbinger-Pagh’08,Porat’08]This work: dynamic d.s. Slide17
Talk overviewApproximate set membership problemBloom filters (simple near-optimal solution)Lower bounds – static case
New dynamic lower boundsSlide18
Static lower boundsStatic settings: insert + queryYao’s min-max principle: prove lower bound for deterministic data structure, randomized inputs
Insert: x
1,…,xn
m bits
Query: ySlide19
Static lower boundsDeterministic data structure: compression maps all sets
to a small family of setsInput: random set Accept set:
Properties:Small memory: No false negatives:Few false positives:Optimal setting:
Insert: x
1
,…,
x
n
m bits
Query: ySlide20
Static lower boundsInsert: x1,…,
xn
m bits
Query: y
U
S
A(S)
Set S,
Represented by
Goal: show #A(S) largeSlide21
Static lower boundsProperties:Assume thatIf then
General case: convexity
Insert: x
1
,…,
x
n
m bits
Query: ySlide22
Talk overviewApproximate set membership problemBloom filters (simple near-optimal solution)
Lower bounds – static caseNew dynamic lower boundsSlide23
Dynamic lower boundsBasic dynamic settings: two inserts + queryBreak inputs to k, n-k chunksInsert: x
1,…,xk
m bits
Insert: x
k+1
,…,
x
n
m bits
Query: ySlide24
Dynamiclower boundsAccepting sets:Properties:
General approach: analyze size of accepting setsSets A(x1,…,x
k) can’t be too small (covering)Sets A(A(x1,…,xk),xk+1,…,xn
) can’t be too large (error)
These yield the trivial lower bound again…
Insert: x
1
,…,
x
k
m bits
Insert: x
k+1
,…,
x
n
m bits
Query: ySlide25
Dynamiclower boundsMethod of typical inputsOn a typical input:A(x
1,…,xk) not too smallA(A(x1,…,
xk),xk+1,…,xn) not too largeInputs uncorrelated with data structure:
Yields an improved lower bound
(note: “typical” can be 1% of inputs)
Insert: x
1
,…,
x
k
m bits
Insert: x
k+1
,…,
x
n
m bits
Query: ySlide26
Dynamic lower boundsInsert: x
1,…,xk
m bits
Insert: x
k+1
,…,
x
n
m bits
Query: y
Functional inequality:
Free parameter: k – how to break input
Optimal choice:
Extension: break input into more parts
Doesn’t seem to help muchSlide27
SummaryApproximate membership problemStatic algorithms match
static information theoretic lower bound: This work: new dynamic information theoretic lower bound
THANK YOU!