Shirly Yakubov Motivation Given a set S of n objects we want to store them in a datastructure that could answer range queries For a range r we have rangesearching counting ID: 685081
Download Presentation The PPT/PDF document "Approximating the Depth via Sampling and..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Approximating the Depth via Sampling and Emptiness
Shirly
YakubovSlide2
Motivation
Given a set
S
of n objects we
want to store them in a data-structure that could answer range queries. For a range r we have:range-searching counting queries – number of objects from S intersected by r, denoted by .range-searching emptiness queries – does r intersect any of the objects in S.Counting queries are harder than emptiness ones.We want to approximate using emptiness queries.
Slide3
MotivationSlide4
Outline
Constructing the data-structure
Applications: halfplane/halfspace, disks, pseudo-disksRelative approximation via samplingconstructing a data-structure that for a prespecified
and
decides
whether
with allowed mistake if
Using
polylogarithmic
emptiness queries, output a number
such that
(the answers are correct with high probability)
Slide5
The data-structure
Let
be M independent random samples of
S
formed by picking every element with probability , where
Build M separate emptiness-query data-structures
respectively, and let
Slide6
The data-structure
Answering a query
of a range
r
:Let ,
Compute
using emptiness
queries
in each
Compute
If
return
, else return
Since each object is picked with
probability, if
we expect to catch more objects from r to our samples
Slide7
The data-structure
Correctness
we will show that with high
probability the data-structure returns the correct answer outside of the range
, that is: and
are small.
Slide8
The data-structure
The idea
By repeating the experiment M times we get an estimation of
by .
As bigger the gap is between
and
, the
fewer experiments required
for a reliable estimation.
We can see that the gap is
.
Slide9
The data-structure
The idea
We want to bound the probability that the difference between
/M
) and our estimation of exceeds this gap.
is a
sum of
independent
Bernoulli random variables.
Using
Chernoff's
inequality
we can bound this probability. Slide10
The data-structure
Observation:
For
, we have
Lemma:
, where
can be made arbitrarily large by a choice of large enough
.
Proof:
We will from now on assume
Slide11
The data-structure
,
-
By the
observation
Slide12
The data-structure
Deploying
Chernoff
inequality we have:
Slide13
The data-structure
Given a set
S
of n objects, a parameter
, and , one can construct a data-structure which, given a range r, returns either L or R. if it return L then
, else
. It might return either answer if
.
(All results are correct with high probability)
CorollarySlide14
The data-structure
D
consists of
emptiness data-structures. The space and preprocessing time needed to build them are
,
where S(m) is the space (and time) needed for a single emptiness data-structure storing m objects.
The query time is
,
where Q(m) is the query time of one emptiness data-structure.
(All
bounds hold with
high probability)
CorollarySlide15
The data-structure
A data structure
that uses
logarithmic number of emptiness data structures as a black box. decides if (L) or
(R)
, and the answer is correct
with high
probability.
So far we have:
R
LSlide16
Approximating the depth
The goal:
Given a set
S
of n objects and a parameter , build a data-structure that for a range r outputs a number , with .The idea: build several data-structures as described before and perform a binary search on them to find
.
Slide17
Approximating the depth
The data-structure
For small values of
(
we construct data structures . Slide18
Approximating the depth
The data-structure
Next, consider the values
for
, where . Note that
:
We
build a data-structure
for each
.
Slide19
Approximating the depth
Answering a query range r
First, we can determine if
is at least
or smaller then (using ).if smaller, we can perform binary search on
to find the exact (+-1) value of
.
. . . .
Slide20
Approximating the depth
Answering a query range r
If bigger, with high probability, if we were to query all
, we would get a sequence of
Rs, followed by a sequence of Ls:
We can perform a binary search to find this changeover.
Notice that for every j:
Slide21
Approximating the depth
Answering a query range r
Let
be the last data-structure returning R, we have:
and
,
let
Slide22
We have:
and
Approximating the depth
Answering a query range r
Now,
yields the required approximation:
.
Slide23
Approximating the depth
Complexity :
We have
)
data-structures. Each of them consists of emptiness data-structures.We used
queries on
.
Therefore, The overall query time is
,
were
is the emptiness-query time of the emptiness data-structures we use. We get the following theorem:
Slide24
Approximating the depth
Theorem :
Given a set
S
of n objects and a parameter , assume that one can construct, using space, in time, a data-structure that answers emptiness queries in time. Then, one can construct, using
space, in
)
time, a data-structure that, given a range
r
, outputs a number
, with
, in
time.(All results and bounds hold with high probability)
Slide25
Applications
We will show several examples of emptiness data-structures that answer emptiness queries in
logarithmic
time, using
linear space. Applying these data-structures to our theorem will give us efficient data-structures for approximating the depth. Slide26
S = points in
/
, r =
halfplane
/ halfspace Dobkin and Kirkpatrick hierarchySlide27
Answering emptiness queries for
halfplane
using
Dobkin
and Kirkpatrick hierarchyThe data structure:Given n points in the plane, compute their convex hull P (can be computed in time, using space). Build the DK hierarchy for P (constructing time is , the space required is ).
Convex hull:
S = points in
/
, r =
halfplane
/
halfspace
Slide28
Answering emptiness queries for
halfplane
using
Dobkin
and Kirkpatrick hierarchyAnswering a query:S = points in /, r = halfplane/ halfspace Slide29
Answering emptiness queries for
halfplane
using
Dobkin
and Kirkpatrick hierarchyAnswering a query:S = points in /, r = halfplane/ halfspace Slide30
Answering emptiness queries for
halfplane
using
Dobkin
and Kirkpatrick hierarchyAnswering a query:We have , and .
S = points in
/
, r =
halfplane
/
halfspace
Slide31
Corollary:
Given a set
P
of
n points in two/three dimensions and a parameter , one can construct in time a data-structure of size , such that given a halfplane/halfspace (reps.) r, it outputs a number , such that
in
time.
*
For exact counting quires, the best result that uses
space gives query time
for two dimensions, and
for three dimensions.
(Corollary
results and bounds hold with high probability)
S = points in
/
, r =
halfplane
/
halfspace Slide32
Answering emptiness queries for disks using
Dobkin
and Kirkpatrick hierarchy
S = points in
, r = disks
Slide33
Corollary:
Given a set
P
of
n points in the plane and a parameter, one can construct in time a data-structure of size , such that given a disk r, it outputs a number , such that
in
time.
(All results and bounds hold with high probability)
S = points in
, r = disks
Slide34
Pseudo-disks
a set S of n simply connected regions in the plane is called a set of pseudo-disks if the boundaries of any two distinct regions cross at most twice.
S = pseudo-disks, r = a pointSlide35
Pseudo-disks
We have
,
and .
S = pseudo-disks, r = a pointSlide36
Corollary:
Given a set
S
of
n pseudo-disks in the plane and a parameter , one can construct in time a data-structure of size , such that given a point q, it outputs a number , such that
in
time.
(All results and bounds hold with high probability)
S = pseudo-disks, r = a pointSlide37
Relative approximation via sampling
What can be done if we want to use a single sample instead of many ones?
We sample each
object with probability
p in a random sample R.If the depth of a range r is sufficiently large (roughly ), then it can be estimated reliably by The interesting fact is that as deeper r is the better this estimate is. Slide38
Relative approximation via sampling
Lemma:
Let
S
be a set of n objects, , and r a range of depth in S. Let R be a random sample of S, such that every element is picked with probability .Let be the depth of r
in
R
. then
with probability
.
Slide39
Relative approximation via sampling
Reminder: (
)
In previous lectures we saw how to find a sample R, that with high probability, is an
, that is:In our case, the lemma gives us the following:
Slide40
Relative approximation via sampling
Proof:
W
e have that
. By (*) we have:
(*):
Chernoff
:
for
.
Theorem:
.
Slide41
Relative approximation via sampling
Conclusion
Note that if depth(
r,S
) is (say) , then
,
which is (relatively) a small number.
Via sampling, we turned the task of estimating the depth of heavy ranges into the task of estimating the depth of a shallow range.