2 Alex Andoni Plan 2 Dimension reduction Application Numerical Linear Algebra Sketching Application Streaming Application Nearest Neighbor Search and more Dimension reduction linear ID: 725590
Download Presentation The PPT/PDF document "Sublinear Algorithmic Tools" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Sublinear Algorithmic Tools2
Alex AndoniSlide2
Plan
2
Dimension reduction
Application: Numerical Linear Algebra
SketchingApplication: StreamingApplication: Nearest Neighbor Searchand more…
Dimension reduction: linear map s.t:for any points :
Slide3
Dimension reduction in other norms/distances?
E.g.,
?Essentially no
[CS’02, BC’03, LN’04, JN’10…]For points, approximation: dimension between
and [BC03, NR10, ANN10…] even if map depends on the dataset!In contrast to : [JL] gives , and doesn’t depend on the datasetGeneralize the notion of dimension reduction! Slide4
Computational view
Arbitrary computation
Cons:
Less geometric structure (e.g.,
not metric)Pros:More expressability: better trade-off approximation vs “dimension” Sketch : “functional compression scheme”for estimating distancesalmost all lossy ( distortion or more) and randomized
Slide5
Sketching for
Analog of Euclidean projections ?
For
,
we used: Gaussian distributionhas stability property: is distributed as Is there something similar for 1-norm?Yes: Cauchy distribution!1-stable: is distributed as
What’s wrong then?
Cauchy are
heavy-tailed…
doesn’t even have finite expectation (of abs)
Slide6
Sketching for
[Indyk’00]
6
Still, can consider similar random mapConsider where
each coordinate distributed as
Cauchy
Take 1-norm:
?
does not have finite expectation, but…
Can
estimate by:median
Correctness claim: for each
Slide7
Estimator for
Estimator: median
Correctness claim
: for each
Proof:
is distributed as
Hence claim equivalent to
Matter of checking the pdf of the Cauchy vars…
Slide8
Estimator for
: high probability
bnd
Estimator: median
Claim: for each Take
Hence
(CLT: Chernoff bound)
Similarly with
The above means that
median
with probability at most
8
if holds
if holds
Slide9
Yesterday’s Application:
regression
Problem:
+structured
, +preconditioner: More: other norms (, M-estimator, Orlicz norms), low-rank approximation & optimization, matrix multiplication, see [Woodruff, FnTTCS’14,…] Weak DR: linear map
,
s.t.
for any
:
Weak(
er
) OSE:
linear map
s.t.
for any linear subspace
of dimension
:
Cauchy distribution
[I’00]
[SW’11, MM’13, WZ’13, WW’18]Slide10
Today Application: Streaming 1
IP
Frequency
131.107.65.14
318.0.1.122
80.97.56.202131.107.65.14131.107.65.14131.107.65.14
18.0.1.12
18.0.1.12
80.97.56.20
80.97.56.20
IP
Frequency
131.107.65.14
3
18.0.1.12
2
80.97.56.20
2
127.0.0.19
192.168.0.1
8257.2.5.70
16.09.20.111Challenge: log statistics of the data, using small spaceSlide11
Streaming statistics
Let
= frequency of IP
1st moment (sum):
Trivial: keep a total counter2nd moment (variance): Trivially: counters too much spaceCan’t do better if exactSmall space via (approximate) dimension reduction in IPFrequency131.107.65.14318.0.1.12280.97.56.202
Slide12
2nd frequency moment via DR
= frequency of IP
2
nd moment:
Store Estimator:
Updating the sketch:
Use linearity of the sketching function:
Correctness from dimension reduction guarantee
Slide13
Streaming Scenario 2
131.107.65.14
18.0.1.12
18.0.1.12
80.97.56.20
IP
Frequency
131.107.65.14
1
18.0.1.12
1
80.97.56.20
1
Question
:
difference
in traffic
Similar
Qs
: average delay/variance in a network
differential statistics between logs at different servers,
etc
IP
Frequency
131.107.65.14
1
18.0.1.12
2Slide14
Sketching for Difference
Use
sketching!
Using random
(common for 2 routers) Estimator:
Already proved: can get approximation with 010110010101Estimate IPFrequency131.107.65.14118.0.1.12
1
80.97.56.20
1
IP
Frequency
131.107.65.14
118.0.1.122
Slide15
Sketching for
norms
-moment:
About
counters enoughworks via -stable distributions [Indyk’00]Can do (and need) counters[AMS’96, SS’02, BYJKS’02, CKS’03, IW’05, BGKS’06, BO10, AKO’11, G’11, BKSV’14,…] Slide16
Streaming 3: # distinct elements
Problem
: compute the number of
distinct elements in the streamTrivial solution:
space for distinct elementsWill see:
space (approximate) IPFrequency131.107.65.14118.0.1.122 Slide17
Distinct Elements
Algorithm:
Hash
function
Compute
Output is Main claim: , for distinct elementsProof:repeats of the same element don’t matter = minimum of random numbers in [0,1]Pick another random number What’s the probability ?1) exactly 2) probability it is smallest among reals: Initialize: minHash=1 hash function h into [0,1]Process
(
int
i
):
if (h(
i
) < minHash) minHash = h(index);Output: 1/minHash-1
2
7
5
[Flajolet-Martin’85, Alon-Matias-Szegedy’96]
Take majority of repetitions