Gautam G Kamath FOCS 2017 Workshop Frontiers in Distribution Testing October 14 2017 Jayadev Acharya Cornell Constantinos Daskalakis MIT John Wright MIT Based on joint works with ID: 649640
Download Presentation The PPT/PDF document "Testing with Alternative Distances" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Testing with Alternative Distances
Gautam “G” KamathFOCS 2017 Workshop: Frontiers in Distribution TestingOctober 14, 2017
Jayadev
Acharya
Cornell
Constantinos
DaskalakisMIT
John WrightMIT
Based on joint works withSlide2
The story so far…
Test whether
versus
Domain of
Success probability
Goal: Strongly sublinear sample complexity
for some
Identity testing (samples from
, known
)
samples[BFFKR’01, P’08, VV’14]Closeness testing (samples from ) samples[BFRSW’00, V’11, CDVV’14]
Slide3
Generalize: Different Distances
or
?
or
?
Slide4
Generalize: Different Distances
or
?
Are
and
-close in
, or
-far in
?
Distances of interest: , , KL, HellingerClassic identity testing: , Can we characterize sample complexity for each pair of distances?Which distribution distances are sublinearly testable? [DKW’18]Wait, but… why? Slide5
Wait, but… why?
Tolerance for model misspecificationUseful as a proxy in classical testing problems
as
distance is useful for composite hypothesis testing
Monotonicity, independence, etc. [AD
K
’15]Other distances are natural in certain testing settings
as Hellinger distance is sometimes natural in multivariate settingsBayes networks, Markov chains [DP’17,DDG’17]Costis’ talk
Slide6
Wait, but… why?
Tolerance for model misspecification
Useful as a proxy in classical testing problems
as
distance is useful for composite hypothesis testing
Monotonicity, independence, etc. [AD
K’15]Other distances are natural in certain testing settings
as Hellinger distance is sometimes natural in multivariate settingsBayes networks, Markov chains [DP’17,DDG’17]
Costis
’ talk
Slide7
Tolerance
Is
equal to
, or are they far from each other?But why do we know
exactly?Models are inexact
Measurement errorsImprecisions in nature,
may be “philosophically” equal, but not literally equalWhen can we test
versus
?
i
deal model
observed model
Read data point wrong…
CLT approximations…Slide8
Tolerance
vs.
?
What
? How about
?
vs.
?
No!
samples [VV’10]Chill out, relax…-distance:
Cauchy-Schwarz:
vs.
?
Yes!
samples [AD
K’15]
Slide9
Details for a
-Tolerant Tester
Goal: Distinguish (
i
)
versus (ii)
Draw
samples from
(“
Poissonization”): number of appearances of symbol ’s are now independent!Statistic:
Acharya,
Daskalakis
,
K
. Optimal Testing for Properties of Distributions. NIPS 2015.Slide10
Details for a
-Tolerant Tester
Goal: Distinguish (
i
)
versus (ii)
Statistic:
: # of appearances of
;
: # of samples
(
i
):
, (ii): Can bound variance of
with some workNeed to avoid low prob. elements of Either ignore
such that ; orMix lightly (
) with uniform distribution (also in [G’16])Apply Chebyshev’s inequality
Side-Note: Pearson’s
-test uses statistic
Subtracting
in the numerator gives an unbiased estimator and importantly may hugely decrease
variance
[Zelterman’87]
[VV’14, CDVV’14, DKN’15]
Acharya,
Daskalakis
,
K
. Optimal Testing for Properties of Distributions. NIPS 2015.Slide11
Tolerant Identity Testing
Daskalakis, K., Wright. Which Distribution Distances are
Sublinearly Testable? SODA 2018.
Harder
(Implicit in [DK’16])Slide12
Tolerant Testing Takeaways
Can handle
or
tolerance at no additional cost
samples
KL, Hellinger, or
tolerance are expensive
samples
KL result based off hardness of entropy estimation
Closeness testing (
unknown): Even tolerance is costly! samplesOnly tolerance is freeProven via hardness of -tolerant identity testingSince is unknown, is no longer a polynomial Slide13
Application: Testing for Structure
Composite hypothesis testingTest against a class of distributions!
versus
Example:
= all monotone distributions
’s are monotone
non-increasing
Others:
unimodality
, log-concavity, monotone hazard rate, independenceAll can be tested in samples [ADK’15]Same complexity as vanilla uniformity testing! Slide14
Testing by Learning
Goal: Distinguish
from
Learn-then-Test:
Learn hypothesis
such that
(needs cheap “proper learner” in
)
(automatic since
)
Perform “tolerant testing”Given sample access to and description of
, distinguish
from
Tolerant testing (step 2) is
Naïve approach (using
instead of
) would require
Proper
learners in
(step 1)?Claim: This is cheap
Slide15
Hellinger Testing
Change
instead of
Hellinger
distance:
Between linear and quadratic relationship with
Natural distance when considering a collection of
iid
samples
Comes up in some multivariate testing problems (
Costis @ 2:55)Testing vs.
?Trivial results via
testing
Identity:
samplesCloseness:
samples
Slide16
Hellinger Testing
Testing
vs.
?
Trivial results via
testing
Identity:
samples
Closeness:
samples
But you can do better!Identity: samplesNo extra cost for or tolerance either!Closeness:
samples
LB and previous UB in [DK’16]
Similar chi-squared statistics as [ADK’15] and [CDVV’14]Some tweaks
and more careful analysis to handle Hellinger
Daskalakis,
K., Wright. Which Distribution Distances are Sublinearly Testable? SODA 2018.Slide17
Miscellanea
vs.
?
Trivially impossible, due to ratio between
and
,
,
Upper bounds for
testing problems?
i.e.,
vs. ?Use estimators mentioned in Jiantao’s talk Slide18
Thanks!