Maryam Aliakbarpour MIT Joint work with Eric Blais U Waterloo and Ronitt Rubinfeld MIT and TAU 1 The Problem 2 R elevant features Smokes Does not regularly exercise ID: 546864
Download Presentation The PPT/PDF document "Learning and Testing Junta Distributions" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Learning and Testing Junta Distributions
Maryam Aliakbarpour (MIT)Joint work with: Eric Blais (U Waterloo) and Ronitt Rubinfeld (MIT and TAU)
1Slide2
The Problem
2Slide3
Relevant features
Smokes
Does not regularly exercise
Gender: male
Correlates with heart attack
Irrelevant to heart attack
Binary features list:
distributed the same
Distribution over
heart attack patients.
Junta coordinates
3Slide4
Non-smoker
smoker
R
elevant features
Heart
attack correlates
Irrelevant to heart attack
Exercises
Does not exercise
4
Assumption: Irrelevant features are uniformly distributed.Slide5
Problem Definition
We call a
-junta
distribution
on the set
where
, if for any two vectors
and
such that
Observe that
.
5
Weight Slide6
Relevant features
Heart
attack correlates
Irrelevant to heart attack and
uniformly distributed.
is there a
small
such set?
Which set is it?
Testing problem
Learning problem
6Slide7
Related work
Feature selection: Guyon-Elisseeff’03, Liu-Motoda’12, and Chandrashekar-Sahin’14.Junta functions:
A. Blum’94
and
A. Blum-Langley’97
, ….,
Blais’09
, G. Valiant’12Property testing of distribution: GR00, BFR+00, BFF+01, Bat01, BDKR02
, BKR04, Val08, Pan08,
Val11, DDS+13, ADJ+11, LRR11
, ILR12, CDVV14, VV14
, DKN15b, DKN15a, ADK15
, and CDGR16Testing properties of collection of distributions: Levi-Ron-Rubinfeld’13, and
Diakonikolas-Kane’167Slide8
Our results
Learning8
Sample complexity
Running time
Lower
bound
Upper bound
Upper bound
Running time
Cover method
)
Our
algorithm
)
Sample complexity
Running time
Lower
bound
Upper
bound
Upper
bound
Running time
Cover method
Our
algorithmSlide9
Our results
Testing9
Lower
bound
Upper bound
Sample complexity
)
)
Lower
bound
Upper
bound
Sample
complexitySlide10
Learning Algorithm
10Slide11
PAC learning
Learning -junta distributions:Given
that
is a
-junta distributions,
outputs
which is
a
-junta distribution, and
-close to
.
In total variation distance:
11Slide12
12
There exists an
-learner for
-junta distributions using
samples.
Theorem Slide13
PMF of
For any
:
Parity function:
Overview of the Fourier analysis
13
We can estimate!Slide14
is a
-junta distribution on the set
.
For any subset
s.t.
,
is zero.
For any
of size
:
.
Lemma 1
14
Corollary
:
Slide15
If
is a
-junta distribution on the set
but it is
-far from being a
-junta distribution on the set
,
15
Lemma 2Slide16
16
Estimating is enough!
Accurate Estimation:
Slide17
Proof sketch of Lemma 2
17
For any
define
Recall:
Closest Junta to
on the set
Slide18
Learning Algorithm
For every subset
of size
:
Estimate
.
Output
that maximizes
.
Output
the estimate of the biases of every setting on coordinates
.
18
Slide19
Testing Algorithm
19Slide20
What does it mean to test?
Testing -junta distributions:If
is a
-junta distributions,
accept
with probability 2/3.
If
is -far from being a
-junta distributions, reject
with probability 2/3.
accept
reject
20Slide21
21
There exists an
-tester for
-junta distributions using
) samples.
Theorem Slide22
View
as a collection
22
Slide23
Reduction
is a junta distribution on
.
is a collection of uniform distributions.
23Slide24
Testing Algorithm
For every subset
of size
:
Partition the domain based on J and view P as
the collection of distributions,
.
If
is a collection of uniform distributions, Accept
. Reject.
How?
24Slide25
Testing collection of uniform distribution
25
Uniform distributionSlide26
Testing collection of uniform distributions
Paninski’08 uniformity test:Draw
samples.
Count the number of unique elements,
, in the sample set.
If
Reject.
Else
Accept.
26Slide27
Testing collection of uniform distributions
Paninski’08 uniformity test:Draw
samples.
Count the number of unique elements,
, in the sample set.
If
Reject.
Else
Accept.
Our Algorithm:
Draw
’s.
Construct
’s
from
’s.
number of unique elements among ’s.
number of unique elements among
’s
If
:
Accept
.Otherwise: Reject
.
27Slide28
Analysis
Paninski’08 uniformity test:Gap between YES cases
and
NO cases
:
is close its expected value!
28
Bound the Variance!Slide29
Analysis
Paninski’08 uniformity test:Gap between YES cases
and
NO cases
:
is close its expected value!
29
Bound the Variance!
Ours
Gap between
YES cases
and
NO cases
:
is close its expected value!
It only works, when:
are within a constant factor of each other.
Slide30
Reduction to the special case
Partition distributions into
buckets
If the collection is
-far from being uniform,
The sub-collection in at least one of the buckets is
-far from being uniform.
30Slide31
Testing uniformity of a collection of distributions (general case)
Estimate
’s
Partition distributions such that
’s are within a
constant factor of each other.
For
each bucket
Test that the sub-collection in each bucket is a set of uniform distributions. If
the test rejects, Reject. Accept
.
31Slide32
Conclusion
Summary:Introduced junta distributionsHow to learn junta distributionsHow to test junta distributionFuture directionsTighter resultRemoving uniformity assumption
32Slide33
Reference
Isabelle Guyon and Andr´e Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157–1182, 2003.Huan Liu and Hiroshi
Motoda
. Feature selection for knowledge discovery and data
mining, volume
454. Springer Science & Business Media, 2012
.
Girish
Chandrashekar and Ferat Sahin. A survey on feature selection methods. Computers & Electrical Engineering, 40(1):16 – 28, 2014.Avrim Blum. Relevant examples and relevant features: Thoughts from computational learning theory. In AAAI Fall Symposium on ‘Relevance’, volume 5, 1994.Avrim
Blum and Pat Langley. Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1-2):245–271, December 1997.Reut Levi, Dana Ron, and Ronitt
Rubinfeld. Testing properties of collections of distributions. Theory of Computing, 9(8):295–347, 2013.Ilias Diakonikolas
and Daniel M. Kane. A new approach for testing properties of discrete distributions. CoRR, abs/1601.05557, 2016. URL http://arxiv.org/abs/1601.05557.
33Slide34
Reference
Gregory Valiant. Finding correlations in subquadratic time, with applications to learning parities and juntas. FOCS, pages 11–20, 2012.Blais, E.: Testing juntas nearly optimally. In: Proc. 41st Symposium on Theory of Computing, pp. 151–158 (2009)
34