Maryam Aliakbarpour MIT Joint work with Eric Blais U Waterloo and Ronitt Rubinfeld MIT and TAU 1 The Problem 2 R elevant features in distributions Smokes Does not regularly exercise ID: 546860
Download Presentation The PPT/PDF document "Learning and Testing Junta Distributions" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Learning and Testing Junta Distributions
Maryam Aliakbarpour (MIT)Joint work with: Eric Blais (U Waterloo) and Ronitt Rubinfeld (MIT and TAU)
1Slide2
The Problem
2Slide3
Relevant features in distributions
Smokes
Does not regularly exercise
Gender: male
Correlates with heart attack
Irrelevant to heart attack
Binary features list:
distributed the same
Distribution over
heart attack patients.
Junta coordinates
3Slide4
Non-smoker
smoker
R
elevant
features in distributions
Heart
attack correlates
Irrelevant to heart attack
Exercises
Does not exercise
4
Assumption: Irrelevant features are uniformly distributed.Slide5
Problem Definition
We call a
-junta
distribution
on the set
where
, if for any two vectors
and
such that
Observe that
.
5
Slide6
Relevant features in distributions
Heart
attack correlates
Irrelevant to heart attack and
uniformly distributed.
is there a
small
such
set?
Which set is it?
Testing problem
Learning problem
6Slide7
Lots of related work
Feature selection: Guyon-Elisseeff’03, Liu-Motoda’12, and Chandrashekar-Sahin’14.Junta functions:
A. Blum’94
and
A. Blum-Langley’97
, ….,
Blais’09
, G. Valiant’12Property testing of distribution: GR00, BFR+00, BFF+01, Bat01, BDKR02
, BKR04, Val08, Pan08,
Val11, DDS+13, ADJ+11, LRR11
, ILR12, CDVV14, VV14
, DKN15b, DKN15a, ADK15
, and CDGR16Testing properties of collection of distributions: Levi-Ron-Rubinfeld’13, and
Diakonikolas-Kane’167Slide8
Learning Algorithm
8Slide9
PAC learning
Learning -junta distributions:Given
that
is a
-junta distributions,
outputs
which is
a
-junta distribution, and
-close to
.
In total variation distance:
9Slide10
Our results on Learning
10
Sample complexity
Running time
Lower
bound
Upper bound
Upper bound
Running time
Cover method
)
Our
work
)
Sample complexity
Running time
Lower
bound
Upper bound
Upper bound
Running time
Cover method
Our
workSlide11
Our results on Learning
11
Sample complexity
Running time
Lower
bound
Upper bound
Upper bound
Running time
Cover method
)
Our
work
)
Sample complexity
Running time
Lower
bound
Upper bound
Upper bound
Running time
Cover method
Our
work
Slide12
12
There exists an
-learner for
-junta distributions using
samples.
Theorem Slide13
PMF of
For any
:
Parity function:
Overview of the Fourier analysis
on Boolean cube
13
We can estimate!Slide14
is a
-junta distribution on the set
.
For any subset
s.t.
,
is zero.
For any
of size
:
.
Lemma 1
14
Corollary
:
Slide15
If
is a
-junta distribution on the set
but it is
-far from being a
-junta distribution on the set
,
15
Lemma 2Slide16
Learning Algorithm
For every subset
of size
:
Estimate
.
Output
that maximizes
.
Output
the estimate of the biases of every setting on coordinates
.
16
Slide17
Testing Algorithm
17Slide18
What does it mean to test?
Testing -junta distributions:If
is a
-junta distributions,
accept
with probability 2/3.
If
is -far from being a
-junta distributions, reject
with probability 2/3.
accept
reject
18Slide19
Our results on Testing
19
Sample Complexity
Time complexity
Lower
bound
Upper bound
Upper bound
Our work
)
)
Sample Complexity
Time complexity
Lower
bound
Upper bound
Upper bound
Our workSlide20
Our results on Testing
20
Sample Complexity
Time complexity
Lower
bound
Upper bound
Upper bound
)
Sample Complexity
Time complexity
Lower
bound
Upper bound
Upper bound
Slide21
Reduction
is a junta distribution on
.
is a collection of uniform distributions.
21Slide22
Testing Algorithm
For every subset
of size
:
Partition the domain based on J and view P as
the collection of distributions,
.
If
is a collection of uniform distributions, Accept.
Reject.
22
Better than just
different test.
Slide23
Conclusion
Summary:Introduced junta distributionsHow to learn junta distributionsHow to test junta distributionsFuture directionsTighter resultsRemoving uniformity assumption
23Slide24
Reference
Isabelle Guyon and Andr´e Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157–1182, 2003.Huan Liu and Hiroshi
Motoda
. Feature selection for knowledge discovery and data
mining, volume
454. Springer Science & Business Media, 2012
.
Girish
Chandrashekar and Ferat Sahin. A survey on feature selection methods. Computers & Electrical Engineering, 40(1):16 – 28, 2014.Avrim Blum. Relevant examples and relevant features: Thoughts from computational learning theory. In AAAI Fall Symposium on ‘Relevance’, volume 5, 1994.Avrim Blum and Pat Langley. Selection of relevant features and examples in
machine learning. Artificial Intelligence, 97(1-2):245–271, December 1997.Reut Levi, Dana Ron, and Ronitt
Rubinfeld. Testing properties of collections of distributions. Theory of Computing, 9(8):295–347, 2013.Ilias Diakonikolas and Daniel M. Kane. A new approach for testing properties of
discrete distributions. CoRR, abs/1601.05557, 2016. URL http://arxiv.org/abs/1601.05557.
24Slide25
Reference
Gregory Valiant. Finding correlations in subquadratic time, with applications to learning parities and juntas. FOCS, pages 11–20, 2012.Blais, E.: Testing juntas nearly optimally. In: Proc. 41st Symposium on Theory of Computing, pp. 151–158 (2009)
25