/
Learning and Testing Junta Distributions Learning and Testing Junta Distributions

Learning and Testing Junta Distributions - PowerPoint Presentation

debby-jeon
debby-jeon . @debby-jeon
Follow
384 views
Uploaded On 2017-05-10

Learning and Testing Junta Distributions - PPT Presentation

Maryam Aliakbarpour MIT Joint work with Eric Blais U Waterloo and Ronitt Rubinfeld MIT and TAU 1 The Problem 2 R elevant features in distributions   Smokes Does not regularly exercise ID: 546860

distributions junta testing learning junta distributions learning testing time boundupper bound distribution sample work complexity features heart attack set

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Learning and Testing Junta Distributions" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Learning and Testing Junta Distributions

Maryam Aliakbarpour (MIT)Joint work with: Eric Blais (U Waterloo) and Ronitt Rubinfeld (MIT and TAU)

1Slide2

The Problem

2Slide3

Relevant features in distributions

 

Smokes

Does not regularly exercise

Gender: male

Correlates with heart attack

Irrelevant to heart attack

Binary features list:

distributed the same

Distribution over

heart attack patients.

Junta coordinates

3Slide4

Non-smoker

smoker

R

elevant

features in distributions

 

Heart

attack correlates

Irrelevant to heart attack

Exercises

Does not exercise

4

Assumption: Irrelevant features are uniformly distributed.Slide5

Problem Definition

We call a

-junta

distribution

on the set

where

, if for any two vectors

and

such that

Observe that

.

 

 

 

 

 

5

 Slide6

Relevant features in distributions

 

Heart

attack correlates

Irrelevant to heart attack and

uniformly distributed.

is there a

small

such

set?

Which set is it?

Testing problem

Learning problem

6Slide7

Lots of related work

Feature selection: Guyon-Elisseeff’03, Liu-Motoda’12, and Chandrashekar-Sahin’14.Junta functions:

A. Blum’94

and

A. Blum-Langley’97

, ….,

Blais’09

, G. Valiant’12Property testing of distribution: GR00, BFR+00, BFF+01, Bat01, BDKR02

, BKR04, Val08, Pan08,

Val11, DDS+13, ADJ+11, LRR11

, ILR12, CDVV14, VV14

, DKN15b, DKN15a, ADK15

, and CDGR16Testing properties of collection of distributions: Levi-Ron-Rubinfeld’13, and

Diakonikolas-Kane’167Slide8

Learning Algorithm

8Slide9

PAC learning

Learning -junta distributions:Given

that

is a

-junta distributions,

outputs

which is

a

-junta distribution, and

-close to

.

 

In total variation distance:

 

9Slide10

Our results on Learning

10

Sample complexity

Running time

Lower

bound

Upper bound

Upper bound

Running time

Cover method

)

Our

work

)

Sample complexity

Running time

Lower

bound

Upper bound

Upper bound

Running time

Cover method

Our

workSlide11

Our results on Learning

11

Sample complexity

Running time

Lower

bound

Upper bound

Upper bound

Running time

Cover method

)

Our

work

)

Sample complexity

Running time

Lower

bound

Upper bound

Upper bound

Running time

Cover method

Our

work

 Slide12

12

There exists an

-learner for

-junta distributions using

samples.

 

Theorem Slide13

 

PMF of

 

For any

:

 

 

Parity function:

 

Overview of the Fourier analysis

on Boolean cube

13

 

We can estimate!Slide14

is a

-junta distribution on the set

.

 

For any subset

s.t.

,

is zero.

 

For any

of size

:

.

 

Lemma 1

14

Corollary

:

 Slide15

If

is a

-junta distribution on the set

but it is

-far from being a

-junta distribution on the set

,

 

 

15

Lemma 2Slide16

Learning Algorithm

For every subset

of size

:

Estimate

.

Output

that maximizes

.

Output

the estimate of the biases of every setting on coordinates

.

 

 

16

 

 

 

 Slide17

Testing Algorithm

17Slide18

What does it mean to test?

Testing -junta distributions:If

is a

-junta distributions,

accept

with probability 2/3.

If

is -far from being a

-junta distributions, reject

with probability 2/3.

 

 

accept

reject

18Slide19

Our results on Testing

19

Sample Complexity

Time complexity

Lower

bound

Upper bound

Upper bound

Our work

)

)

Sample Complexity

Time complexity

Lower

bound

Upper bound

Upper bound

Our workSlide20

Our results on Testing

20

Sample Complexity

Time complexity

Lower

bound

Upper bound

Upper bound

)

Sample Complexity

Time complexity

Lower

bound

Upper bound

Upper bound

 Slide21

Reduction

is a junta distribution on

.

 

is a collection of uniform distributions.

 

 

 

 

 

21Slide22

Testing Algorithm

For every subset

of size

:

Partition the domain based on J and view P as

the collection of distributions,

.

If

is a collection of uniform distributions, Accept.

Reject.

 

22

Better than just

different test.

 Slide23

Conclusion

Summary:Introduced junta distributionsHow to learn junta distributionsHow to test junta distributionsFuture directionsTighter resultsRemoving uniformity assumption

23Slide24

Reference

Isabelle Guyon and Andr´e Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157–1182, 2003.Huan Liu and Hiroshi

Motoda

. Feature selection for knowledge discovery and data

mining, volume

454. Springer Science & Business Media, 2012

.

Girish

Chandrashekar and Ferat Sahin. A survey on feature selection methods. Computers & Electrical Engineering, 40(1):16 – 28, 2014.Avrim Blum. Relevant examples and relevant features: Thoughts from computational learning theory. In AAAI Fall Symposium on ‘Relevance’, volume 5, 1994.Avrim Blum and Pat Langley. Selection of relevant features and examples in

machine learning. Artificial Intelligence, 97(1-2):245–271, December 1997.Reut Levi, Dana Ron, and Ronitt

Rubinfeld. Testing properties of collections of distributions. Theory of Computing, 9(8):295–347, 2013.Ilias Diakonikolas and Daniel M. Kane. A new approach for testing properties of

discrete distributions. CoRR, abs/1601.05557, 2016. URL http://arxiv.org/abs/1601.05557.

24Slide25

Reference

Gregory Valiant. Finding correlations in subquadratic time, with applications to learning parities and juntas. FOCS, pages 11–20, 2012.Blais, E.: Testing juntas nearly optimally. In: Proc. 41st Symposium on Theory of Computing, pp. 151–158 (2009)

25