/
Learning Representations of Data Learning Representations of Data

Learning Representations of Data - PowerPoint Presentation

liane-varnes
liane-varnes . @liane-varnes
Follow
405 views
Uploaded On 2017-11-10

Learning Representations of Data - PPT Presentation

J Saketha Nath IIT Bombay Collaborators Pratik Jawanpuria Arun Iyer Sunita Sarawagi Ganesh Ramakrishnan Outline Introduction to Representation Learning Summary of Research ID: 604368

kernel learning inference data learning kernel data inference multi convex min icml

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Learning Representations of Data" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Learning Representations of Data

J. Saketha Nath

, IIT Bombay

Collaborators:

Pratik

Jawanpuria

,

Arun

Iyer

, Sunita

Sarawagi

, Ganesh Ramakrishnan.Slide2

Outline

Introduction to Representation Learning

Summary of Research

Case Study:

Class-ratio estimation

Concluding remarksSlide3

Introduction toRepresentation LearningSlide4

Representation Learning: Illustration

Training

Inference

 

 

 

 

 

 

 

 

 

 

 

 Slide5

Representation Learning: Illustration

Training

Inference

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 Slide6

Representation Learning: Examples

Training

Inference

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Principle Component Analysis

Deep Learning

(long list :)Slide7

Representation Learning: Illustration

Training

Inference

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 Slide8

Kernel Learning: Illustration

Training

Inference

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 Slide9

Kernel Learning: Broad set-ups

Multi-modal Data

[NIPS’09, JMLR’11]

Multi-task Learning

[SDM’11, ICML’12]

Interpretable Rule Learning

[ICML’11, JMLR’15]Slide10

Case Study:Class Ratio Estimation

Kernel LearningSlide11

Class Ratio Estimation

Labeled

Unlabeled

 

 

 

 

 

 

 

 

 

 

 

 Slide12

Class Ratio Estimation

Labeled

Unlabeled

 

 

 

 

 

 

 

 

 

 

What

frac

. from each class?Slide13

Class Ratio Estimation

 Slide14

Class Ratio Estimation

Assumption:

 

 Slide15

Class Ratio Estimation

 Slide16

Class Ratio Estimation

Representation of data distribution using kernel

 

 

 Slide17

Class Ratio Estimation

 Slide18

Class Ratio Estimation

Kernel Learning: Which

is best?

 

 Slide19

Statistical Consistency

Theorem: Let

be the estimated and true class ratios, let

be a matrix with

column as

, and let

, then with probability

, we have:

 

Please refer ICML’14, KDD’16 for detailsSlide20

Kernel Learning

Given:

.

Goal: Find

such that

min. bound on

mineig

mineig

CONVEX!

CONVEX!

min. empirical average of

Posed as a SDP, solved using cutting-planes algorithm

 Slide21

Kernel Learning

Given:

.

Goal: Find

such that…

min. bound on

mineig

mineig

CONVEX!

CONVEX!

min. empirical average of

Posed as a SDP, solved using cutting-planes algorithm

 Slide22

Kernel Learning

Given:

.

Goal: Find

such that…

min. bound on

mineig

mineig

CONVEX!

CONVEX!

min. empirical average of

Posed as a SDP, solved using cutting-planes algorithm

 Slide23

Kernel Learning

Given:

.

Goal: Find

such that…

min. bound on

mineig

mineig

CONVEX!

CONVEX!

min. empirical average of

Posed as a SDP, solved using cutting-planes algorithm

 Slide24

Kernel Learning

Given:

.

Goal: Find

such that…

min. bound on

mineig

mineig

CONVEX!

CONVEX!

min. empirical average of

Posed as a SDP, solved using cutting-planes algorithm

 Slide25

Kernel Learning

Given:

.

Goal: Find

such that…

min. bound on

mineig

mineig

CONVEX!

CONVEX!

min. empirical average of

Posed as a SDP, solved using cutting-planes algorithm

 Slide26

Kernel Learning

Given:

.

Goal: Find

such that…

min. bound on

mineig

mineig

CONVEX!

CONVEX!

min. empirical average of

Posed as a SDP, solved using cutting-planes algorithm

 Slide27

Kernel Learning

Given:

.

Goal: Find

such that…

min. bound on

mineig

mineig

CONVEX!

CONVEX!

min. empirical average of

Posed as a SDP, solved using cutting-planes algorithm

 Slide28

Kernel Learning

Given:

.

Goal: Find

such that…

min. bound on

mineig

mineig

CONVEX!

CONVEX!

min. empirical average of

Posed as a SDP, solved using cutting-planes algorithm

 

Please refer ICML’14 for detailsSlide29

Simulation results

Estimation Error

Varying Negative Class Proportions in U

(proportion in L is set to [0.5, 0.5])Slide30

Concluding remarks…Slide31

Summary of ResearchSlide32

Kernel learning

 

 

 

 

 

 

 

 

 

 

Multi-modal Data

NIPS’09, JMLR’11

INFERENCE

DATA

 

Multi-task learning

SDM’11, ICML’12

Rule Ensemble Learning

ICML’11, JMLR’15Slide33

Kernel learning

 

 

 

 

 

 

 

 

 

 

Multi-modal Data

NIPS’09, JMLR’11

INFERENCE

DATA

 

Multi-task learning

SDM’11, ICML’12

Rule Ensemble Learning

ICML’11, JMLR’15Slide34

Kernel learning

 

 

 

 

 

 

 

 

 

 

Multi-modal Data

NIPS’09, JMLR’11

INFERENCE

DATA

 

Multi-task learning

SDM’11, ICML’12

Rule Ensemble Learning

ICML’11, JMLR’15Slide35

Kernel learning

Rule Ensemble Learning

ICML’11, JMLR’15

INFERENCE

DATA

 

Multi-task learning

SDM’11, ICML’12

 

 

 

 

 

 

 

 

 

 

Multi-modal Data

NIPS’09, JMLR’11Slide36

Kernel learning – Multi-modal Data

For details refer NIPS’09, JMLR’11

Training, Inference

 

 

 

 

 

 

 

 

 

 Slide37

Kernel learning – Multi-modal Data

Training, Inference

For details refer NIPS’09, JMLR’11

 

 

 

 

 

 

 

 

 

 

Sparse combination of kernels in vogueSlide38

Kernel learning – Multi-modal Data

Training, Inference

For details refer NIPS’09, JMLR’11

 

 

 

 

 

 

 

 

 

 

Sparse combination of kernels in vogue

Key idea

Hierarchy in kernels

Non-sparse over modes

Sparse within each modeSlide39

Kernel learning – Multi-modal Data

Training, Inference

For details refer NIPS’09, JMLR’11

 

 

 

 

 

 

 

 

 

 

Sparse combination of kernels in vogue

Key idea

Hierarchy in kernels

Non-sparse over modes

Sparse within each mode

Mirror-descent

algo

.

Iterations involve sparse case solutionSlide40

Kernel learning – Multi-task Case

For details refer SDM’11, ICML’12

DATA

INFERENCE

 Slide41

Kernel learning – Multi-task Case

Paradigm shift

Multi-task feature learning to multi-task kernel learning

For details refer SDM’11, ICML’12

DATA

INFERENCE

 Slide42

Kernel learning – Multi-task Case

Paradigm shift

Multi-task feature learning to multi-task kernel learning

Generalized to case of unknown task relationships

Few kernels shared by few tasks

Convex formulation and active-set based algorithm

Analysed convergence

For details refer SDM’11, ICML’12

DATA

INFERENCE

 

 Slide43

Rule Ensemble Learning

For details refer ICML’11, JMLR’15Slide44

Rule Ensemble Learning

Posed as kernel learning problem

Convex formulation

Provable bounds

on convergence

For details refer ICML’11, JMLR’15Slide45

Rule Ensemble Learning

Posed as kernel learning problem

Convex formulation

Provable bounds on convergence

Search for compact rules

Rule is long

Descendents

are even longer

 

For details refer ICML’11, JMLR’15Slide46

Representation Learning

DATA

 

 

Training, InferenceSlide47

Overwhelming choice in representations

DATA

 

 

Training, Inference

Representation LearningSlide48

Overwhelming choice in representations

Data-dependent representation learning

DATA

 

 

Training, Inference

Representation LearningSlide49

Explicit feature learning

Dictionary Learning

Deep Learning

DATA

 

 

Training, Inference

Representation Learning - ParadigmsSlide50

Explicit feature learning

Dictionary Learning

Deep Learning

Implicit feature learning

Kernel learning

DATA

 

 

Kernel Methods

Representation Learning - Paradigms

 

 

 

 Slide51

Explicit feature learning

Dictionary Learning

Deep Learning

Implicit feature learning

Kernel learning

DATA

 

 

Kernel Methods

Representation Learning - Paradigms