/
Hierarchical Classification Hierarchical Classification

Hierarchical Classification - PowerPoint Presentation

danika-pritchard
danika-pritchard . @danika-pritchard
Follow
430 views
Uploaded On 2016-05-05

Hierarchical Classification - PPT Presentation

Rongcheng Lin Computer Science Department Contents Motivation Definition amp Problem Review of SVM Hierarchical Classification Pathbased Approaches Regularizationbased Approaches Motivation ID: 306446

tree classification loss function classification tree function loss class hierarchical problem redefine group based large path margin approaches tasks

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Hierarchical Classification" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Hierarchical Classification

Rongcheng Lin

Computer Science DepartmentSlide2

Contents

Motivation, Definition & Problem

Review of SVM

Hierarchical Classification

Path-based Approaches

Regularization-based ApproachesSlide3

Motivation

The classes in real world are structured, specially often hierarchically related.

Gene function prediction

Document categorization

Image Search

Hierarchies

or taxonomies offer clear advantage in supporting tasks like browsing, searching or visualization

International Patent Classification scheme

Yahoo! Web

catalogs

Prior knowledge about class relationships will improve the classification performance, especially for tasks with large class numberSlide4

Motivation

The classes in real world are structured, specially often hierarchically related.

Gene function prediction

Document categorization

Image Search

Hierarchies

or taxonomies offer clear advantage in supporting tasks like browsing, searching or visualization

International Patent Classification scheme

Yahoo! Web

catalogs

Prior knowledge about class relationships will improve the classification performance, especially for tasks with large class numberSlide5

Motivation

The classes in real world are structured, specially often hierarchically related.

Gene function prediction

Document categorization

Image Search

Hierarchies

or taxonomies offer clear advantage in supporting tasks like browsing, searching or visualization

International Patent Classification scheme

Yahoo! Web

catalogs

Prior knowledge about class relationships will boost the classification performance, especially for tasks with large class numberSlide6

Definition and Problem

a

utomatically categorize data into pre-defined topic hierarchies or taxonomies

Supervised Learning

Structured Output Slide7

DAG and Tree StructureSlide8

Definition and Problem

a

utomatically categorize data into pre-defined topic hierarchies or taxonomies

Supervised Learning

Structured Output

Problem and solution?Slide9

Definition and Problem

Incorporate the inter-class relationship(hierarchy) into classification

Redefine the problem

Lower

level categories are more detailed while upper level categories are more

general

Redefine the margin

Different

classification mistake are of different

severity

Redefine the loss functionSlide10

Definition and Problem

Incorporate the inter-class relationship(hierarchy) into classification

Redefine the problem

Lower

level categories are more detailed while upper level categories are more

general

Redefine the margin

Different

classification mistake are of different

severity

Redefine the loss functionSlide11

Definition and Problem

Incorporate the inter-class relationship(hierarchy) into classification

Redefine the problem

Lower

level categories are more detailed while upper level categories are more

general

Redefine the margin

Different

classification mistake are of different

severity

Redefine the loss functionSlide12

Review: Binary SVM

Binary classification

M

argin

Loss Function

w

T

x

+

b

= 0

w

T

x

+

b

< 0

w

T

x

+

b

> 0

f

(

x

)

=

sign(

w

T

x

+

b

)

 Slide13

Review: Binary SVM

 

General Form:Slide14

Review: Multiclass SVM

1

)

one-vs-the

rest

2

) Crammer &

Singer (pairwise)Slide15

Review: Multiclass SVM

Dedicated Loss FunctionSlide16

Review: Multiclass SVM

Dedicated Loss Function

 Slide17

Review: Hinge Loss Function

the

more you violate the

margin, the

higher the penalty

is.Slide18

Loss Function

 Slide19

Hierarchical Classifiers

Path-based Approaches

Large Margin Hierarchical Classification

Hierarchical Document Categorization with Support Vector Machine

On Large Margin Hierarchical Classification with multiple paths

Regularization-based Approaches

Tree-Guided Group Lasso for Multi-task Regression

Hierarchical Multitask Structured Output Learning

for Large-Scale SegmentationSlide20

Tree Distance

A given hierarchy induces a metric over the set of classes

tree

distance or tree induced error

(y,

)

is defined to be the number of edges along the (unique) path from

y to

 Slide21

Tree Distance

A given hierarchy induces a metric over the set of classes

tree

distance or tree induced error

(y,

)

is defined to be the number of edges along the (unique) path from

y to

 

 

y

 Slide22

Tree Distance

2

5

6

 

8

9

1

0

3

 

y

4

 Slide23

Loss Functions

1

1

Zero-One Loss

Hinge Loss

Hierarchical Hinge

Loss

 

 

 Slide24

Path-based Approaches

path-based approaches try to find the most likely path from the root.

Only need to update the parameters of miss-classified

nodes in the tree

Slide25

Large margin hierarchical classifier

 

 

 

 Slide26

Training AlgorithmSlide27

HSVMSlide28

HSVM

 

 

 Slide29

HSVM

 

 Slide30

Regularization-based Approaches

K individual classification tasks

Use a n additional regularization term

to penalizes the disagreement between the individual models

 Slide31

Multitask Learning

Inductions of multiple tasks are performed simultaneously to capture intrinsic relatednessSlide32
Slide33

L1-Norm, L2-Norm

P

enalize model complexity to avoid

overfitting

L-1 Norm give more sparse estimate than L-2 NormSlide34

Group Lasso and Sparse Group LassoSlide35

HMTL: Hierarchical Multitask Learning

determine the contribution of regularization from the origin vs. the parent node’s parameters

(i.e., the strength of coupling between the node and its parent)

 Slide36

HMTLSlide37

Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity

Original Approach:

New Approach:

Note:

 Slide38

Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity

e

ach leaf node is a class

e

ach inner node is a group of classesSlide39
Slide40

Tree-Guided Group LassoSlide41
Slide42
Slide43
Slide44

Advantages and Drawbacks

Assume children is good

Assume parent is good

Assume both are not goodSlide45

Advantages and Drawbacks

Assume children is good

Tree Guided Group Lasso

Assume parent is good

HMTL

Assume both are not good

Path-based

It depends!