Rongcheng Lin Computer Science Department Contents Motivation Definition amp Problem Review of SVM Hierarchical Classification Pathbased Approaches Regularizationbased Approaches Motivation ID: 306446
Download Presentation The PPT/PDF document "Hierarchical Classification" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Hierarchical Classification
Rongcheng Lin
Computer Science DepartmentSlide2
Contents
Motivation, Definition & Problem
Review of SVM
Hierarchical Classification
Path-based Approaches
Regularization-based ApproachesSlide3
Motivation
The classes in real world are structured, specially often hierarchically related.
Gene function prediction
Document categorization
Image Search
…
Hierarchies
or taxonomies offer clear advantage in supporting tasks like browsing, searching or visualization
International Patent Classification scheme
Yahoo! Web
catalogs
…
Prior knowledge about class relationships will improve the classification performance, especially for tasks with large class numberSlide4
Motivation
The classes in real world are structured, specially often hierarchically related.
Gene function prediction
Document categorization
Image Search
…
Hierarchies
or taxonomies offer clear advantage in supporting tasks like browsing, searching or visualization
International Patent Classification scheme
Yahoo! Web
catalogs
…
Prior knowledge about class relationships will improve the classification performance, especially for tasks with large class numberSlide5
Motivation
The classes in real world are structured, specially often hierarchically related.
Gene function prediction
Document categorization
Image Search
…
Hierarchies
or taxonomies offer clear advantage in supporting tasks like browsing, searching or visualization
International Patent Classification scheme
Yahoo! Web
catalogs
…
Prior knowledge about class relationships will boost the classification performance, especially for tasks with large class numberSlide6
Definition and Problem
a
utomatically categorize data into pre-defined topic hierarchies or taxonomies
Supervised Learning
Structured Output Slide7
DAG and Tree StructureSlide8
Definition and Problem
a
utomatically categorize data into pre-defined topic hierarchies or taxonomies
Supervised Learning
Structured Output
Problem and solution?Slide9
Definition and Problem
Incorporate the inter-class relationship(hierarchy) into classification
Redefine the problem
Lower
level categories are more detailed while upper level categories are more
general
Redefine the margin
Different
classification mistake are of different
severity
Redefine the loss functionSlide10
Definition and Problem
Incorporate the inter-class relationship(hierarchy) into classification
Redefine the problem
Lower
level categories are more detailed while upper level categories are more
general
Redefine the margin
Different
classification mistake are of different
severity
Redefine the loss functionSlide11
Definition and Problem
Incorporate the inter-class relationship(hierarchy) into classification
Redefine the problem
Lower
level categories are more detailed while upper level categories are more
general
Redefine the margin
Different
classification mistake are of different
severity
Redefine the loss functionSlide12
Review: Binary SVM
Binary classification
M
argin
Loss Function
w
T
x
+
b
= 0
w
T
x
+
b
< 0
w
T
x
+
b
> 0
f
(
x
)
=
sign(
w
T
x
+
b
)
Slide13
Review: Binary SVM
General Form:Slide14
Review: Multiclass SVM
1
)
one-vs-the
rest
2
) Crammer &
Singer (pairwise)Slide15
Review: Multiclass SVM
Dedicated Loss FunctionSlide16
Review: Multiclass SVM
Dedicated Loss Function
Slide17
Review: Hinge Loss Function
the
more you violate the
margin, the
higher the penalty
is.Slide18
Loss Function
Slide19
Hierarchical Classifiers
Path-based Approaches
Large Margin Hierarchical Classification
Hierarchical Document Categorization with Support Vector Machine
On Large Margin Hierarchical Classification with multiple paths
Regularization-based Approaches
Tree-Guided Group Lasso for Multi-task Regression
Hierarchical Multitask Structured Output Learning
for Large-Scale SegmentationSlide20
Tree Distance
A given hierarchy induces a metric over the set of classes
tree
distance or tree induced error
(y,
)
is defined to be the number of edges along the (unique) path from
y to
Slide21
Tree Distance
A given hierarchy induces a metric over the set of classes
tree
distance or tree induced error
(y,
)
is defined to be the number of edges along the (unique) path from
y to
y
Slide22
Tree Distance
2
5
6
8
9
1
0
3
y
4
Slide23
Loss Functions
1
1
Zero-One Loss
Hinge Loss
Hierarchical Hinge
Loss
Slide24
Path-based Approaches
path-based approaches try to find the most likely path from the root.
Only need to update the parameters of miss-classified
nodes in the tree
Slide25
Large margin hierarchical classifier
Slide26
Training AlgorithmSlide27
HSVMSlide28
HSVM
Slide29
HSVM
Slide30
Regularization-based Approaches
K individual classification tasks
Use a n additional regularization term
to penalizes the disagreement between the individual models
Slide31
Multitask Learning
Inductions of multiple tasks are performed simultaneously to capture intrinsic relatednessSlide32Slide33
L1-Norm, L2-Norm
P
enalize model complexity to avoid
overfitting
L-1 Norm give more sparse estimate than L-2 NormSlide34
Group Lasso and Sparse Group LassoSlide35
HMTL: Hierarchical Multitask Learning
determine the contribution of regularization from the origin vs. the parent node’s parameters
(i.e., the strength of coupling between the node and its parent)
Slide36
HMTLSlide37
Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity
Original Approach:
New Approach:
Note:
Slide38
Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity
e
ach leaf node is a class
e
ach inner node is a group of classesSlide39Slide40
Tree-Guided Group LassoSlide41Slide42Slide43Slide44
Advantages and Drawbacks
Assume children is good
Assume parent is good
Assume both are not goodSlide45
Advantages and Drawbacks
Assume children is good
Tree Guided Group Lasso
Assume parent is good
HMTL
Assume both are not good
Path-based
It depends!