/
CNN-RNN: A Unified Framework for Multi-label Image Classification CNN-RNN: A Unified Framework for Multi-label Image Classification

CNN-RNN: A Unified Framework for Multi-label Image Classification - PowerPoint Presentation

faustina-dinatale
faustina-dinatale . @faustina-dinatale
Follow
348 views
Uploaded On 2018-09-19

CNN-RNN: A Unified Framework for Multi-label Image Classification - PPT Presentation

Xueying Bai Jiankun Xu Multilabel Image Classification Cooccurrence dependency Higherorder correlation one label can be predicted using the previous label Semantic redundancy labels have overlapping meanings cat and kitten ID: 671047

embedding label labels model label embedding model labels cnn rnn images image classification order objects data higher redundancy semantic

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "CNN-RNN: A Unified Framework for Multi-..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

CNN-RNN: A Unified Framework for Multi-label Image Classification

Xueying

Bai,

Jiankun

XuSlide2

Multi-label Image Classification

Co-occurrence dependency

Higher-order correlation: one label can be predicted using the previous label

Semantic redundancy: labels have overlapping meanings (cat and kitten)Slide3

Previous Models

Multiple single-label classification

Fail to model the dependency between multiple labels

Graphic model

Large amount of parameters;

Can not model higher-order correlationSlide4

RNN-CNN Model

Learn the semantic redundancy and the co-occurrence dependencies

Have an end-to-end training process

Predict more objects that need contexts (higher-order correlation)Slide5

CNN-RNN FrameworkSlide6

Joint Embedding Model

Label embedding: the embedding vector in a low-d Euclidian space in which embeddings of semantically similar labels are close to each other

Image embedding: the embedding vector close to that of its associated labels in the same space

Exploit semantic redundancy problem: share classification parametersSlide7

Model Diagram

Output of CNN: Image embedding

Output of RNN (

o(t)

): new embedding including the information from previous label (to model higher order correlations)Slide8

LSTMSlide9

Recurrent Neural NetworkSlide10

Inference

Prediction Path

Beam Search

Find top N labels in each time step as candidates

Find top N prediction paths for each time (t+1)Slide11

Beam Search

When comes to ‘End’: add to the candidate path set

Termination condition: probability of current intermediate paths is smaller than that of all candidate paths.Slide12

Experiments

CNN module uses the 16 layers VGG network

Dimension of label embedding is 64

Dimension of LSTM RNN layer is 512Test on Datasets:

NUS-WIDE, MS COCO and VOC PASCAL 2007Slide13

Evaluation

Metric

Precision:

correctly annotated labels/ generated labelsRecall: correctly

annotated labels/ ground-truth

labels

C-P, O-P; C-R, O-R

C-

Fl

, O-

Fl

: geometrical average

MAPSlide14

NUS-WIDE

A web image dataset contains 269648 images and 5018 tags.

Test on dataset with 1000 tags and 81 tags.Slide15
Slide16
Slide17

MS COCO

It contains 123 thousand images of 80 objects types.

Training data

has 82783 images and testing data has 40504 images.

Most images have multiple objects.Slide18
Slide19
Slide20

PASCAL VOC 2007

Training data has

5011

images and testing data has 4952 images.Use AP and

mAP

to evaluate.Slide21

Label embedding

The model effectively learns a joint label embeddingSlide22
Slide23

Attention VisualizationSlide24

Conclusion and Future Work

Combines the advantages of the joint image/label embedding and label co-occurrence models by employing CNN and RNN

Experimental results on several datasets show good performance

Predicting small objects is still a challenge.Slide25

Reference: CNN-RNN: A Unified Framework for Multi-label Image Classification —

Jiang Wang, Yi Yang,

Junhua

Mao, Zhiheng Huang, Chang Huang, Wei Xu

Questions?Slide26

Thank you all