/
A Framework of Extracting Multi-scale Features Using Multip A Framework of Extracting Multi-scale Features Using Multip

A Framework of Extracting Multi-scale Features Using Multip - PowerPoint Presentation

yoshiko-marsland
yoshiko-marsland . @yoshiko-marsland
Follow
401 views
Uploaded On 2016-05-13

A Framework of Extracting Multi-scale Features Using Multip - PPT Presentation

KuanChuan Peng Tsuhan Chen 1 Introduction Breakthrough progress in object classification 2 O Russakovsky et al ImageNet large scale visual recognition challenge arXiv14090575 2014 ID: 318079

scale classification abstract style classification scale style abstract convolutional networks visual cnn painting large examples neural karpathy eccv14 multi

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "A Framework of Extracting Multi-scale Fe..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

A Framework of Extracting Multi-scale Features Using Multiple Convolutional Neural Networks

Kuan-Chuan

PengTsuhan Chen

1Slide2

Introduction

Breakthrough progress in object classification.

2O. Russakovsky

et al.

ImageNet

large scale visual recognition challenge. arXiv:1409.0575, 2014.N. Murray et al. AVA: A Large-Scale Database for Aesthetic Visual Analysis. CVPR12.

cat

dog

lion

tigerSlide3

Introduction

Humans are interested in more than objects.

For example, aesthetic quality.3

N.

Murray et al.

AVA: A Large-Scale Database for Aesthetic Visual Analysis.

CVPR12

.Slide4

How do machines describe images?

Examples by state-of-art algorithm:

A.

Karpathy

and F.-F. Li.

Deep visual-semantic alignments for generating image descriptions.

CVPR15.

http://cs.stanford.edu/people/karpathy/deepimagesent/

“man in black shirt is playing guitar.”

“woman is holding bunch of bananas.”

4Slide5

How do machines describe images?

Examples by state-of-art algorithm:

A.

Karpathy

and F.-F. Li.

Deep visual-semantic alignments for generating image descriptions.

CVPR15.

http://cs.stanford.edu/people/karpathy/deepimagesent/

man in black shirt

is playing guitar.”

woman

is holding bunch of bananas.”

5Slide6

How do machines describe images?

Examples by state-of-art algorithm:

A.

Karpathy

and F.-F. Li.

Deep visual-semantic alignments for generating image descriptions.

CVPR15.

http://cs.stanford.edu/people/karpathy/deepimagesent/

“man in black shirt is playing

guitar

.”

“woman is holding

bunch of bananas

.”

6Slide7

How do machines describe images?

Examples by state-of-art algorithm:

A.

Karpathy

and F.-F. Li.

Deep visual-semantic alignments for generating image descriptions.

CVPR15.

http://cs.stanford.edu/people/karpathy/deepimagesent/

“man in black shirt is

playing

guitar.”

“woman is

holding

bunch of bananas.”

7Slide8

How do experts describe images?

Examples by the Pulitzer Prize winners:

http://

www.pulitzer.org/archives/8417

http://www.pulitzer.org/archives/6451

“At bath times, Danielle appears serene. But no one know what lies beyond those

eyes.” (by Lane

DeGregory

)

“The surgery has dragged on for hours with little progress, and

Mulliken

, taking a breather next to an array of Sam's CAT scans, is feeling the frustration and

exhaustion.” (by Tom Hallman Jr.)

8Slide9

How do experts describe images?

Images convey more than

objects.

http://www.pulitzer.org/archives/8417

http://www.pulitzer.org/archives/6451

“At bath times, Danielle appears

serene

. But no one know what lies beyond those

eyes.” (by Lane

DeGregory

)

“The surgery has dragged on for hours with little progress, and

Mulliken

, taking a breather next to an array of Sam's CAT scans, is feeling the

frustration

and exhaustion.” (by Tom Hallman Jr.)

9Slide10

Beyond Objects

Abstract attributes matter.Attributes relating to or involving general ideas or qualities rather than specific people, objects, or actions

. [Merriam-Webster dictionary]Bridge the gap between machines and humans:Teach machines to solve

abstract tasks

(tasks involving abstract attributes).

http://www.merriam-webster.com/dictionary/abstract

10Slide11

Goal

A general framework to achieve better performance in abstract tasks.

Multi-scale features by using convolutional neural networks (CNN).

11Slide12

Why CNN?

12

O. Russakovsky et al.

ImageNet

large scale visual recognition challenge

.

arXiv:1409.0575, 2014.L. Deng et al. A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion. ICASSP13.

A. Karpathy et al.

Large-scale video classification with

convolutional neural networks. CVPR14.

object classification

video classification

s

peech recognitionSlide13

Existing Abstract Tasks

More and more abstract tasks are proposed.

13Slide14

Artistic Style & Artist Style Classification

[F. S. Khan et al. MVA14.]

Architectural Style Classification

[Z. Xu et al. ECCV14.]

14Slide15

15

amusement

anger

awe

contentment

disgust

excitement

fear

sad

Emotion Classification

[J. Machajdik et al. ACMMM10.]

Aesthetic Classification

[N. Murray et al. CVPR12.]

h

igh aesthetic quality

l

ow aesthetic qualitySlide16

Bohemian

Hipster

Fashion Style Classification

[M. H. Kiapour et al. ECCV14.]

Memorability Prediction

[P.

Isola

et al. CVPR11.]

Interestingness Prediction

[M.

Gygli

et al. ICCV13

.]

16Slide17

Inspiration

It is tricky to describe abstract attributes as objects.Not easy to “locate” abstract attributes.What if abstract attributes prevail everywhere?

Label-inheritable (LI) property.

contentment

[J. Machajdik et al. ACMMM10.]

?

17Slide18

Label-Inheritable (LI) Property

Dataset

Painting-91 [1]

arcDataset

[2]

Caltech-101 [3]

Task

Artist style classification

Architectural style classification

Object classification

LabelPicassoBaroque ArchitectureFaces

Label-inheritableYes

PartialMostly No

18

[1] F. S. Khan

e

t al

.

Painting-91: a large scale database for computational painting categorization.

Machine Vision & Applications 14.

[

2

]

Z. Xu et al

.

Architectural style classification using multinomial latent logistic regression.

ECCV14

.

[3]

F.-F.

Li et al.

Learning

generative visual

models from few training examples: An

incremental

bayesian

approach tested on 101 object

categories.

CVPRW04.Slide19

Label-Inheritable (LI) Property

Dataset

Painting-91 [1]

arcDataset

[2]

Caltech-101 [3]

Task

Artist style classification

Architectural style classification

Object classification

LabelPicassoBaroque ArchitectureFaces

Label-inheritableYes

PartialMostly No

19

[1] F. S. Khan

e

t al

.

Painting-91: a large scale database for computational painting categorization.

Machine Vision & Applications 14.

[

2

]

Z. Xu et al

.

Architectural style classification using multinomial latent logistic regression.

ECCV14

.

[3]

F.-F.

Li et al.

Learning

generative visual

models from few training examples: An

incremental

bayesian

approach tested on 101 object

categories.

CVPRW04.Slide20

Label-Inheritable (LI) Property

Dataset

Painting-91 [1]

arcDataset

[2]

Caltech-101 [3]

Task

Artist style classification

Architectural style classification

Object classification

LabelPicassoBaroque ArchitectureFaces

Label-inheritableYes

PartialMostly No

20

[1] F. S. Khan

e

t al

.

Painting-91: a large scale database for computational painting categorization.

Machine Vision & Applications 14.

[

2

]

Z. Xu et al

.

Architectural style classification using multinomial latent logistic regression.

ECCV14

.

[3]

F.-F.

Li et al.

Learning

generative visual

models from few training examples: An

incremental

bayesian

approach tested on 101 object

categories.

CVPRW04.Slide21

Multi-Scale CNN

Assume LI property holds for each image and the associated label.

21A.

Krizhevsky

et al

.

ImageNet

classification with deep convolutional neural networks. NIPS12.Slide22

AlexNet

The number of nodes in output layer is changed to be the number of classes in each task.

22

A.

Krizhevsky

et al

.

ImageNet

classification with deep convolutional

neural networks.

NIPS12.Slide23

Experimental Results

Method \ Task

Artist style classification

Artistic style classification

Caltech-101 object classification

(15 / 30 training examples per class)

Architectural style classification

(10 / 25 classes

)

Previous work

(baseline)53.10 [1]

62.20 [1]83.80 / 86.50 [2]

69.17 / 46.21 [3]Single-scale CNN(baseline)

55.15

67.37

83.45 / 88.1970.64 / 54.842-scale CNN(ours)

58.1169.6780.19 / 87.58

74.82 / 58.893-scale CNN

(ours)57.91

70.96

N/A75.32 / 59.13

[1] F. S. Khan

e

t al

.

Painting-91: a large scale database for computational painting categorization.

Machine Vision & Applications 14.

[2]

M. D. Zeiler and R. Fergus

.

Visualizing and understanding convolutional networks.

ECCV14.

[3]

Z. Xu et al

.

Architectural style classification using multinomial latent logistic regression.

ECCV14

.

classification accuracy (%)

23

Label-inheritable

Yes

Yes

Mostly No

PartialSlide24

Is it because of more training data?

What if we train one CNN with images in different scales?

24

A.

Krizhevsky

et al

.

ImageNet classification with deep convolutional neural networks. NIPS12.Slide25

Additional Results

Method \ Task

Artist style classification

Artistic style classification

Caltech-101 object classification

(15 / 30 training examples per class)

Architectural style classification

(10 / 25 classes

)

Previous work

(baseline)53.10 [1]

62.20 [1]83.80 / 86.50 [2]

69.17 / 46.21 [3]Single-scale CNN(baseline)

55.15

67.37

83.45 / 88.1970.64 / 54.842-scale CNN(ours)

58.1169.6780.19 / 87.58

74.82 / 58.891 CNN +

2-scale images46.86

61.95N / A

67.93 / 49.06

[1] F. S. Khan

e

t al

.

Painting-91: a large scale database for computational painting categorization.

Machine Vision & Applications 14.

[2]

M. D. Zeiler and R. Fergus

.

Visualizing and understanding convolutional networks.

ECCV14.

[3]

Z. Xu et al

.

Architectural style classification using multinomial latent logistic regression.

ECCV14

.

classification accuracy (%)

25

Label-inheritable

Yes

Yes

Mostly No

PartialSlide26

Conclusion

We proposed Multi-Scale Convolutional Neural Networks (MSCNN) based on Label-Inheritable (LI) property.Multi-scale features.

MSCNN can outperform the state-of-art performance on datasets where LI property holds or even partially holds.

26Slide27

Towards Solving Abstract Tasks

More CNN features

to achieve better performance in abstract tasks.Multi-scale features (ICME15).Multi-depth features (ICIP15).

Multi-task features (submitted to ICCV15).

27

K

.-C. Peng and T. Chen.

A Framework of extracting multi-scale features using multiple convolutional neural networks

.

ICME15.K.-C. Peng and T. Chen. Cross-layer features in convolutional neural networks for generic classification tasks.

ICIP15.K.-C. Peng and T. Chen. Toward correlating and solving abstract tasks using c

onvolutional neural networks. Submitted to ICCV15.Slide28

Q & A

28