/
Scalable Image Annotation with Scalable Image Annotation with

Scalable Image Annotation with - PowerPoint Presentation

liane-varnes
liane-varnes . @liane-varnes
Follow
392 views
Uploaded On 2018-01-16

Scalable Image Annotation with - PPT Presentation

ConceptRank Petra Budíková Michal Batko Pavel Zezula Outline Searchbased annotation Motivation Problem formalization Challenges ConceptRank Idea Semantic network construction PageRank and ConceptRank ID: 623736

annotation semantic conceptrank network semantic annotation network conceptrank synsets image probability search based phase keywords relationships candidate initial probabilities

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Scalable Image Annotation with" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Scalable Image Annotation with ConceptRank

Petra Budíková, Michal Batko, Pavel ZezulaSlide2

Outline

Search-based annotation

Motivation

Problem formalization

Challenges

ConceptRank

Idea

Semantic network construction

PageRank and ConceptRank

Image annotation with ConceptRank

MUFIN Image Annotation

Framework description

Current implementation

and parameters

Examples

Experimental evaluation

Future workSlide3

What and why?Slide4

Motivation

What is in the image?

Why do I care?

Keyword-based image retrieval

Impaired users

Data summarization

S

cientific data classification…

Yellow flower

Flower, yellow, dandelion, detail, close-up, nature, plant, beautiful

Taraxacum

officinale

The first dandelion that bloomed this year in front of the White House.

nature

dandelionSlide5

Problem formalization

The

annotation task is defined by a

query image

I

and a

vocabulary V of target concepts

The annotation function fA

assigns to each concept c ∈ V a value from <0,1> that expresses the probability of the concept c being relevant for

IDepending on the application, only a subset of V can be returned to the usera fixed number of the most probable concepts concepts with probability higher than a given thresholdsome advanced selection of interesting concepts

V = {flower, animal, person, building}Slide6

How can we describe the image?

Option 1:

Classifiers

Option 2:

S

earch-based

approach

Principles

Learning phase

: use reliable training data to create classifiers for selected concepts

Annotation phase

: run classifiers

Learning phase

: none

Annotation phase

: similarity search over annotated data +

postprocessing

Main advantages

mature technologies available (e.g. neural networks)

fasthigh precision and recallreducing the reliance on cleanly labeled data, utilization of web datano costly learning phase, annotation phase can be easily adjusted to user’s preferencesscalability w.r.t. vocabulary sizeUse casesAnnotations with fixed vocabulary and reliable training dataidentification of peopleclassification of cancer cells…Annotations with open/adaptable vocabularyproposing keyword annotations for web image databases – need to be rich, adapt to the changing vocabulary of users

Option 1: Classifiers Option 2: Search-based approachPrinciplesLearning phase: use reliable training data to create classifiers for selected conceptsAnnotation phase: run classifiersLearning phase: noneAnnotation phase: similarity search over annotated data + postprocessingMain advantagesmature technologies available (e.g. neural networks)fasthigh precision and recallreducing the reliance on cleanly labeled data, utilization of web datano costly learning phase, annotation phase can be easily adjusted to user’s preferencesscalability w.r.t. vocabulary size

Option 1: Classifiers Option 2: Search-based approachPrinciplesLearning phase: use reliable training data to create classifiers for selected conceptsAnnotation phase: run classifiersLearning phase: noneAnnotation phase: similarity search over annotated data + postprocessing

Option 1:

Classifiers

Option 2:

S

earch-based

approachSlide7

Search-based approach: basic scheme

V = {flower, animal, person, building}

Annotated i

mage collection

Content-based

image retrieval

Similar annotated images

Yellow, bloom, pretty

Meadow

,

outdoors, dandelion

Mary’s

garden, summer

Text processing

Semantic resources

Selection of the

final annotation

flower

Candidate keywords with probabilities/scoresPlant 0.3Flower 0.3Garden 0.15Animal 0.05Human 0.1Park 0.1Slide8

Search-based approach: challenges

Selection and preprocessing of underlying database of annotated images

Size vs. quality

Effective and efficient image search

Descriptors, indexing technique

Image search results processing

Baseline: word cloud

Advanced: semantic analysis, annotation with hierarchic structureSelection of output(user?)selected level of the hierarchic structureSlide9

ConceptRankSlide10

Baseline word cloud solution

???

What would a person do?

Search for

semantic connections

between candidate keywords

Flowers bloom; dandelion is a flower; there are usually flowers in a garden; …

Based on the connections, estimate probabilities of vocabulary terms“Flower” is rather likely

Idea

Content-based

image retrieval

?

V = {flower, animal, person, building}

Similar annotated images

Yellow, bloom, pretty

Meadow

,

outdoors, dandelion

Mary’s

garden, summerSlide11

What can the computer do?Search for semantic connections between candidate

keywords?

Yes! Ontologies

, WordNet

,

image dataset statistics

, web,

…Based on the connections, estimate probabilities of vocabulary terms?Yes! Based on the connections, add new candidates and/or adjust the score of existing candidates

So, lets try it!Tasks: find a suitable source of semantic informationpropose an algorithm that

uses the selected resource to discover semantic connections between candidate concepts and performs score recomputationWe want a generic and theoretically sound solution 

Idea (cont.)

ConceptRankSlide12

ConceptRank overview

Let us asume we have some semantic resource S that contains

Semantic objects

Relationships between semantic objects

Mapping from English words to semantic objects

For ConceptRank, we need to

Transform the input keywords into semantic objects from S

Lets call the result “initial candidate objects”Retrieve relationships between candidate objects and if suitable, add new candidate objectsWe need a suitable representation for this: semantic networksCompute the probability of candidate objects

The actual ConceptRank algorithmSlide13

Graph representation of semantic relationshipsNodes: candidate objects

Node probability: current probability of the respective candidate concept

Edges: relationships between candidate objects

Edge weight:

“relevance transfer”

capacity

the weight of edge from A to B expresses the ratio of probability which node A contributes to node B

Semantic network for annotations

dog

cat

animal

mouse

computer

keyboard

1

1

0.5

0.5

1

0.33

0.330.330.50.5

0.20.10.20.10.20.2Slide14

Building the semantic network

Input

:

initObjectsWithProb

set of initial objects with probabilities,

S

- semantic resource,

rels – set of interesting relationships Output

: semanticNet

– the semantic network

begin

queue <- initObjectsWithProb.getObjects();

for

(o :

queue)

do

semanticNet.addNode(o); queue.remove(o); for (r : rels) do for (o2 : S.getConnectedObjects(o,r)) do if (semanticNet.contains(o2)) then semanticNet.addEdge(o,o2,r,computeWeight(r,…)); else if (r.isExpandingRel) then queue.add(o2); semanticNet.addNode(o2); semanticNet.addEdge(o,o2,r,computeWeight(r

,…)); fi fi done donedoneendSlide15

ConceptRank algorithm

Task

: Using the probabilities of initial concepts (which were obtained from previous annotation phases) and the semantic network, compute the probability of each node in the network

Observations:

The nodes in the network mutually influence each other’s

probability

The

computation of node probabilities needs to be an iterative processGoal: theoretically sound algorithm that finds a balanced state of the iterative processInspiration: Google PageRank algorithm

dog

cat

animal

mouse

computer

1

1

0.5

0.5

keyboard

1

0.33

0.330.330.5

0.50.20.10.20.10.20.2dogcatanimalmousecomputer110.50.5

keyboard10.330.330.330.50.50.066

0.066

0.35

0.166

0.1

0.25Slide16

PageRank

Input: Web pages

and

links represented in a graph

Output: Importance score

of

pages

Algorithm idea: In its simplest form, PageRank is a solution to the recursive equation “a page is important if important pages link to it.”The importance of any node is computed as the probability that this node is reached by a random surfer who starts in an arbitrary node of the network graph and moves for a long time.

Network graph construction: Pages are represented by nodes, hyperlinks by oriented edges.

For each node in the graph, the sum of weights of all outgoing edges is 1.

A

C

B

0.5

1

0.5Slide17

PageRank (cont.)

Some math behind:

Since the

probability

of reaching a node depends solely on the probabilities of referencing

nodes

, the random surfer model is a Markov process.

For Markov processes, it is known that the distribution of the surfer approaches a limiting distribution, provided two conditions are met:the graph is strongly connected (it is possible to get from any node to any other n.)

there are no dead ends (nodes that have no outgoing edges)To meet these conditions, the random surfer can perform random restarts – with

a probability Prestart, he can restart at any moment in any nodeComputation of scores: eigenvector computation over the matrix representation of the adjusted graph

Prestart=0.3

A

C

B

0.35

0.7

0.35

0.33

0.1

0.1

0.10.10.330.1

0.330.1ACB0.510.5ACB0.350.7

0.35Slide18

ConceptRank vs. PageRank

Input

:

PageRank: web

pages and

hyperlinks

ConceptRank: candidate concepts and semantic links

Output: PageRank: importance score of pagesConceptRank: importance score of candidate conceptsSimilarities:

We have nodes and links that can be used to form a graph/networkThe network can be mode

lled as a Markov processThe random walk intuition makes sense for both problemsRandom walk with internet: simulates randomly surfing userRandom walk with keywords: simulates user’s thinking while looking for relevant concepts

Differences:For ConceptRank, we want to consider initial probabilities associated with nodesSlide19

Adaptation of initial probabilities into the model

Random restarts will not be uniformly random

Instead, the probability that the walk will restart in a given node will correspond to the initial probability of that node

The initial probability is determined by previous steps of the annotation process

For concepts found among the keywords of similar images, the initial probability corresponds to the frequency of the concept

For concepts that were added during the semantic network building, the initial probability is 0

dog

cat

animal

mouse

computer

keyboard

0.4

0.35

0.2

0.4

0.35

0

0.2

0.05

00.40.050.350.2

0.05dogcatanimalmousecomputer110.50.5keyboard1

0.330.330.330.50.5Slide20

ConceptRank algorithm

Input

:

initObjectsWithProb

– initial concepts and

their

probabilities,

semanticNet

– the semantic network,

rels – selected relationships and their weights,

restartProb – probability of random surfer restart

Output:

nodeProbs

– probabilities of network nodes

begin

//construct the restart vector and matrix

restartVector <- constant vector of 0 values;

for (

n : semanticNet.getNodes()) do if (initObjectsWithProb.contains(n)) then restartVector[semanticNet.indexOf(n)] <- initObjectsWithProb.get(n); fidonerestartM <- unityVector*restartVector;// construct the transition matrix, normalize, solve dead endstransitionM <- new Matrix;for (r : rels.getRelationshipTypes()) do relM = constructTypeMatrix(semanticNet.getNodes,semanticNet.getEdges(r)); transitionM.add(relM*rels.getWeight(r));donetransitionM.normalize();for (i=0; i<transitionM.getColumnDimension(); i++) do

if (transitionM.getColumn(i).getSum() == 0) then transitionM.replaceColumn(i, restartVector); fidone// compute the eigenvectorcompleteMatrix <- (1-restartProb)*transitionM + restartProb*restartM;nodeProbs <- completeMatrix.getPrincipalEigenvector();endSlide21

Efficiency issues

For larger sets of similar images, the number of initial keywords and subsequentially the number of nodes in the network may get high (1000+)

Costly construction of the semantic network

Costly computation of the ConceptRank

Therefore, approximations can be used

For semantic network construction: limiting the number of initial nodes

For ConceptRank computation: limited number of multiplications by the transfer matrix instead of the exact mathematic computation of the eigenvector

Approximation used by Google, known to work very wellSlide22

Putting theory to useSlide23

The basic annotation scheme again

V = {flower, animal, person, building}

Annotated i

mage collection

Content-based

image retrieval

Similar annotated images

Yellow, bloom, pretty

Meadow

,

outdoors, dandelion

Mary’s

garden, summer

Text processing

Semantic resources

Selection of the

final annotation

flower

Candidate keywords with probabilities/scoresPlant 0.3Flower 0.3Garden 0.15Animal 0.05Human 0.1Park 0.1ConceptRankSlide24

MUFIN Image Annotation Framework

Modular architecture for image annotation

There is an extensible set of modules that implement the same interface

Can be arbitrarily combined into an “annotation pipeline”

There is an “annotation record” object that is passed from one module to another

Carries information about

query and candidate

keywords, current estimate of probabilities, and any other knowledge deemed relevant by individual modulesClear structure, easy adaptabilityUpgrade from MPEG7 to DeCAF descriptors = replacing one module without disturbing others

MUFIN Image Annotation applicationSlide25

MUFIN Image Annotation – current version

Objective:

Annotation with semantic relationships evaluated by ConceptRank

Basic decisions:

Reference dataset: 20M Profiset

20M high-quality images with rich and systematic

annotation

20 keywords per image on averageObtained from a commercial web-site selling stock images Evaluation of visual similarity: DeCAF descriptorsState-of-the-art for image content description

Indexing: PPP-codesSource of semantic information: WordNetLexical database of EnglishNouns

, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms – synsetsSynsets are interlinked by conceptual-semantic and lexical relationsHypernyms, hyponyms, …Slide26

WordNet ConceptRank details

Basic objects for semantic analysis: synset

Step 1: Transformation

of keywords to

synsets

For keywords with multiple meanings, there exist more synsets (e.g. mouse). How do we decide which synset(s) to pick?

There is an additional resourse that for most English words lists the possible synsets together with a score that corresponds to the frequency of use of the keyword in the meaning described by the given synset

We take a fixed number of the most probable synsets for each keywordThere may be many synsets retrieved by the previous step, which could lead to costly processing of the semantic networkTherefore, only a fixed number of the most probable synsets are used to build the networkSlide27

WordNet ConceptRank details (cont.)

Step 2: Construction

of WordNet-based semantic network

Which

relationships are interesting

?

For now: Hypernyms, hyponyms, holonyms, meronyms

Which relationships should be used to extend the network and which should be used only to add edges between existing nodes?Extending mode: bottom-up relationships (hypernyms, maybe holonyms)How shall we compute the weights of semantic network edges for each relationship?Bottom-up relationships: edge weight 1

Top-down relationships: edge weight 1/(number of child nodes)

dog

cat

animal

mouse

computer

1

1

1

1

keyboard

1

0.330.330.330.50.5Slide28

The complete annotation pipeline

Similarity

search

Extraction of the DeCAF descriptor from the query image

Retrieval

of k visual nearest neighbors

Semantic analysis

Frequency analysis of keywords + normalizationTransformation of keywords to synsetsConstruction of WordNet-based semantic networkComputation of ConceptRankSelection of the final annotation

Mapping synsets with probabilities to vocabulary conceptsSlide29

Overview of annotation parameters

Similarity search

# of similar images

Transformation of keywords to synsets

# of most probable synsets per keyword

# of most

probable synsets

that enter the network constructionConstruction of WordNet-based semantic networktypes of relationshipsfor extending networkfor adding edges

weights of edges for individual relationshipsComputation of ConceptRankrestart probabilityweights of individual relationship matricesSlide30

Annotation query example

Input:

?

Vocabulary: all English wordsSlide31

Example: kNN search and initial synsets

kNN

search: k=5

Keywords to

synsets

:

at most 3 most probable synsets per keywordMerge synsets: 20

synsets with the highest probability

beak, cotswolds

, flamingoes

(

2),

head, janes

(

2)

,

pink

,

site, slimbridge (2), trust , water, wetlands, wildfowl beak, cotswolds , flamingoes (2), head, janes (2), pink, preening, site, slimbridge (2), trust, water, wetlands, wildfowlamerican, birds, darwin, flamingo (2), flap, flapping (2), galapagos, greater (2), islands, markings, phoenicopterus, race, ruber, south, wing, wings (2)aythya, drake, duck, sv, swimmingaythya, drake, duck, sv, swimming?flamingo 0,185greater 0,062wildfowl 0,062Cotswolds 0,062Aythya 0,062wetland 0,062site 0,058head 0,049pink 0,047water 0,046trust 0,037wings 0,037duck 0,034Drake 0,031drake 0,031swimming 0,031Galapagos_Islands 0,031beak 0,025beak 0,025American 0,023Initial synsets:Slide32

Example: semantic network – hypernymsSlide33

Example: annotation results

Top 5

keywords

– demonstration settings

Flamingoes (4.15)

Duck (2.44)

Wildfowl (1.74)

Birds (1.48)Wetlands (1.41

)Top 5 keywords – 70 images, 7 synsets/kw, 100 init. synsets, all relationships

Animal (2.68)Bird (2.42)Travel (2.30)Vertebrates (2.04)Swimming (1.42) Slide34

Experimental evaluation

ImageCLEF 2014: Scalable Concept Image Annotation

Focus on concept-wise scalability

No reasonable training data

Provided development queries, GT and evaluation scripts

Vocabulary

: aerial

airplane baby beach bicycle bird boat bridge building car cartoon castle cat chair child church cityscape closeup cloud cloudless coast

countryside

daytime desert diagram dog drink drum elder embroidery fire firework fish flower fog food footwear furniture garden grass guitar harbor hat helicopter

highway

horse indoor instrument lake lightning logo monument moon motorcycle mountain nighttime overcast painting park person plant portrait

protest

rain rainbow reflection river road sand sculpture sea shadow sign silhouette smoke snow soil space spectacles sport sun

sunrise/sunset table teenager

toy traffic train tricycle truck underwater unpaved wagon

water

GT

: countryside daytime grass horse plantSlide35

Development data results

Processing time

:

1500

ms on average for parameters used in

the

table

1000 ms for

descriptor

extraction (can be improved)

300 ms for similarity search

Competition results: a close 2

nd placeExperimental evaluation

(cont.)

MP-c

MR-c

MF-c

MP-s

MR-s

MF-s

MAP-sRandom baseline2.791.031.173.151.912.238.78DISA baseline – freq. analysis, 1 synset per kw20.9634.2222.8737.3043.1438.0740.59DISA baseline with multiple synsets per kw31.2036.7627.79

44.3051.0045.0050.03DISA with hyper-hypo30.1036.5728.7548.4258.2250.2658.35DISA with hyper-hypo-holo-mero30.2936.6328.9849.0859.1151.0059.34Slide36

What next?Slide37

Summary and Future work

Already done

The ConceptRank algorithm

Working

annotation

system

Good results in the ImageCLEF

competitionNear futureMore evaluationsInfluence of dataset size and quality, approximation params, …Google ground truthPublish or perish

More distant futureOther resources of semantic relationshipsOntologies, Word2VecRelevance

feedbackCombined architecture: search-based approach and modern NN classifiers