/
Writer Writer

Writer - PowerPoint Presentation

olivia-moreira
olivia-moreira . @olivia-moreira
Follow
373 views
Uploaded On 2016-04-23

Writer - PPT Presentation

identification through information retrieval Ralph Niels Franc Grootjen amp Louis Vuurpijl August 21st 2008 ICFHR Montreal A search engine for forensic experts Writer ID: 289653

identification writer vuurpijl retrieval writer identification retrieval vuurpijl prototype ralph niels franc grootjen louis throughinformation character query information matcher

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Writer" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Writer identification through information retrieval

Ralph Niels, Franc Grootjen & Louis Vuurpijl

August 21st, 2008

ICFHR, MontrealSlide2

A search engine for forensic expertsWriter

identification through

information

retrieval

Ralph NielsFranc GrootjenLouis VuurpijlSlide3

OverviewForensic writer identificationPrototypical shapes in handwritingInformation retrieval (IR)

TraditionalWriter identification usingprototypesExperiments

Method

Results

Conclusions & future work

Writer

identification

throughinformation retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide4

Forensic writer identification

Writer

identification

through

information

retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide5

Forensic information retrievalWeb search: query of words to search in documents containing wordsForensic search: query of

characters to search in documents containing characters

Previous work

*

: sub-character level, binary featuresBased on characters: improves justification possibilities

Writer identification through

information

retrieval

Ralph Niels

Franc Grootjen

Louis Vuurpijl

*

A.

Bensefia

, T.

Paquet

, and L.

Heutte

. A

writer

identification

and

verification

system.

Pattern

Recogn

. Letters, 26(13):2080–2092, 2005.Slide6

Forensic information retrievalDictionary of character shapes: prototypesExperts use prototypesDescribe query & documents by prototype usage

instances of

prototype

Writer

identification

through

information

retrieval

Ralph Niels

Franc Grootjen

Louis Vuurpijl

PrototypesSlide7

Character to prototype matcherFind most similar prototype for each character

W

48

h

16 a9 t1

y2 o1 u23 d16 i25 d12 i6 s12 (…)

a5

a9

a16

a52

(…)

Writer

identification

through

information

retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide8

PrototypesAveraged shapes of real handwritten charactersDynamic Time Warping-distance to find most similar prototype

Writer

identification

through

information retrievalRalph Niels

Franc Grootjen

Louis VuurpijlR. Niels & L. Vuurpijl & L. Schomaker.

Automatic

allograph

matching

in

forensic

writer

identification

.

International Journal of

Pattern

Recognition

and

Artificial

Intelligence. Vol. 21, No. 1. Pages 61-81. February 2007.

PrototypesSlide9

The IR model for writer identificationCharacter to prototype matcher

Indexing

Matching

Character to prototype matcher

Writer input

Query input

Prototype list

af

(q)

af

(w)

aw(w)

Ranked

list

Justification

Writer

identification

through

information

retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide10

Indexing: create weighted vectorsVector of prototype usage for each writer: af(w)Adjust weight of prototypes in that vector:

Protos used by many writers: not distinctive -> lower weightwf(p)

= number of writers using proto

p

Weighted vector

of prototype use for each writer

Writer

identification through

information

retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide11

The IR model for writer identificationCharacter to prototype matcher

Indexing

Matching

Character to prototype matcher

Writer input

Query input

Prototype list

af

(q)

af

(w)

aw(w)

Ranked

list

Justification

Prototype frequency in query

Writer

identification

through

information

retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide12

The IR model for writer identificationCharacter to prototype matcher

Indexing

Matching

Character to prototype matcher

Writer input

Query input

Prototype list

af

(q)

af

(w)

aw(w)

Ranked

list

Justification

Writer

identification

through

information

retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide13

MatchingInput‘Database writers’: Indexed writer vectors

aw(w)‘Query writer’: Vector af(q)

Match:

Calculate cosine of angle between

af(q) and each aw(w)

OutputRanked list of writers (similarity to query)Writer

identification through

information retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide14

The IR model for writer identificationCharacter to prototype matcher

Indexing

Matching

Character to prototype matcher

Writer input

Query input

Prototype list

af

(q)

af

(w)

aw(w)

Ranked

list

Justification

Writer

identification

through

information

retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide15

JustificationSimilarity value (cosine of angle)Prototype contribution to retrieval result

Writer

identification

through

information retrieval

Ralph NielsFranc Grootjen

Louis VuurpijlSlide16

JustificationForensic expert can further inspect justification

Writer identification

through

information

retrievalRalph Niels

Franc GrootjenLouis VuurpijlSlide17

Experiment43 writers from plucoll databaseOnline data

Segmented into charactersHow well does our technique perform given a certain amount of data (characters)?Amount of characters in database (

d

)

Amount of characters in query (q)

Writer identification through

information

retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide18

ExperimentPick d random letters from each database

writerPick q random other letters from one writer,

and use those as

query

Find most similar writerPrototypes

iwf(p), aw(w)MatchingVary d and q

Repeat 10 times for each writer

Writer

identification

through

information

retrieval

Ralph Niels

Franc Grootjen

Louis Vuurpijl

Repeat

10 times for each comb. of

d

and

qSlide19

Results

100300

500

1000

10

5979838830

8697

99100

50

94

99

100

100

70

96

100

100

100

100

98

100

100

100

d

q

d

q

Writer

identification

through

information

retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide20

Conclusions & future workNeeded for 100%: 70 chars (q), 300 chars (

d)Average English sentence: 75-100 charactersNo black box: results are justified

Online data: forensic practice?

Extract semi-automatically with help expert

Use offline matching technique

Just 43 writersBigger (n writers & n techniques) experiments plannedPromising resultsWriter

identification through

information retrieval

Ralph Niels

Franc Grootjen

Louis VuurpijlSlide21

Writer identification

throughinformation

retrieval

Ralph Niels

Franc Grootjen

Louis Vuurpijl

A search engine for forensic experts