/
Amir Rahimzadeh  Ilkhechi Amir Rahimzadeh  Ilkhechi

Amir Rahimzadeh Ilkhechi - PowerPoint Presentation

SunkissedBabe
SunkissedBabe . @SunkissedBabe
Follow
342 views
Uploaded On 2022-08-02

Amir Rahimzadeh Ilkhechi - PPT Presentation

Yağız Salor Mustafa İlker Saraç Hakan Sözer Distributed Information Retrieval Jamie Callan Motivation The single database model can be successful if most of the important or ID: 932336

ddd bbb aaa ccc bbb ddd ccc aaa database eee model resource information single language normalized unigram private multi

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Amir Rahimzadeh Ilkhechi" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Amir Rahimzadeh IlkhechiYağız SalorMustafa İlker SaraçHakan Sözer

Distributed Information

Retrieval

Jamie

Callan

Slide2

MotivationThe single database model can be successful if most of the important or valuable information on a network can be copied easily. However information that cannot be copied is not accessible under the single database model. Information that is proprietary that costs money or that a publisher wishes

to control

carefully is essentially invisible to the single database

model.

Slide3

SolutionThe alternative to the single database model is a multi-database model in which the existence of multiple text databases is modeled explicitly

Single-DB Model

Multi-DB Model

Central DB

Holds Descriptions of the Private DBs

Private DB 1

Private

DB 2

Slide4

Multi-Database ModelResource Description:The contents of each text database must be describedResource Selection:Given an information need and a set of resource descriptions a decision must be made about which database(s) to searchResource Merging:

Integrating the ranked lists returned by each database

into a single

coherent ranked list

aaa

,

bbb

, ccc

bbb

,

ddd

,

eee

aaa, ccc, ddd

???, ???, ???

???, ???, ???

???, ???, ???

aaa

,

bbb

, ccc

bbb

,

ddd

,

eee

aaa

,

bbb

, ccc

bbb

,

ddd

,

eee

aaa

, ccc,

ddd

Slide5

Resource DescriptionApproach: A simple and robust solution is to represent each database by a description consisting of the words that occur in the database and their frequencies of occurrence

or statistics derived from frequencies of

occurrence

which called

unigram language model

aaa

,

bbb

, ccc

bbb

,

ddd

, eee

aaa, ccc, ddd

???, ???, ???

???, ???, ???

???, ???, ???

Slide6

Resource SelectionThe major part of this resource selection problem is ranking resources by how likely they are to satisfy the information needApproach is to apply the techniques of document ranking to the problem of resource ranking using variants of tf .idf approaches. One advantage is that the same query can be used to rank resources and to rank documents

aaa

,

bbb

, ccc

bbb

,

ddd

,

eee

aaa

, ccc,

ddd

aaa

,

bbb, ccc

bbb,

ddd, eee

Slide7

Resource MergingSolutions include: computing normalized scoresestimating normalized scoresmerging based on unnormalized scores.

aaa

,

bbb

, ccc

bbb

,

ddd

,

eee

Slide8

ResultsAccuracyof unigram language modelsof resource rankingsof document rankingsTestbedsSummary statistics for three distributed IR testbeds

Slide9

Conclusion & Summary

aaa

,

bbb

, ccc

bbb

,

ddd

,

eee

aaa

, ccc,

ddd

???, ???, ???

???, ???, ???

???, ???, ???

aaa

,

bbb

, ccc

bbb

,

ddd

, eee

aaa

,

bbb

, ccc

bbb

,

ddd

,

eee

aaa

, ccc,

ddd

unigram

language model

t

f

.

idf

Computing normalized scores