/
Introduction to  Information Retrieval Introduction to  Information Retrieval

Introduction to Information Retrieval - PowerPoint Presentation

dayspiracy
dayspiracy . @dayspiracy
Follow
344 views
Uploaded On 2020-06-16

Introduction to Information Retrieval - PPT Presentation

What is IR Sit down before fact as a little child be prepared to give up every conceived notion follow humbly wherever and whatever abysses nature leads or you will learn nothing Thomas Huxley ID: 778676

query information retrieval search information query search retrieval data research knowledge organization build amp structure representation index engines class

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Introduction to Information Retrieval" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Introduction to

Information Retrieval

Slide2

What is IR?

Sit down before fact as a little child,

be prepared to give up every conceived notion, follow humbly wherever and whatever abysses nature leads, or you will learn nothing. -- Thomas Huxley --

Search Engines

2

Google

Query =

What is IR?

Query =

What is information retrieval?

Ask.com

Query =

What is IR?

Query =

What is information retrieval?

Yahoo!

Query =

What is IR?

Query =

What is information retrieval?

Google Korea

Query =

What is IR?

Query =

What is information retrieval?

Naver

Query =

What is IR?

Query =

What is information retrieval?

Daum

Query =

What is IR?

Query =

What is information retrieval?

Slide3

IR:

Key Questions

What are we looking for?How do we find it?Why is it difficult?Search Engines3

“A prudent question is one-half of wisdom”

Francis Bacon

Slide4

IR:

What are we looking for?

We areLooking for X.Q&A: population of ChinaKnown-item Search: “Cather in the Rye”Looking for something like/about X.General/background info: TalibanCollection Development: IR LiteratureSimilar to (known) X: like “Cather in the Rye”WhatyoumacallX: “the rye-boy story”Looking for somethingProblem Resoultion: how can we fight terrorism?Knowledge Development: what is IR?LookingNeed something, but don’t know whatwhat’s it all about?Serendipity: Web surfingSearch Engines

4

Slide5

IR:

How do we find it?

Brute force searchEasy to build, maintain, and useSearcher does all the work; Hard to get satisfactionOrganize/structure the dataIntuitive to useHard to build and maintainKnowledge of builder’s language & organization structure is crucialUse a search toolEasier to build and maintain: Less manipulation of dataSometimes works, sometimes not (Helps to know the language of the data)Ask the expertsEasy and satisfying to use (by definition)“Expert” knowledge is transitory, hard to encapsulateGo with the crowdRelatively easy to build and maintainLimited utility: doesn’t work with “unpopular” XZen-Fusion search.Search Engines

5

Slide6

Information Seeking Process:

Dynamic, Interactive, Iterative

UserIntermediary

Information

What am I looking for?

- Identification of info. need

What question do I ask?

- Query formulation

What is the searcher looking for?

- Discovery of user’s info. need

How should the question be posed?

-

Query representation

Where is the relevant information?

-

Query-document matching

What data to collect?

-

Collection development

What information to index?

-

Indexing/Representation

How to represent it? - Data structure

Search Engines

6

Slide7

Information Seeking Models

Berry-picking Model

(딸기따기 모델)Interesting information is scattered like berries among bushes.Information seeking is a dynamic, non-linear process, where information need/queries continually shift.Information needs are not satisfied by a single, final retrieved set of documents, but rather by a series of selections and bits of information found along the way. Traditional ModelLinear process:Problem identificationIdentification of information needQuery formulationResult evaluationStatic information needThe goal is to retrieve a perfect match of the information needSearch Engines7

Bates, 1989

Broader, 2002

Slide8

IR Research:

Overview

Search Engines8

Information Organization:

- Add structure & annotation

Information Retrieval

- Create a searchable index

Information Access

- Retrieve information

Data Mining

- Discover Knowledge

Slide9

IR Research:

Information Retrieval

Search Engines9

Representation

- indexing, term weighting

Searchable Index

Raw Data

Query Formulation

-

“What is

information retrieval?”

Search Results

- (ranked) document list

D1

wd1 wd2 wd3

D2

wd3 wd2 wd1 wd3

D3

wd1 wd2

Index Term

D1

D2

D3

wd1

(information)

1

1

1

wd2

(model)

0

1

1

wd3

(retrieval)

1

2

0

wd4

(seminar)

1

0

0

Rank

docID

score

1

D2

3

2

D1

2

3

D3

1

D1: information retrieval seminars

D2: retrieval models and information retrieval

D3: information model

Slide10

IR Research:

Information Organization

Search Engines10

Representation

- NLP & Machine Learning

Organized Data

Raw Data

Query Formulation

- “What is IR?”

Search Results

- document groups

Slide11

IR Research:

Natural Language Processing

GoalUnderstanding/effective processing of natural languageNot just pattern matchingLexical Analysis usingPart-of-Speech (POS) taggingSentence ParsingResearch area, technique, tool forData Mining, Knowledge Discovery Search Engines11

Slide12

IR Research:

Machine Learning

Research Area, technique, tool forInformation Organization, Data Mining, Knowledge DiscoveryInformation Organization viaSupervised Learning (Automatic Classification)Unsupervised Learning (Clustering)Search Engines12

Class 1

Class 2

Class 1

Class 2

Classification

Clustering