/
IBE312: IBE312:

IBE312: - PowerPoint Presentation

min-jolicoeur
min-jolicoeur . @min-jolicoeur
Follow
355 views
Uploaded On 2016-04-21

IBE312: - PPT Presentation

Information Architecture 2013 Ch 9 Metadata Many of the slides in this slideset are reproduced andor modified content from publically available ID: 287549

metadata data content thesauri data metadata thesauri content taxonomies amp terms umls thesaurus concepts practical related focus http faceted

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "IBE312:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

IBE312: Information Architecture2013

Ch

. 9 –

Metadata

Many

of

the

slides in

this

slideset

are

reproduced

and/or

modified

content

from

publically

available

slidesets

by

Paul Jacobs (2012),

The

iSchool

, University of Maryland

http://terpconnect.umd.edu/~psjacobs/s12/INFM700s12.htm.

These materials were made available and

licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States

See

http://creativecommons.org/licenses/by-nc-sa/3.0/us/

for details.Slide2

2

Metadata

“Data about data” - Definitional and descriptive documentation/information about data…

From Free On-line Dictionary of Computing:

Data about data. In data processing, meta-data is definitional data that provides information about or documentation of other data managed within an application or environment.

For example, meta-data would document

data about data elements

or attributes, (name, size, data type, etc) and

data about records

or data structures (length, fields, columns, etc) and

data about data

(where it is located, how it is associated, ownership, etc.). Meta-data may include

descriptive information

about the context, quality and condition, or characteristics of the data.

(

Some other

definitions

.)Slide3

MetadataWhy do we need this?Types of metadataDescriptive/subjective/content (e.g. author, subject, keywords, …)

Administrative (e.g. owner, rights, cost, creation date, version, …)

Technical (e.g. format, size, dependencies, programs). . . .In practical terms:Metadata helps users locate, navigate, interpret contentMetadata helps organizations manage contentMetadata helps systems manipulate contentSlide4

Data without Metadata…

Who:

authored it? to contact about data?What: are contents of database?When: was it collected? processed? finalized? Where: was the study done?Why: was the data collected?How: were data collected?

processed? Verified?

… can be pretty useless!Slide5

Early Example of MetadataSlide6

Menagerie of TermsClassificationHierarchiesEpistemologyDirectoriesControlled vocabularies

Knowledge representation

Let’s focus on significant differences.Let’s focus on advantages/disadvantages.Let’s focus on how each is useful.Slide7

7

Controlled Vocabulary

Any defined subset of natural language

List of

equivalent terms

(synonym rings)

Use search logs.

List of

preferred terms

(authority files)

Commonly also include variant terms

Educating users, enabling

browsing

Term rotation (pointers in index)

p.201

Classification scheme / taxonomy

Hierarchical relationships

(narrower/broader)Slide8

Controlled Vocabulary

Queries

can be ”exploded” to increase recallSlide9

Controlled Vocabularyauthority file – inclusive

,

preferred term can serve as the unique identifier for a collection of terms, educate usersSlide10

Related Terms & TechniquesTaxonomiesAnything organized in some sort of hierarchical structureTagging

Adding almost any kind of metadata to content, but now often descriptive and user-provided

ThesauriFocus on relations between termsFocus on “concepts”OntologiesUsually model a specific domain or part of the worldGenerally machine-readableIncreasing complexity and richness

Metadata

Taxonomies

& Thesauri

Practical UsesSlide11

How are taxonomies, tagging, controlled vocabularies and thesauri used? The semantic gap:

What’s the problem

?Synonymy – roughly, different words or phrases can be used to express similar ideas (e.g. “notebook”, “laptop”)Polysemy – roughly, the same word can have different meanings (e.g., “line” (fishing, code, queue, . . .) )Taxonomies try to group similar concepts“Tags” often assign words to concepts, making it easier to find related conceptsControlled vocabularies avoid ambiguity (like a specific tag set)Thesauri represent attempts to better organize mappings between words and conceptsDo these present precision or recall problems?Slide12

TaxonomiesOrganization of objects according to some principleFamiliar examples:Linnaean taxonomy (for living organisms)

Web directories (e.g., Yahoo or ODP)

Corporate directoriesOrganization chartsOrganizational structures previously discussedMetadataTaxonomies & ThesauriPractical UsesSlide13

Tagging- e.g. Flickr – popular tags

Metadata

Taxonomies & ThesauriPractical UsesSlide14

Flickr – related tags

Metadata

Taxonomies & ThesauriPractical UsesSlide15

Del.icio.us – related tags

Metadata

Taxonomies & ThesauriPractical UsesSlide16

Thesauri: Motivation“Semantic gap” between concepts and words

Online

thesauri help mapping many synonyms or word variants onto one preferred term – improve precision in retrieval (p.203)Words are used to evoke conceptsConcrete objects: MacBook Pro, iPhoneAbstract ideas: freedom, peaceConceptsWords

Ideas

MeaningSlide17

17

Thesauri

Book of synonyms, often including related and contrasting words and antonyms.

In this class:

A controlled vocabulary in which equivalence, hierarchical, and associative relationships are identified for purposes of improved retrieval.

Technical lingo …

Thesauri standards: ISO 2788, …Slide18

18

Thesauri TypesSlide19

IA Uses of ThesauriFor organizationFor navigationFor indexing contentFor searchingSlide20

Applying IA PrinciplesFocus on users and user needs – users are different, and have different modelsFocus on content – concepts are different, too – different levels, words, complexity, vaguenessExamples:

What’s the difference between laptop, PDA, phone, and convergence device?

When is “cancer research” “oncology”?When a user browses a furniture catalog for chairs, do you show them ottomans and footstools?Slide21

Standard Thesaurus StructureComputer

Notebook

LaptopDesktopReplacement

Ultraportable

Tablet PC

IS-A

IS-A

AKA

Synonyms (variants)

Narrower

Terms

Broader

Terms

PreferredSlide22

Semantic relationships in a thesaurus

(

pp. 204-205): Abbreviations: PT, VT, BT, NT, RT, Use (U) – VT use PT, Use For (UF) – full list of VT on the PT record, Scope Note (SN) – meaning of

the term to rule

out

ambiguity

.Slide23

Semantic relationships of a wine

thesaurus, p. 206Slide24

Some Real ExamplesContent tagging and social media (e.g. flickr, del.i.cious

)

Special-purpose classification schemes and thesauri (e.g. art & architecture thesaurus – AAT, UMLS)General semantic tools and classification schemes (e.g., Princeton WordNet, Roget’s Thesaurus)Slide25

Art & Architecture Thesaurus

Metadata

Taxonomies & ThesauriPractical Useshttp://www.getty.edu/research/conducting_research/vocabularies/aat/Slide26

UMLS (Unified Medical Labeling System)Source: National Library of Medicine (NIH)

Metathesaurus

Semantic

Network

SPECIALIST

Lexicon +Tools

135 broad

categories

and

54

relationships

between them

1 million+

biomedical

concepts

from over 100 sources

lexical information and programs for

language processing

3 Knowledge Sources

used separately or together

Metadata

Taxonomies

& Thesauri

Practical UsesSlide27

E.g. UMLS (Unified Medical Labeling System)Source: National Library of Medicine (NIH)

Metadata

Taxonomies & ThesauriPractical UsesBegan in 1986 as long-term R&D project

Designed for systems developers

Develop multi-purpose tools to enhance understanding of medical meaning across

systems

Overcome barriers to effective retrieval of machine-readable information

Overcome variety of ways the same concepts are expressed in machine readable and human languageSlide28

UMLS UsesSource: National Library of Medicine (NIH)

Metadata

Taxonomies & ThesauriPractical UsesInformation retrieval

Thesaurus construction

Natural language processing

Automated indexing

Electronic health records (EHR)

Distribution mechanism for

HIPAA, CHI, PHIN regulatory standards

SNOMED CT Slide29

UMLS Metathesaurus

http://www.nlm.nih.gov/research/umls/Slide30

UMLS Metathesaurus

http://www.nlm.nih.gov/research/umls/Slide31

UMLS Thesaurus Browserhttp://www.nlm.nih.gov/research/umls/Slide32

32

Semantic Relationships

Equivalence

(PT = VT)

Hierarchical:

Generic (Bird NT Magpie), whole-part (Foot NT big toe)

or

instance (Seas NT Mediterranean Sea)

Faceted / multiple hierarchies

Associative

Related

terms (hammer RT nail)

Preferred terms

:

Form, selection, definition and specificity

Polyhierarchy

(Medline

corss

-lists viral pneumonia under both ...Fig 9-25, p. 220)

Faceted

classification

– multiple taxonomies that focus on different dimensions of the content. (e.g. wine.com pp. 223-224.)Slide33

Associative TermSlide34

Poly-HierarchiesConcepts can have multiple parentsExample:

What are the advantages and disadvantages?

What’s the relationship to polysemy?Cracow (Poland : Voivodship)Auschwitz II-Birkenau (Poland : Death Camp)

Block 25

(Auschwitz II-Birkenau)

German death camps

Kanada

(Auschwitz II-Birkenau)

From Shoah Foundation’s thesaurus of holocaust termsSlide35

Faceted HierarchiesAlternative to single and poly-hierarchiesBasic idea:Describe objects along multiple facetsEach facet has its associated hierarchy

Issues:

What’s a facet?How do you navigate faceted hierarchies?Slide36

Faceted Browsing ExampleSlide37

Faceted Browsing ExampleDemo:

http://flamenco.berkeley.edu/demos.htmlSlide38

Advantages of FacetsIntegrates searching and browsingEasy to build complex queriesEasy to narrow, broaden, shift focusHelps users avoid getting lost

Helps to prevent “categorization wars”Slide39

Relationship to IA?

Database

WebServerApplicationServer

Network

Ontologies are implicitly “hidden” here!!!

Flight

Trip

From:

Part-of

Airplane

Equipment

To:

Departure Time:

Arrival Time:

Origin:

Destination:

Type:

Capacity:

Rule:

Arrival Time is always after Departure Time

Rule:

Distance from Origin to Destination typical > 100 milesSlide40

Putting it all together…

Database

WebServerApplicationServer

Network

Database

Web

Server

Network

Two-Layer Architecture

Three-Layer Architecture

Apache

mySQL

PHPSlide41

Popular Implementation

Content

MetadataPresentation

SQL Database

PHP/HTMLSlide42

Content 

Presentation

A

B

C

D

E

F

G

H

You are here: A > C > D

Contents at D

Related

- D

- E

Hierarchy(child, parent)

Content(id, attribute

1

, attribute

2

, attribute

3

, …)Slide43

Faceted Browsing

Matching

ResultsFilter by - Facet1 (possible values)

- Facet2 (possible values)

Hierarchy(child, parent)

Content(id, attribute

1

, attribute

2

, attribute

3

, …)Slide44

SummaryMeta-dataGeneral function

Types of meta-data

Taxonomies and ThesauriRole in organizing, navigating and searching contentGeneral-purpose taxonomiesSpecial-purpose taxonomiesPractical use & implementation

Related Contents


Next Show more