Stella G Dextre Clarke Project Leader ISO NP 25964 Chair ISKO UK s tella lukehouseorg 1 Summary Brief thesaurus chronology What role does the thesaurus have now The demand for interoperability ID: 422034
Download Presentation The PPT/PDF document "Thesauri, interoperability and the role ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Thesauri, interoperability and the role of ISO 25964
Stella G Dextre ClarkeProject Leader, ISO NP 25964Chair, ISKO UKstella @lukehouse.org
1Slide2
SummaryBrief thesaurus chronology
What role does the thesaurus have now?The demand for interoperabilityHighlights from ISO 25964
2Slide3
Thesauri – a brief chronology
Once upon a time, thesauri were at the cutting edge of Information Retrieval (IR) technologyHey-day in 1960s and 1970s; after mid-1980s popularity declinedISO 2788 and ISO 5964 (for monolingual and multilingual thesauri respectively) came out 1974 - 1986.Internet/intranets in 1990s brought resurgence and diversification (into other forms of controlled vocabulary, such as “taxonomies”)TREC (1992 onwards) has shown dominance of statistical methods in IR. But stats alone are not enough!At the turn of the century, thesauri back in fashion and work began on refurbishing the British and International standardsSemantic Web and SKOS developments provide more incentiveToday, even Google employs some “taxonomists”.
3Slide4
Slide unearthed from TR’01(2001): The thesaurus coming back into fashion!Slide5
5Slide6
6Slide7
7Slide8
The role of controlled vocabularies today
Needed where full text is not available, e.g. image libraries and audio resourcesInvaluable for crossing language barriersEspecially useful in-house, where the page rank algorithms are less effectiveEssential to access vast databases and catalogues of bibliographic data from decades pastProvide added value in combination with other methods, often hidden behind the scenesIn all these contexts, interoperability is key.
8Slide9
Introducing
ISO 25964ISO 25964: Thesauri and interoperability with other vocabularies Part 1: Thesauri for information retrieval
Part 2: Interoperability with other vocabularies
It updates ISO 2788 and ISO 5964
based on BS 8723, with much reworking
Part 1,
published in August 2011
, covers monolingual and multilingual thesauri
Part 2, to be
published in
January 2013
,
covers mapping between thesauri and other types of vocabulary
information retrieval seen as main application, including indexing as well as searching9Slide10
What does “interoperability”
mean?Definition: ability of two or more systems or components to exchange information and to use the information that has been exchanged.In the case of thesauri and other KOS, broadly speaking interoperability applies at more than one level:presenting data in a standard way to enable import and use in other systems
(ISO 25964 Part 1)
providing mappings between the terms/concepts of one KOS and those of another
(ISO 25964 Part 2)
plus any other type of exchange between one KOS and another
(ISO 25964 Part 2
)
10Slide11
Linked Data Cloud in 2011
- Richard Cyganiak and Anja Jentzsch see http://lod-cloud.net/Slide12
A simplified view of interoperability
My thesaurusSlide13
Interoperability between vocabularies (see ISO 25964-2)
My thesaurus
Your thesaurus
GEMET
AGROVOC
LCSH
Dewey
WordnetSlide14
Interoperability between applications (see ISO 25964-1)
Vocabulary management software
indexing/tagging software
search/browsing softwareSlide15
Content of ISO 25964-1, supporting interoperability between applications
thesaurus content and construction, mono- or multi-lingual (i.e. a complete update of ISO 2788 and ISO 5964)guidance on applying facet analysis to thesauri
guidance on managing thesaurus development and maintenance
functional requirements for software to manage thesauri
a data model and derived XML schema
15Slide16
16Slide17
Models
for mappingGuidelines for mappingRecommendations on mapping typesHow to handle pre-coordination
Mapping to vocabularies other than thesauri:
classification schemes
file
plans
(Classification
schemes used for records
management)
taxonomies
subject
heading
schemes
ontologiesterminologiesname authority listssynonym rings Brief guidance on handling mappings dataContent of ISO 25964-2, supporting interoperability between vocabularies17Slide18
Recommended “Models for mapping
” EF
G
H
A
B
C
D
P
Q
R
SSlide19
What does “mapping” mean?
Definition: process of establishing relationships between the concepts of one vocabulary and those of anotherRecommended types of mapping are based on the standard internal relationship types, basically: equivalence, hierarchical and associativeGreater differentiation of mapping types is allowed, but is optional, to avoid complexity in simple applicationsSlide20
Full range of ISO 25964-2 mapping types
Basic mapping types:EquivalenceSimpleCompoundIntersecting compound equivalenceCumulative compound equivalenceHierarchical
Broader
Narrower
Associative
Simple equivalence can be marked as “Exact” or “Inexact”Slide21
Full range of ISO 25964-2 mapping types with examples
Basic mapping types:EquivalenceSimple: Laptop computers EQ Notebook computersCompoundIntersecting compound equivalence:
Women executives EQ Women + Executives
Cumulative compound equivalence:
Inland waterways EQ Rivers | Canals
Hierarchical
Broader:
Streets BM Roads
Narrower:
Roads NM Streets
Associative:
e-Learning RM Distance educationExact equivalence: Aubergines =EQ Egg-plantsInexact equivalence: Horticulture ~EQ GardeningSlide22
The joys of pre-coordination
Examples:599.742.71(084.12) photographs of lions (from UDC)Automobiles--Air conditioning--Maintenance and repair (from LCSH)Occurs characteristically in subject heading schemes, classification schemes, taxonomies and file plansMapping obliges use of the more complicated mapping types, especially compound equivalence
22Slide23
Vocabularies other than thesauri
ISO 25964 is a standard for thesauri; it does not attempt to standardize other types of KOS. It guides only on interoperability between thesauri and other types of KOS.The clause on each KOS type presents:Key characteristics of the KOS (non-normative)Semantic components/relationships (non-normative)Recommendations for interoperability between the KOS and a thesaurus, especially mapping (normative)
23Slide24
Vocabularies other than thesauri
The following are dealt with in ISO 25964:classification schemesfile plans (classification schemes used for records management)taxonomies subject heading schemes
name authority lists
synonym rings
terminologies
ontologiesSlide25
General prospects for mapping
- thesauri
mapping relatively straightforward
- classification schemes
- file plans
- taxonomies
- subject heading schemes
concept mapping useful in IR, pre-coordination common
- name authority lists
mapping usually straightforward but common concepts few
- synonym rings
- terminologies
- ontologies
concept mapping rarely useful; complementary uses are a more likely prospectSlide26
Ontologies are special…
Definition of ontology excludes “lightweight” examples such as thesauri and classification schemesThe Gruber/Studer definition is adopted, and interpreted broadly enough to admit OWL-based examples such as ORE and FOAF.Mapping between ontologies and thesauri is not recommended.Interoperability recommendations focus on use cases such as reengineering a thesaurus as an ontology, and complementary use of thesaurus with ontology.
26Slide27
Simple ontology illustration(
credit: Jutta Lindenthal; see http://www.jlindenthal.de/IID/2012/Kurs_2012.htm )
27Slide28
Structural comparison
The illustration is used in ISO 25964 to draw out key similarities and differences between ontologies and thesauri.The aim is to encourage emerging applications in which thesauri and ontologies can usefully interoperate.
28Slide29
Interoperability at the level of standards
SKOS
ISO25964
OWL
RDF
XML
SRU
Z39.19
MARC 21
REST
HTTP
BS 8723
ZThes
SPARQL
JSON
ISO2709
Z39.50Slide30
Dextre Clarke and Zeng, 2012. http://www.niso.org/publications/isq/2012/v24no1/clarke/
30Slide31
The thesaurus coming back into fashion…Slide32
…although often hidden behind the scenesSlide33
And interoperability makes new tricks easier…Slide34
Want a copy of
the standards?Download Part 1 from ISO at http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=53657Part 2 will be in the ISO catalogue next yearOrder from your national standards body (e.g. BSI, DIN, ANSI, AFNOR)Some public/academic reference libraries stock themISO standards are not cheap to purchase
However, the
data model
and
XML schema
for exchange of thesaurus data are available online
without charge or password control
. Go to
http://www.niso.org/schemas/iso25964
/
34Slide35
Some extra slides with more detail
APPENDIX35Slide36
Who is involved in developing the standard?
A Working Group (WG8), under the ISO subcommittee known as ISO TC46/SC9, has drafted the standard.
WG8 has members from 15 countries.
The WG8 Secretariat is provided by NISO in the USA
Currently active members
of WG8
include:
Johan De Smedt
Marianne Lykke
Stella Dextre Clarke (Leader)
Esther Scheven
Michèle Hudon
Douglas Tudhope
Daniel KlessLeonard WillJutta LindenthalMarcia Lei Zeng36Slide37
Intersecting versus cumulative equivalenceSlide38
Mapping example from a pre-coordinated concept: inland waterway transport
Inland waterway transport EQ transport + (rivers | canals)
The Rialto Bridge, Venice
Michele
Marieschi
© Bridgeman Education