identifiers an Overview Juha Hakala The National Library of Finland 20110201 Traditional identifiers Traditional bibliographic identifiers are systems like ISBN International Standard Book Number which provide unique and persistent identification for certain ID: 311492
Download Presentation The PPT/PDF document "Persistent" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Persistent identifiers – an Overview
Juha Hakala
The National
Library
of Finland
2011-02-01Slide2
Traditional identifiersTraditional (bibliographic) identifiers are systems like ISBN (International Standard Book Number) which provide unique and persistent identification for certain
types
of
resources (books, serials, etc.)
They were designed for printed resources before the Internet was invented; thus the match with the digital resources and the Web may be a forced
one
These identifiers are well established international standards with relatively clear roles
Not always clear how to apply them to the e-resources, except that identified resources themselves should be persistentSlide3
Persistent identifiers (PIDs)A new category of identifiers which are actionable in the Internet, that is, they enable persistent linking (
resolution
)
to the
resource
or a surrogate such as a bibliographic description of
the
resource
Most PIDs are also “traditional” identifiers
When using a DOI, one can identify a book with DOI & an embedded ISBN or DOI with
a local
ID string
URN is the only exception from this; URNs must include a traditional identifier
URN namespaces inherit the rules of the traditional identifier used; there is no need to discuss the scope of the URN itself Slide4
Traditional versus persistent identifiersAssigning a
traditional
identifier
such
as ISBN is (
should
be
?) a
controlled
process
with
precise
rules
What
is
identified
,
by
whom
Assigning
a
PID
such
as ARK
may
or
may
not
be
a
controlled
process
and the
rules
of
application
may
be
vague
Sometimes
the
rules
are
different
:
A
book
must
have
just
one
ISBN,
but
it
may
have
two
PIDs
(for
instance
, ARK and DOI)
The National
Library
of Finland
uses
Handles
in
its
Dspace
system
,
but
URN is the ”
official
”
identifier
of
these
resourcesSlide5
RecommendationsConflicts between the two identifier groups should be avoided at all cost
If a traditional identifier can be assigned to the resource, use that identifier as a part of
the PID
It follow
s
that
PIDs that cannot (easily)
incorporate traditional identifiers may cause problems
Any identifier
(traditional / PID) should have explicit implementation guidelines
If no general guidelines exist
r
ules must
be developed
locally; such rules should eventually be aligned in the level of the PID communitySlide6
Persistent identifiers and the Web: Cool URIs
From the library point of view, cool URIs (URLs) are not
proper identifiers
at
all
The same resource may be available from many URLs
Over time, different resources or variant versions of
the same
resource may be available in the same URI
There is absolutely no control over cool URI assignment
A user cannot know if a URI is cool or not (most of them aren’t)
Instead, cool URIs are
just shelf marks
What is a realistic time frame for cool URI persistence?
Cool URIs
can support only
resolution; persistent identifiers can be more
versatile
in this
respect
Match with the current / future long term preservation systemsSlide7
Services provided by PIDsBasic question:
what
services
do
we
need
?
Some
examples
:
Find
all
locations
(
URLs
)
related
to the
PID
Find
bibliographic
metadata
related
to the
PID
Retrieve
the
preservation
commitment
of the
owning
organization
(
concerning
the
resource
at
hand
)
There
is
no
overall
framework
/
context
within
which
to
design
the
resolution
services
Each
PID
provides
a
slightly
different
setSlide8
PID –based services in the futureTheoretical
basis
could
be
twofold
:
Functional
requirements
for
bibliographic
records
(FRBR) –
model
:
work
,
expression
,
manifestation
Current
theory
and
practice
of
long-term
preservation
based
on
the
migration
strategy
(and a long
tail
of
manifestations
for
each
work
)
This
means
it
must
be
possible
for
instance
to
:
Find
all
works
related
to the
work
at
hand
Find
all
expressions
related
to the
work
at
hand
Find
all
manifestations
of the
work
at
hand
Find
out
differences
between
these
manifestationsSlide9
PID–based services in the future (2)It should also be possible to
Find out who is
preserving
the resource
Retrieve the rights metadata related to the resource
Retrieve the preservation metadata related to the resource
Retrieve the most original version (the eldest preserved manifestation) of the resource
Retrieve the latest (and supposedly the easiest to use) manifestation of the resource
…Slide10
Example: qualitative social scientific data set
The
work
itself
should
be
described
;
one
metadata
element
should
be
the PID
Expressions
(
translations
to
other
languages
)
should
have
their
own
PIDs
,
linked
to the
work
level
record
There
may
be
multiple
manifestations
(
relational
database
, Excel
table
, etc.) of
each
expression
;
each
one
should
have
its
own
PID, and
there
should
be
links
to the
work
/
expressions
In
this
environment
,
it
would
make
sense
to
provide
links
to the
work
, and
let
the
users
to
choose
the
most
appropriate
manifestation
Choice
of the
language
,
file
format
, etc.Slide11
Recommendations (2)Services supported
by
PID
systems
need
a
face
lift
Many
systems
were
designed
10+
years
ago
,
when
digital
object
management
systems
were
still
in
their
infancy
Upgrades
must
be
done
in a
non-destructive
manner (
existing
implementations
must
be
compliant
with
the new version)
All
aspects
of
PID
systems
should
be
standardized
Some
PIDs
(
e.g
. ARK and PURL)
have
never
reached
a
standard
status,
and
at
best
only
one
part
of the
system
(
identifier
syntax
)
has
been
published
as a
standard
More
(and
better
)
open
source
implementations
are
neededSlide12
ConclusionThere
will
be
multiple
PIDs
in
existence
in the
future
(just
like
there
are
now
)
Once
a
system
has
been
chosen
,
you
cannot
give
it
up
PID
supporters
and
cool
URI
proponents
will
most
likely
continue
talking
past
one
another
for
quite
some
time
,
but
:
Given
the
time
frame
the national
libraries
&
archives
must
preserve
resources
(
centuries
) and the
technical
complexity
of
this
task
,
cool
URIs
fall
short
of the
requirements
in
several
ways
;
instead
,
PIDs
must
be
used
PID
systems
are
to
some
extent
”
work
in
progress
”