/
Search to Discovery: Finding Global Scholarly Resources wit Search to Discovery: Finding Global Scholarly Resources wit

Search to Discovery: Finding Global Scholarly Resources wit - PowerPoint Presentation

lindy-dunigan
lindy-dunigan . @lindy-dunigan
Follow
387 views
Uploaded On 2015-11-03

Search to Discovery: Finding Global Scholarly Resources wit - PPT Presentation

Pascal Calarco amp Alison Hitchens Library December 6 2011 Agenda The state of search in libraries Pascal Expanding Primo beyond the local catalogue Alison Questions 2011 Library Information Systems Milestones ID: 181743

2011 primo local search primo 2011 search local amp information discovery catalog central records library resources science journal content

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Search to Discovery: Finding Global Scho..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Search to Discovery: Finding Global Scholarly Resources with Primo

Pascal Calarco &

Alison Hitchens, Library

December 6, 2011Slide2

Agenda

The state of search in libraries (Pascal)

Expanding Primo beyond the local catalogue (Alison)

Questions

2011Slide3

Library Information Systems: Milestones

Discovery

Metasearch

Citation Linking ILS 3rd gen (Client-server; 1990s) ILS 2nd gen (Mainframe; 1980s) OCLC (library network; 1972) Early systems MARC1960 1980 1990 2000 2010

2011Slide4

In the beginning, there was the card catalog (1901+)

Indexes:

Subject

AuthorTitleInterfiled cards, calln

umber access

2011Slide5

Library of Congress National Union Catalog (pre-1956)

2011Slide6

Henriette Avram, Developer of MARC

Programmer/analyst at Library

Of Congress

Developed system for printing

c

ard catalog information (MARC)

ISO certification 19732011Slide7

Later, there was the Online Public Access Catalog (OPAC)

Machine Readable Cataloging (MARC)

Inventory of the print/physical holdings of a library

Better than the card catalog; keyword searching & boolean

functionality

Non-intuitive; required training or intermediation (information professional)

Limited generally to single library2011Slide8

Library networks & resource sharing

2011Slide9

Print to Electronic

2011Slide10

Now: Electronic Almost Ubiquitous

85%+ of journal

literature digital

Hundreds of specialized scholarly databasesMass print book digitization effortsElectronic books going mainstream

Aggregated meta-indexes: 750 million metadata for journal/newspaper articles

2011Slide11

Goal: improve user experience

Users want to FIND not search

Source required information to user regardless of format or location

Leverage our knowledge of academic community @ uWaterlooIntegrate into key services: LMS, CMS, other library services

2011Slide12

Database Content Silos

Content Silos

System Silos

Catalog

ILL

Meta-search

eReserve

Website

Science-Direct

Web of Science

ETDs

EEBO

JSTORSlide13

Metasearch: an interim step

aka Federated Search; emerged 2003

Distributed search from one interface via web services, SOAP/XML gateways

Idiosyncratic and slow; vendors implemented variouslyRelevancy of merged results problematic

2011Slide14

Problems with catalog searching & evolution to discovery

UCLA & Berkeley: information

retrieval & user

behavior (1986-1996)Google Books: “digitize the world’s knowledge” (2002)

Karen Schneider, Andrew Pace, Roy

Tennant: “The OPAC ‘Sucks’”

(2002)Next generation catalogs -> Discovery (2008+)2011Slide15

Catalogs

: Information

Science

Research

Christine L.

Borgman (1986) “Why are online catalogs hard to use? Lessons learned from information retrieval studies” Journal of the American Society for Information Science

Ray R. Larsen (1991) “The decline of subject searching: Long-term trends and patterns of index use in an online

catalog

Journal of the American Society for Information Science

Ray R. Larsen (1992) “Evaluation of advanced retrieval techniques in an experimental online

catalog

Journal of the American Society for Information Science

Ray R. Larsen (1996) “Cheshire II: designing a next-generation online

catalog

Journal of the American Society for Information Science

Christine L.

Borgman

(1996) “Why are online

catalogs

still hard to use?”

Journal of the American Society for Information ScienceSlide16

How Users Search: What We’ve Learned

Most people make typos at least some of the time

Most searches are 2, 3, 4 words with no Boolean operators

Most searches use keyword

Search is hesitant, iterative, often random process of discovery

Most people start elsewhere

Few read help screensFew use advanced search – this is true even in Google Slide17

The Google Effect

Expectations for web search tools now:

Radically simplified UI, fast results

Aggregated content Relevant results on first pageNatural Language queries

Spelling correction/adaptation

2011Slide18

The OPAC “Sucks”

The OPAC lacks common features of most search engines

Relevance ranking vs. last in, first out

Spell checking (related - did you mean?)

Popular query operators like + and –

Refine search

Sort flexibility

Faceting

Citation indexing vs full text

Developed for print materials, limitations with electronic materials or atomized items (like articles)

Difficult for certain known item searchSlide19

Industry Trends

Decouple the front end (search and discovery) from the back end (inventory and cataloguing)

Service Oriented Architecture – many programs loosely coupled

Cloud services --

SaaS

The 5

th

generation of

library business systems emerging now – hosted, cloud solutionsSlide20

Discovery Characteristics

Enhanced Search Functionality

Faceted browse

Relevance ranking“Did

you

mean?”

/ Spell Checking auto-correction, resubmit searchContent aggregationIntegrating search for books, articles, etc.Single, Simple Search BoxFRBR – functional requirements for bibliographic record, grouping editionsSlide21

Discovery Characteristics, cont.

Enhanced Experience

Sometimes fun and engaging

Interactive/Collaborative

User centered design

Enhanced Services

Find it / Get it for meBook Covers / SynopsisFull textAvailability on same page as resultsSlide22

Discovery Characteristics, cont.

Enhanced Content

Article Searching

Commercial Data

Merging Special Collections

Harvesting Online Collections

Grey LiteratureFree ContentEnhanced AccessSyndication - Getting into users toolsCourse Management SystemsBrowser and Desktop Tool BarsPortalsSlide23

Discovery Components

Next Generation Catalog

Next Generation “Unified Search” Aid

Normalization &

Apache

SOLR/

LuceneUser Interface

ILS

OPAC

MARC

Vendor

Data

MetaSearch

OAI

Vendor

Data

Circ Data

Full

TextSlide24

Primo Central

Content Components

Primo

RACER

TUG

Archives

OCUL

Geospatial

HathiTrust

Others

Phase I

Phase II

FutureSlide25

Evolution of DiscoverySlide26

Options for Expanding Primo

Local ingestion of resources using FTP or OAI harvesting

Searching remote resources in Primo using the Primo

DeepSearch API*Subscribing to a large centralized index, such as Primo Central

*Application Programming Interface

2011Slide27

Local ingestion of records

Example:

Hathi

Trust Digital LibraryHarvest the public domain records from Hathi Trust Digital Library

Normalize the records

Index the records in our local Primo database

Schedule updates from Hathi Trust into Primo2011Slide28

Normalization: creating local sort field (Date – Oldest)

2011Slide29

Primo Normalized XML (PNX)

2011Slide30

Open source & Open platform

Primo uses

Lucene

for its indexingSOLR exposes Lucene as a web service and allows for facetingAPIs and web services allow flexibility and customization

2011Slide31

We can’t index everything!

Trying out a subscription to Primo Central, a centralized index of scholarly journal articles, newspapers, conference proceedings etc.

User sees one interface; user is searching 2 indexes

2011Slide32

What is Primo Central Index?

A centralized index

of free and restricted resources

primarily articles & e-booksbased on metadata & full-text provided by publishers/aggregators

based on the collections selected by the library in the Primo Administration module

created & maintained by our vendor, Ex

LibrisSlide33

What is Primo Central Index?

A centralized index

of records harvested using the same process as our local Primo database

created using the same PNX record structure as our local Primo databaseindexed using the same indexing tools as our local Primo databaseSlide34

Blending local and remote resources

Both local and remote results are represented in the facets

Blended relevance ranking

Can configure Primo to

boost high ranking local results

so that when Primo is doing relevance ranking on our 4 million records alongside 100s of millions of Primo Central records local results aren’t missed by the userSlide35

Search = local resources & Primo CentralSlide36

How does it work?

Ex

Libris

has created & indexed records for millions of items based on information from the publishersPrimo searches Primo Central the same way it searches the local databaseFull text availability is determined in advance by our URL resolver SFX, i.e.

Delivery of the resource uses menu for Slide37

New features: snippets give context

If your search term is found in the full-text, Primo supplies a snippet highlighting the termSlide38

New features: expanding the search

Defaults to our library’s electronic subscriptions but users can expand the search to all of Primo CentralSlide39

New Facets & Facet ValuesSlide40

Added value: bX RecommenderSlide41

Trouble-shooting remote resources

We can view the PNX records using web services but we have no control over the content or the normalization rules

Records have the same structure as our local records but are missing local fields and don’t reflect local policies

2011Slide42

Assessing Primo Central

Over 65 hours of one-on-one usability testing and focus groups with undergraduate students, graduate students, faculty, staff and alumni

Library staff survey

Feedback formStatistics from Cognos

2011Slide43

Looking to the future

What other content should be added to Primo?

How can we improve/enhance the interface?

What is the right balance for boosting local physical resources?How do we point users to resources that can’t be searched using Primo?

2011Slide44

Questions?

Pascal Calarco

Associate

University Librarian, Digital & Discovery Servicespvcalarco@uwaterloo.caAlison Hitchens

Cataloguing & Metadata Librarian

ahitchen@uwaterloo.ca

2011