/
STK5800 STK5800

STK5800 - PowerPoint Presentation

marina-yarberry
marina-yarberry . @marina-yarberry
Follow
364 views
Uploaded On 2016-05-06

STK5800 - PPT Presentation

and EPrints Services for Object Storage and Preservation March 2008 All content in these slides is considered work in progress In no way does it represent an absolute view of any final end product and at this stage should purely be considered a set of realistic ideas ID: 308141

honeycomb repository objects storage repository honeycomb storage objects eprints services layer preservation object repositories backup metadata large big service

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "STK5800" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

STK5800 and EPrints

Services for Object Storage and PreservationMarch 2008

All content in these slides is considered work in progress. In no way does it represent an absolute view of any final end product and at this stage should purely be considered a set of realistic ideas. Slide2

Outline

StorageTek 5800 (The Honeycomb) provides high resilience data storage with a built in metadata layer.EPrints

is a piece of repository software for managing large collections of digital objects and their related metadata.

Slide3

EPrints

Open Source repository software to provide open access to institutional output.

Provides a powerful

plugin

based package which can easily be extended at any layer to suit a users requirements.

2 types of archiveThose used to manage publications and small objects.Those used to deposit large objects. These tend to contain heavier customisation. Slide4

Preserv2

Preserv2 is the 2nd iteration of a project looking at preservation services for repositories.

Beyond simple backup

Format Renderers, Format Translation, Risk Assessment, Interoperability and long term storage. Slide5

Why use a

Honeycomb?

A Honeycomb is not just a “Big Disk”

A Service Based Architecture:

Big object, big storage, more powerful

plugins/services.Smaller Repositories can jointly use a single Honeycomb as a “Preservation Service”.

Preservation Service Providers

Can combine several servers into a “Honeycomb Cloud” Slide6

EPrints Architecture

EPrints

(Repository) Layer

Object Storage

Metadata StorageSlide7

EPrints and Honeycomb

EPrints

(Repository) Layer

STK5800

HoneyCombSlide8

Services for Repositories

EPrints

(Repository) Layer

Metadata Services

Storage Beans

Automated

Wide Area BackupSlide9

Metadata Services

Same resilience as data.Averts the need to store a file id/url somewhere in order to find an object.

Enables collections to be constructed by independent parties.

Objects can be exported into many formats accurately.Slide10

Storage Beans

Can perform operations upon the objects in the system without reliance upon the repository to manage these processes. (e.g. Object Translation)Preservation services can provide feedback to repository administrators on potential risks to their objects. (e.g. Object Classification, age)

Can be used to extend the metadata layer to provide more powerful access to objects and their parts/pages. (e.g. Retrieve me page 10 of volume 6 of X) Slide11

Wide Area Replication (Backup)

The

possibility to link two or more Honeycombs together over a wide area to provide mirrored backup.

This can be implemented

by

the archive which can store its objects in a “Honeycomb Cloud” Slide12

Possible Architectures (2)

Repository

Repository

RepositorySlide13

Possible Architectures (3)

Repository

Repository

RepositorySlide14

Possible Architectures (4)

Repository

Repository

RepositorySlide15

Preservation Services

A “Honeycomb Cloud” provides the basis for a preservation service which can be provided to many small scale (<200Gb) repositories.Options for object storage:

Locally with Honeycomb acting purely as a preservation service.

Hand all object storage and retrieval to Honeycomb Cloud.

A half and half solution:

Small Objects served locally, Large Objects from Honeycomb.Recent and Popular Objects served locally, Older Objects considered preserved. Slide16

EPrints with the STK 4500

The out of the box repository solution for Large Repositories.Slide17

Thumpers “Big Disk”

The Thumper system (STK 4500) is essentially a “Big Disk” server.“Out of the Box” solution.

Expansions:

Services to enable replication between 2 thumpers.

Preservation services using a Honeycomb.

Aimed at Repositories where tape backup is not ideal.Slide18

Ecrystals (Possible Use Case)

Large Chemistry repository which currently stores only processes result objects (small).

These result files are generated from >1Gb raw datasets.

8+ Datasets generated a day.

After 6 months results sets are of less worth.

This represents 1TB of raw data in a 6 month period. Slide19

ECrystals – Single Honeycomb Architecture

Current Repository RemainsAll Results Sets Stored on

HoneyComb

Pros

Simplistic Architecture

Sole use of Honeycomb Year of “on-site” storage.Cons Cost

Backup Procedure?

EPrints

(Repository) LayerSlide20

Thumper System

ECrystals

– Thumper with “Honeycomb Cloud”

Pros

Single local machine

6 months+ locally Accessible

Automated Preservation

Preservation Services managed by Honeycomb Cloud.

Storage Beans on Honeycomb Cloud compress older/less popular objects

Cons

?

EPrints

(Repository) LayerSlide21

Summary

Honeycomb provides:Better separation of repository layer from storage layer.R

epository interoperability.

A

new approach to storing and preserving

data from institutional repositories based on EPrints and other software.

Related Contents


Next Show more