Larry Peterson In collaboration with Arizona Akamai Internet2 NSF North Carolina Open Networking Lab Princeton and several pilot sites S3 DropBox GenBank iPlant Data Management Challenge ID: 655439
Download Presentation The PPT/PDF document "Give Your Data the Edge A Scalable Data ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Give Your Data the Edge
A Scalable Data Delivery Platform
Larry Peterson
In collaboration with
Arizona
, Akamai
,
Internet2
, NSF
, North Carolina,
Open Networking Lab, Princeton
(and several pilot sites)Slide2
S3
DropBoxGenBankiPlant
Data Management Challenge
Distributed
Set of
Collaborators
Private
Data
Repositories
Commodity
Cloud
Storage
Pre-Stage
Write-Back
Share
Data
Management
ExpertsSlide3
Our Goal
To enable a scalable number of collaborators (and their applications) to share access to data independent of where it is stored, where the storage platform:Minimizes the operational burden imposed on usersMaximizes the use of commodity infrastructureMaximizes aggregate I/O performanceSlide4
S3
DropBoxMetadataServiceSG
SG
SG
SG
SG
GenBank
Shared
Volume
SG
CDN
SG
iPlant
Syndicate SolutionSlide5
S3
DropBoxMetadataServiceSG
SG
SG
SG
SG
GenBank
Shared
Volume
SG
CDN
SG
iPlant
Syndicate Solution
Bridges application
w
orkflow and HTTP
t
ransport; e.g.,
–
iRODS
–
Hadoop
Aquires
data
f
rom
e
xisting
r
epositories; e.g.,
–
iPlant
(
iRODS
)
–
GenBank
Treats cloud
s
torage as a
b
lock device
Manages data
c
onsistency and
k
ey distributionSlide6
S3
DropBoxMetadataService
SG
S
G
SG
SG
SG
GenBank
Shared
Volume
SG
CDN
SG
iPlant
Syndicate Solution
As easy as
m
ounting
Dropbox
Auto-mount in
Cloud VMsSlide7
Syndicate = CDN
Object Store NoSQL DB Scalable Read Bandwidth(Akamai HyperCache & RequestRouter)Data Durability(S3, Glacier, DropBox, Box, Swift)
Data Consistency
(Google App Engine
)
Value-Add
Storage Service
Service CompositionSlide8
Amazon
AWSGoogleCloudPlatform…CommodityCloudsPrivateCloudsInternet2BackboneRegional& Campus
EndUsers
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
HPC
S3
S3
S3
S3
S3
RR
RR
RR
RR
MS
MS
MS
MS
MS
Value-Add Storage Service
.
.
.Slide9
Value Proposition
Cloud-Ready – Allows users to mount shared volumes into cloud-hosted virtual machines (VMs) with minimal operational overhead.Scalable Read Bandwidth – Provides scalable read bandwidth (i.e., supports a scalable number of users) with minimal operational overhead. Provider Independence – Allows users to take advantage of cost/performance tradeoffs among multiple storage providers (as well as spread risk across those providers) with minimal operational overhead. Slide10
Value Proposition
Secure-by-Default – Allows users to securely share files across organizational boundaries, at scale, with minimal operational overhead. Adapt to Existing Workflows – Makes it easy to integrate existing user workflows, datasets, and toolkits, as well as extend and customize to meet specific community requirements (e.g., privacy). Sustainable Design – Provides a general-purpose storage platform that leverages commodity storage and network caches at every opportunity. Slide11
More Information
opencloud.us+Tomorrow, 10:50amAdvanced Networking/Joint TechsSlide12
S3
DropBoxMetadataServiceSG
SG
SG
SG
SG
GenBank
Shared
Volume
SG
CDN
SG
iPlant
Next Talk by
John Hartman
iRODS
Hadoop