Ilya Baldin RENCI UNC Chapel Hill Networked Clouds Cloud and Network Providers Observatory Wind tunnel Science Workflows ExoGENI Testbed ComputationalData Science Projects on ExoGENI ID: 458745
Download Presentation The PPT/PDF document "Using GENI for computational science" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Using GENI for computational science
Ilya
Baldin
RENCI, UNC – Chapel HillSlide2
Networked Clouds
Cloud and Network Providers
Observatory
Wind tunnel
Science WorkflowsSlide3
ExoGENI
TestbedSlide4
Computational/Data Science Projects on ExoGENI
ADAMANT – Building tools for enabling workflow-based scientific applications on dynamic infrastructure (RENCI, Duke, USC/ISI)
RADII – Building tools for supporting collaborative data-driven science (RENCI)
GENI
ScienceShakedown
– ADCIRC storm surge modeling on GENIGoal of presentation to demonstrate some of the things that are possible with GENI today
4Slide5
ADAMANT
Presentation title goes here
5Slide6
Scientific Workflows – Dynamic Use Case
Presentation title goes here
6Slide7
CC-NIE ADAMANT – Pegasus/ExoGENI
7
Network Infrastructure-as-a-Service (
NIaaS
) for workflow-driven applications
Tools for workflows integrated with adaptive infrastructure
Workflows triggering adaptive infrastructure
Pegasus workflows using
ExoGENI
Adapt to application demands (compute, network, storage)
Integrate data movement into
NIaaS
(on-ramps)
Target applications
Montage Galactic plane ensemble: Astronomy mosaics
Genomics: High-Throughput SequencingSlide8
8
ExoGENI
: Enabling Features for Workflows
On-Ramps /
Stitchports
Connect
ExoGENI
to existing static infrastructure to import/export
Storage slivering
Networked storage:
iSCSI
target on
dataplane
Neuca
tools attach
lun
, format and
mount filesystem
Inter-domain links, multipoint broadcast networksSlide9
Computational workflows in Genomics
Several
versions as we scaled:
S
ingle
machine
C
luster
basedMapSeq: specialized code & Condor
Pegasus & Condor
RNA-
Seq
WGSSlide10
10
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
Cloud providers (compute, data)
Goal: learning to use NIaaS for biomedical research
VM
VM
Slice 1
VM
VM
VM
Slice 2
VM
User or workflow provisioned & isolated slices
VM
VM
VM
VM
Network providersSlide11
Goal: Management of data flows in NIaaS
RENCI
UNC
iRODS Data Grid
iCAT
RE
RE
VM
VM
VM
Slice 2
VM
Layer 2 connection within the slice
Metadata control
Lab X can compute on Project Y data in the cloud
User X can move data from Study A to the cloud
Data from Study W cannot remain on cloud resources
Ease of access
Control over access
Auditing
ProvenanceSlide12
12
Example
ExoGENI
requests auto-generatedSlide13
Application to
NIaaS
- ArchitectureSlide14
RADII
Presentation title goes here
14Slide15
RADII
RADII: Resource Aware Data-centric Collaboration Infrastructure
Middleware to facilitate data
-driven collaborations
for
domain researchers and a commodity to the science communityReducing the large gap between procuring
the required infrastructure and manage data transfers efficientlyIntegration of data-grid (
iRODS) and NIaaS (ORCA) technologies on ExoGENI infrastructure Novel tools to
map data processes, computations, storage and organization entities onto infrastructure with intuitive GUI based application Novel data-centric resource management mechanisms for provisioning and
de-provisioning resources dynamically through out the lifecycle of collaborationsSlide16
Why iRODS in RADII?
RADII Policies to
iRODS
Rule Language
Easy to map policies to
iRODS Dynamic PEPReduced complexity for RADII
Distributed and Elastic Data Grid Resource Monitoring FrameworkGeo-aware Resource hierarchy creation via composable
iRODSMetadata taggingSlide17
Resource Awareness
iRODS
RMS provides node specific resource utilization
End-to-End parameters such as throughput, current network flow is important for judicious placement, replication and retrieval decision
Created end-to
-end Throughput, Latency and instantaneous transfer RX/TX per second monitoring.The best server selection based on end-to-end utility value: Slide18
Experiment Topology
Figure: Experimental Setup TopologySlide19
Experimental Setup
The sites were : UCD, SL, UH,
FIU
Parallel and multithreaded file ingestion from each of the clients
Total 400GB file ingestion from each client
One copy at the edge node and another replication based on utile value.Slide20
Edge Put and Remote Replication Time
Figure: Edge Node Put Time
Figure: Remote Replication TimeSlide21
ScienceShakedown
Presentation title goes here
21Slide22
Motivation
Hurricane Sandy (2012)Slide23
Motivation
Real-time, on-demand computations of storm surge impacts
Hazards
to coastal areas a major concern
Hazard/Threat Information needed ASAP (
Urgently
)
Critical need for:
detailed high spatial resolution
large compute resources
Federal Forecast cycle every 6 hrsMust be
well within Cycle to be relevant/useful
I.e., New information at 5:59 is already old!!!Slide24
Computing Storm Surge
ADCIRC
Storm Surge Model
FEMA-approved for Coastal Flood Insurance Studies
Very high spatial resolution (
millions
of triangles)
Typically use
256-1024 cores for real-time (one simulation!)
ADCIRC grid for coastal North CarolinaSlide25
Tackling Uncertainty
Research Ensemble
NSF Hazards SEES project
22 members
, H. Floyd (1999)
One simulation is NOT enough!
Probabilistic Assessment of Hurricanes
A “few” likely hurricanes
Fully dynamic atmosphere (WRF)Slide26
Why GENI?
Current limitations: Real-time demands for compute resource
Large demands for real-time compute resources during storms
Not enough demand to dedicate a cluster year-roundSlide27
Why GENI?
Current limitations: Real-time demands for compute resource
Large demands for real-time compute resources during storms
Not enough demand to dedicate a cluster year-round
GENI enables
Federation of resources
Cloud bursting, urgent, on-demand
High-speed data transfers to/from/between remote resourcesReplicate data/compute across geographic areas
Resiliency, performanceSlide28
Storm Surge Workflow
Parallel task (32 Core MPI)
Each ensemble member is a high-performance parallel
task that calculates one storm
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
Core
Compute
CoreSlide29
Slice Topology
11 GENI sites (1 ensemble manager, 10 compute sites)
T
opology: 92 VMs (368 cores), 10 inter-domain VLANs, 1 TB
iSCSI
storage
HPC compute nodes: 80 compute nodes (320 cores
) from 10 sites Slide30
ADCIRC Results from GENI
Storm Surge
for 6 simulations
N11
N17
N01
N14
N16
N20
Small Threat
Big ThreatSlide31
Conclusions
GENI testbed represents a kind of shared infrastructure suitable for prototyping of solutions for some computational science domains
GENI technologies represent a collection of enabling mechanisms that can provide foundation for the future federated science cyberinfrastructure
Different members of GENI federations offer different capabilities for their users, suitable for a variety of problems
31Slide32
Thank you!
Funders
Partners
32