/
Using GENI for computational science Using GENI for computational science

Using GENI for computational science - PowerPoint Presentation

tawny-fly
tawny-fly . @tawny-fly
Follow
414 views
Uploaded On 2016-09-02

Using GENI for computational science - PPT Presentation

Ilya Baldin RENCI UNC Chapel Hill Networked Clouds Cloud and Network Providers Observatory Wind tunnel Science Workflows ExoGENI Testbed ComputationalData Science Projects on ExoGENI ID: 458745

core compute geni data compute core data geni infrastructure time resource exogeni resources workflows irods storm science radii real niaas surge high

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Using GENI for computational science" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Using GENI for computational science

Ilya

Baldin

RENCI, UNC – Chapel HillSlide2

Networked Clouds

Cloud and Network Providers

Observatory

Wind tunnel

Science WorkflowsSlide3

ExoGENI

TestbedSlide4

Computational/Data Science Projects on ExoGENI

ADAMANT – Building tools for enabling workflow-based scientific applications on dynamic infrastructure (RENCI, Duke, USC/ISI)

RADII – Building tools for supporting collaborative data-driven science (RENCI)

GENI

ScienceShakedown

– ADCIRC storm surge modeling on GENIGoal of presentation to demonstrate some of the things that are possible with GENI today

4Slide5

ADAMANT

Presentation title goes here

5Slide6

Scientific Workflows – Dynamic Use Case

Presentation title goes here

6Slide7

CC-NIE ADAMANT – Pegasus/ExoGENI

7

Network Infrastructure-as-a-Service (

NIaaS

) for workflow-driven applications

Tools for workflows integrated with adaptive infrastructure

Workflows triggering adaptive infrastructure

Pegasus workflows using

ExoGENI

Adapt to application demands (compute, network, storage)

Integrate data movement into

NIaaS

(on-ramps)

Target applications

Montage Galactic plane ensemble: Astronomy mosaics

Genomics: High-Throughput SequencingSlide8

8

ExoGENI

: Enabling Features for Workflows

On-Ramps /

Stitchports

Connect

ExoGENI

to existing static infrastructure to import/export

Storage slivering

Networked storage:

iSCSI

target on

dataplane

Neuca

tools attach

lun

, format and

mount filesystem

Inter-domain links, multipoint broadcast networksSlide9

Computational workflows in Genomics

Several

versions as we scaled:

S

ingle

machine

C

luster

basedMapSeq: specialized code & Condor

Pegasus & Condor

RNA-

Seq

WGSSlide10

10

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

Cloud providers (compute, data)

Goal: learning to use NIaaS for biomedical research

VM

VM

Slice 1

VM

VM

VM

Slice 2

VM

User or workflow provisioned & isolated slices

VM

VM

VM

VM

Network providersSlide11

Goal: Management of data flows in NIaaS

RENCI

UNC

iRODS Data Grid

iCAT

RE

RE

VM

VM

VM

Slice 2

VM

Layer 2 connection within the slice

Metadata control

Lab X can compute on Project Y data in the cloud

User X can move data from Study A to the cloud

Data from Study W cannot remain on cloud resources

Ease of access

Control over access

Auditing

ProvenanceSlide12

12

Example

ExoGENI

requests auto-generatedSlide13

Application to

NIaaS

- ArchitectureSlide14

RADII

Presentation title goes here

14Slide15

RADII

RADII: Resource Aware Data-centric Collaboration Infrastructure

Middleware to facilitate data

-driven collaborations

for

domain researchers and a commodity to the science communityReducing the large gap between procuring

the required infrastructure and manage data transfers efficientlyIntegration of data-grid (

iRODS) and NIaaS (ORCA) technologies on ExoGENI infrastructure Novel tools to

map data processes, computations, storage and organization entities onto infrastructure with intuitive GUI based application Novel data-centric resource management mechanisms for provisioning and

de-provisioning resources dynamically through out the lifecycle of collaborationsSlide16

Why iRODS in RADII?

RADII Policies to

iRODS

Rule Language

Easy to map policies to

iRODS Dynamic PEPReduced complexity for RADII

Distributed and Elastic Data Grid Resource Monitoring FrameworkGeo-aware Resource hierarchy creation via composable

iRODSMetadata taggingSlide17

Resource Awareness

iRODS

RMS provides node specific resource utilization

End-to-End parameters such as throughput, current network flow is important for judicious placement, replication and retrieval decision

Created end-to

-end Throughput, Latency and instantaneous transfer RX/TX per second monitoring.The best server selection based on end-to-end utility value: Slide18

Experiment Topology

Figure: Experimental Setup TopologySlide19

Experimental Setup

The sites were : UCD, SL, UH,

FIU

Parallel and multithreaded file ingestion from each of the clients

Total 400GB file ingestion from each client

One copy at the edge node and another replication based on utile value.Slide20

Edge Put and Remote Replication Time

Figure: Edge Node Put Time

Figure: Remote Replication TimeSlide21

ScienceShakedown

Presentation title goes here

21Slide22

Motivation

Hurricane Sandy (2012)Slide23

Motivation

Real-time, on-demand computations of storm surge impacts

Hazards

to coastal areas a major concern

Hazard/Threat Information needed ASAP (

Urgently

)

Critical need for:

detailed  high spatial resolution

 large compute resources

Federal Forecast cycle every 6 hrsMust be

well within Cycle to be relevant/useful

I.e., New information at 5:59 is already old!!!Slide24

Computing Storm Surge

ADCIRC

Storm Surge Model

FEMA-approved for Coastal Flood Insurance Studies

Very high spatial resolution (

millions

of triangles)

Typically use

256-1024 cores for real-time (one simulation!)

ADCIRC grid for coastal North CarolinaSlide25

Tackling Uncertainty

Research Ensemble

NSF Hazards SEES project

22 members

, H. Floyd (1999)

One simulation is NOT enough!

Probabilistic Assessment of Hurricanes

A “few” likely hurricanes

Fully dynamic atmosphere (WRF)Slide26

Why GENI?

Current limitations: Real-time demands for compute resource

Large demands for real-time compute resources during storms

Not enough demand to dedicate a cluster year-roundSlide27

Why GENI?

Current limitations: Real-time demands for compute resource

Large demands for real-time compute resources during storms

Not enough demand to dedicate a cluster year-round

GENI enables

Federation of resources

Cloud bursting, urgent, on-demand

High-speed data transfers to/from/between remote resourcesReplicate data/compute across geographic areas

Resiliency, performanceSlide28

Storm Surge Workflow

Parallel task (32 Core MPI)

Each ensemble member is a high-performance parallel

task that calculates one storm

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

CoreSlide29

Slice Topology

11 GENI sites (1 ensemble manager, 10 compute sites)

T

opology: 92 VMs (368 cores), 10 inter-domain VLANs, 1 TB

iSCSI

storage

HPC compute nodes: 80 compute nodes (320 cores

) from 10 sites Slide30

ADCIRC Results from GENI

Storm Surge

for 6 simulations

N11

N17

N01

N14

N16

N20

Small Threat

Big ThreatSlide31

Conclusions

GENI testbed represents a kind of shared infrastructure suitable for prototyping of solutions for some computational science domains

GENI technologies represent a collection of enabling mechanisms that can provide foundation for the future federated science cyberinfrastructure

Different members of GENI federations offer different capabilities for their users, suitable for a variety of problems

31Slide32

Thank you!

Funders

Partners

32