/
Using GENI for computational science Using GENI for computational science

Using GENI for computational science - PowerPoint Presentation

festivehippo
festivehippo . @festivehippo
Follow
342 views
Uploaded On 2020-08-04

Using GENI for computational science - PPT Presentation

Ilya Baldin RENCI UNC Chapel Hill Networked Clouds Cloud and Network Providers Observatory Wind tunnel Science Workflows ExoGENI Testbed ComputationalData Science Projects on ExoGENI ID: 797102

core compute geni data compute core data geni infrastructure time exogeni resource science radii storm workflows real resources irods

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Using GENI for computational science" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Using GENI for computational science

Ilya

Baldin

RENCI, UNC – Chapel Hill

Slide2

Networked Clouds

Cloud and Network Providers

Observatory

Wind tunnel

Science Workflows

Slide3

ExoGENI

Testbed

Slide4

Computational/Data Science Projects on ExoGENI

ADAMANT – Building tools for enabling workflow-based scientific applications on dynamic infrastructure (RENCI, Duke, USC/ISI)

RADII – Building tools for supporting collaborative data-driven science (RENCI)

GENI

ScienceShakedown

– ADCIRC storm surge modeling on GENIGoal of presentation to demonstrate some of the things that are possible with GENI today

4

Slide5

ADAMANT

Presentation title goes here

5

Slide6

Scientific Workflows – Dynamic Use Case

Presentation title goes here

6

Slide7

CC-NIE ADAMANT – Pegasus/ExoGENI

7

Network Infrastructure-as-a-Service (

NIaaS

) for workflow-driven applications

Tools for workflows integrated with adaptive infrastructure

Workflows triggering adaptive infrastructure

Pegasus workflows using

ExoGENI

Adapt to application demands (compute, network, storage)

Integrate data movement into

NIaaS

(on-ramps)

Target applications

Montage Galactic plane ensemble: Astronomy mosaics

Genomics: High-Throughput Sequencing

Slide8

8

ExoGENI

: Enabling Features for Workflows

On-Ramps /

Stitchports

Connect

ExoGENI

to existing static infrastructure to import/export

Storage slivering

Networked storage:

iSCSI

target on

dataplane

Neuca

tools attach

lun

, format and

mount filesystem

Inter-domain links, multipoint broadcast networks

Slide9

Computational workflows in Genomics

Several

versions as we scaled:

S

ingle

machine

C

luster

basedMapSeq: specialized code & CondorPegasus & Condor

RNA-

Seq

WGS

Slide10

10

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

Cloud providers (compute, data)

Goal: learning to use NIaaS for biomedical research

VM

VM

Slice 1

VM

VM

VM

Slice 2

VM

User or workflow provisioned & isolated slices

VM

VM

VM

VM

Network providers

Slide11

Goal: Management of data flows in NIaaS

RENCI

UNC

iRODS Data Grid

iCAT

RE

RE

VM

VM

VM

Slice 2

VM

Layer 2 connection within the slice

Metadata control

Lab X can compute on Project Y data in the cloud

User X can move data from Study A to the cloud

Data from Study W cannot remain on cloud resources

Ease of access

Control over access

Auditing

Provenance

Slide12

12

Example

ExoGENI

requests auto-generated

Slide13

Application to

NIaaS

- Architecture

Slide14

RADII

Presentation title goes here

14

Slide15

RADII

RADII: Resource Aware Data-centric Collaboration Infrastructure

Middleware to facilitate data

-driven collaborations

for

domain researchers and a commodity to the science communityReducing the large gap between procuring

the required infrastructure and manage data transfers efficientlyIntegration of data-grid (

iRODS) and NIaaS (ORCA) technologies on ExoGENI infrastructure Novel tools to

map data processes, computations, storage and organization entities onto infrastructure with intuitive GUI based application Novel data-centric resource management mechanisms for provisioning and de-provisioning

resources dynamically through out the lifecycle of collaborations

Slide16

Why iRODS in RADII?

RADII Policies to

iRODS

Rule Language

Easy to map policies to

iRODS Dynamic PEPReduced complexity for RADII

Distributed and Elastic Data Grid Resource Monitoring FrameworkGeo-aware Resource hierarchy creation via composable

iRODSMetadata tagging

Slide17

Resource Awareness

iRODS

RMS provides node specific resource utilization

End-to-End parameters such as throughput, current network flow is important for judicious placement, replication and retrieval decision

Created end-to

-end Throughput, Latency and instantaneous transfer RX/TX per second monitoring.The best server selection based on end-to-end utility value:

Slide18

Experiment Topology

Figure: Experimental Setup Topology

Slide19

Experimental Setup

The sites were : UCD, SL, UH,

FIU

Parallel and multithreaded file ingestion from each of the clients

Total 400GB file ingestion from each client

One copy at the edge node and another replication based on utile value.

Slide20

Edge Put and Remote Replication Time

Figure: Edge Node Put Time

Figure: Remote Replication Time

Slide21

ScienceShakedown

Presentation title goes here

21

Slide22

Motivation

Hurricane Sandy (2012)

Slide23

Motivation

Real-time, on-demand computations of storm surge impacts

Hazards

to coastal areas a major concern

Hazard/Threat Information needed ASAP (

Urgently

)

Critical need for:

detailed  high spatial resolution

 large compute resources

Federal Forecast cycle every 6 hrsMust be

well within Cycle to be relevant/useful

I.e., New information at 5:59 is already old!!!

Slide24

Computing Storm Surge

ADCIRC

Storm Surge Model

FEMA-approved for Coastal Flood Insurance Studies

Very high spatial resolution (

millions

of triangles)

Typically use

256-1024 cores for real-time (one simulation!)

ADCIRC grid for coastal North Carolina

Slide25

Tackling Uncertainty

Research Ensemble

NSF Hazards SEES project

22 members

, H. Floyd (1999)

One simulation is NOT enough!

Probabilistic Assessment of Hurricanes

A “few” likely hurricanes

Fully dynamic atmosphere (WRF)

Slide26

Why GENI?

Current limitations: Real-time demands for compute resource

Large demands for real-time compute resources during storms

Not enough demand to dedicate a cluster year-round

Slide27

Why GENI?

Current limitations: Real-time demands for compute resource

Large demands for real-time compute resources during storms

Not enough demand to dedicate a cluster year-round

GENI enables

Federation of resources

Cloud bursting, urgent, on-demand

High-speed data transfers to/from/between remote resourcesReplicate data/compute across geographic areas

Resiliency, performance

Slide28

Storm Surge Workflow

Parallel task (32 Core MPI)

Each ensemble member is a high-performance parallel

task that calculates one storm

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Compute

Core

Slide29

Slice Topology

11 GENI sites (1 ensemble manager, 10 compute sites)

T

opology: 92 VMs (368 cores), 10 inter-domain VLANs, 1 TB

iSCSI

storage

HPC compute nodes: 80 compute nodes (320 cores

) from 10 sites

Slide30

ADCIRC Results from GENI

Storm Surge

for 6 simulations

N11

N17

N01

N14

N16

N20

Small Threat

Big Threat

Slide31

Conclusions

GENI testbed represents a kind of shared infrastructure suitable for prototyping of solutions for some computational science domains

GENI technologies represent a collection of enabling mechanisms that can provide foundation for the future federated science cyberinfrastructure

Different members of GENI federations offer different capabilities for their users, suitable for a variety of problems

31

Slide32

Thank you!

Funders

Partners

32