/
Jialin Liu ,  Bradly   Crysler Jialin Liu ,  Bradly   Crysler

Jialin Liu , Bradly Crysler - PowerPoint Presentation

importedferrari
importedferrari . @importedferrari
Follow
342 views
Uploaded On 2020-11-06

Jialin Liu , Bradly Crysler - PPT Presentation

Yin Lu Yong Chen Oct 07 2013 DataIntensive Scalable Computing Laboratory DISCL Localitydriven Highlevel IO Aggregation for Processing Scientific Datasets 1 Outline Introduction ID: 816713

high 100 hila level 100 high level hila length start physical scientific collaboration aggregation overlapping call logical 200 locality

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Jialin Liu , Bradly Crysler" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Jialin Liu, Bradly Crysler, Yin Lu, Yong ChenOct. 07. 2013Data-Intensive Scalable Computing Laboratory (DISCL)

Locality-driven High-level I/O Aggregation for Processing Scientific Datasets

1

Slide2

OutlineIntroductionMotivation Hila: High Level I/O A

ggregationEvaluationConclusion and Future Work

2

Slide3

Introduction

Scientific simulations

nowadays generate

a few

terabytes (TB) of data in

a single run

and the data sizes

are expected

to reach

petabytes (PB) in the near future. GCRM, 100 million collumns, 128 levels per column, 50 kmAccessing and analyzing the data reveals poor I/O performance due to the logical-physical mismatching.

Slide4

IntroductionScientific Datasets and Scientific I/O Libraries

PnetCDF, HDF5, ADIOSPnetCDF

MPI-IO

Parallel File Systems

Scientific I/O libraries allow users to specify array-based logical input

Logical-physical mismatching

Slide5

Motivation

I/O methods in scientific I/O libraries(

PnetCDF

, ADIOS, HDF5):

Independent I/O

Collective I/O

Noblocking

I/O

Processes collaboration:

No

Calls collaboration :

No

Processes collaboration:

Yes

Calls collaboration :

No

Processes collaboration:

Yes

Calls collaboration :

Yes

Slide6

MotivationContention on Storage Server without Aware of Locality

Call

0

Call

1

Call

i

Two Phase Collective I/O

ag

00

ag

01

ag

02

ag

03

ag

1

0

ag

1

1

ag

1

2

ag

1

3

ag

i

0

ag

i

1

ag

i

2

ag

i

3

Slide7

Performance with Overlapping CallsConclusion: Overlapping Should be Removed

Slide8

Idea: High level I/O Aggregation

start{0,0,0}

length{100,200,100}

start{0,0,100}

length{100,200,100}

start{10,20,100}

length{10,150,400}

start{10,170,100}

length{10,150,400}

Physical

Layout

sub

0

sub

2

sub

0

sub

2

sub

1

sub

3

sub

1

sub

3

Physical

Layout

start{0,0,0}

length{100,200,200}

start{10,20,100}

length{10,300,400}

Call

0

Call

1

Logical Input

Decomposition

Slide9

Idea: High level I/O Aggregation

Basic IdeaFigure out the overlapping among requestsEliminate the overlapping before doing I/O

Challenges

How to decompose the requests

How to aggregate the sub-arrays at a high level

Slide10

Hila: High Level I/O Aggregation

Way to figure out the physical layoutSub-correlation Function

Sub-correlation Set

Lustre Striping: stripe size: s; stripe count: l;

Dataset : Dimension: d; subsets size: m

Slide11

Hila Algorithm: Prior Step

Prior Step: calculate sub-correlation set, one time analysis

Slide12

Hila Algorithm: DecompositionMain Steps: Request Decomposition and Aggregation

Slide13

Improvement with HilaPerformance Improved with Hila

Slide14

Improvement with HilaFASM Improved with Hila

Slide15

Conclusion and Future WorkConclusion

The mismatching between logical access and physical layout can lead to poor performance.We propose the locality-driven high-level aggregation approach (HiLa) to facilitate the existing I

/O methods by eliminating the overlapping among sub-array requests.

Future Work

Apply to write operations

Integrate with

f

ile systems.

Slide16

Locality-driven High-level I/O Aggregationfor Processing Scientific DatasetsThanks

Q&Ahttp://discl.cs.ttu.edu

Related Contents


Next Show more