/
A  study A  study

A study - PowerPoint Presentation

natalia-silvester
natalia-silvester . @natalia-silvester
Follow
380 views
Uploaded On 2017-01-15

A study - PPT Presentation

of delta sync and other optimisations in HTTP WebDav synchronisation protocols Do we need changes in OwnCloud protocol Wojciech Jarosz AGH University of Science and Technology CERN ID: 510057

cs3 zurich 2016 january zurich cs3 january 2016 cernbox files bundling protocol users file kbit size scientific sync delta

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "A study" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

A study of delta sync and other optimisations in HTTP/WebDav synchronisation protocols

Do we need changes in OwnCloud protocol?

Wojciech Jarosz

AGH

University

of Science and Technology / CERNSlide2

IntroductionOwncloud protocol, CERNBox serviceEnhancing current protocolInvestigation

of following enhancements:BundlingDelta-syncingCompressionChunk size

adjustment

Context: scientific environment at CERN

CS3 Zurich, January 2016

2Slide3

IntroductionData from CERNBox FS and network logs

CS3 Zurich, January 20163Slide4

CERNBoxDistinguished features: Integrated with 80PB of physics dataFuture: easy and

effective to share experiment resultsFuture: focus on scientific

usage

Currently: a mix of scientific and

personal use

CS3 Zurich, January 2016

4Slide5

CERNBox as of Oct 15~ 31 TB of data~ 3700 users~

24 milion files in ~ 3 milion directoriesAverage file size: ~ 1.3 MB, median file size <

100kB

200k

file uploads / downloads per day

CS3 Zurich, January 2016

5Slide6

FilesizesCS3 Zurich, January 2016

6Slide7

Files count and sizeCS3 Zurich, January 20167

No

extensionSlide8

Where are the transfers coming from?

CS3 Zurich, January 20168Slide9

Downloads vs UploadsCS3 Zurich, January 2016

9Slide10

Protocol - chunkingCould be used for:partial uploaddelta-sync

deduplicationIs the chunk size chosen correctly?Most of the files

are

smallModern protocols should use network-

aware chunkingCurrently

only ~0.15% of all

PUTs are chunkedIs dynamic chunking a viable

option

?

CS3 Zurich, January 2016

10Slide11

Enhancements to the current OwnCloud protocolFocus on bundling, delta-sync and compression

CS3 Zurich, January 201611Slide12

BundlingTypically users are active only a few days a

monthCS3 Zurich, January 201612Slide13

BundlingEven power users work in cycles

CS3 Zurich, January 201613Slide14

BundlingTypically users are active only a few days

a monthOften over 2000 requests in 10 minutesSmall file size

Implementation?

Simple bundling

– TARBall?Choose the right bundle

sizeSend chunks

in parallelError

reportingCS3 Zurich, January 201614

tar

untarSlide15

BundlingDROPBOX[1]CERNBOX*Reduce TCP slow-start

effectCS3 Zurich, January 201615

Before

bundling

After

bundling

Median flow size16.2 kB

42.4

kB

Throughput

PUT

358

kbit

/s

552.92

kbit

/s

Throughput

GET

783

kbit

/s

1294

kbit

/s

Before

bundling

After

bundling

Throughput

PUT

~3600

kbit

/s

Up

to 400

Mbit

/s ?

Throughput

GET

~7653

kbit

/s

Up

to 500

Mbit

/s ?

[1]

I. Drago, M. Mellia, M. M. Munaf`o, A.

Sperotto,

R

.

Sadre

, and A. Pras.

Inside

Dropbox

:

Understanding

Personal

Cloud

Storage Services

.

In

Proceedings

of the 12th ACM Internet

Measurement

Conference

, IMC’12, pages 481–494, 2012

.

* Based on users inside

CERN and affiliated institutionsSlide16

Extensions and filesizesCS3 Zurich, January 2016

16?Slide17

Delta-syncAbout 7.8 % of the files are versionsTypically files

are modified the same dayUsually small files

CS3 Zurich, January 2016

17Slide18

ROOT filesScientific software frameworkComplex file structureAlready compressed

Small changes scatteredthroughout the fileCS3 Zurich, January 2016

18Slide19

Delta-syncPossible implementationsChunk-basedByte-range requestMore data and simulation

neededIt might be not worth implementingCS3 Zurich, January 2016

19Slide20

CompressionFrom TOP20 extensions (sizewise) only .txt will compress

wellCompression can be slow, but almost all requests are

executed

from desktop

clientsCS3 Zurich, January 2016

20Slide21

FutureSlide22

Future - serviceCernBOX fully exposed to a very large scientific

repository (ATLAS, LHCb, CMS…)Fuse-mount to underlying CernBOX storage

available

everywhere at CERNWill users

use CERNBox in

new ways?

CS3 Zurich, January 201622Slide23

ConclusionOwncloud protocol is simple, but is it enough?

Understand before implementationWork in progress!

MSc

at AGHCS3 Zurich, January 2016

23Slide24

ConclusionBundling looks like the most viable enhancementFurther

research is needed for delta-sync and dynamic chunking

Compression

is less likely to enhance current

protocol

CS3 Zurich, January 2016

24Slide25

Contact detailsWojciech JaroszWojciech.Jarosz@cern.ch +41 22 76 75970

CS3 Zurich, January 201625Opinions

/

questions

most welcome!How the usage compares

to your system?How to

implement the new features

?Feedback, ideas, comments…