/
Storage Optimization Strategies Storage Optimization Strategies

Storage Optimization Strategies - PowerPoint Presentation

natalia-silvester
natalia-silvester . @natalia-silvester
Follow
349 views
Uploaded On 2018-12-14

Storage Optimization Strategies - PPT Presentation

Techniques for configuring your Progress OpenEdge Database in order to minimize IO operations Tom Bascom White Star Software tomwsscom A Few Words about the Speaker Tom Bascom Progress user amp roaming DBA since 1987 ID: 741157

block 000 size type 000 block type size storage table areas records blocks area 100 record data rows idx1

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Storage Optimization Strategies" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Storage Optimization Strategies

Techniques for configuring your Progress OpenEdge Database in order to minimize IO operations

Tom Bascom, White Star Software

tom@wss.comSlide2

A Few Words about the Speaker

Tom Bascom; Progress user & roaming DBA since 1987President, DBAppraise, LLCRemote database management service for OpenEdge.Simplifying the job of managing and monitoring the world’s best business applications.

tom@dbappraise.com

VP, White Star Software, LLC

Expert consulting services related to all aspects of Progress and OpenEdge.tom@wss.com

2Slide3

We Will NOT be Talking about:

SANsServersOperating systemsRAID levels… and so forth.

3Slide4

What Do We Mean by“Storage Optimization”?

The trade press thinks it means BIG DISKS.Your CFO thinks it means BIG SAVINGS.Programmers think it means BIG DATABASES.SAN vendors think it means BIG COMMISSIONS.

DBAs seek the best possible reliability and performance at a reasonable cost.

4Slide5

5

The Foundation of OpenEdge Storage OptimizationSlide6

Type 2 Storage Areas

Type 2 storage areas are the foundation for all advanced features of the OpenEdge

database.

Type 2 areas have cluster sizes of 8, 64 or 512.

Cluster sizes of 0 or 1 are Type 1 areas.Data blocks in Type 2 areas contain data from just one table.

# misc32 storage area

d “misc32_dat":11,32;8 .

6Slide7

Only Read What You NeedBecause data blocks in Type 2 storage areas are “asocial”:

Locality of reference is leveraged more strongly.Table-oriented utilities such as index rebuild, binary dump and so forth know exactly which blocks they need to read and which blocks they do not need to read.DB features, such as the SQL-92 fast table scan and fast table drop, can operate much more effectively.

7Slide8

MYTHStorage optimization is just for large tables.

Type 2 storage areas are just for large tables.8Slide9

Truth

Very small, yet active tables often dominate an application’s IO profile.And type 2 areas are a very powerful tool for addressing this.9Slide10

Case Study

A system with 30,000 record reads/sec:The bulk of the reads were from one 10,000 record table.Coincidentally, Big B was set to 10,000.That table was in a Type 1 area, and its records were widely scattered.Moving the table to a Type 2 Area patched the problem. Only 2% of –B was now needed for this table!Performance improved dramatically.

10Slide11

Type 2 Storage Area UsageAlways

use type 2 areas…… for areas that contain data, indexes or LOBS.The schema area is a type 1 area 

11Slide12

12

How to Define

Y

our

S

torage

A

reasSlide13

Use the Largest DB Block SizeLarge blocks reduce IO; fewer operations are needed to move the same amount of data.

More data can be packed into the same space because there is proportionally less overhead.Because a large block can contain more data, it has improved odds of being a cache “hit.”Large blocks enable HW features to be leveraged: especially SAN HW.13Slide14

What about Windows?There are those who would say “except for Windows.” (Because Windows is a 4K-oriented OS.)

I have had good success with Windows & 8k blocks.NTFS can be changed to use an 8k block…14Slide15

Use Many (Type 2) Storage Areas

Do NOT assign tables to areas based on “function.”Instead group objects by common “technical attributes.”Create distinct storage areas for:Each very large tableTables with common Rows Per Block settingsIndexes versus data

15Slide16

16

Record

FragmentationSlide17

Fragmentation and Scatter“Fragmentation” is splitting records into multiple pieces.“Scatter” is the distance between (logically) adjacent records.

17Slide18

$ proutil dbname –C dbanalys > dbname.dba

RECORD BLOCK SUMMARY FOR AREA "APP_FLAGS_Dat" : 95

-------------------------------------------------------

Record Size (B) -Fragments- Scatter

Table Records Size Min Max Mean Count Factor Factor

PUB.APP_FLAGS 1676180 47.9M 28 58 29 1676190 1.0 1.9

Fragmentation and Scatter

“Fragmentation” is splitting records into multiple pieces.

“Scatter” is the distance between (logically) adjacent records.

18Slide19

Fragmentation and Scatter

“Fragmentation” is splitting records into multiple pieces.“Scatter” is the distance between (logically) adjacent records.

19

$ proutil dbname –C dbanalys > dbname.dba

RECORD BLOCK SUMMARY FOR AREA "APP_FLAGS_Dat" : 95

-------------------------------------------------------

Record Size (B) -Fragments- Scatter

Table Records Size Min Max Mean Count Factor Factor

PUB.APP_FLAGS 1676180 47.9M 28 58 29 1676190 1.0 1.9

…Slide20

Create LimitThe minimum free space in a blockProvides room for routine record expansion

OE10.2B default is 150 (4k & 8k blocks)Must be smaller than the toss limitOnly rarely worth adjusting20Slide21

Toss LimitThe minimum free space required to be on the “RM Chain”

Avoids looking for space in blocks that don’t have muchMust be set higher than Create Limit.Default is 300 (4k & 8k blocks)Ideally should be less than average row sizeOnly rarely worth adjusting

21Slide22

22

Fragmentation, Create & Toss

SummarySlide23

Create and Toss Limit Usage

SymptomActionFragmentation occurs on updates to existing records.You anticipated one fragment, but two were created.

Increase Create Limit

-

or -Decrease Rows Per BlockThere is limited (or no) fragmentation, but database block space is being used inefficiently, and records are not expected to grow beyond their original size.

Decrease Create Limit

- or -

Increase Rows Per Block

You have many (thousands, not hundreds) of blocks on the RM chain with insufficient space to create new records.

Increase Toss Limit

23

* Create and Toss limits are per area for Type 1 areas and per table for Type 2 areas.Slide24

24

Rows

Per

BlockSlide25

Why not “One Size Fits All”?A universal setting such as 128 rows per block seems simple.

And for many situations it is adequate.But…Too large a value may lead to fragmentation and too small to wasted space.It also makes advanced data analysis more difficult.And it really isn’t that hard to pick good values for RPB.

25Slide26

Set Rows Per Block Optimally

Use the largest Rows Per Block that:Fills the blockBut does not unnecessarily fragment itRough Guideline:Next power of 2 after BlockSize / (AvgRecSize + 20)Example: 8192 / (

220

+ 20) = 34, next power of 2 = 64

Caveat: there are far more complex rules that can be used and a great deal depends on the application’s record creation & update behavior. # misc32 storage area d “misc32_dat":11,

32

;8 .

26Slide27

27

RPB

ExampleSlide28

Set Rows Per Block Optimally

BlkSzRPBBlocks

Disk (KB)

Waste/

Blk%Used

Actual RPB

IO/1,000 Recs

1

4

3,015

3,015

124

86%

3

333

4

4

2,500

10,000

2,965

23%

4

250

4

8

1,250

5,000

2,075

46%

8

125

4

16

627

2,508

295

92%

16

62

4

32

596

2,384

112

97%

17

59

8

4

2,500

20,000

7,060

11%

4

250

8

16

625

5,000

4,383

44.76

16

62

8

32

313

2,504

806

90%

32

31

8

64

286

2,288

114

98%

35

29

8

128

285

2,280

109

98%

3529

Original

28Slide29

Set Rows Per Block Optimally

BlkSzRPBBlocks

Disk (KB)

Waste/

Blk%Used

Actual RPB

IO/1,000 Recs

1

4

3,015

3,015

124

86%

3

333

4

4

2,500

10,000

2,965

23%

4

250

4

8

1,250

5,000

2,075

46%

8

125

4

16

627

2,508

295

92%

16

62

4

32

596

2,384

112

97%

17

59

8

4

2,500

20,000

7,060

11%

4

250

8

16

625

5,000

4,383

44.76

16

62

8

32

313

2,504

806

90%

32

31

8

64

286

2,288

114

98%

35

29

8

128

285

2,280

109

98%

3529

Original

Oops!

29Slide30

Set Rows Per Block Optimally

BlkSzRPBBlocks

Disk (KB)

Waste/

Blk%Used

Actual RPB

IO/1,000 Recs

1

4

3,015

3,015

124

86%

3

333

4

4

2,500

10,000

2,965

23%

4

250

4

8

1,250

5,000

2,075

46%

8

125

4

16

627

2,508

295

92%

16

62

4

32

596

2,384

112

97%

17

59

8

4

2,500

20,000

7,060

11%

4

250

8

16

625

5,000

4,383

44.76

16

62

8

32

313

2,504

806

90%

32

31

8

64

286

2,288

114

98%

35

29

8

128

285

2,280

109

98%

3529

Original

Suggested

Oops!

30Slide31

Rows Per Block Caveats

Blocks have overhead, which varies by storage area type, block size, Progress version and by tweaking the create and toss limits.Not all data behaves the same:Records that are created small and that grow frequently may tend to fragment if RPB is too high.Record size distribution is not always Gaussian.If you’re unsure – round up!

31Slide32

32

Cluster

SizeSlide33

Blocks Per Cluster

When a type 2 area expands it will do so a cluster at a time.Larger clusters are more efficient:Expansion occurs less frequently.Disk space is more likely to be contiguously arranged. # misc32 storage area

d “misc32_dat":11,32;

8

.33Slide34

Why not “One Size Fits All”?A universal setting such as 512 blocks per cluster seems simple…

34Slide35

Set Cluster Size OptimallyThere is no advantage to having a cluster more than twice the size of the table.

Except that you need a cluster size of at least 8 to be Type 2.Indexes are usually much smaller than data.There may be dramatic differences in the size of indexes even on the same table.35Slide36

Different Index Sizes

$ proutil dbname –C dbanalys > dbname.dba…RECORD BLOCK SUMMARY FOR AREA "APP_FLAGS_Dat" : 95-------------------------------------------------------

Record Size (B) -Fragments- Scatter

Table Records Size Min Max Mean Count Factor Factor

PUB.APP_FLAGS 1676180

47.9M

28 58 29 1676190 1.0 1.9

INDEX BLOCK SUMMARY FOR AREA "APP_FLAGS_Idx" : 96

-------------------------------------------------------

Table Index Flds Lvls Blks Size %Util Factor

PUB.APP_FLAGS

AppNo 183 1 3

4764

37.1M 99.9 1.0

FaxDateTime 184 2 2

45

259.8K 72.4 1.6

FaxUserNotified 185 2 2

86

450.1K 65.6 1.7

36Slide37

37

Logical

ScatterSlide38

Logical Scatter Case Study38

A process reading approximately 1,000,000 records.An initial run time of 2 hours.139 records/sec.Un-optimized database.Slide39

Perform IO in the Optimal Order

TableIndex%Sequential

%Idx Used

Density

Table1t1_idx1*

0%

100%

0.09

t1_idx2

0%

0%

0.09

Table2

t2_idx1

69%

99%

0.51

t2_idx2*

98%

1%

0.51

t2_idx3

74%

0%

0.51

4k DB Block

39Slide40

Perform IO in the Optimal Order

TableIndex%Sequential

%Idx Used

Density

Table1t1_idx1*

0%

100%

0.09

t1_idx2

0%

0%

0.09

Table2

t2_idx1

69%

99%

0.51

t2_idx2*

98%

1%

0.51

t2_idx3

74%

0%

0.51

Table

Index

%Sequential

%Idx Used

Density

Table1

t1_idx1*

71%

100%

0.10

t1_idx2

63%

0%

0.10

Table2

t2_idx1

85%

99%

1.00

t2_idx2*

100%

1%

1.00

t2_idx3

83%

0%

0.99

4k DB Block

8k DB Block

40Slide41

Perform IO in the Optimal Order

TableIndex%Sequential

%Idx Used

Density

Table1t1_idx1*

0%

100%

0.09

t1_idx2

0%

0%

0.09

Table2

t2_idx1

69%

99%

0.51

t2_idx2*

98%

1%

0.51

t2_idx3

74%

0%

0.51

Table

Index

%Sequential

%Idx Used

Density

Table1

t1_idx1*

71%

100%

0.10

t1_idx2

63%

0%

0.10

Table2

t2_idx1

85%

99%

1.00

t2_idx2*

100%

1%

1.00

t2_idx3

83%

0%

0.99

4k DB Block

8k DB Block

41

Oops!Slide42

Logical Scatter Case Study

Block SizeHit Ratio%Sequential

Block

References

IO OpsTime

4k

95

69

319,719

19,208

120

4k

98

69

320,149

9,816

60

4k

99

69

320,350

6,416

40

8k

95

85

160,026

9,417

55

8k

98

85

159,805

4,746

30

8k

99

85

160,008

3,192

20

The process was improved from an initial runtime of roughly 2 hours (top line, in red) to approximately 20 minutes (bottom) by moving from 4k blocks and 69% sequential access at a hit ratio of approximately 95% to 8k blocks, 85% sequential access and a hit ratio of 99%.

42Slide43

43

Avoid IO,

But If You Must…Slide44

… in Big B You Should Trust!

LayerTime# of Recs

# of Ops

Cost per Op

RelativeProgress

to –B

0.96

100,000

203,473

0.000005

1

-B to FS Cache

10.24

100,000

26,711

0.000383

75

FS

Cache to SAN

5.93

100,000

26,711

0.000222

45

-B to SAN Cache*

11.17

100,000

26,711

0.000605

120

SAN

Cache to Disk

200.35

100,000

26,711

0.007500

1500

-B to Disk

211.52

100,000

26,711

0.007919

1585

* Used concurrent IO to eliminate FS cache

44Slide45

New Feature!

10.2B supports a new feature called “Alternate Buffer Pool.”This can be used to isolate specified database objects (tables and/or indexes).The alternate buffer pool has its own distinct –B.If the database objects are smaller than –B, there is no need for the LRU algorithm.This can result in

major

performance improvements for small, but very active, tables.

proutil dbname –C enableB2 areanameTable and Index level selection is for Type 2 only!45Slide46

Conclusion

Always use Type 2 storage areas.Define your storage areas based on technical attributes of the data.Static analysis isn’t enough – you need to also monitor runtime behaviors.White Star Software has a great deal of experience in optimizing storage. We would be happy to engage with any customer that would like our help!

46Slide47

Thank You!

47Slide48

Questions?

48