/
Variations on Balanced Trees Variations on Balanced Trees

Variations on Balanced Trees - PowerPoint Presentation

ellena-manuel
ellena-manuel . @ellena-manuel
Follow
395 views
Uploaded On 2015-09-25

Variations on Balanced Trees - PPT Presentation

Lazy Red Black Trees Stefan Kahrs Overview some general introduction on BSTs some specific observations on redblack trees how we can make them lazy and why we may want to conclusions ID: 139994

tree trees red black trees tree black red repair invariant data insert random bst lazy left insertion average parent

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Variations on Balanced Trees" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Variations on Balanced TreesLazy Red-Black Trees

Stefan KahrsSlide2

Overviewsome general introduction on BSTssome specific observations on red-black treeshow we can make them lazy - and why we may want toconclusionsSlide3

Binary Search Treescommonly used data structure to implement sets or finite maps (only keys shown):

56

33

227Slide4

A problem with ordinary BSTson random data searching or inserting or deleting an entry performs in O(log(n)) time where n is the number of entries, but...if the data is biased

then this can deteriorate to O(n)...and thus a tree-formation can deteriorate to O(n2)Slide5

Therefore...people have come up with various schemes that make trees self-balancethe idea is always that insertion/deletion pay a O(log(n))

tax to maintain an invariantthe invariant guarantees that

search

or

insert

or

delete

all perform in logarithmic timeSlide6

Well-known invariants for treesBraun trees: size of left/right subtree vary by at most 1 – too strong for search

trees O(n0.58)AVL trees:

depth

of left/right

subtree

vary by at most 1

2-3-4 trees: a node has 1 to 3 keys, and 2 to 4

subtrees

(special case of B-tree)

Red-Black trees: an indirect realisation of 2-3-4 treesSlide7

Red-Black TreeBST with an additional colour field which can be RED or BLACKinvariant 1

: red nodes have only black children, root/nil are blackthus, a non-empty black node has between 2 and 4 black children

invariant

2

: all paths to leaves go through the same number of black nodesSlide8

Example68

12

83

7

43

75

96

98

94

70

7

6Slide9

Perceived WisdomRed-Black trees are cheaper to maintain than AVL trees, though they may not be quite as balancedpretty balanced though: average path-length for a Red-Black tree is in the worst case

only 5% longer that that of a Braun-treeSlide10

Aside: a problem with balanced treesan ordinary BST has on random data an average path length of 2*ln

(n)this is only 38% longer than the average path length of a Braun treethus: most balanced tree schemes lose against ordinary BST on random data, because they fail to pay their tax from those 38%

red-black trees succeed thoughSlide11

Algorithms on RB treessearch: unchanged, ignores colourinsert:insert as in BST (a fresh red node)

rotate subtrees until color violation goes awaycolour root black

delete

(more complex than insert):

delete as in BST

if underflow rotate from siblings until underflow goes awaySlide12

Example68

12

83

7

43

75

96

98

94

70

7

6

69Slide13

Example68

12

83

7

43

75

96

98

94

70

7

6

69Slide14

Example68

12

83

7

43

75

96

98

94

70

7

6

69Slide15

Standard Imperative Algorithmfind the place of insertion in a loopcheck your parent whether you’re a naughty child, and correct behaviour if necessary, by going up the treeSlide16

Problem with thisQuestion: how do you go up the tree?Answer: children should know

their parent.Which means: trees in imperative implementations are often not proper trees, every link consists of two pointersSlide17

Functional Implementationsin a pure FP language such as Haskell you don’t have pointer comparison and so parent pointers won’t work

instead we do something like this:insert x tree = repair (

simplInsert

x tree)

simplInsert

inserts data in

subtree

and produces a tree with a potential invariant violation at the top,

repair

fixes thatthe ancestors

sit on the recursion stackSlide18

Recursionactually, nothing stops us from doing likewise in an imperative language, using recursive insertion (or deletion)cost: recursive calls rather than loops

benefit: no parent pointers – saves memory and makes all rotations cheaperis still more expensive though...Slide19

Can we do better?problem is that the recursive insertion algorithm is not tail-recursive and thus not directly loopifiable: we repair after

we insertwhat if we turn this around?newinsert

x tree =

simplinsert

x (repair tree)

this is the fundamental idea behind lazy red-black treesSlide20

What does that mean?we allow colour violations to happen in the first placethese violations remain in the tree

we repair them when we are about to revisit a nodethis is all nicely loopifiable and requires no parent pointersSlide21

In the imperative codewhere we used to have...n = n.left

;...to continue in the left branchwe now have:

n =

n.left

=

n.left.repair

();Slide22

Invariants?the standard red-black tree invariant is broken with this (affects search)in principle, we can have B-

R-R-B-R-

R

-B-

R

-

R

paths, though these are rare

but this is as bad as it gets, so we do have an invariant that guarantees O(log(n))

average path lengths are similar to RB treesSlide23

Performance?I implemented this in Java, and the performance data were initially inconclusive (JIT compiler, garbage collection)after forcing gc

between tests, standard RB remains faster (40% faster on random inputs), though this may still be tweakable

so what is the

extra cost

, and can we do anything about it?Slide24

Checks!most nodes we visit and check are fineespecially high up in the tree, as these are constantly repaired...and the ones low down do not matter that much anyway

so we could move from regular maintenance to student-flat

maintenance

, i.e. repair trees only once in a blue moonSlide25

What?yes, the colour invariant goes to pot with thatwe do maintain black height though...

...and trust the healing powers of occasional repair: suppose we have a biased insertion sequence and don’t repair for a while...Slide26

Example12

83

7

43

96

70

suppose the tree has this shape, and now we insert a 5 in repair-modeSlide27

Result12

83

7

43

96

70

5Slide28

Findingson random data, performance of lazy red-black trees is virtually unaffected, even if we perform safe-insert only 1/100on biased data works a bit better under

student-flat, but still loses to RB (15% slower for this bias)average tree depth: 1.5 longer than RBon random inputs

also on biased inputs (where BST falls off the cliff)Slide29

ConclusionsUltimately: failure!Lazy RB trees are not faster than normal ones.On random inputs, Lazy RB perform very similarly to plain BST

Some small room for improvement – I doubt though the gap to plain RB can be closedPerhaps other algorithms would

benefit more from

lazy invariant maintenance

?