 45K - views

Data type for disjoint sets:. makeSet. (x). :. Given an element x create a singleton set. that contains only this element. Return a locator/handle. for e in the data structure.. find(x). : Given a handle for an element x; find the set that.

Download Presentation - The PPT/PDF document "Union Find ADT" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

## Presentation on theme: "Union Find ADT"— Presentation transcript:

Slide1

Data type for disjoint sets:makeSet(x): Given an element x create a singleton setthat contains only this element. Return a locator/handlefor e in the data structure.find(x): Given a handle for an element x; find the set thatcontains x. Return a handle/identifier/pointer/label for this set.union(A,B): Given two set identifiers create the union ofthe two sets.

1

TexPoint

fonts used in EMF.

TexPoint

manual before you delete this box.:

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

Slide2

Applications:keep track of the connected components of a dynamic graph that changes due to insertion of nodes and edgesKruskals Minimum Spanning Tree Algorithm

2

Slide3

List Implementation:the elements of a set are stored in a list; eachnode has a backward pointer to the tailthe tail of the list contains the label for the setmakeSet(x) needs constant timefind(x) also needs constant time

3

A

Slide4

Union:

take the smaller of the two sets

change all its backward pointer to the label of the larger set

insert the smaller list at the head of the larger

time O(min(|A|,|B|))

4

B

A

Slide5

Union Find

Lemma. The amortized running times are:find O(1).makeSet O(log n).union O(1).Proof. Idea: We partially charge the cost for a union operation to the elements involved. In total we will charge at most O(log n) to an element. Since each element has to be created with makeSet(x) we get an amortized time bound byinflating the cost of makeSet(x) to O(log n). find: actual cost and amortized cost are the same. union(A,B): if (A==B) do nothing  time O(1) for the check otherwise add the smaller set to the larger; charge the cost for this to the elements in the smaller set; each element is charged one. your cost: zero!

5

Slide6

Union Find

How much do we charge to an element?Observation.Whenever we charge one to an element x the size of the subsetAx that contains x increases by at least a factor of 2.  total charge to an element is at most O(log n)

6

Slide7

Implementation via trees:the root of the tree is the label of the setonly pointer to parent exist; we cannot list all elements ofa given setProblem: find is not constant anymore

Union Find

7

23

16

19

3

8

14

17

7

6

9

5

10

12

2

Slide8

Union of two sets:the root of the tree is the label of the setstore the size of a subtree in the root

Union Find: Tree Implementation

8

3

8

14

17

7

6

9

5

10

12

2

7

5

1

1

1

2

1

4

2

1

1

Slide9

Union of two sets:the root of the tree is the label of the setstore the size of a subtree in the root make the smaller tree the child of the larger

3

8

7

6

9

14

17

2

5

10

12

Union Find: Tree Implementation

9

11

5

1

1

1

2

1

4

2

1

1

Slide10

Find:go upwards until you find the rootmake all visited node into children of the root(path compression)

3

8

7

6

9

14

17

2

5

10

12

Union Find: Tree Implementation

10

11

5

1

1

1

2

1

4

2

1

1

Slide11

Find:

go upwards until you find the root

make all visited node into children of the root(path compression)

3

8

7

6

9

14

17

Union Find: Tree Implementation

11

11

5

1

1

1

2

1

10

12

4

1

2

5

2

1

Slide12

Tree Implementation: Analysis

Analysisunion (A,B) can be done in time O(1)makeSet(x) is still trivial: O(1)the cost for find(x) may be large. Observation: The height of the trees is at most O(log n). if for an element x the distance to the root increases thismeans that the number of elements in its sub-tree at least doubles. the cost for find(x) is at most O(log n) without amortization.

12

Slide13

Can we do better? Yes with amortization!Definitions:n(v) := the number of nodes that were in the subtree rooted at v, when v became child of another node. rank r(v) := n(v) ≥ 2r(v)Lemma: The rank of a parent p must be strictly larger than therank of its child c.after c is linked to p the rank of c does not change anymorewhile the rank of c might still increase.directly after the linking r(p)

Tree Implementation: Analysis

13

=

>

r(c)

=

TexPoint Display

Slide14

Tree Implementation: Analysis

Theorem. There are at most n/2s nodes of rank s. Proof:a node of rank s had at least 2s nodes in its subtree, whenit became a child of another nodenodes of the same rank have disjoint subtrees as theycannot be ancestors of each other[observe that a node that was initially not in a subtree T cannot via path compression join this sub-tree. ] more precisely: each node in the tree sees during its lifetime at most one ancestor of rank s; for each rank s node there are at least 2s nodes that have seen him; hence there can at most be n/2s nodes of rank s.

14

Slide15

Tree Implementation: Analysis

Definitions: Theorem: We can obtain the following amortized running times:makeSet(x):find(x):union(A,B): O(1)

15

Slide16

Tree Implementation: Analysis

group-number:a node with rank r[v] is in rank-groupthis means the rank-group g contains ranks t(g-1)+1,…., t(g)there are at most differentrank-groups

16

Slide17

Tree Implementation: Analysis

Accounting Scheme:create an account for every find-operationcreate an account for every nodeThe cost of a find operation is equal to the length of the pathtraversed. We charge the cost for going from v to parent[v] in thefollowing way:if the parent of v does not change due to path-compressionwe charge the cost to the find-account (at most cost 1 per find)if the group-number of rank[v] is the same as that of rank[parent(v)] (before starting path compression) we charge the cost to the node-account of votherwise we charge the cost to the find-account

17

Slide18

Tree Implementation: Analysis

Observations:find(x) is only charged . (max number of rank-groups) after a node is charged its parent is re-assigned to a node higher up in the tree.  parent gets larger rank.after some time the parent is in a larger rank group.  node will never be charged againthe charge to a node in rank-group g is at most t(g)-t(g-1)<= t(g)What is the total number of operations that is charged to nodes?the total charge is at mostwhere n(g) is the number of nodes in group g

18

Slide19

Tree Implementation: Analysis

hence: as there are only groups

19

Slide20

Tree Implementation: Analysis

If there are only n elements we charge at mostto these elements.Charging to every makeSet()-operation gives theresult. The analysis is not tight. In fact it has been shown that the amortized time for the union-find implementation with pathcompression is O(®(n)), were ®(n), is the inverse Ackermann function which grows a lot slower than . There is alsoa lower bound of (®(n)).

20