Data type for disjoint sets makeSet x Given an element x create a singleton set that contains only this element Return a locatorhandle for e in the data structure findx Given a handle for an element x find the set that ID: 138584
Download Presentation The PPT/PDF document "Union Find ADT" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Union Find ADT
Data type for disjoint sets:makeSet(x): Given an element x create a singleton setthat contains only this element. Return a locator/handlefor e in the data structure.find(x): Given a handle for an element x; find the set thatcontains x. Return a handle/identifier/pointer/label for this set.union(A,B): Given two set identifiers create the union ofthe two sets.
1
TexPoint
fonts used in EMF.
Read the
TexPoint
manual before you delete this box.:
A
A
A
A
A
A
A
A
A
A
A
A
A
A
ASlide2
Union Find ADT
Applications:keep track of the connected components of a dynamic graph that changes due to insertion of nodes and edgesKruskals Minimum Spanning Tree Algorithm2Slide3
Union Find ADT
List Implementation:the elements of a set are stored in a list; eachnode has a backward pointer to the tailthe tail of the list contains the label for the setmakeSet(x) needs constant timefind(x) also needs constant time3
ASlide4
Union Find ADT
Union:
take the smaller of the two sets
change all its backward pointer to the label of the larger set
insert the smaller list at the head of the larger
time O(min(|A|,|B|))
4
B
ASlide5
Union Find
Lemma. The amortized running times are:find O(1).makeSet O(log n).union O(1).Proof. Idea: We partially charge the cost for a union operation to the elements involved. In total we will charge at most O(log n) to an element. Since each element has to be created with makeSet(x) we get an amortized time bound byinflating the cost of
makeSet(x) to O(log n). find: actual cost and amortized cost are the same.
union(A,B
): i
f (A==B) do nothing
time O(1) for the check otherwise add the smaller set to the larger; charge the cost for this to the elements in the smaller set; each element is charged
one
. your cost:
zero!
5Slide6
Union Find
How much do we charge to an element?Observation.Whenever we charge one to an element x the size of the subsetAx that contains x increases by at least a factor of 2. total charge to an element is at most O(log n)
6Slide7
Implementation via trees:
the root of the tree is the label of the setonly pointer to parent exist; we cannot list all elements ofa given setProblem: find is not constant anymoreUnion Find7
23
16
19
3
8
14
17
7
6
9
5
10
12
2Slide8
Union of two sets:
the root of the tree is the label of the setstore the size of a subtree in the root Union Find: Tree Implementation8
3
8
14
17
7
6
9
5
10
12
2
7
5
1
1
1
2
1
4
2
1
1Slide9
Union of two sets:
the root of the tree is the label of the setstore the size of a subtree in the root make the smaller tree the child of the larger
3
8
7
6
9
14
17
2
5
10
12
Union Find: Tree Implementation
9
11
5
1
1
1
2
1
4
2
1
1Slide10
Find:
go upwards until you find the rootmake all visited node into children of the root(path compression)
3
8
7
6
9
14
17
2
5
10
12
Union Find: Tree Implementation
10
11
5
1
1
1
2
1
4
2
11Slide11
Find:
go upwards until you find the rootmake all visited node into children of the root(path compression)
3
8
7
6
9
14
17
Union Find: Tree Implementation
11
11
5
1
1
1
2
1
10
12
4
1
2
5
21Slide12
Tree Implementation: Analysis
Analysisunion (A,B) can be done in time O(1)makeSet(x) is still trivial: O(1)the cost for find(x) may be large. Observation: The height of the trees is at most O(log n). if for an element x the distance to the root increases thismeans that the number of elements in its sub-tree at least doubles. the cost for find(x) is at most O(log n)
without amortization.
12Slide13
Can we do better?
Yes with amortization!Definitions:n(v) := the number of nodes that were in the subtree rooted at v, when v became child of another node. rank r(v) := n(v) ≥ 2r(v)Lemma: The rank of a parent p must be strictly larger than the
rank of its child c.after c is linked to p the rank of c does not change anymore
while the rank of c might still increase.
directly after the linking r(p)
Tree Implementation: Analysis
13
≥
≥
=
>
r(c)
=
TexPoint DisplaySlide14
Tree Implementation: Analysis
Theorem. There are at most n/2s nodes of rank s. Proof:a node of rank s had at least 2s nodes in its subtree
, whenit became a child of another nodenodes of the same rank have disjoint
subtrees
as they
cannot be ancestors of each other
[observe that
a
node that was initially
not in
a subtree T cannot via path compression join this sub-tree. ]
more precisely: each node in the tree sees during its lifetime at most one ancestor of rank s; for each rank s node there are at least 2s
nodes that have seen him; hence there can at most be n/2s nodes of rank s.
14Slide15
Tree Implementation: Analysis
Definitions: Theorem: We can obtain the following amortized running times:makeSet(x):
find(x):union(A,B): O(1)
15Slide16
Tree Implementation: Analysis
group-number:a node with rank r[v] is in rank-groupthis means the rank-group g contains ranks t(g-1)+1,…., t(g)there are at most differentrank-groups
16Slide17
Tree Implementation: Analysis
Accounting Scheme:create an account for every find-operationcreate an account for every nodeThe cost of a find operation is equal to the length of the pathtraversed. We charge the cost for going from v to parent[v] in thefollowing way:if the parent of v does not change
due to path-compressionwe charge the cost to the find-account (at most cost 1 per find)
if the group-number of rank[v] is the same as that of
rank[parent(v)] (before starting path compression) we charge the cost to the node-account of v
otherwise we charge the cost to the find-account
17Slide18
Tree Implementation: Analysis
Observations:find(x) is only charged . (max number of rank-groups) after a node is charged its parent is re-assigned to a node higher up in the tree. parent gets larger rank.after some time the parent is in a larger rank group. node will never be charged againthe charge to a node in rank-group g is at most
t(g)-t(g-1)<= t(g)
What is the
total number
of operations that is charged to nodes?
the total charge is at most
where n(g) is the number of nodes in group g
18Slide19
Tree Implementation: Analysis
hence: as there are only groups
19Slide20
Tree Implementation: Analysis
If there are only n elements we charge at mostto these elements.Charging to every makeSet()-operation gives theresult. The analysis is not tight. In fact it has been shown that the amortized time for the union-find implementation with path
compression is O(®(n)), were
®
(n), is the inverse Ackermann function which grows a
lot
slower than . There is also
a
lower bound
of
(®(n)).
20