1 Uri Zwick Tel Aviv University October 2015 Last updated November 18 2015 Spanning Trees 2 A tree is a connected acyclic graph contains no cycles A spanning tree ID: 488305
Download Presentation The PPT/PDF document "Minimum Spanning Trees" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Minimum Spanning Trees
1
Uri Zwick
Tel Aviv University
October 2015
Last updated
: November 18,
2015Slide2
Spanning Trees
2
A
tree is a connected
acyclic graph (contains no cycles)
.
A
spanning tree
of an undirected graph
is a subgraph of , with vertex set , which is a tree.(A graph has a spanning tree iff it is connected.)
The following conditions are equivalent:
(i) is a spanning tree of .
(ii) The addition of any edge of to closes a cycle.
(iii) is connected but the removal of any edge disconnects it.
(iv) and is acyclic.
(v) and is connected.
A
forest
is an
acyclic
graph. Slide3
Minimum Spanning Trees
3
A
spanning tree of an undirected graph
is a subgraph of
, with vertex set
, which is a
tree
.
(A graph has a spanning tree iff it is connected.) If a weight (or cost) function
is defined
on the edges of and
is
a spanning tree of , then
.
The MST problem:
Given an undirected graph with a weight function
,
find a spanning tree such that is minimized.
A
tree
is a
connected
acyclic
graph (contains no
cycles
)
.Slide4
4
5
8
15
3
1
9
16
13
2
17
25
11
30
18
22
A minimum spanning tree
12Slide5
Running time
Algorithm
Borůvka
(1926)
Kruskal
(1956)
Jarnik
(1930) Prim
(1957)
Dijkstra
(1959)
Yao
(1975)
Cheriton-Tarjan
(1976)
Fredman-Tarjan
(1987)
Gabow-Galil
-
Spencer-
Tarjan (1986)
Chazelle
(2000)
Karger
-Klein-
Tarjan
(1995)
Running time
Algorithm
Borůvka
(1926)
Kruskal
(1956)
Jarnik
(1930) Prim
(1957)
Dijkstra
(1959)
Yao
(1975)
Cheriton-Tarjan
(1976)
Fredman-Tarjan
(1987)
Gabow-Galil
-
Spencer-
Tarjan
(1986)
Chazelle
(2000)
Karger
-Klein-
Tarjan
(1995)
Comparison-based MST algorithms
Deterministic
Rand.Slide6
Ackermann’s function
(one of many similar variants)
Apply
times on
times
Slide7
Ackermann’s function
(Tower)
Slide8
Ackermann’s function
Grows
EXTREMELY
fast!
Ackermann’s function is
recursive
but not
primitive recursiveSlide9
Inverse Ackermann’s function
Row inverses:
Slide10
Inverse Ackermann’s function
,
Column inverses:
is an
extremely
slowly growing function.
Surprisingly, it appears naturally in the analysis of
various computational problems and algorithms.
Examples:
Union-find, Minimum Spanning Trees,
Davenport-Schinzel sequences.
,
Why?Slide11
Fundamental cycles
Tree
+ non-
tree
edge
unique
cycle
The removal of any
tree
edge
from the
cycle
generates a new treeSlide12
Fundamental cuts
Removing an edge from a
spanning tree
generates a
cut
.
Adding any edge crossing the
cut
,
creates a new
spanning tree
.Slide13
Cut rule
13
Let
, where
, be a cut.
If
is the
strictly
lightest edge in the cut,
then
is contained in
all
MSTs of
.
If
is a lightest
edge in the cut,
then
is contained in
some
MST
of
.
Slide14
Cycle rule
14
Let
be a cycle in
.
If
is the
strictly heaviest edge on the cycle,then is not contained in any MST of . If
is a heaviest edge on the cycle,then
is not contained in some MST of .
Slide15
Cut rule - proof
Let
be an MST such that
.
Let
be the (
stricktly
) lightest edge in the cut
Then,
contains a cycle
.
must contain another edge
.
, as
is lightest in the cut
Thus
, a contradiction.
Slide16
Cycle rule - proof
Let
be the strictly heaviest edge on
.
Let
be an MST that contains
.
Removing
from creates a cut
.
The cycle must contain another edge from the cut.
is also a spanning tree,and
a contradiction. Slide17
Exercise:
Use the cycle or cut rules to prove the claim.
Exercise:
Strengthening the above claim, show,
without assuming that all edge weights are distinct,
that if
and
are two MSTs of the same graph,
then they have the same multisets of edge weights.
Uniqueness of MST
If all edge weights are
distinct
,
then the MST is
unique.
For simplicity, we will usually assume
that all edge weights are distinct.Slide18
Lexicographically
Minimal Spanning Tree
If
is a spanning tree containing edges
such that
, let
A
spanning tree
is a
lexicographically
MST
if and only if for every other spanning tree we have
, lexicographically.
Theorem:
A
spanning tree
is an
MST
if and only if it is a
lexicographically
MST
Exercise:
Prove the theorem.Slide19
Bottleneck (min-max) paths
Lexicographically minimal paths
If
be a path containing edges
such that
, let
A
path
from
to
is a
lexicographically minimal path
from
to
i
ff for every other path
from
to
we have
, lexicographically.
Theorem:
Let
be a
MST
of an
undirected
graph
.
Then, for every
, the unique path connecting
and
in
is a
lexicographically minimal path
between
and
.
In particular, it is also a
bottleneck path
between
and
.
Exercise:
Prove the theorem.
Assume that all edge weights are distinctSlide20
20
Kruskal’s
algorithm
Sort the edges in non-decreasing order of weight:
.
Initialize
. (
is now a set of edges.)
For
to
:
If
does not contain a cycle,
then
.
Correctness:
If all edge weights are
distinct
,
then each edge added to
is the strictly lightest edge
in the cut defined by one of the trees in the forest
.
Slide21
21
5
8
15
3
1
9
16
2
17
25
30
18
22
Kruskal’s
algorithm
12
11
13Slide22
22
Kruskal’s
algorithm
When
is examined,
is a forest, and
connect vertices
of
the same
tree.
If
connects two different trees, it is the
lightest
edge in the cuts defined by these trees.
The two trees containing
merge.
If
connects two vertices of the same tree,
it is the
heaviest
edge on a cycle.
Slide23
Comparison-based algorithms
An MST is a spanning tree whose
sum
of edge weights is minimized.
A
comparison-based
algorithm is an algorithm that
does not perform any operation on edge weights
other than
pairwise comparisons
.
Surprisingly, we can find such a tree by only
comparing
edge weights. No
additions are necessary.
In the
word-RAM
model, where edge weights are
integers fitting into a single machine word, an MST can be found deterministically in
time. Slide24
Breaking ties
Suppose that
is a comparison-based algorithm that
finds an MST whenever all edge weights are
distinct
.
Convert
into an algorithm
, by replacing every
if
…
by
if
…
where
are
distinct
identifiers of the edges.
Then,
finds an MST even if weights are not distinct.
Exercise:
Prove the claim.Slide25
The
blue
-
red
framework
Another convenient way of dealing with ties.
Initially all edges are uncolored.
Blue rule:
Choose a cut that does not contain a
blue
edge.
Among all uncolored edges in the cut, choose an
edge of minimal weight and color it
blue
.
Red rule:
Choose a cycle that does not contain a
red
edge. Among all uncolored edges on the cycle, choose an edge of maximal weight and color it red.
Theorem:
After any sequence of applications of the blue and red rules, there is always an MST that contains all
blue
edges and none of the
red
edges. As long as some of the
edges are uncolored, at least one of the rules may be applied.Slide26
Disjoint Sets / Union-Find
Make-Set(x): Create a singleton set containing x.Union(x,
y
): Unite the sets
containing x and y
.
Find(
x
):
Return a representative of the set containing x.Find(x)=Find(y) iff x and y are currently in same set.Using a simple implementation, the total cost of operations, out of which are Make-Set operations, is
Slide27
Union Find
Represent each set as a rooted tree.
Union by rank
Path compression
The parent of a vertex
x
is denoted by
x.p
.
x
Find(
x
) traces the path from x to the root.x.p
The representative of each set is the root of its tree.Slide28
Union by rank
0
Union by rank on its own gives
find time
A tree of rank
contains at least
elements
If
is not a root, then
At most
nodes of rank
Each item is assigned a rank, initially 0.Slide29
Path CompressionSlide30
Analysis of Union-Find
Theorem: [Tarjan (1975)]The total cost of operations, out of which
are Make-Set operations, is
While the data structure is very simple,
the analysis is far from being simple.
A simpler proof given by
[Seidel-
Sharir
(2005)]We shall not present the proof(s) in this course.The amortized cost of make-set and unite is , while the amortized cost of find is
Slide31
Union Find - pseudocodeSlide32
Efficient implementation
of
Kruskal’s
algorithm
Use the
union-find
data structure
for
:
for
to
:
if
:
return
Cost of sorting:
Cost of all other operations:
Total cost:
Slide33
33
Matroids
[Whitney (1935)]
A pair
, where
is a finite set and
is a
matroid
iff
the following three conditions are satisfied:
(
i)
(
i
i) If
and then .
(iii)
If
and
then
there exists
such that
.
The sets contained in
are called
independent sets
.
A
maximal
independent set is called a
basis.
A
minimal
dependent set is called a
circuit.Slide34
34
Jarnik
-Prim-
Dijkstra
Let
be an arbitrary vertex.
Let
,
.
While
:
Find the
lightest
edge
in the cut
.
Let
,
Correctness follows immediately from the cut rule.
To implement the algorithm efficiently,
use a
priority-queue
(heap) to hold
.
Slide35
35
5
8
15
3
1
9
16
13
2
17
25
11
30
18
22
Prim’s algorithm
12Slide36
36
:
for
:
while
:
for : if and
:
;
if
else:
return
Prim’s algorithm -
pseudocodeSlide37
37
Prim’s algorithm -
complexity
The algorithm performs at most:
– Insert
operations
– Extract-Min
operations
– Decrease-Key
operations
Using, e.g., a Fibonacci-heap,
the running time is:
Slide38
38
Borůvska’s
algorithm
(1926)
Start with
.
(
defines a forest in which each tree is a singleton.)
While
is not a spanning tree, each tree in
chooses
the
lightest
edge leaving it and adds it to
.
(We assume that
is connected
and that all edge weights are distinct.) Slide39
39
Borůvska’s
algorithm
(1926)
Start with
.
(
defines a forest in which each tree is a singleton.)
(We assume that
is connected
and that all edge weights are
distinct
.)
While
is not a spanning tree, each tree in
chooses
the
lightest
edge leaving it and adds it to
.
Slide40
40
5
8
15
3
1
9
16
13
2
17
25
11
30
18
22
Borůvka’s
algorithm
12Slide41
41
Borůvska’s
algorithm -
Analysis
In each iteration, the number of trees in
is reduced by a factor of at least
2
.
Hence, the number of iterations is
.
Each iteration can be easily implemented in
time.
Thus, the total running time is
.
Borůvska’s
algorithm can be
parallelized
.
Borůvska’s
algorithm is used as a building block
in more efficient and more sophisticated algorithms.Slide42
42
Contraction
Let
be an undirected graph, and let
.
Denote by
the graph obtained by
contracting
each connected component of
into a single vertex.
Slide43
43
Contraction
Let
be an undirected graph, and let
.
Denote by
the graph obtained by
contracting
each connected component of
into a single vertex.
Slide44
44
MST
using contractions
Let
be an undirected graph, and
let
be a set of edges contained in some
MST
.
Let
be an MST of
.
Then is an MST
of
. Slide45
45
MST
using contractions
Let
be an undirected graph, and
let
be a set of edges contained in some
MST
.
Let
be an MST of
.
Then is an MST
of
. Slide46
46
Yao’s algorithm
(1976)
Partition the edges of each vertex
into
equally-sized
buckets
.
Can be done in
time by repeatedly finding
medians
. (Working on each vertex separately.)
The edges in the first
bucket
are all lighter
than the edges in the second bucket, etc.
For each vertex, keep the index of the first
bucket that contains edges that are not self-loops.
Excluding the cost of scanning
buckets
and finding out
that all their edges are self-loops, each
Borůka
iteration
can be implemented in
time.
Slide47
47
Yao’s algorithm
(1976)
Partitioning into
buckets
–
Choosing
, we get
Time per iteration
–
Total running time:
Scanning “useless” buckets
–
Slide48
48
Yao’s algorithm
(1976)
Number of vertices in the contracted graph is at most:
Getting rid of the
term:
Contract
the resulting trees.
Run
Yao
’s algorithm on contracted graph.
Start with
standard
Borůka
iterations.
Total running time of modified algorithm:
Faster than
Prim
when
!
Slide49
49
Exercise:
Show that an
-time algorithm
can also be obtained by combining
Borůuvka
’s
algorithm with
Prim’s algorithm.
Yao
’s algorithm has the advantage that it does not use“sophisticated” data structures such as Fibonacci heaps.Slide50
50
The algorithm is composed of iterations.
Choose a parameter
.
At the beginning of the
-
th
iteration, the graph
contains
(super-)vertices and at most
edges.
Repeatedly choose vertices, not in any tree yet, and run
Prim
’s algorithm from them until either the size of the
heap is
, or the tree merges with an existing tree.
Each tree formed has at least
edges touching it.
Number of trees formed is at most
.
At the end of each iterations,
contract
the trees formed.
Time of the
-
th
iteration is
.
Fredman-Tarjan
(1987
)Slide51
51
Fredman-Tarjan
(1987)
Suppose
.
Slide52
52
Number of trees formed is
.
Time of
-
th
iteration is
.
Fredman-Tarjan
(1987
)
Choose
.
T
ime of
-
th
iteration is
.
Last iteration
Number of iterations is
.
Total running time is
.
Slide53
53
Bonus materialNot covered in class this term“Careful. We don’t want to learn from this.”(Bill Watterson, “Calvin and Hobbes”)Slide54
54
An improvement of the
Fredman-Tarjan
algorithm.
Each
packet
is stored in a (tiny) Fibonacci heap.
New ingredient:
The edges of each vertex are
partitioned, arbitrarily, into
packets
of size
.
Instead of examining all edges touching a vertex,
we examine the lightest edge in each packet
.
For each tree we maintain a list of the packets. Edges
in these packets may lead to vertices in the same tree.
We cannot afford to
contract
trees after each iteration.Trees are maintained using a union-find data structure.
Gabow
-
Galil
-Spencer-
Tarjan
(1986)Slide55
Packets – Technical details:
When trees merge, we concatenate their list of packets.
If both trees have a residual packet, we
merge
the two
packets, using a
meld
operation. The new packet is residual
if and only if its (original) size is smaller than
.
Gabow
-
Galil-Spencer-Tarjan
(1986)
Initially, partition the edges of each vertex into
packets
of size , and at most one residual packet
of size less than
.
In the beginning of each iteration,
each
tree
has a list of packets associated with it.
Let
be an (integer) parameter.
Packets have edges removed from them.
For simplicity, we consider their
original
size.
Number of packets in the
-th
iteration
All packets are of size
.
Slide56
56
Time of
-
th
iteration is
As before,
.
Number of iterations is
.
Total running time is
.
Gabow
-
Galil
-Spencer-
Tarjan
(1986)
At the end of the
-
th
iteration, each tree has at least
packets associated with it.
Hence,
.
Choose
.