/
9/20/2009 CSE 5311 Fall 2007 9/20/2009 CSE 5311 Fall 2007

9/20/2009 CSE 5311 Fall 2007 - PowerPoint Presentation

marina-yarberry
marina-yarberry . @marina-yarberry
Follow
343 views
Uploaded On 2019-06-23

9/20/2009 CSE 5311 Fall 2007 - PPT Presentation

M Kumar 1 Greedy Algorithms TOPICS Greedy Strategy Activity Selection Minimum Spanning Tree Shortest Paths Huffman Codes Fractional Knapsack 9202009 CSE 5311 Fall 2007 M Kumar 2 The Greedy Principle ID: 760050

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "9/20/2009 CSE 5311 Fall 2007" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

9/20/2009

CSE 5311 Fall 2007M Kumar

1

Greedy Algorithms

TOPICS

Greedy Strategy

Activity Selection

Minimum Spanning Tree

Shortest Paths

Huffman Codes

Fractional Knapsack

Slide2

9/20/2009

CSE 5311 Fall 2007M Kumar

2

The Greedy Principle

The problem:

We are required to find a feasible solution that either maximizes or minimizes a given objective solution

.

It is easy to determine a feasible solution but not necessarily an optimal solution.

The greedy method solves this problem in stages, at each stage, a decision is made considering inputs in an order determined by the selection procedure which may be based on an optimization measure.

The greedy algorithm always makes the choice that looks best at the moment.

For each decision point in the greedy algorithm, the choice that seems best at the moment is chosen

It makes a local optimal choice that may lead to a global optimal choice.

Slide3

9/20/2009

CSE 5311 Fall 2007M Kumar

3

Activity Selection Problem

Scheduling a resource among several competing activities.

S = {1,2, 3, …,

n

} is the set of

n

proposed activities

The activities share a resource, which can be used by only one activity at a time -a Tennis Court, a Lecture Hall etc.,

Each activity

i

has a start time,

s

i

and a finish time

f

i

, where

s

i

f

i

.

When selected, the activity takes place during time (

s

i

, f

i

)

Activities

i

and

j

are compatible if

s

i

f

j

or

s

j

f

i

The activity-selection problem selects the maximum-size set of mutually compatible activities

The input activities are in order by increasing finishing times.

f

1

f

2

f

3

f

n

; Can be sorted in O (

n log n

) time

Slide4

9/20/2009

CSE 5311 Fall 2007M Kumar

4

Procedure for activity selection (from CLRS)

Procedure GREEDY_ACTIVITY_SELECTOR(

s, f

)

n

length [

S

]; in order of increasing finishing times;

A

{1}; first job to finish

j

1;

for

i

2 to

n

do if

s

i

f

j

then

A

A

{

i

};

j

i

;

Slide5

9/20/2009

CSE 5311 Fall 2007M Kumar

5

i si fi 1 1 4 2 3 53 0 64 5 75 3 86 5 97 6 108 8 119 8 1210 2 1311 12 14

Initially we choose activity 1 as it has the least finish time.

Then, activities 2 and 3 are not compatible as s

2

< f

1

and s

3

< f

1

.

We choose activity 4, s

4

> f

1

, and add activity 4 to the set A.

A = {1, 4}

Activities 5, 6, and 7 are incompatible and activity 8 is chosen

A = {1,4,8}

Finally activity 10 is incompatible and activity 11 is chosen

A {1,4,8,11}

The algorithm can schedule a set of n activities in

(n) time.

Slide6

9/20/2009

CSE 5311 Fall 2007M Kumar

6

Greedy Algorithms

Minimum Cost Spanning Tree

Kruskal’s algorithm

Prim’s Algorithm

Single Source Shortest Path

Huffman Codes

Slide7

9/20/2009

CSE 5311 Fall 2007M Kumar

7

Minimum-Cost Spanning Trees

Consider a network of computers connected through bidirectional links. Each link is associated with a positive cost: the cost of sending a message on each link.

This network can be represented by an undirected graph with positive costs on each edge.

In bidirectional networks we can assume that the cost of sending a message on a link does not depend on the

direction

.

Suppose we want to broadcast a message to all the computers from an arbitrary computer.

The cost of the broadcast is the sum of the costs of links used to forward the message.

Slide8

9/20/2009

CSE 5311 Fall 2007M Kumar

8

Minimum-Cost Spanning Trees

Find a fixed connected subgraph, containing all the vertices such that the sum of the costs of the edges in the subgraph is minimum. This subgraph is a tree as it does not contain any cycles.

Such a tree is called the spanning tree since it spans the entire graph G.

A given graph may have more than one spanning tree

The minimum-cost spanning tree (MCST) is one whose edge weights add up to the least among all the spanning trees

Slide9

9/20/2009

CSE 5311 Fall 2007M Kumar

9

MCST

Slide10

9/20/2009

CSE 5311 Fall 2007M Kumar

10

MCST

The Problem

: Given an undirected connected weighted graph G =(V,E), find a spanning tree T of G of minimum cost.

Greedy Algorithm for finding the Minimum Spanning Tree of a Graph G =(V,E)

The algorithm is also called

Kruskal's

algorithm

.

At each step of the algorithm, one of several possible choices must be made,

The greedy strategy: make the choice that is the best at the moment

Slide11

9/20/2009

CSE 5311 Fall 2007M Kumar

11

Kruskal's Algorithm

Procedure

MCST_G(V,E)

(Kruskal's Algorithm)

Input

: An undirected graph G(V,E) with a cost function c on the edges

Output

: T the minimum cost spanning tree for G

T

0;

VS

0;

for

each vertex v

V

do

VS = VS

{v};

sort the edges of E in nondecreasing order of weight

while

VS

> 1

do

choose (v,w) an edge E of lowest cost;

delete (v,w) from E;

if

v and w are in different sets W1 and W2 in VS

do

W1 = W1

W2;

VS = VS - W2;

T

T

(v,w);

return T

Slide12

9/20/2009

CSE 5311 Fall 2007M Kumar

12

Kruskals_ MCST

The algorithm maintains a collection VS of disjoint sets of vertices

Each set W in VS represents a connected set of vertices forming a spanning tree

Initially, each vertex is in a set by itself in VS

Edges are chosen from E in order of increasing cost, we consider each edge (v, w) in turn; v, w

V.

If v and w are already in the same set (say W) of VS, we discard the edge

If v and w are in distinct sets W1 and W2 (meaning v and/or w not in T) we merge W1 with W2 and add (v, w) to T.

Slide13

9/20/2009

CSE 5311 Fall 2007M Kumar

13

Kruskals_ MCST

Consider the example graph shown earlier, The edges in nondecreasing order[(A,D),1],[(C,D),1],[(C,F),2],[(E,F),2],[(A,F),3],[(A,B),3],[(B,E),4],[(D,E),5],[(B,C),6]EdgeActionSets in VSSpanning Tree, T =[{A},{B},{C},{D},{E},{F}]{0}(A,D)merge[{A,D}, {B},{C}, {E}, {F}] {(A,D)} (C,D) merge[{A,C,D}, {B}, {E}, {F}] {(A,D), (C,D)} (C,F) merge[{A,C,D,F},{B},{E}]{(A,D),(C,D), (C,F)} (E,F) merge[{A,C,D,E,F},{B}]{(A,D),(C,D), (C,F),(E,F)}(A,F) reject[{A,C,D,E,F},{B}]{(A,D),(C,D), (C,F), (E,F)}(A,B) merge[{A,B,C,D,E,F}]{(A,D),(A,B),(C,D), (C,F),(E,F)}(B,E) reject(D,E) reject(B,C) reject

F

E

D

C

B

A

1

2

1

4

5

2

6

3

3

F

E

D

C

B

A

1

2

1

2

3

The equivalent Graph and the MCST

Slide14

9/20/2009

CSE 5311 Fall 2007M Kumar

14

Kruskals_ MCST Complexity

Steps 1 thru 4 take time O (V)

Step 5 sorts the edges in nondecreasing order in O (E log E ) time

Steps 6 through 13 take O (E) time

The total time for the algorithm is therefore given by O (E log E)

The edges can be maintained in a heap data structure with the property,

E[PARENT(

i

)]

E[

i

]

remember, this property is the opposite of the one used in the heapsort algorithm earlier. This property can be used to sort data elements in nonincreasing order.

Construct a heap of the edge weights, the edge with lowest cost is at the root

During each step of edge removal, delete the root (minimum element) from the heap and rearrange the heap.

The use of heap data structure reduces the time taken because at every step we are only picking up the minimum or root element rather than sorting the edge weights.

Slide15

9/20/2009

CSE 5311 Fall 2007M Kumar

15

Prim’s Algorithm

F

E

D

C

B

A

1

2

1

4

5

2

6

3

3

F

D

B

A

2

1

3

The equivalent Graph and the MCST

C

1

D

A

1

C

1

D

A

1

E

2

F

2

C

1

D

A

1

E

2

F

2

C

1

D

A

1

Slide16

9/20/2009

CSE 5311 Fall 2007M Kumar

16

Single-Source Shortest Paths

A motorist wishes to find the shortest possible route from from Perth to Brisbane. Given the map of Australia on which the distance between each pair of cities is marked, how can we determine the shortest route?

Slide17

9/20/2009

CSE 5311 Fall 2007M Kumar

17

Single Source Shortest Path

In a shortest-paths problem, we are given a weighted, directed graph G = (V,E), with weights assigned to each edge in the graph. The weight of the path p = (v0, v1, v2, …, vk) is the sum of the weights of its constituent edges:

v0

v1

v2 . . .

vk-1

vk

The shortest-path from u to v is given by

d(u,v) = min {weight (p) : if there are one or more paths from u to v

=

otherwise

Slide18

9/20/2009

CSE 5311 Fall 2007M Kumar

18

The single-source shortest paths problem

Given G (V,E), find the shortest path from a given vertex

u

V to every vertex v

V ( u

v)

.

For each vertex v

V in the weighted directed graph, d[v] represents the distance from u to v.

Initially,

d[v] = 0 when u = v.

d[v] =

if (u,v) is not an edge

d[v] = weight of edge (u,v) if (u,v) exists.

Dijkstra's Algorithm : At every step of the algorithm, we compute,

d[y] = min {d[

y

], d[

x

] + w(

x,y

)}, where

x,y

V

.

Dijkstra's algorithm is based on the greedy principle because at every step we pick the path of least weight.

Slide19

9/20/2009

CSE 5311 Fall 2007M Kumar

19

Dijkstra's Algorithm : At every step of the algorithm, we compute,

d

[

y

] = min {d[

y

], d[

x

] +

w

(

x,y

)}, where

x,y

V

.

Dijkstra's algorithm is based on the greedy principle because at every step we pick the path of least path.

Slide20

9/20/2009

CSE 5311 Fall 2007M Kumar

20

Slide21

9/20/2009

CSE 5311 Fall 2007M Kumar

21

Example:

Step

#

Vertex tobe markedDistance to vertexUnmarked verticesuabcdefgh0u0159a,b,c,d,e,f,g,h1a01539b,c,d,e,f,g,h2c0153712b,d,e,f,g,h3b01537812d,e,f,g,h4d0153781211e,f,g,h5e01537812119f,g,h6h01537812119g,h7g01537812119h8f01537812119--

Slide22

9/20/2009

CSE 5311 Fall 2007M Kumar

22

Dijkstra's Single-source shortest path

Procedure

Dijkstra's Single-source shortest path_G(V,E,u)

Input:

G

=(

V,E

), the weighted directed graph and v the source vertex

Output: for each vertex,

v

, d[

v

] is the length of the shortest path from u to v.

mark vertex

u

;

d

[

u

]

0;

for

each unmarked vertex

v

V

do

if

edge (

u,v

) exists

d [

v

]

weight (

u,v

);

else

d

[

v

]

;

while

there exists an unmarked vertex

do

let

v

be an unmarked vertex such that

d

[

v

] is minimal;

mark vertex

v

;

for

all edges (

v,x

) such that

x

is unmarked

do

if d[x] > d[v] + weight[v,x]

then

d[x]

d[v] + weight[v,x]

Slide23

9/20/2009

CSE 5311 Fall 2007M Kumar

23

Complexity of Dijkstra's algorithm:

Steps 1 and 2 take

(1) time

Steps 3 to 5 take O(

V

) time

The vertices are arranged in a heap in order of their paths from

u

Updating the length of a path takes

O(log V)

time.

There are

V

iterations, and at most

E

updates

Therefore the algorithm takes O((

E

+

V

) log

V

) time.

Slide24

9/20/2009

CSE 5311 Fall 2007M Kumar

24

Huffman codes

Huffman codes are used to compress data. We will study Huffman's greedy algorithm for encoding compressed data.

Data Compression

A given file can be considered as a string of characters.

The work involved in compressing and uncompressing should justify the savings in terms of storage area and/or communication costs.

In ASCII all characters are represented by bit strings of size 7.

For example if we had 100000 characters in a file then we need 700000 bits to store the file using ASCII.

Slide25

9/20/2009

CSE 5311 Fall 2007M Kumar

25

Example

The file consists of only 6 characters as shown in the table below.

Using the fixed-length binary code, the whole file can be encoded in 300,000 bits.

However using the variable-length code , the file can be encoded in 224,000 bits.

a b c d e f

Frequency

45 13 12 16 9 5

(in thousands)

Fixed-length

000 001 010 011 100 101

codeword

Variable-length

0 101 100 111 1101 1100

codeword

A variable length coding scheme assigns frequent characters, short code words and infrequent characters, long code words.

In the above variable-length code, 1-bit string represents the most frequent character a, and a 4-bit string represents the most infrequent character

f

.

Slide26

9/20/2009

CSE 5311 Fall 2007M Kumar

26

Let us denote the characters by C1, C2, …, Cn and denote their frequencies by f1, f2, ,,,, fn. Suppose there is an encoding E in which a bit string Si of length si represents Ci, the length of the file compressed by using encoding E is

Slide27

9/20/2009

CSE 5311 Fall 2007M Kumar

27

Prefix Codes

The prefixes of an encoding of

one character must not be equal to a complete encoding

of another character.

1100 and 11001 are not valid codes

because 1100 is a prefix of 11001

This constraint is called the prefix constraint.

Codes in which no codeword is also a prefix of some other code word are called prefix codes.

Shortening the encoding of one character may lengthen the encodings of others.

To find an encoding

E

that satisfies the prefix constraint and minimizes L(

E,F

).

Slide28

9/20/2009

CSE 5311 Fall 2007M Kumar

28

The prefix code for file can be represented by a binary tree in which every non leaf node has two children. Consider the variable-length code of the table above, a tree corresponding to the variable-length code of the table is shown below.

1

1

1

1

0

0

0

0

1

0

100

55

30

25

14

a:45

5

c:12

b:13

e:9

f:5

d:16

Note that the length of the code for a character is equal to the depth of the character in the tree shown.

0 101 100 111 1101 1100

Slide29

9/20/2009

CSE 5311 Fall 2007M Kumar

29

Greedy Algorithm for Constructing a Huffman Code

The algorithm builds the tree corresponding to the optimal code in a bottom-up manner.

The algorithm begins with a set of

C

leaves and performs a sequence of 'merging' operations to create the tree.

C

is the set of characters in the alphabet.

Slide30

9/20/2009

CSE 5311 Fall 2007M Kumar

30

Procedure

Huffman_Encoding(

S,f

);

Input :

S

(a string of characters) and

f

(an array of frequencies).

Output :

T

(the Huffman tree for

S

)

1. insert all characters into a heap

H

according to their frequencies;

2.

while

H

is not empty

do

3.

if

H

contains only one character

x

then

4.

x

root (T);

5.

else

6.

z

ALLOCATE_NODE();

7.

x

left[

T,z

]

EXTRACT_MIN(H);

8.

y

right[

T,z

]

EXTRACT_MIN(H);

9.

f

z

f

x

+

f

y

;

10. INSERT(

H,z

);

Slide31

9/20/2009

CSE 5311 Fall 2007M Kumar

31

f:5

e:9

c:12

b:13

d:16

a:45

c:12

b:13

d:16

a:45

1

0

14

f:5

e:9

1

0

14

f:5

e:9

d:16

1

0

25

c:12

b:13

a:45

The algorithm is based on a reduction of a problem with

n

characters to a problem with

n

-1 characters.

A new character replaces two existing ones.

Slide32

9/20/2009

CSE 5311 Fall 2007M Kumar

32

1

0

25

c:12

b:13

1

0

14

f:5

e:9

1

0

30

d:16

a:45

a:45

1

0

25

c:12

b:13

1

0

14

f:5

e:9

1

0

30

d:16

1

0

55

Suppose

C

i

and

C

j

are two characters with minimal frequency, there exists a tree that minimizes

L (E,F)

in which these characters correspond to leaves with the maximal distance from the root.

Slide33

9/20/2009

CSE 5311 Fall 2007M Kumar

33

1

0

25

c:12

b:13

1

0

14

f:5

e:9

1

0

30

d:16

1

0

55

a:45

1

0

100

0 101 100 111 1101 1100

Slide34

9/20/2009

CSE 5311 Fall 2007M Kumar

34

Complexity of the algorithm

Building a heap in step 1 takes O(

n

) time

Insertions (steps 7 and 8) and

deletions (step 10) on H

take O (

log n

) time each

Therefore

Steps 2 through 10 take O(

n logn

) time

Thus the overall complexity of the algorithm is

O(

n logn

).

Slide35

9/20/2009

CSE 5311 Fall 2007M Kumar

35

The fractional knapsack problem Limited supply of each itemEach item has a size and a value per unit (e.g., Pound)greedy strategyCompute value per Pound for each itemArrange these in non-increasing orderFill sack with the item of greatest value per pound until either the item is exhausted or the sack is fullIf sack is not full, fill the remainder with the next item in the listRepeat until sack is full

How about a 0-1 Knapsack?? Can we use Greedy strategy?

Slide36

9/20/2009

CSE 5311 Fall 2007M Kumar

36

Problems

Suppose that we have a set of

k

activities to schedule among

n

number of lecture halls; activity

i

starts at time

si

and terminates at time

fi

1

i

k

. We wish to schedule all activities using as few lecture halls as possible. Give an efficient greedy algorithm to determine which activity should use which lecture hall.

You are required to purchase

n

different types of items. Currently each item costs $

D

. However, the items will become more expensive according to exponential growth curves. In particular the cost of item

j

increases by a factor

r

j

> 1 each month, where

r

j

is a given parameter. This means that if item

j

is purchased

t

months from now, it will cost

D

r

j

t

. Assume that the growth rates are distinct, that is

r

i

=

r

j

for items

i

j.

Given that you can buy only one item each month, design an algorithm that takes n rates of growth

r

1

, r

2,

…,

r

n

,

and computes an order in which to buy the items so that the total amount spent is minimized.