Advisor Alex Pothen Committee Jessica Crouch Bruce Hendrickson Stephan Olariu Mohammad Zubair 23 January 2009 1 Algorithms for VertexWeighted Matching Mahantesh Halappanavar A Graph ID: 166077
Download Presentation The PPT/PDF document "Thesis Defense" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Thesis Defense
Advisor: Alex PothenCommittee : Jessica CrouchBruce HendricksonStephan OlariuMohammad Zubair
23 January, 2009
1
Algorithms for
Vertex-Weighted MatchingMahantesh HalappanavarSlide2
A Graph
A graph G is a pair (V, E)V is a set of vertices E, a set of edges, represents a binary relation on VBipartite and nonbipartiteWeighted and unweighted
w
S
T
2Slide3
A Matching
A matching M is a subset of edges such that no two edges in M are incident on the same vertex
3Slide4
Applications of Matchings
Sparse matrix computationsMatrix preconditioningBlock Triangular FormMultilevel Graph AlgorithmsGraph partitionersGraph clusteringScheduling ProblemHigh speed network switchingFacility scheduling problemBioinformaticsHomology detectionStructural alignment
4Slide5
Types of Matchings
5
Vertex
Wtd
Matching
General
Bipartite
Approx
Exact
Cardinality Matching
General
Bipartite
Approx
Exact
Edge
Wtd
Matching
General
Bipartite
Approx
ExactSlide6
6
Our ContributionsNew ⅔-approx algorithm (bipartite MVM)Better understanding of the MVM problemNew ½-approx algorithms (MVM)Parallel ½-approx algorithm for wtd matchingSoftware implementations Slide7
Outline
7Slide8
Proposed Serial Algorithms
B = Bipartite; G = General graphsn is number of vertices and m is number the edges d2 is the average number of distinct alternating paths, of length at most
3 edges, starting at a vertex
8Slide9
Augmentation
M
M
M
M
M
M
M
An alternating path:
An augmenting path:
Augmentation by Symmetric Difference
:
9Slide10
Restricted Bipartite Graph
S
T
3
8
7
10
6
9
5
4
2
1
S
T
3
5
4
2
1
S
T
8
7
10
6
9
+
10Slide11
First task:
Sort S vertices in decreasing order of weights GLOBALOPTIMAL: M*
Second task: Compute a
maximum cardinality matching by processing S
vertices in pre-computed order
GLOBALTWOTHIRD: M⅔
Second task: Compute a matching by processing S vertices in
pre-computed order with augmenting paths of length at most
three
11Slide12
Execution of GLOBALT
WOTHIRD
S
T
3
5
4
2
1
S
T
3
5
4
2
1
S
T
3
5
4
2
1
S
T
3
5
4
2
1
w
(
M
*
)
= 5+4+3+2 =
14
w
(
M
⅔
)
= 5+4+2 =
11
Approximation ratio
=
11/14
> 2/3
12Slide13
Complete Solution
General Bipartite graph
S-restricted bipartite graph
T-restricted bipartite graph
M
T
M
S
Mendelsohn-Dulmage Technique
M
13Slide14
Proofs of Correctness
Exact algorithm ⅔-approx algorithm14Slide15
15
Exact: Reachability PropertyGiven: G=(S, T, E), weights on S, and a matching MS
For every M
S–matched vertex
s’ reachable from an M
S–unmatched vertex si via an MS
–alternating path: w
(s’)
≥ w
(
s
i
)Slide16
16
Exact: Reachability OptimalityGiven: G=(S, T, E), weights on S, and a maximum cardinality matching M
SIf MS
satisfies the reachability property, then it is also a maximum
vertex-weight matching
Proof: Use contradiction
M
*
M
*
s
s
i
s’
M
S
M
S
P
If:
w
(
s’
)
> w
(
s
)Slide17
⅔-approx: Skeleton of the Proof
Consider concurrent execution of algorithms GLOBALOPTIMAL and GLOBALTWOTHIRDAt a given step, both algorithms will process the same vertex si
SA failed
vertex is a vertex that is matched in M
*, but not in
M⅔Intuition: Show that for every failed vertex, there are two distinct
vertices that are matched in M⅔ heavier than the failed
17Slide18
Induction: Step k
Consider execution when vertex sk fails:
M
*
M
*
s
1
s
1,a,k
s
1,b,k
M
⅔
M
⅔
s
2
s
2,a,k
s
2,b,k
s
k
s
k,a,k
s
k,b,k
18Slide19
Proof:
M*,k M⅔,k
M
*
s
i
M
*
M
*
s
i
s
i,a,k
M
⅔
M
*
M
*
s
i
s
i,a,k
s
i,b,k
M
⅔
M
⅔
(a)
(b)
(c)
(d)
19
M
*
s
i
s
i,a,k
M
⅔Slide20
For Failed Vertex sk
There are two distinct M⅔–matched vertices “heavier” than sk
s
k
s
k,a,k
s
k,b,k
M
⅔
M
⅔
M
*
M
*
20
Proof
: Vertices are processed in a
decreasing
order of weights
Slide21
A Potential Problem
State of vertices failed earlier:Possibility: w(si) >
w(si,
b,k)
M
*
M
*
s
1
s
1,a,k
s
1,b,k
M
⅔
M
⅔
s
2
s
2,a,k
s
2,b,k
s
k
s
k,a,k
s
k,b,k
21Slide22
Counting Technique
For every failed vertex si, iI={1, …,k},
there are two distinct M⅔
–matched vertices heavier than s
i (not necessarily on the alternating path)
Proof: Induct on the failed steps: Step 1:
M
*
M
*
s
1
s
1,a,1
s
1,b,1
M
⅔
M
⅔
s
1
s
1,a
s
1,b
22Slide23
… Counting Technique
Step 2:
M
*
M
*
s
1
s
1,a,2
s
1,b,2
M
⅔
M
⅔
s
1
s
1,a
s
1,b
M
*
M
*
s
2
s
2,a,2
s
2,b,2
M
3
M
3
s
2
s
2,a
s
2,b
23Slide24
Outline
24Slide25
Related Work
2004: Jaap-Henk HoepmanShow parallel algorithm as a variant of Preis’s algorithmOne vertex per processor (theoretical)2007: Fredrik Manne and Rob Bisseling:Extend Hoepman’s workShow parallel algorithm as a variant of Luby’s algorithm (maximal independent set problem)
Limited experimental results (32 processors)25Slide26
Data Distribution
P
0
P1
Ghost vertices
26Cross-edgesSlide27
Serial Pointer-based Algorithm
For each vertex, set a pointer to the heaviest neighborIf two vertices point to each other, then add the (locally dominating) edge to matchingRemove all edges incident on the matched edges, reset the pointers, and repeat
27Slide28
Execution of Pointer-based Algorithm
Parallel in nature28Slide29
Our algorithm: Parallel Pointer-based
Initialization of data structuresPhase 1: Independent ComputationIdentify locally-dominant edgesSend messages as needed (cross-edges)Phase 2: Shared ComputationReceive messagesComputation based on the messages received
Send messages as neededRepeat until no more edges can be matched
29Slide30
Phase 1: Independent Computation
For each local vertex, set pointer to heaviest neighbor If point to ghost, enqueue REQUEST to its ownerRepeat: Vertices pointing to each other: Match
Remove incident edges; enqueue
UNAVAILABLE messages for cross-edgesReset pointers;
enqueue messages as neededRepeat until no more edges can be matched
Send all queued messages30Slide31
Phase 2: Shared Computation
WHILE (S ≠ NULL) DOReceive a message M(vl, vg, type)Process based on type (
REQUEST/UNAVAILABLE
/
FAILURE)Dominating?
Match; remove incident edges; send UNAVAILABLE message for cross-edgesReset
pointers; send messages as neededUpdate:
Counter[vg]
: Decrement the counterS: Remove
v
g
from
S
when
Counter[
v
g
]=0
Send FAILURE
messages if some vertex cannot be matched31
S
: set of ghost vertices;
Counter[vg
] = Local degree of ghost vgSlide32
Sample Execution
32Slide33
Outline
33Slide34
Performance of Serial Pointer-based Algorithm
2.4 GHz Intel Xeon (64-bit) with 32 GB RAMExact algorithm: O(|V|3) Pointer-based ½-approx algorithm:O(|E|
),
= maximum degreeO
(|E|): Expected with random edge-weights
34
Approximation algorithm is fastSlide35
Relative Performance of Approx Algorithms
Pointer-based algorithm is relatively fast and computes matchings of the same quality as the others.35Slide36
Platform Details
Franklin: A massively parallel processing (MPP) Cray XT-4 system at NERSC with 9,660 compute nodes 38,640 processor coresTheoretical peak: 356 Tflop/sec.Compute Node: One 2.3 GHz quad core AMD Opteron processor; 8 GB RAM
Network: SeaStar2 router (3D torus topology)Software:
PGI C++ compilers (–fast*
)Cray MPICH2
36* -fast = -O2 -Munroll=c:1 -Mnoframe -Mlre
-Mvect=sse -Mscalarsse
-Mcache_align -MflushzSlide37
Test Set
Five-point Grid GraphsRandom Geometric GraphsScalable Synthetic Graphs (SSCA#2)Graphs from Applications37
T
P
Average Time
Maximum TimeSlide38
38
Five-Point Grid Graphs4k X 4k (|V|= 16,000,000; |E|= 31,992,000)Slide39
39Slide40
40Slide41
41Slide42
Weak Scaling Experiment(Five Point Grid Graphs)
42Slide43
43Slide44
44Slide45
45
Random Geometric Graphs(|V|= 640,000; |E|= 3,080,872)
SSCA#2 Graph
(|V|=524,288; |E|=10,008,022)
Graphs from Applications
Source: University of Florida Sparse Matrix Collection Slide46
46
Conclusions and Future WorkSlide47
47
ConclusionsNew ⅔-approx Algorithm (bipartite MVM)Failed vertices; Counting techniqueBetter understanding of the MVM problemReachability propertyNew ½-approx MVM AlgorithmsRestricted reachability propertyParallel ½-approx Algorithm for wtd matchingScalable implementation and analysis
Software Implementations Parallel: MatchBoxP (C++, MPI and STL)
Serial: Milan (C++ and STL)Slide48
Conclusions
Parallel Algorithms:Structure of the graph as well as partitioning affect performanceMemory limitations will affect data structures (ghost vertices), and therefore, algorithm design and performanceHybrid implementations (MPI + OpenMP) can provide better performanceFewer partitions imply lesser communication
48Slide49
Future Work
Experimental analysis of serial vertex-weighted matching algorithmsParallel expts on machines with thousands of cores at Purdue and the DOE leadership class facilitiesParallel graph generatorsSoftware engineering: data structures, error handling, documentation, etc.Parallel optimal matching algorithm49Slide50
50
Some Open ProblemsProof of correctness for Algorithm LOCALTWOTHIRD⅔-approx algorithm for nonbipartite graphs
¾-
approx algorithm for bipartite graphsSlide51
51
AcknowledgementsAlex PothenFlorin DobrianAssefaw GebremedhinJessica CrouchBruce Hendrickson Stephan Olariu Mohammad ZubairSlide52
52
Extra SlidesSlide53
Research Question
With reference to (maximum or approx) vertex weighted matching M in graph G, what happens to the quality of M when the weight of a vertex changes after M has been computed?Application: Network switches where matching needs to be computed for every cycle, although similar traffic pattern may continue over many cycles (incremental computation?)53Slide54
Answer: (1) Exact Algorithm
Let weight of vertex vi change after M is computed.Two possibilities: vi M vi M 54Slide55
55
Background: Reachability PropertyGiven: G=(S, T, E), w:S R+, and a matching M
S For every
MS
–matched vertex s’ reachable from an
MS–unmatched vertex si via an
MS–alternating path:
w(s’
) ≥
w
(
s
i
)
Reachability
OptimalitySlide56
Answer: Exact Algorithm (Case 1)
From vertex vi (with new weight) find all alternating paths (no augmenting paths) to unmatched vertices vx. Perform symmetric difference via the path that leads to the heaviest vx, such that w(v
x) > w(v
i)Reachability property
holds
56
v
i
Not possible
v
xSlide57
Answer: Exact Algorithm (Case 2)
From vertex vi (with new weight) find all alternating paths (no augmenting paths) to matched vertices vx. Perform symmetric difference via the path that leads to the lightest vx, such that w(
vi) >
w(v
x)Reachability property
holds 57
v
x
Not possible
v
i
Not possibleSlide58
Answer: ½-approx Algorithm
(Case 1 & 2)From vertex vi (with new weight) find all alternating paths of length two edges (no augmenting paths) to unmatched vertices vx. Perform symmetric difference via the path that leads to the heaviest vx, such that
(vx) >
w(v
i)Restricted Reachability property holds (in both the cases)
58Slide59
Answer: ⅔-approx
AlgorithmWill vi become a failed vertex? Yes: Argument similar to the proof for 2/3-approx at the induction step k+1No: The approximation ratio will hold – with or without modifying the matching 59Slide60
Additional Slides
60Slide61
Mendelsohn-Dulmage Technique
Given a bipartite graph G=(S, T, E), and two matchings MS and MT,A new matching M can be computed such that it matches:all S vertices matched by
MS and all T vertices matched by
MT.
61Slide62
Mendelsohn-Dulmage Technique
Compute: MS MT
S
T
(a) Cycle
S
T
(b)
M
S
-augmenting
S
T
(c)
M
T
-augmenting
S
T
(e)
M
T
-alternating
S
T
(d)
M
S
-alternating
62Slide63
Results: Weight of Matching
63Slide64
Results: Execution Time
64Slide65
Performance: Weight of matching
65PG: Path Growing (Drake and Hougardy)Slide66
Performance: Cardinality of matching
Approx algorithms compute good quality matchings.66Slide67
67Slide68
Random Geometric Graphs
(|V|= 640,000; |E|= 3,080,872)68Slide69
69Slide70
70Slide71
71Slide72
72Slide73
SSCA#2 Graph
(|V|=524,288; |E|=10,008,022) 73Slide74
74Slide75
75Slide76
76Slide77
Graphs from Applications
Source: University of Florida Sparse Matrix Collection 77Slide78
78Slide79
79Slide80
80