Ruta Mehta Indian Institute of Technology Bombay Joint work with Jugal Garg and Albert X Jiang A Game RockPaperScissor RockPaperScissor A Play Winner 1 RockPaperScissor ID: 540490 Download Presentation
Download Presentation  The PPT/PDF document "Bilinear Games: Polynomial Time Algorith..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, noncommercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Presentation on theme: "Bilinear Games: Polynomial Time Algorithms for Rank Based S"— Presentation transcript
Slide1
Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses
Ruta
Mehta
Indian Institute of Technology, Bombay
Joint work with
Jugal
Garg
and Albert X. JiangSlide2
A Game: RockPaperScissorSlide3
RockPaperScissor: A Play
Winner
$
1Slide4
RockPaperScissor:
A Play
Winner
$
1Slide5
RockPaperScissor:
A Play
Winner
$
1Slide6
0,0
1,1
1,1
1,1
0,0
1,1
1,1
1,1
0,0
RockPaperScissor PayoffsSlide7
R
P
C
R
0
1
1
P
1
0
1
C
1
1
0
Bimatrix
Game
Steady State:
No
p
layer gains by unilateral deviation
R
P
C
R
0
1
1
P
1
0
1
C
1
1
0
S
1
= { R, P, C }
S
2
= { R, P, C }
A
BSlide8
R
P
C
R
0
1
1
P
1
0
1
C
1
1
0
Bimatrix
Game
No Steady State
R
P
C
R
0
1
1
P
1
0
1
C
1
1
0
S
1
= { R, P, C }
S
2
= { R, P, C }
A
BSlide9
R
1/3
P
1/3
C
1/3
R
0
1
1
P
1
0
1
C
1
1
0
Mixed Play
Steady State
R
P
C
R 1/3
0
1
1
P 1/3
1
0
1
C 1/3
1
1
0
S
1
= { R, P, C }
A
B
∆
1
={r
1
, p
1
, c
1
≥0;
r
1
+p
1
+c
1
=1}
S
1
= { R, P, C }
∆
2
={r
2
, p
2
, c
2
≥0;
r
2
+p
2
+c
2
=1}Slide10
John Nash (1951)Finite Game:
Finitely many players, each with finitely many strategies.
Nash: Every finite game has a steady state in mixed strategy.
Hence forth called Nash equilibrium (NE)
Proved using
Kakutani
fixed point theorem: Highly nonconstructive.Slide11
Nash Equilibrium ComputationPapadimitriou (JCSS’94)
: PPADclass P
roblems
where existence is
guaranteed like
fixed point, Sperner’s Lemma, Nash
equilibrium.Chen and Deng (FOCS’06)
: It is PPADhard.CDT (FOCS’06)
:
Even approximation is PPADhard.Slide12
Rank and ComputationKannan
and Theobald (SODA’07):
Define rank of (A,B) as rank(A+B).
FPTAS for fixed rank games.
Polynomial
time algorithms for exact Nash.
Dantzig
(1963):
Zerosum (rank0) is equiv. to LP.AGMS (STOC’11):
Rank1
games.Slide13
Bilinear Games
Bimatrix Game with polyhedral strategy sets.Two players: 1
and 2
Polyhedral strategy sets:
X
={x  Ex = e; x ≥ 0}, Y={y  Fy=f; y
≥ 0}Payoff matrices: A, B Bilinear Payoff: (x, y) fetches
xTAy
to
player
1
, and
x
T
By
to
player
2
.
Motivation:
Koller
et al. (STOC’94) for twoplayer extensive form game with perfect recall.Slide14
Nash Equilibrium in Bilinear
NE: No player gains by unilateral deviation.Existence: Corollary of
Glicksberg’s
result.
Symmetric Game:
B=A
T
and Y=X.(x, y) is a symmetric profile if y=x.
Existence of symmetric NE:
An adaptation of Nash’s proof for symmetric
bimatrix
games.Slide15
Bilinear Contains:Bimatrix
, Polymatrix, Bayesian, etc.
Bimatrix
:
X =
∆
1, Y =
∆2
Polymatrix:
N players.
Each pair plays a
bimatrix
game.
Player
i
:
S
i
finite strategy
set,
∆
i
Mixed strategy set.
Goal of
i
: Choose x
i
from
∆
i
to maximize total payoff.
A
ij
i
jSlide16
Polymatrix to Bilinear
M= S1+ … + S
n
. X = {(x
1
,…,
xn)  xi in ∆
i}, Y=X.A , B=AT
Symmetric NE of (A,B) maps to a NE of the
polymatrix
game
0
0
A
ij
0
0
i
j
A =
Slide17
Best Response (Koller et al.)
Fix a strategy y of player 2.Player 1 solves
max:
x
T
(Ay) min:
eTp
Ex = e pTE
≥ (Ay)T
x ≥ 0
At optimal: p
s.t
.
A
i
y
≤
pTE
i
&
x
i
> 0 =>
A
i
y
=
p
T
E
i
Given x
X
, for player 2 we get
At optimal: q
s.t
.
B
j
x ≤ qT
Fj &
yj > 0 =>
qT
F
j
=
B
jxSlide18
Best Response Polytopes (BRPs)
(x,y) is a NE
iff
p
: Ay ≤
E
Tp
; xi
> 0 =>
A
i
y
=
p
T
E
i
q:
x
T
B
≤
q
T
F
;
y
j
> 0 =>
q
T
F
j
=
B
j
x
x
T
(Ay

E
T
p
)
≤ 0 and (xTB  qTF)y ≤ 0xT(A+B)y – eTp – fTy ≤ 0Slide19
Nash Equilibrium in BRPs
NE
iff
x
T
(Ay 
ETp
)=0 and (xT
B

q
T
F
)y=0
x
T
(A+B)y –
e
T
p
–
f
T
y
=0
Assumption: P and Q are non
degnerate
.
(u, v) of P x Q gives a NE => (u, v) is a vertex.Slide20
QP Formulation
max: x
T
(A+B)y –
e
T
p
– fT
y
s.t
.
(y, p) P
(x, q) Q
Optimal value 0.
Only vertex solutions.
Slide21
Our ResultsRank1 games: rank(A+B)=1Extend
Adsul et al. algorithm for exact NE.Fixed rank games: rank(A+B)=k
Extend FPTAS of
Kannan
et al.
Rank of A or B is constant
Enumerate all NE in polynomial time.Slide22
Rank1 CaseZerosum ~ rank(A+B)=0: LP formulation (Charnes’53)rank(A+B)=1 then A+B =
a.
b
T
The QP formulation:
max: (x
Ta)(b
Ty) – e
T
p
–
f
T
y
s.t. (y, p) P
(x, q) QSlide23
Rank1 CaseReplace (
xTa) by z. Recall B
= A +
a
.
b
T
xT(A+B)y
–
e
T
p
–
f
T
y
=0 z(
b
T
y
)
–
e
T
p
–
f
T
y
=0
N
= Points of P x Q’ with
z(
b
T
y
) –
e
T
p
–
f
T
y=0
Forms paths and cycles, since z gives one degree of freedom.
NE of (A,B
):
Points in intersection of
N
and z – xTa =0. Slide24
Parameterized LP
LP(z) = max: z(
b
T
y
) –
eTp – f
Ty
s.t. (y, p) P (x, z, q
)
Q’
Given any c, Optimal value of LP(c) is 0.
OPT(c) lies on
N
, and
Let
N
(c)={Points of
N
with z=c}
, then
OPT(c)=
N
(c).
N
is a single path on which z is monotonic. Slide25
Rank1: The AlgorithmNE:
Intersection of N
and H: z –
x
T
a
=0.
. c1=amin
, c2=amax
H
N
H
–
H
+
NE
N
(c
1
)
N
(c
2
)Slide26
Rank1: Binary Search Algorithm
NE of (A,B): Points in intersection of N and H.
c=c
1
+c
2
/2.
H
NE
N
(c
1
)
N
(c
2
)
N
N
(c)
H
+
H
–Slide27
Rank1: Binary Search AlgorithmNE of (A,B): Points in intersection of
N and H.c=c
1
+c
2
/2.
If N(c) in H
–,then c1
=c else c2=c.
H
NE
N
(c
2
)
N
N
(c
1
)
H
+
H
–Slide28
AnalysisTerminates because,z is monotonic on
N.Increase in z on each edge is lower bounded by 1/d where d is polynomial sized in the input.
Time complexity:
Solve LP(c) to get
N
(c) in each pivot.
log(d) * log(amax
– amin) pivots.Slide29
ConclusionsBilinear games: Bimatrix
with polytopal strategy sets.Fairly general. Contains
polymatrix
,
bayesian
, etc.
Polynomial time algorithm for rank based subclasses.Open problems:Designing a Lemke
Howson type algorithm.Degree, index, stability concepts.Computation of approximate equilibrium.Slide30
Thank You