/
Bilinear Games: Polynomial Time Algorithms for Rank Based S Bilinear Games: Polynomial Time Algorithms for Rank Based S

Bilinear Games: Polynomial Time Algorithms for Rank Based S - PowerPoint Presentation

myesha-ticknor
myesha-ticknor . @myesha-ticknor
Follow
400 views
Uploaded On 2017-04-22

Bilinear Games: Polynomial Time Algorithms for Rank Based S - PPT Presentation

Ruta Mehta Indian Institute of Technology Bombay Joint work with Jugal Garg and Albert X Jiang A Game RockPaperScissor RockPaperScissor A Play Winner 1 RockPaperScissor ID: 540490

etp rank player game rank etp game player fty games nash bimatrix strategy bilinear equilibrium algorithm points rock max

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Bilinear Games: Polynomial Time Algorith..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses

Ruta

Mehta

Indian Institute of Technology, Bombay

Joint work with

Jugal

Garg

and Albert X. JiangSlide2

A Game: Rock-Paper-ScissorSlide3

Rock-Paper-Scissor: A Play

Winner

$

1Slide4

Rock-Paper-Scissor:

A Play

Winner

$

1Slide5

Rock-Paper-Scissor:

A Play

Winner

$

1Slide6

0,0

-1,1

1,-1

1,-1

0,0

-1,1

-1,1

1,-1

0,0

Rock-Paper-Scissor PayoffsSlide7

R

P

C

R

0

-1

1

P

1

0

-1

C

-1

1

0

Bimatrix

Game

Steady State:

No

p

layer gains by unilateral deviation

R

P

C

R

0

1

-1

P

-1

0

1

C

1

-1

0

S

1

= { R, P, C }

S

2

= { R, P, C }

A

BSlide8

R

P

C

R

0

-1

1

P

1

0

-1

C

-1

1

0

Bimatrix

Game

No Steady State

R

P

C

R

0

1

-1

P

-1

0

1

C

1

-1

0

S

1

= { R, P, C }

S

2

= { R, P, C }

A

BSlide9

R

1/3

P

1/3

C

1/3

R

0

-1

1

P

1

0

-1

C

-1

1

0

Mixed Play

Steady State

R

P

C

R 1/3

0

1

-1

P 1/3

-1

0

1

C 1/3

1

-1

0

S

1

= { R, P, C }

A

B

1

={r

1

, p

1

, c

1

≥0;

r

1

+p

1

+c

1

=1}

S

1

= { R, P, C }

2

={r

2

, p

2

, c

2

≥0;

r

2

+p

2

+c

2

=1}Slide10

John Nash (1951)Finite Game:

Finitely many players, each with finitely many strategies.

Nash: Every finite game has a steady state in mixed strategy.

Hence forth called Nash equilibrium (NE)

Proved using

Kakutani

fixed point theorem: Highly non-constructive.Slide11

Nash Equilibrium ComputationPapadimitriou (JCSS’94)

: PPAD-class P

roblems

where existence is

guaranteed like

fixed point, Sperner’s Lemma, Nash

equilibrium.Chen and Deng (FOCS’06)

: It is PPAD-hard.CDT (FOCS’06)

:

Even approximation is PPAD-hard.Slide12

Rank and ComputationKannan

and Theobald (SODA’07):

Define rank of (A,B) as rank(A+B).

FPTAS for fixed rank games.

Polynomial

time algorithms for exact Nash.

Dantzig

(1963):

Zero-sum (rank-0) is equiv. to LP.AGMS (STOC’11):

Rank-1

games.Slide13

Bilinear Games

Bimatrix Game with polyhedral strategy sets.Two players: 1

and 2

Polyhedral strategy sets:

X

={x | Ex = e; x ≥ 0}, Y={y | Fy=f; y

≥ 0}Payoff matrices: A, B Bilinear Payoff: (x, y) fetches

xTAy

to

player

1

, and

x

T

By

to

player

2

.

Motivation:

Koller

et al. (STOC’94) for two-player extensive form game with perfect recall.Slide14

Nash Equilibrium in Bilinear

NE: No player gains by unilateral deviation.Existence: Corollary of

Glicksberg’s

result.

Symmetric Game:

B=A

T

and Y=X.(x, y) is a symmetric profile if y=x.

Existence of symmetric NE:

An adaptation of Nash’s proof for symmetric

bimatrix

games.Slide15

Bilinear Contains:Bimatrix

, Polymatrix, Bayesian, etc.

Bimatrix

:

X =

1, Y =

∆2

Polymatrix:

N players.

Each pair plays a

bimatrix

game.

Player

i

:

S

i

finite strategy

set,

i

Mixed strategy set.

Goal of

i

: Choose x

i

from

i

to maximize total payoff.

A

ij

i

jSlide16

Polymatrix to Bilinear

M= |S1|+ … + |S

n

|. X = {(x

1

,…,

xn) | xi in ∆

i}, Y=X.A , B=AT

Symmetric NE of (A,B) maps to a NE of the

polymatrix

game

0

0

A

ij

0

0

i

j

A =

Slide17

Best Response (Koller et al.)

Fix a strategy y of player 2.Player 1 solves

max:

x

T

(Ay) min:

eTp

Ex = e pTE

≥ (Ay)T

x ≥ 0

At optimal: p

s.t

.

A

i

y

pTE

i

&

x

i

> 0 =>

A

i

y

=

p

T

E

i

Given x

X

, for player 2 we get

At optimal: q

s.t

.

B

j

x ≤ qT

Fj &

yj > 0 =>

qT

F

j

=

B

jxSlide18

Best Response Polytopes (BRPs)

(x,y) is a NE

iff

p

: Ay ≤

E

Tp

; xi

> 0 =>

A

i

y

=

p

T

E

i

q:

x

T

B

q

T

F

;

y

j

> 0 =>

q

T

F

j

=

B

j

x

x

T

(Ay

-

E

T

p

)

≤ 0 and (xTB - qTF)y ≤ 0xT(A+B)y – eTp – fTy ≤ 0Slide19

Nash Equilibrium in BRPs

NE

iff

x

T

(Ay -

ETp

)=0 and (xT

B

-

q

T

F

)y=0

x

T

(A+B)y –

e

T

p

f

T

y

=0

Assumption: P and Q are non-

degnerate

.

(u, v) of P x Q gives a NE => (u, v) is a vertex.Slide20

QP Formulation

max: x

T

(A+B)y –

e

T

p

– fT

y

s.t

.

(y, p) P

(x, q) Q

Optimal value 0.

Only vertex solutions.

Slide21

Our ResultsRank-1 games: rank(A+B)=1Extend

Adsul et al. algorithm for exact NE.Fixed rank games: rank(A+B)=k

Extend FPTAS of

Kannan

et al.

Rank of A or B is constant

Enumerate all NE in polynomial time.Slide22

Rank-1 CaseZero-sum ~ rank(A+B)=0: LP formulation (Charnes’53)rank(A+B)=1 then A+B =

a.

b

T

The QP formulation:

max: (x

Ta)(b

Ty) – e

T

p

f

T

y

s.t. (y, p) P

(x, q) QSlide23

Rank-1 CaseReplace (

xTa) by z. Recall B

= -A +

a

.

b

T

xT(A+B)y

e

T

p

f

T

y

=0 z(

b

T

y

)

e

T

p

f

T

y

=0

N

= Points of P x Q’ with

z(

b

T

y

) –

e

T

p

f

T

y=0

Forms paths and cycles, since z gives one degree of freedom.

NE of (A,B

):

Points in intersection of

N

and z – xTa =0. Slide24

Parameterized LP

LP(z) = max: z(

b

T

y

) –

eTp – f

Ty

s.t. (y, p) P (x, z, q

)

Q’

Given any c, Optimal value of LP(c) is 0.

OPT(c) lies on

N

, and

Let

N

(c)={Points of

N

with z=c}

, then

OPT(c)=

N

(c).

N

is a single path on which z is monotonic. Slide25

Rank-1: The AlgorithmNE:

Intersection of N

and H: z –

x

T

a

=0.

. c1=amin

, c2=amax

H

N

H

H

+

NE

N

(c

1

)

N

(c

2

)Slide26

Rank-1: Binary Search Algorithm

NE of (A,B): Points in intersection of N and H.

c=c

1

+c

2

/2.

H

NE

N

(c

1

)

N

(c

2

)

N

N

(c)

H

+

H

–Slide27

Rank-1: Binary Search AlgorithmNE of (A,B): Points in intersection of

N and H.c=c

1

+c

2

/2.

If N(c) in H

–,then c1

=c else c2=c.

H

NE

N

(c

2

)

N

N

(c

1

)

H

+

H

–Slide28

AnalysisTerminates because,z is monotonic on

N.Increase in z on each edge is lower bounded by 1/d where d is polynomial sized in the input.

Time complexity:

Solve LP(c) to get

N

(c) in each pivot.

log(d) * log(amax

– amin) pivots.Slide29

ConclusionsBilinear games: Bimatrix

with polytopal strategy sets.Fairly general. Contains

polymatrix

,

bayesian

, etc.

Polynomial time algorithm for rank based subclasses.Open problems:Designing a Lemke-

Howson type algorithm.Degree, index, stability concepts.Computation of approximate equilibrium.Slide30

Thank You