/
On-Line Decision Problems with Exponentially Many Experts On-Line Decision Problems with Exponentially Many Experts

On-Line Decision Problems with Exponentially Many Experts - PowerPoint Presentation

giovanna-bartolotta
giovanna-bartolotta . @giovanna-bartolotta
Follow
404 views
Uploaded On 2016-03-02

On-Line Decision Problems with Exponentially Many Experts - PPT Presentation

Richard M Karp Hoffman Fest September 1920 2014 OnLine Decision Problem A game between an adversary and a chooser At each of T steps the adversary secretly designates a subset of a fixed set A of n possible actions The designated actions are called winning and the others are called losing ID: 238448

winning action set actions action winning actions set win sets number step algorithm paths case ideal frequent problem chooser

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "On-Line Decision Problems with Exponenti..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

On-Line Decision Problems with Exponentially Many Experts

Richard M. Karp

Hoffman Fest

September 19-20, 2014Slide2

On-Line Decision Problem

A game between an adversary and a chooser. At each of T steps the adversary secretly designates a subset of a fixed set A of n possible actions. The designated actions are called winning and the others are called losing.

T

he chooser, knowing the winning actions at all preceding steps, chooses an action.

The

set of winning actions at the current step is then revealed to the chooser.

The chooser’s goal: Minimize regret, the number of times he chooses a losing action, minus min

e

(number of losses of action e).

Any deterministic policy for the

chooser

can be made to lose at every step.Slide3

Randomized On-line Algorithms with Proven Regret Bounds

{W(

t,e

), L(

t,e

))}: Number of {wins, losses} for action e in first t steps.

Multiplicative Update Algorithm: Choose a small positive epsilon.

A

t each step t+1, assign a weight (1-epsilon}

(L(t-1,e))

to each action e. Draw an action from a distribution in which the probability of an action is proportional to its weight.

Kalai-Vempala

Algorithm: at each step t+1, assign each action e the score W(

t,e

) - R(t+1,e) where R(t+1,e) is drawn by the chooser from an exponential distribution with mean

1/epsilon. Choose the action

argmax

e

W(

t,e

) – R(t+1,e)Slide4

Bounds on Regret

For the multiplicative weight update algorithm and

Kalai-Vempala

algorithm the expected regret after step t is at most

epsilon min

t

L(

t,e

) +O(log(n)/epsilon

where n is the number of possible actions.Slide5

Examples

The actions may be: paths through a network,

multicommodity

flows,

packings

of a knapsack, truth assignments, stock levels in an inventory system, spanning trees, minimum spanning trees, perfect bipartite

matchings

, transversals in a bipartite graph, bases in a

matroid

etc.

How to implement the

Kalai-Vempala

algorithm when the number of possible actions is exponentially large?Slide6

Coping with Exponentially Many Actions

Approach: Find a compact representation that enables counting the number of winning actions at any prior step, sampling uniformly at random from the set of winning actions at that step , and testing whether a given action is winning at that step.

Implement the

Kalai-Vempala

algorithm using these primitives.Slide7

Ideal Case

WIN(t) Set of winning actions at step t.

Ideal case: It is possible to count the elements of WIN(t), sample uniformly from WIN(t) and test membership in WIN(t).

Sets for which the ideal case holds: source-sink paths respecting a deadline, feasible solutions to a provisioning problem, feasible solutions to a knapsack problem, spanning trees, minimum spanning trees, satisfying assignments to a

boolean

formula described by a BDD.Slide8

Nearly Ideal Case

Approximate counting can be done. The nearly ideal case applies to perfect

matchings

in a bipartite graph, integer

multicommodity

flows, satisfying assignments to a DNF Boolean formula and bases of a transversal

matroid

.Slide9

Example: The On-Line Path Selection Problem

The setting: A graph G, a source a and sink z, where

a

is different from z, and a positive integer deadline d.

The actions are the paths from a to z in G.

At each step t the adversary assigns positive integer costs to the edges. A path is winning if the sum of the costs of its edges is less than or equal to d.Slide10

A Compact Representation

A pair (

c,u

), where c is a nonnegative integer

and u is a vertex, is

good

at step t if there is a path of cost c from

a

to u and a path of cost

<=d-c from u to z.

In the graph of winning paths H the vertex set is the set of good pairs plus the pair (

d,z

) and there is an edge from (

c,u

) to (

c’,v

) if c’ –c is the cost of edge (

u,v

)

and from (

c,u

) to (

d,z

) if the cost of the edge (

u,t

) is

less than or equal to d-c.

The winning paths are the paths in H from (0,a) to (

d,z

).Slide11

Constructing the Graph H

By breadth-first search forward from (0,

a

) and backward from (

d,

z

) find the set of winning

pairs.

Counting: Examining the pairs (

c,u

) in H in decreasing order of c, compute L(

c,u

), the number of paths in H from (

c,u

) to (

d,z

). L(0,a) is the number of winning paths.

Sampling: Impose an arbitrary linear ordering on the edges out of each pair in H. This induces a linear ordering on the winning paths. For any N, one can quickly find the Nth path in this ordering.Slide12

A P

rovisioning

P

roblem

Each day,

the adversary secretly specifies a vector d = (d(1),d(2),…,d(m)) of demands for m perishable goods and the chooser chooses supply levels e(1),e(2),…,e(m)

summing to C. The action (e(1), e(2),…,e(m)) is winning if e(

i

) >= d(

i

) for all

i

.

Easy exercise: count the winning

choic

es

and sample uniformly at random from the set of winning

choic

es

.Slide13

Naïve Algorithm for the Ideal Case

At step t+1 explicitly compute W(

t,e

) and W(

t,e

) - R(t+1,e) for all e, and choose

argmax

e

W(

t,e

) – R(t+1,e).

Here W(

t,e

) =

W(t-1,e) + 1 if e is a winning action at time t, and W(t-1,e) otherwise.

This is expensive of time and especially of storage.Slide14

Thinning

Choose a small positive parameter p

At each step, compute a thinned winning set

in which each winning action occurs independently with probability p

Estimate each W(

t,e

) by

V(

t,e

)/p

where V(

t,e

) is the number of thinned winning sets containing e at the first t steps. Execute the naïve algorithm using these estimates.Slide15

Intersection Trick

Winning sets: WIN(1), WIN(2),…,WIN(T)

In many cases, the intersection of k winning sets has a compact representation supporting counting, uniform sampling and membership testing.

One can estimate the fraction of winning sets containing action e up to time t by picking random k-tuples of prior winning sets and estimating the frequency with which the k-fold intersection contains action e.Slide16

A Thought Experiment

A thought experiment: given an urn filled with balls of many different colors, find all balls that occur with frequency at least 1/r, where r is

positive integer.

Solution:

Repeat as long as possible: discard a set of r balls of distinct colors.

At the end of this process, fewer than r colors remain, and every color of frequency >1/r in the original urn is

among them.Slide17

Finding Frequent Actions with Limited Storage

WIN(

i

) = set of winning actions at time

i

.

Action e is

(

t,r

)-frequent

if

W(

t,e

) > 1/r (|WIN(1)| + … + |WIN(t)|).

Problem: for t=1,2,…,T find the (

t,r

)-frequent actions. These are candidates for

a

rgmax

e

(W(

t,e

) –R(t+1,e)). Actions that are not

(

t,r

)-frequent can often be eliminated.Slide18

A Streaming Algorithm

Process the successive sets WIN(1), WIN(2), …, WIN(T) action by action.

Create sets S(1),S(2), each of cardinality at most r and containing occurrences of distinct actions.

Place each action in turn in the first set that can accept it.Slide19

Streaming Algorithm (continued)

A set S(j) is

incomplete

if it contains fewer than r distinct actions. For each such action, the algorithm keeps a count of the number of incomplete sets containing the action.

Lemma

: At any point, the number of actions occurring in incomplete sets is less than r.

Lemma

: If action e is (

t,r

)-frequent then, after the processing of WIN(1), WIN(2),…,WIN(t),

action e occurs in at least one incomplete set.Slide20

Estimating the Frequencies of Actions

Goal: for each action e, estimate the number of complete sets containing e.

Associate with each complete set S(j) a consensus set C(j) as follows:

C(1) = S(1)

for j=2,3,…

if e is in both S(j) and C(j-1) then e is in C(j)

pair the remaining elements of S(j)

one-to-one with the remaining elements

of C(j-1). Place one action from each pair into C(j), choosing the element from S(j) with probability (j-1)/j, and the element from C(j-1) with probability 1/j.Slide21

Estimating the Frequencies of Actions (continued)

Theorem: The probability that action e lies in

C(j) is exactly number of occurrences of e in the sets S(1), S(2),…,S(j), divided by j.

The construction of the consensus sets C(t) is randomized. By maintaining several consensus sets for each set S(t) one can estimate the number of occurrences of e in the sets

S

(1),S(2),…,S(t). Combining this with the construction of the (

t,r

)-frequent actions one can reliably estimate

argmax

e

W

(

t,e

) –R(t+1,e) for all t.Slide22

Acknowledgement

This is ongoing joint work with

Evdokia

Nikolova

of UT Austin.

The further work will focus on algorithms for the non-ideal case, in which counting and sampling can only be done approximately, and the non-

binary

case

, in which an action receives a score ate each step, rather than merely winning or losing.