/
Implementing the  “Wisdom of the Crowd” Implementing the  “Wisdom of the Crowd”

Implementing the “Wisdom of the Crowd” - PowerPoint Presentation

funname
funname . @funname
Follow
344 views
Uploaded On 2020-06-23

Implementing the “Wisdom of the Crowd” - PPT Presentation

The Internet Economy With i Ilan Kremer and Yishay Mansour 1 ii Jacob Glazer and Ilan Kremer Study Internet but not only applications like Crowd funding Tripadvisor ID: 784990

action agent planner agents agent action agents planner information policy optimal exploration recommend observes site mechanism recommendation motivation recommendations

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Implementing the “Wisdom of the Crowd..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Implementing the “Wisdom of the Crowd”The Internet Economy

Withi) Ilan Kremer and Yishay Mansour

1

ii) Jacob Glazer and

Ilan

Kremer

Slide2

Study Internet (but not only) applications like Crowd funding, Tripadvisor, Netflix,

Waze, Amazon, OKCupid, and many more, that attempt to implement the wisdom of the crowd.

2

The multi-arm bandit problem (first paper).

Rel. Lit: “Optimal Design for Social Learning” Horner and Che

To study these applications, we take a mechanism design approach to two classical economic problems.

Information cascading (second paper). Rel. Lit: “Optimal Voting Schemes with Costly Information Acquization” Gershkov and Szentes

These sites (often called expert sites) collect information from customers while making recommendations to them.

MOTIVATION

Slide3

Model Agents arrive sequentially:H

ave prior on possible rewards from a set of actions/arms. Each makes one choice, and gets reward.

Only planner observes (part of) the history. Interested

in maximizing social welfare. Choose what information to

reveal.

Agents are strategic. Know planner’s strategy

Model I: Planner observes the whole history, choices & rewards.

When IC constraints are ignored this is the well known Multi Arm Bandit problem.

Model II: Planner observes only the choices made by agents but not their rewards. When history is fully revealed, this is the model of

Information Cascade (with costly signals).

Slide4

Research QuestionControlling revelation of information, can the planner induce exploration, prevent an early information cascading?

What is the optimal policy of the planner.

What is the expected loss compared to the first best outcome?

Slide5

Waze: Social Media User based navigationReal time navigation recommendations based on user inputs; Cellular and GPS.

Recommendation dilemma:Need to try alternate routes to estimate timeWorks well only if attracts large number of users

Motivation:

The site’s manager is interested in maximizing the social welfare

Slide6

6

Websites such as TripAdvisor.com and yelp.com (and many others) try to Implement the ‘wisdom of the crowds’.

Motivation

How the ranking is done? How it should be done?

They collect information from customers while making recommendations to them by providing a ranking.

The site’s manager is interested in maximizing the social welfare

Works

well only if

attracts

large

number of users

Slide7

Motivation

Also Crowd funding websites (

InvestingZone

or

CrowdCube

), or matching site like

OKCupid, and many others, are all relevant examples.

I

n both cases the same conflict arises between the site and the agents. Your Amazon.com  “We compare your activity on our site with that of other customers, and using this comparison, are able to recommend other items that may interest you in your

Amazon.com

Your

recommendations change regularly based on a

number

of

factors,

including

…..,

as well as changes in

the

interests of other customers like you.

Slide8

In an interview to the NYT (Sep. 6, 2014), Mr. Rudder CEO and cofounder of OkCupid said:

“We told users something that wasn’t true....People come to us because they want the website to work, and we want the website to work.”

    “

Guess what, everybody,” he

added,

“if you use

the Internet, you’re the subject of hundreds of experiments at any given time, on every

site.”We are interested in how much “manipulation” (experimentation) can be exercised when agents are strategic.

Motivation

Slide9

Multi-Arms Model(simplest possible example)Two actions: a1 and a

2N risk-neutral agents

Each action has a fix unknown reward

R

1

and R2

(r.v).Prior over the rewards; E[R1

] > E[R2]=μ2

Planner observes choices and rewards.

provides agent n with message mn Some information about past.

Slide10

ExampleAction 1 has prior Uniform in [-1,+2]Action 2 has prior Uniform in [-2,+2]

No information:All agents prefer action 1, the better a priori actionNo explorationFull information:

Assume first agent observes a value of zero or above

Then no

incentive for

other agents

to explore action 2

0

-12-22Can we do better?Action 2

Action 1

Slide11

Impossibility ExampleAction 1 prior Unif[3/4, 5/4]Action 2 prior Unif[-2,+2]E[R

2] = 0 < R1Agent n knows:all prior agents preferred action 1Hence, he too prefers action 1Planner has no influence

Action 2

Action 1

µ

2

0

+2

-2Required Assumption: Pr[ R1 < μ2 ] > 0

Slide12

Basic properties of optimal mechanism:A mechanism is a sequence of functions {M

t}tєN where

M

t

: H

t-1

→ MSufficient to consider recommendations policy that are IC (Myerson 1986).

{Пt}tєN

where Пt: Ht-1 → {1,2}

Two natural IC constraintsE[R2-R1 | recommend(2) ] ≥ 0

E[R

1

-R

2

| recommend(1) ] ≥ 0

Sufficient to consider only action

2

A mechanism that is

IC

for action

2

is automatically

IC

for action

1

Slide13

The optimal policy is a partition policyRecommends to first agent action 1The only IC recommendationIf both actions are sampled, recommend the better

Mapping from values of r1 to agent that explores Conclusion: Consider only partition policy

R

1

agent 4

agent 4

agent 3

agent 5

No exploration

Slide14

The optimal policy is a threshold policyAgent 1: recommend

action 1. Planner observes reward r1Agent 2: explores for all values below E[R2

] (and above) Thresholds

Agent

t

>2

:Both actions sampled: recommend the better actionOtherwise: If r1 < θt then recommend action 2 otherwise action

1Intuition: Inherent tradeoff between two potential reasons for being recommended action 2

Agent 2 Agent 3

Agent 4Agent 5µ2 No exploration

IC

constraints are tight

Slide15

Recall the basic IC constraint:

OPTOMALITY

Proper Swap:

B

2

B

1

Agent t

1

(5)

Agent t

2

(10)

R

1

b

2

b

1

Since

B

2

<B

1

Pr

[b

2

]>

Pr

[b

1

]

What is NOT a threshold policy:

Exploitation > 0

Exploration

< 0

Slide16

Information Cascading Modelplanner observes only choices not outcomes

AGENT: risk-neutral arrive sequentially.Known arrival order; do not observe history.

Each agent is asked to choose

an

action and then get a reward

Before making a choice, an agent, at a

cost

c>0, can obtain an informative signal.

Two actions

A and B. One action is “good” and yields a payoff of one while the other is “bad” and yields a payoff of zero.

Slide17

There exists a planner who observes (only) the chosen actions (

A or B) taken by all agents. For

every t

the

planner decides what message to send

the agent.

Planner's objective is to maximize the discounted present value of all agents' payoffs.

Let pt : H

t-1 → [0,1] denotes planner’s posterior after t-1 observations.

Let μt : {M} → [0,1] denotes agent t’s posterior.

Slide18

Information structure and belief’s law of motionIf A is the good action

, the signal gets the value sa with

probability 1

If

B

is the good action, the signal gets the value sb

with probability q, and s

a with probability 1-q. Note that sb is fully revealing.

If a signal is obtained by agent t, then01

p

0

s

a

s

a

s

a

s

b

Prob

(A)

Slide19

Agent t's utility maximizing decision is given by

0

μ

b

1

μ

a

p

0

Planner’s first-best maximizing decision is given by a threshold. Commitment to full revelation → too little explorationCommitment to no revelation → too much exploration

b

a

e

Preliminaries

0

p

b

1

p

a

p

0

b

a

e

Slide20

Basic properties of optimal mechanism:T

he optimal mechanism is:

Phase one:

As long as there is no conflict between the planner and agent

t

,

(i.e., pt є [μb

, μa] ) full revelation is exercised.

20

0

1

μ

a

p

0

e

20

0

1

μ

a

p

0

Or

t =1

t =2

t =3

t =

 

e

(

ii)

Public Mechanism

(i) A

Recommendation Mechanism where

M

t

: H

t-1

→ {

a,b,e

}

(

iii)

Three phases

Slide21

Phase two: If the first

agents obtained a signal sa then for all

<

t≤t

*,

m

t=e, and μ

t= μa

. This is achieved by committing to recommend e even after the planner learned that the good action is B.

0μa

p

0

t =1

t =2

t =3

t =

 

Phase three:

For all

t≥t

*

the planner recommend

B

is

p

t

=0

, and otherwise

A

. Note that

p

t

*

is either zero or less than p

a

.

t =

t*

p

a

Main idea of proof. Second best is like first best with increasing cost. The extra cost of keeping

μ

t

=

μ

a.

Slide22

Thank You !

Slide23

Thank You !23

Slide24

24Your Amazon.com We compare your activity on our site with that of other customers, and using this comparison, are able to recommend other items that may interest you in Your recommendations change regularly based on a number of factors, including ….., as well as changes in the interests of other customers like you.

Slide25

ExampleAssume: R₁ ~ U [-1 , 5] R

2 ~ U [-5 , 5] N large (optimal to test the two alternatives).25

Full Transparency

Agent 2 chooses second alternative only if R

₁≤0. Otherwise

all agents choose the first alternative.

Outcome is suboptimal for large N

0

55-1-5

Slide26

recommends 2nd alternative to agent 2 whenever R₁≤1.This is IC because E[R1 | recommend(2) ] = 0

26This outcome

is more efficient than the one under full transparency.

But

we can do even better.

0

5

5

-1-51

Slide27

recommends third agent to use 2nd action if one of two cases occursSecond agent tested 2nd action (R₁≤1) and the planner learned that R₂>R₁.

1<R₁≤1+x , so the third agent is the first to test 2nd action.Agent n>4 never explores regardless of N. So at most 3 agents choose the wrong action.

27

1

R

1

1+x=3.23

I

2I3

0=E[R2]I4-15

Slide28

IC AnalysisAgent t1 unchangedAdded b2 to and subtracted b1 Proper swap implies equal effect.Agents other than t

1 and t2Before t1 and after t2: unchangedBetween t1 and t2:Gain (Pr

[b2] -

Pr

[b

1]) max{r

1,r2} IC holds28

Slide29

Multi-Arm BanditSimple, one player, decision modelMultiple independent (costly) actionsUncertainty regarding the rewardsTradeoff between exploration and exploitation (Gittins index)29

Slide30

Reflecting on RealityReport-card systemsHealth-care, education, …Public disclosure of information Patients health, students scores, …Pro:Incentives to improve qualityInformation to usersCons:Incentives to “game” the system avoid problematic cases

30We suggest a different point of view

Slide31

31

Websites such as TripAdvisor.com and yelp.com (and many others) try to Implement the ‘wisdom of the crowds’.

Motivation: The New Internet Economy

How the ranking is done? How it should be done?

They collect information from customers while making recommendations to them by providing a ranking.

The site’s manager is interested in maximizing the social welfare

Works

well only if

attracts

large

number of users

Slide32

Recommendation PolicyRecommendation Policy:Proof (Myerson’s (1986)):32

For agent n,Gives recommendation xn ϵ{a1,a

2}

Recommendation is

IC

if

E[Rj – Ri | xn = aj ] ≥ 0

Note that IC Implies: recommend to agent 1 action a1Claim: Optimal policy is a Recommendation Policy

M(j,n) – set of messages that cause agent n to select action aj.H(j,n) – the corresponding historiesE[Rj-Ri|m] ≥ 0 for m ϵ M(j,n)

Consider the recommendation aj after H(j,n)Still IC, identical outcomes

Slide33

Partition PolicyPartition Policy:Optimal policy is a partition:33

Recommendation policyAgent 1: recommending action a1 and observing r1If r1 in In

,

n≤

N

Agent

n the one to explore a2Any agent n’>n uses the better of the actionsPayoff max{r1,r2

}If r1 in IN+1 no agent explores a2Disjoint subsets

In Recommending the better action when both are knownOptimizes sum of payoffsStrengthen the IC

Agent 2

Agent 3

Slide34

Only worse action (a2) is “important”Proof:34Lemma:

Any policy that is IC w.r.t. a2 is IC w.r.t. a1Let

Kn denotes the

set of histories that cause

x

n

=a2E[R2–R1|hϵ Kn

] ≥0Since it is an IC policyOriginally: E[R2–R1] <0

Therefore E[R2 – R1 | not in Kn] < 0

Slide35

Optimality → Tight IC constraintsLemma: Proof:35

If agent n+1 explores (Pr[In+1]>0), then agent n has a tight

IC constraint.

Move exploration from agent

n+1

to agent n (

r1 ϵ ѴϵIn+1)Improves sum of payoffs For r1

ϵѴ replaces r1+R2 by R2 + max{r

1,r2}Keeps the IC for agent n (since it was not tight) and n+1 (remove exploration)

R1InIn+1( Ѵ )

Slide36

Information CascadingBikhchandani, Hirshleifer, and Welch (1992), Banerjee (1992) 36

OR

Agents ignore (or do not acquire) own signals.

Same exercise is conducted but now planner observes only actions, and private signals are costly (Netflix)

Slide37

The Story of Coventry and TuringIn November 1940, Prime Minister, Winston Churchill, knew several days in advance that the Germans would attack Coventry but deliberately held back the information. His intelligence came

from the scientists at Bletchley Park, who, in utmost secrecy, had cracked the Enigma code the Germans used for their military communications. Warning the city of Coventry and its residents of the imminent threat would have alerted the Germans to the fact that their codes had been cracked.

Churchill considered it worth the sacrifice of a whole city and its people to protect his back-door route into

Berlin’s

secrets. 

The imitation game

37

Slide38

How good is optimal?!The expected loss due to ICBounded (independent of N)Bounding the number of exploring agents by:Where38

Slide39

ProofConsider the ‘exploitation’ term for agent n>2. It is an increasing sequence as for higher n the planner becomes better informed. Hence, it is bounded from bellow by the ‘exploitation’ term of agent

3. This in turns is bounded below by α.The sum of the ‘exploration’ terms is bounded by 39

 

 

Slide40

Extensions40

Slide41

Introducing money transferBasically same policyPlanner invest all the money in agent 2Gets more exploration as early as possible.Otherwise, same construction.When money costs money:The planner will subsidize some exploration of agent 2Other agents as before.

41

Slide42

Relaxing agents knowledgeSo far agents knew their exact placeRelaxation: Agents are divided to blocksearly users, medium, late usersEssentially the same property holdsIn each block only the first explores

Blocks can only increase social welfareThe bigger the blocks the closer to first-best42

Slide43

Optimal Policy: performanceAction 1 is better:Only one agent explores action 2Action 2 is better:Only a finite number of agents explore action 1. This number is bounded and the bound is independent of

N. => Conclusion Aggregate loss compared to first best is bounded43

Slide44

Now to some proofs …44

Slide45

Basic IC constraintsRecommendation policy With sets In

45

Positive (exploitation)

Negative (exploration)

R

1

I

n-1

InIn-1

InIn+1E[R2

]

Slide46

Threshold policyPartition policy such that In = (in-1,in

]I2 = (-∞,i2)IN+1 = (iN,∞)

Main Characterization: The optimal policy is a threshold policy

46

No exploration

Agent 2

Agent 3

Agent 4

Agent 5R1

Slide47

47

Motivation: The New Internet Economy

Also websites such as Netflix, Amazon OKCupid, Tripadvisor and many others.

Regardless of what the planner/site observes, in both cases the same conflict arises between the site and the agents.

Slide48

48Crowd Funding sites collect information from investors by monitoring their choices and, use this information in making recommendations to future investors.

Motivation: The New Internet Economy

Also websites such as Netflix, Amazon OKCupid, Tripadvisor and many others.

Regardless of what the planner/site observes, in both cases the same conflict arises between the site and the agents.