/
Crowdsourcing High Quality Labels with a Tight Budget Crowdsourcing High Quality Labels with a Tight Budget

Crowdsourcing High Quality Labels with a Tight Budget - PowerPoint Presentation

myesha-ticknor
myesha-ticknor . @myesha-ticknor
Follow
351 views
Uploaded On 2019-01-21

Crowdsourcing High Quality Labels with a Tight Budget - PPT Presentation

  Qi Li 1 Fenglong Ma 1 Jing Gao 1 Lu Su 1 Christopher J Quinn 2 1 SUNY Buffalo 2 Purdue University 1 What is Crowdsourcing Terminology Requester Worker HITs Instance ID: 747308

completeness requirement label labels requirement completeness labels label budget instance minimum crowdsourcing instances requallo count achieve quality vote dataset

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Crowdsourcing High Quality Labels with a..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Crowdsourcing High Quality Labels with a Tight Budget 

Qi Li1, Fenglong Ma1, Jing Gao1, Lu Su1, Christopher J. Quinn21SUNY Buffalo; 2Purdue University

1Slide2

What is Crowdsourcing? TerminologyRequester

WorkerHITsInstanceBasic procedureRequester posts HITsWorker chooses HITs to work onRequester gets labels and pay2

Same

……

requester

Are the two images of the same person?

Same

Same

differentSlide3

Budget AllocationSince crowdsourcing costs money, we need to use the budget wisely.

3Slide4

Budget AllocationSince crowdsourcing costs money, we need to use the budget wisely.Budget allocation:Which instance should we query for labels and how many?

Which worker should we choose?Impossible on most current crowdsourcing platforms.4Slide5

Challenges Under a Tight Budget

Quantity and Quality Trade-off Different Requirements of Quality5Q1Q2Q3

or

Q1

Q2

Q3

I want my results are not

randomly

guessed.

I will approve a result if more than 75% of the workers agree on that label.

Existing work would behave.

Slide6

Inputs and GoalInputs

Requester's requirementThe budget T: the maximum amount of labels can be affordedGoalLabel as many instances as possible which achieve the requirement under the budget6Slide7

Problem Settings

independent binary instancesTrue label Instance difficulty: relative frequency of +1 appears when the number of workers approaches infinity

means the instance is hard

Workers are noiseless (for basic model)

, where,

is worker

’s label for instance

Labels for instance

are

i.i.d

. from Bernoulli(

)

 

7Slide8

Notations

Notations DefinitionThe true label of the -th instance

Difficulty level of the

-

th

instance

Maximum number

of labels given the budget

Vote count of

labels for the

-

th

instance

Vote count of labels for the

-th instanceNotations Definition

Maximum number of labels given the budget

8Slide9

Examples of Requirement

Minimum ratioApprove the result on an instance if orEquivalent to

set

a threshold on entropy

Hypothesis test

Fisher exact test to test if the labels are

randomly

guessed

Calculate the p-value, and approve the result if

 

9Slide10

Completeness Ratio between the observed total vote counts and the minimum count of labels it needs to achieve the requirement.

10Slide11

Completeness Ratio between the observed total vote counts and the minimum count of labels it needs to achieve the requirement.

Denoted as:

 

11

Observed total vote counts

Minimum count to achieve the requirementSlide12

Completeness Ratio between the observed total vote counts and the minimum count of labels it needs to achieve the requirement.

Example:requirement is the minimum ratio of 4If , completeness=If

, completeness=

 

12Slide13

Maximize CompletenessThe goal is to label instances as many as possible that achieve the

requirement of quality. 13Slide14

Maximize CompletenessThe goal is to label instances as many as possible that achieve the

requirement of quality. Maximize the overall completenessFormally:14Slide15

Maximize CompletenessThe goal is to label instances

as many as possible that achieve the requirement of quality. Formally:: policy (i.e., all the possible combinations of choosing instances for labelling).: the expected completeness of the -th instance.Constraint: cannot exceed the budget.

 

15Slide16

Expected Completeness

where

 

16

Completeness given that the true label is

 Slide17

Expected Completeness

where

 

17

Completeness given that the true label is

 Slide18

Markov Decision ProcessSolve the optimization using Markov decision process

Stage-wise reward

Greedy strategy

 

18Slide19

Requallo Framework

19+1-1+1

+1

Q1

Requirement: Minimum

R

atio of

3

-

1

+1

+1

-

1

+1

Q2

Q3Slide20

Requallo Framework

20+1-1+1

+1

Q1

Requirement: Minimum

R

atio of

3

C

ompleteness

-

1

+1

+1

-

1

+1

100%

Q2

Completeness

Q3

Completeness

72%

50%Slide21

Requallo Framework

21+1-1+1

+1

Q1

Requirement: Minimum

R

atio of

3

C

ompleteness

-

1

+1

+1

-

1

+1

100%

Q2

Completeness

Q3

Completeness

RewardReward

72%

50%Slide22

Requallo Framework

22+1-1+1

+1

Q1

Requirement: Minimum

R

atio of

3

C

ompleteness

-

1

+1

+1

S

elected

-

1

+1

Unselected

100%

Q2

CompletenessQ3Completeness

RewardReward

72%

50%Slide23

Extension: Workers’ Reliability

Reliability degree: The label from a worker - two layers of Bernoulli sampling

Adjust the

vote

counts:

 

23Slide24

Experiments on Real-World Crowdsourcing Tasks

DatasetRTE dataset: conducted on mTurk for recognizing textual entailmentGame Dataset: conducted using an Android app based on a TV game show “Who Wants to Be a Millionaire“Performance MeasuresQuantity Quality24Slide25

Experiments on Real-World Crowdsourcing TasksRTE Dataset

Game Dataset25

Quantity

QuantitySlide26

Experiments on Real-World Crowdsourcing TasksRTE Dataset

Game Dataset26

Absolute count

Absolute countSlide27

Experiments on real-world crowdsourcing tasksRTE Dataset

Game Dataset27

Accuracy rate

Accuracy rateSlide28

Comparison of Different Requallo Policies (on Game dataset)

MethodCost#Instances#CorrectAccuracyRequallo-p0.27715166215870.9549Requallo-p0.111191159715580.9756

Requallo-p0.05

13878

1517

1493

0.9842

Requallo-c4

8689

1567

1518

0.9687Requallo-c5

11266

148914640.9832Requallo-m35127170915800.9245

28This result confirms our intuition.If a requester wants high quality results, he can set a strict requirement, but should expect a lower quantity of labeled instances or a higher cost.Slide29

ConclusionsIn this paper, we study how to allocate a tight budget for crowdsourcing tasksThe requesters can specify their needs on label qualityThe goal is to maximize quantity under the budget while guarantee the quality

The proposed Requallo framework uses greedy strategy to sequentially label instances.Extension to incorporate workers’ reliabilities.29Slide30

30

Thank You!