Qi Li 1 Fenglong Ma 1 Jing Gao 1 Lu Su 1 Christopher J Quinn 2 1 SUNY Buffalo 2 Purdue University 1 What is Crowdsourcing Terminology Requester Worker HITs Instance ID: 747308
Download Presentation The PPT/PDF document "Crowdsourcing High Quality Labels with a..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Crowdsourcing High Quality Labels with a Tight Budget
Qi Li1, Fenglong Ma1, Jing Gao1, Lu Su1, Christopher J. Quinn21SUNY Buffalo; 2Purdue University
1Slide2
What is Crowdsourcing? TerminologyRequester
WorkerHITsInstanceBasic procedureRequester posts HITsWorker chooses HITs to work onRequester gets labels and pay2
Same
……
requester
Are the two images of the same person?
Same
Same
…
differentSlide3
Budget AllocationSince crowdsourcing costs money, we need to use the budget wisely.
3Slide4
Budget AllocationSince crowdsourcing costs money, we need to use the budget wisely.Budget allocation:Which instance should we query for labels and how many?
Which worker should we choose?Impossible on most current crowdsourcing platforms.4Slide5
Challenges Under a Tight Budget
Quantity and Quality Trade-off Different Requirements of Quality5Q1Q2Q3
or
Q1
Q2
Q3
I want my results are not
randomly
guessed.
I will approve a result if more than 75% of the workers agree on that label.
Existing work would behave.
Slide6
Inputs and GoalInputs
Requester's requirementThe budget T: the maximum amount of labels can be affordedGoalLabel as many instances as possible which achieve the requirement under the budget6Slide7
Problem Settings
independent binary instancesTrue label Instance difficulty: relative frequency of +1 appears when the number of workers approaches infinity
means the instance is hard
Workers are noiseless (for basic model)
, where,
is worker
’s label for instance
Labels for instance
are
i.i.d
. from Bernoulli(
)
7Slide8
Notations
Notations DefinitionThe true label of the -th instance
Difficulty level of the
-
th
instance
Maximum number
of labels given the budget
Vote count of
labels for the
-
th
instance
Vote count of labels for the
-th instanceNotations Definition
Maximum number of labels given the budget
8Slide9
Examples of Requirement
Minimum ratioApprove the result on an instance if orEquivalent to
set
a threshold on entropy
Hypothesis test
Fisher exact test to test if the labels are
randomly
guessed
Calculate the p-value, and approve the result if
9Slide10
Completeness Ratio between the observed total vote counts and the minimum count of labels it needs to achieve the requirement.
10Slide11
Completeness Ratio between the observed total vote counts and the minimum count of labels it needs to achieve the requirement.
Denoted as:
11
Observed total vote counts
Minimum count to achieve the requirementSlide12
Completeness Ratio between the observed total vote counts and the minimum count of labels it needs to achieve the requirement.
Example:requirement is the minimum ratio of 4If , completeness=If
, completeness=
12Slide13
Maximize CompletenessThe goal is to label instances as many as possible that achieve the
requirement of quality. 13Slide14
Maximize CompletenessThe goal is to label instances as many as possible that achieve the
requirement of quality. Maximize the overall completenessFormally:14Slide15
Maximize CompletenessThe goal is to label instances
as many as possible that achieve the requirement of quality. Formally:: policy (i.e., all the possible combinations of choosing instances for labelling).: the expected completeness of the -th instance.Constraint: cannot exceed the budget.
15Slide16
Expected Completeness
where
16
Completeness given that the true label is
Slide17
Expected Completeness
where
17
Completeness given that the true label is
Slide18
Markov Decision ProcessSolve the optimization using Markov decision process
Stage-wise reward
Greedy strategy
18Slide19
Requallo Framework
19+1-1+1
+1
Q1
Requirement: Minimum
R
atio of
3
-
1
+1
+1
-
1
+1
Q2
Q3Slide20
Requallo Framework
20+1-1+1
+1
Q1
Requirement: Minimum
R
atio of
3
C
ompleteness
-
1
+1
+1
-
1
+1
100%
Q2
Completeness
Q3
Completeness
72%
50%Slide21
Requallo Framework
21+1-1+1
+1
Q1
Requirement: Minimum
R
atio of
3
C
ompleteness
-
1
+1
+1
-
1
+1
100%
Q2
Completeness
Q3
Completeness
RewardReward
72%
50%Slide22
Requallo Framework
22+1-1+1
+1
Q1
Requirement: Minimum
R
atio of
3
C
ompleteness
-
1
+1
+1
S
elected
-
1
+1
Unselected
100%
Q2
CompletenessQ3Completeness
RewardReward
72%
50%Slide23
Extension: Workers’ Reliability
Reliability degree: The label from a worker - two layers of Bernoulli sampling
Adjust the
vote
counts:
23Slide24
Experiments on Real-World Crowdsourcing Tasks
DatasetRTE dataset: conducted on mTurk for recognizing textual entailmentGame Dataset: conducted using an Android app based on a TV game show “Who Wants to Be a Millionaire“Performance MeasuresQuantity Quality24Slide25
Experiments on Real-World Crowdsourcing TasksRTE Dataset
Game Dataset25
Quantity
QuantitySlide26
Experiments on Real-World Crowdsourcing TasksRTE Dataset
Game Dataset26
Absolute count
Absolute countSlide27
Experiments on real-world crowdsourcing tasksRTE Dataset
Game Dataset27
Accuracy rate
Accuracy rateSlide28
Comparison of Different Requallo Policies (on Game dataset)
MethodCost#Instances#CorrectAccuracyRequallo-p0.27715166215870.9549Requallo-p0.111191159715580.9756
Requallo-p0.05
13878
1517
1493
0.9842
Requallo-c4
8689
1567
1518
0.9687Requallo-c5
11266
148914640.9832Requallo-m35127170915800.9245
28This result confirms our intuition.If a requester wants high quality results, he can set a strict requirement, but should expect a lower quantity of labeled instances or a higher cost.Slide29
ConclusionsIn this paper, we study how to allocate a tight budget for crowdsourcing tasksThe requesters can specify their needs on label qualityThe goal is to maximize quantity under the budget while guarantee the quality
The proposed Requallo framework uses greedy strategy to sequentially label instances.Extension to incorporate workers’ reliabilities.29Slide30
30
Thank You!