/
Unbiased Learning-to-Rank Unbiased Learning-to-Rank

Unbiased Learning-to-Rank - PowerPoint Presentation

test
test . @test
Follow
427 views
Uploaded On 2017-06-02

Unbiased Learning-to-Rank - PPT Presentation

with Biased Feedback Thorsten Joachims Adith Swaminathan Tobias Schnabel Department of Computer Science amp Department of Information Science Cornell University LearningtoRank from Clicks ID: 555166

click abcdefg rank propensity abcdefg click propensity rank presented labels learning feedback partial information full loss risk ranking relevance training info estimator

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Unbiased Learning-to-Rank" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Unbiased Learning-to-Rank with Biased Feedback

Thorsten Joachims, Adith Swaminathan, Tobias SchnabelDepartment of Computer Science & Department of Information ScienceCornell UniversitySlide2

Learning-to-Rank from Clicks

Presented

A

B

C

D

EFG

ABCDEFG

Click

Presented

ABCDEFG

ABCDEFG

Click

Presented

ABCDEFG

ABCDEFG

Click

Presented

ABCDEFG

ABCDEFG

Click

Presented

ABCDEFG

ABCDEFG

Click

Presented

ABCDEFG

ABCDEFG

Click

Presented

ABCDEFG

ABCDEFG

Click

Click

New Ranker

 

Learning Algorithm

Query Distribution

Deployed Ranker

 

Should perform better than

 Slide3

New

F

G

D

C

E

ABFG

DCEABPresented

A

B

CDEFGABCDE

FGEvaluating RankingsA

BC

DE

FGABCDEFG

FGDCEA

B

F

GDCEAB

Presented

A

B

CDEFGABCD

EFGClickNew

FGD

C

EABFGDCEAB

Deployed Ranker

 

New Ranker to Evaluate

 124

367Manually LabeledSlide4

Evaluation with Missing Judgments

Loss:

Relevance

labels

This

talk: rank of relevant

documents

Assume:Click implies observed and relevant:

Problem:

No click can mean not relevant OR not observed

Understand observation mechanism Presented AB

CD

E

FGABCDEFG

ClickSlide5

Inverse Propensity Score Estimator

Observation Propensities

Random variable

indicates whether relevance label

for is observed

Inverse

Propensity

Score (IPS) Estimator:Unbiasedness:

 

 

Presented

A

1.0

B

0.8

C0.5 D0.2E0.2F0.2G0.1

A

1.0

B0.8C0.5 D0.2E0.2F0.2G0.1

[Horvitz & Thompson, 1952] [Rubin, 1983] [Zadrozny et al., 2003] [Langford, Li, 2009] [Swaminathan & Joachims, 2015]

New RankingSlide6

Inverse Propensity Score Estimator

Observation Propensities

Random variable

indicates whether relevance label

for is observed

Inverse

Propensity

Score (IPS) Estimator:Unbiasedness:

 

 

 

 

Presented

A

1.0

B

0.8

C

0.5

D

0.2

E

0.2F0.2G0.1

A1.0B0.8C0.5 D

0.2E

0.2F

0.2G0.1

Need to know the propensities only for relevant/clicked docs.[Horvitz & Thompson, 1952] [Rubin, 1983] [Zadrozny et al., 2003] [Langford, Li, 2009] [Swaminathan & Joachims, 2015]

New RankingSlide7

Full-Info Learning-to-Rank

Loss:

Risk

:

Empirical Risk:

Training:

 Slide8

ERM for Partial-Information LTR

Unbiased Empirical Risk:

ERM Learning:

Questions

:

How do we optimize this

empirical risk

in a practical learning algorithm?

How do we define and estimate the propensity model

?

 

 

Consistent Estimator of True Error

Consistent ERM LearningSlide9

Propensity-Weighted SVM Rank

Data:

Training QP:

Loss Bound:

 

 

 

Query

Clicked

Others

Propensity

Optimizes

convex upper

bound on unbiased

IPS risk

estimate!

[Joachims et al., 2002]Slide10

Propensity-Weighted SVM RankTraining QP:

Risk Bound:

Clicked result

All other results

PropensitySlide11

Position-Based Propensity Model

Model:AssumptionsExamination only depends on rank  Q

Clicks reveal relevance if examined

and

otherwise

 

 

Propensity

 Slide12

Position-Based Propensity ModelModel:

AssumptionsExamination only depends on rank Click reveals relevance if rank is examined

 

Presented

A

B

C

D

E

F

G

A

B

C

DEFG

[Richardson et al., 2007] [Chuklin

et al., 2015] [Wang et al., 2016]Slide13

Estimating the Propensities

Experiment:Click rate at rank 1:

Intervention:

swap results at rank 1 and rank k

Click rate at rank k:

 

[Langford et al., 2009;

Wang

et al.,

2016]Slide14

Presented

Presented

A

B

C

D

E

F

G

A

B

C

D

E

F

G

Experiments

Yahoo Web Search

D

ataset

Full-information datasetBinarized relevance labelsGenerate synthetic click data based on

Position-based propensity model with

Baseline “deployed” ranker to generate 33% noisy clicks on irrelevant docs Presented

AB

C

D

EF

G

A

B

CDEFGClick

ClickSlide15

Scaling with Training Set Size

Deployed RankerSlide16

Scaling with Training Set SizeSlide17

Severity of Presentation Bias

 Slide18

Increasing Click NoiseSlide19

Misspecified Propensities

 

Increase bias

Reduce variance

Increase bias

Increase varianceSlide20

Real-World Experiment

Arxiv Full-Text SearchRun intervention experiment to estimate Collect training clicks using production ranker

Train naïve / propensity

SVM-Rank (1000 features)

A/B tests via i

nterleaving

 Slide21

Conclusions and FuturePartial-Information

Learning-to-Rank ProblemBetween Batch-Learning-from-Bandit-Feedback and Full-Info Learning-to-RankPositive-only feedbackRelevant for many ranking problems with partial labels

Partial-Information Empirical

Risk Minimization (ERM

)

Unbiased ERM objective despite biased partial-feedback

Propensity Ranking SVM methodFuture ResearchOther loss functions? Other LTR algorithms? More sophisticated propensity estimation and modeling?How to handle new bias-variance trade-off in risk estimator?Software http://www.joachims.org/svm_light/svm_proprank.htmlSlide22

Multi-Label Classification / Ranking

Full Information FeedbackInput: Labels:

Goal: for new x, predict bitvector/ranking/subset

Examples

Document Tagging:

=Doc,

=(

Politics?,Europe?,…)Object Recognition: =Image, =(Cow?,Plane?,…)Search: =query, =(Doc1Rel?,Doc2Rel?,…)

 

Problem: We almost never get reliable feedback for all labels!

USAPoliticsElection 2016Trump AdminEnvironmentGlobal Warming⁝

 Slide23

Partial Feedback: Missing Labels

Labels not Missing Uniformly at Random  Covariate ShiftExample: Movie recommendation [Schnabel et al., 2016]

 

E

xamples

Labels

Examples

Full Information Feedback r

Partial Information Feedback

 

Partially

revealed

 

LabelsSlide24

Partial Feedback: Positive-Only

Unclear when “0” means “missing” or “negative”.Example: Tagging, Object Recognition [Jain et al., 2016]

 

E

xamples

Labels

Examples

Full Information Feedback Y*

Partial Information Feedback

 

Some positives revealed

 

Labels

ClickSlide25

What is between BLBF and full-info LTR?

BLBF:No assumptions/knowledge about loss functionOnly observe loss for chosen Exploration policy

provides randomized

with full support

Unbiased Partial-Info LTR:

Loss function is know/assumed

Relevance labels only partially revealedUser behavior provides randomized relevance labels with full supportFull-Info LTR:Loss function is known/assumedRelevance labels known for all documents Slide26

 

E

xamples

Labels

 Slide27

Propensities

Partial-Info Learning-to-RankSetup

is the presented

ranking

is random variable indicating whether

is observed.

is random variable indicating whether is clicked.

Propensity of observing

 

 

E

xamples

Labels