/
Predicting Content Change on the Web Predicting Content Change on the Web

Predicting Content Change on the Web - PowerPoint Presentation

undialto
undialto . @undialto
Follow
344 views
Uploaded On 2020-08-07

Predicting Content Change on the Web - PPT Presentation

Kira Radinsky Technion Israel Paul Bennettt Microsoft Research 2009 2010 2011 Bing Site Personal Site 2009 2010 2011 Unified Approach for Content Change Prediction 1D Setting use observation of change only ID: 801711

content change pages page change content page pages results related setting observation site 2011 2010 2009 information prediction crawling

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Predicting Content Change on the Web" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Predicting Content Change on the Web

Kira Radinsky

Technion

, Israel

Paul

Bennettt

Microsoft Research

Slide2

2009

2010

2011

Bing Site

Slide3

Slide4

Personal Site

2009

2010

2011

Slide5

Slide6

Unified Approach for

Content Change Prediction

1D Setting

use observation of change only

2D Setting

use observation of change and

content from the page itself only

3D Setting

use change and content from

page and related pages.

Slide7

Results

– what information to use?

Content

improves over Page Change Frequency

alone

Related pages improve over Content & Change frequency

Slide8

Results – how to combine the information?

Having different views of the change leads to best results

Slide9

Results – how to choose the related pages?

Best indicators of page change are the correlations in content similarity over time.

Slide10

How Can it Improve Crawling?

Slide11

Conclusions

Page content is useful for identifying page change

Related pages content also helps in deciding which pages will change

The combination of the data is important, and can be efficiently distributedApplicationsImproved incremental crawling strategy.Prediction of a new hyper-link to a previously unknown (i.e., non-indexed) web page.

Personalized new content RSS