/
Web Cache Web Cache

Web Cache - PowerPoint Presentation

phoebe-click
phoebe-click . @phoebe-click
Follow
393 views
Uploaded On 2016-12-18

Web Cache - PPT Presentation

Characterizing Roles of Frontend Servers in End to End Performance of Dynamic Content Distribution 46842197 Li ZHANG 78884704 Dakuo WANG 30165502 Xuejie SUN ID: 503061

content search dynamic cache search content cache dynamic bing server servers performance static user google time portion data web

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Web Cache" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Web Cache

Characterizing Roles of Front-end Servers in

End-to-End Performance of Dynamic Content Distribution

46842197

Li

ZHANG

78884704

Dakuo WANG

30165502

Xuejie

SUN

37324635

Yang

LIUSlide2

Introduction of Web Cache

Related Paper Overview2.1 Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol 2.2 Going Viral: Flash Crowds in an Open CDN 3. Characterizing Roles of Front-end Servers in End-to-

End Performance of Dynamic Content Distribution 3.1 Problem Definition3.2 Motivation3.3 Model3.4 Result3.5 Conclusion3.6 Pro & Con

4. Q & ASlide3

Introduction to Web Caching(Proxy Server)

CONCEPTWeb cache is a mechanism for the temporary caching of web documents to reduce bandwidth usage, server load, and perceived lag.

TYPES OF PROXY SERVERForward proxies, Open proxies, Reverse proxies, Performance Enhancing ProxiesUSES OF PROXY SERVERTo speed up access to resources . To control access to internal resources.To filter content.To hide the real IP.To circumvent Internet filtering to access content otherwise blocked by governments.To breakthrough own IP access restrictions. Slide4

Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol Li Fan, Member, IEEE, Pei Cao, Jussara

Almeida, and Andrei Z. Broder Slide5

Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol Li Fan, Member, IEEE, Pei Cao, Jussara

Almeida, and Andrei Z. Broder Internet Cache Protocol (ICP)Simple Cache Sharing: fetch and store locallyNo load balancingOverhead: UDP messages (factor of 73 to 90), network traffic (8% - 13%), client HTTP request latency (8%-12%)

Summary CacheEach proxy store a summary of its directory of cached document in every other proxyCache miss, check the summaries to see if it exist in other proxiesSummary Cache Enhanced ICP (SC-ICP)Add new opcode in ICP version 2

Introduce additional header follows regular ICP header

Modify Squid 1.1.4 software to implement the protocolSlide6

Going Viral: Flash Crowds in an Open CDN Patrick Wendell, Michael J. Freedman

Flash Crowds on CoralCDNCoralCDN: an open Content Distribution Network (CDN) running at several hundred POPsFlash Crowds: a period over which request rates for a particular fully-qualified domain name are increasing exponentially

average per minute request rate over a particular period ti4 years CDN traffic, 33 billion HTTP requestsAnalysis conclusion:Potential benefits of cooperative vs. independent caching by CDN node

The efficacy of elastic redirection and resource provisioning

The ecosystem of portals, aggregators and social networksSlide7

Going Viral: Flash Crowds in an Open CDN Patrick Wendell, Michael J. Freedman

Flash Crowd CacheabilityThe degree of caches coordination in fetching origin contentCoralCDN uses a distributed hash table for global content discovery

Commercial CDNs, Akamai, use non-cooperative caching, where each remote proxy independently fetches content from the origin siteFewer requests to origin site, higher complexity and additional overheadSlide8

Going Viral: Flash Crowds in an Open CDN Patrick Wendell, Michael J. Freedman

Flash Crowd CacheabilitySlide9

The Motivations

Most content on the Internet is stored at data centers in the cloud, and they are dynamic for user’s request. The scale and cost of building and operating large-scale powerful

data centers are increasing.The way to improve the overall response time is to deploy “proxy” servers closer to users. FE servers can be exploited to improve the user-perceived performance due to:

1) A

portion of the dynamic content may be static; thus can be cached and delivered immediately from the FE servers. 2) Via split TCP connections, a FE server can establish a persistent TCP connection with the data center which not only eliminates the effect of TCP slow-start between the

FE

and BE, but also reduce the RTT between the user and the server

.Slide10

The Problem

Authors conduct an active measurement-based comparative study of Google and Microsoft Bing web search services.Use the PlanetLab nodes to perform extensive measurements of Google and Bing search services using a variety of keyword search, and collect dynamically generated content and application-layer measurement data.Use these collected data to analysis the role of FE. Slide11

How to solve itThey develop an in-house user search query emulator, which performs exactly the same functionality as the web-based search box.

They conduct extensive measurements by submitting the same search queries to both Bing and Google search engines, and collect detailed TCPdump with full application-layer payloads.Perform two sets of experiments: 1) In the first set, search queries are launched from all measurement nodes to their default 3 FE servers every 10 seconds.

2) In the second set, they fix one FE server (of Bing or Google respectively) at a time, and launch queries from all measurement nodes to this server.Slide12

Content distribution

Content includes static and dynamic (i.e., search results)Static portion: HTTP header, HTML header, CSS style files and the static menu bar. Dynamic portion: keyword-dependent menu bar, search results and ads.Static portion is cached and directly delivered by FE servers. Dynamic portion is generated by BE data centers and them passed onto the FE servers for delivery.T

he experiment shows Tdynamic varies significantly with the types of search keywords used, whereas Tstatic is mostly insensitive.Slide13

Several parameters:

Tb: start of TCP three-way handshake T1: HTTP GET requestT2: receive packets from server T3/T4: receive first/last packet containing the static portion T5/T6: receive first/last packet containing the dynamic portion Slide14

Tstatic depends mostly on the time to generate and deliver the static

content portion at the FE server. When RTT is small, T

dynamic

is roughly a constant while

T

delta

decreases

as a function of

RTT.

When

RTT increases

beyond

a

certain threshold, the dynamic content portion will be

received by

the FE server before the static content portion is

entirely delivered to the client. Hence

T

dynamic

increases as

a function of

RTT,

while

T

delta

becomes zero.

Observation:Slide15

Performance

First cluster represents the three-way TCP handshake between theclient and the FE server. The second and third cluster representthe delivery of static and dynamic contents. As the RTT increases, the gap between the end of the second and the beginning of the third clusters decreases, and eventually the two arelumped together.Slide16

Google has slightly

farther FE servers from the clients, but has significantly lower Tstatic and Tdynamic.These results illustrate that placing FE servers closer to clients does

not necessarily reduce Tstatic and Tdynamic.The x-axis represents the PlanetLab nodes, and the yaxis represents the box-plot for the distribution for different samples

.

The

results show that comparing Google, users

using the

Bing search service tend to experience slightly

longer and

more variable overall response times.Slide17

Comparing Bing & Google Performance and discuss

The fetch time between Google FE servers and BE data centers tends to be smaller and more stable. In contrast, fetch time between Akamai FE servers and Bing data centers tends to be larger and shows higher variability.Although Bing place FE servers closer to client, it has significantly higher Tstatic and

Tdynamic compare to Google. The reason for this may be due to the higher and more variable loads at Akamai FE server, as Bing shared with other servicers.

The end to end performance is determined solely by the FE-BE fetch time.

T

fetch

consists of two key components:

T

proc

and

RTT

beSlide18

Several Results of This Paper

FE severs do not cache any dynamically generated search result. It only cache the static information, such as Http header, Html header.Placing FE closer to users can improve user-perceived performance. There is a trade-off between placement of FE severs and the FE-BE fetch time. T

here is a threshold within which placing FE further closer to users is no longer helpful.While placing FE severs closer to users can help reduce latency, other key factors, such as processing times, loads at FE/BE data centers, and the quality of connections

between them also play a critical role in determining the overall user-perceived performance.

Improving and optimizing these factors are important for overall user-perceived performance in dynamic content distribution such as dynamic generation of search results in response to user requires.Slide19

Strong point of the paper

This paper investigated the role of FE sever in improving user-perceived performance of dynamic content distribution, which is emerging as the next big business for CDN. This paper developed a good and simple model-based inference framework to measure and quantify the frontend-to-backend fetching time, which contains the query processing time at BE and delivery time between BE and FE.They used Bing and Google search services, and performed extensive network measurement and analysis, based on several sets of experiments.This paper also took into consideration about the difference between the FE of Bing and FE of G

oogle.Slide20

Weakness of the Paper

In this paper, they focused on standard search functions of search engines. However, more recently, some search engines introduced more advanced search features such as the interactive feature. By using this feature, after each letter user typed, a separate query is sent to the FE sever. And subsequent queries are highly correlated.Most nodes they used for test may introduce some unfairness between Bing and Google (because they are placed closer to Bing FE sever). No significant packet loss during the measurements. In a high loss rate environment, placing FE closer to users may significantly improve the user-perceived end-to-end performance.Slide21

ThanksAny Questions?