/
Network Applications of Bloom Filters: A Survey Network Applications of Bloom Filters: A Survey

Network Applications of Bloom Filters: A Survey - PowerPoint Presentation

DiamondsAreForever
DiamondsAreForever . @DiamondsAreForever
Follow
343 views
Uploaded On 2022-08-02

Network Applications of Bloom Filters: A Survey - PPT Presentation

Andrei Broder and Michael Mitzenmacher Presenter Chen Qian Slides credit Hongkun Yang Outline Bloom Filter Overview Standard Bloom Filters Counting Bloom Filters Historical Applications ID: 932820

filters bloom false standard bloom filters standard false hash set bits functions filter counting positive number p2p items probability

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Network Applications of Bloom Filters: A..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Network Applications of Bloom Filters: A Survey

Andrei

Broder

and Michael

Mitzenmacher

Presenter:

Chen Qian

Slides

credit:

Hongkun

Yang

Slide2

Outline

Bloom Filter Overview

Standard Bloom Filters

Counting Bloom Filters

Historical Applications

Network Applications

Distributed Caching

P2P/Overlay Networks

Resource Routing

Conclusion

Slide3

Overview

Burton Bloom introduced it in 1970s

Randomized data structure

Representing a set to support membership queries

Dramatic space savings

Allow false positives

Slide4

Standard Bloom Filters: Notations

S

the set of

n

elements

{x1, x2, …,

xn

}k independent hash functions h1, …, hk with range {1, …, m}.Assume: hash functions map each item in the universe to a random number uniformly over the range {1, …, m}MD5An array B of m bits, initially filled with 0s

Slide5

Standard Bloom Filters: How It Works

Hash each

xi in

S

k

times. If

Hj(xi) = 1, set B[=1.To check whether y is in S, check B at H_j(y), j = 1,2,…,k

If all k values are set to 1,

y

is assumed to be in

S,If not, y is clearly not in S.

No False Negative

Possible False Positive

Slide6

Standard Bloom Filters: An Example

0

0

0

0

0

0

B

INTIAL STATE

Slide7

Standard Bloom Filters: An Example

0

0

0

0

0

0

B

INSERTION

x

1

1

1

x

2

1

Slide8

Standard Bloom Filters: An Example

1

0

1

0

1

0

B

CHECK

y

1

y

2

Slide9

Standard Bloom Filters: False Positive Rate (1)

Pr[a

given bit in

B

is 0]=

The probability of a false positive is

Let

r be the proportion of 0 bits after all elements are inserted in the Bloom filterConditioned on r, the probability of a false positive is

Slide10

Standard Bloom Filters: False Positive Rate (2)

The fraction of 0 bits is extremely concentrated around its expectation

Therefore, with high probability,

Slide11

Standard Bloom Filters: Optimal Number of Hash Functions (1)

Two competing forces:

More hash functions gives more chances to find a 0 bit for an element that is not a member of

S

Fewer hash functions increases the fraction of 0 bits in the array

Slide12

Standard Bloom Filters: Optimal Number of Hash Functions (2)

Slide13

Standard Bloom Filters: Optimal Number of Hash Functions (3)

Note that

Let

g

=kln(1-e-kn/m)

, solve

Rewrite

g as where p Using symmetry, g is minimal when p = ½Then

Slide14

Standard Bloom Filters: Space Efficiency

A lower bound

Let

e

be the false positive ratio, then

The optimal case

The false

posive rate for the optimal Bloom filter isLet f>e

Slide15

Standard Bloom Filters: Operations (1)

Union

Build a Bloom filter representing the union of

A

and

B

by taking the OR of

BF(A) and BF(B)Shrinking a Bloom filterHalving the size by taking the OR of the first and the second half of the Bloom filterIncrease false positive rateThe intersection of two sets

Slide16

Standard Bloom Filters: Operations (2)

The intersection of

S

1

and

S

2

The average number of 1 bits in the AND of BF(S1) and BF(S2)Z1 the number of 0 bits in BF(S1), Z2 BF(S2

), Z12 the AND of BF(S

1

)

and BF(S2)

Slide17

Counting Bloom Filters: Motivation

Standard Bloom filters

Easy to insert elements

Cannot perform deletion operations

Counting Bloom filters

Each entry is not a single bit but a small counter

Insert an element: increment the corresponding counters

Delete an element: decrement the corresponding counters

Slide18

Counting Bloom Filters: An Example

0

0

0

0

0

0

B

INTIAL STATE

Slide19

Counting Bloom Filters: An Example

0

0

0

0

0

0

B

INSERTION

x

1

1

1

x

2

1

2

Slide20

Counting Bloom Filters: An Example

1

0

1

0

1

0

B

DELETION

x

1

2

0

Slide21

Countering Bloom Filters: How Large Counters Do We Need? (1)

n

elements,

k

hash functions,

m

counters, and

c(i) the count associated with the ith counterThe tail probability is bounded byThen use the union bound again

Slide22

Countering Bloom Filters: How Large Counters Do We Need? (2)

4 bits per counter is enough

The maximum counter value is

O

(log

m

) with high probability, and hence O(loglog m) bits are sufficientLet j = 3ln m/ lnln m

Slide23

Historical Applications

Dictionaries

Hyphenation programs

UNIX spell-checkers

Dictionary of unsuitable passwords

Databases

Semi-join operations

Differential files

Slide24

Distributed Caching: Scenario

Slide25

Distributed Caching: Summary Cache

Motivation

Sharing of caches among Web proxies to reduce Web traffic and alleviate network bottlenecks

Directly sharing lists of URLs has too much overhead

Solution

Use Bloom filters to reduce network traffic

Use a counting Bloom filter to track cache contents

Broadcast the corresponding standard Bloom filter to other proxies

Slide26

P2P/Overlay Networks: Content Delivery

Problem

Peer

A

has a set of items

S

A

, peer B has SB, B wants useful items from A (SA-SB)SolutionB sends A its Bloom filter BF(B)A sends B its items that is not in

SB according to BF(B)Implications of false positives

Not all elements in

S

A-SB will be sentRedundant items (e.g. erasure coding)

A large fraction of SA

-SB is sufficient (not necessarily the entire set)

Slide27

P2P/Overlay Networks: Efficient P2P Keyword Searching (1)

Problem

Peer

A

has a set of items

S

A

, peer B has SB, A wants to determine Solution A sends B its Bloom filter BF(A)B sends A its items that appears to be in SA according to BF(A)

B eliminates false positives and determines exactlyFewer bits transmitted than

A

sending the entire set

SA

Slide28

P2P/Overlay Networks: Efficient P2P Keyword Searching (2)

1 2

3 4

3 4

5 6

S

A

S

B

Server

A

Server

B

(1) request

3 4

3 4

6

(2)

BF(A)

Client

Slide29

Resource Routing (1)

Network is in the form of a rooted tree

Nodes hold resources

Each node keeps Bloom filters representing

A unified list of resources that it holds or reachable through one of its children

Individual lists of resources for it and each child.

When receiving a request for a resource

Check the unified list to see whether the node or its descendants hold the resourceYes: check the individual listsNo: forward the request up the tree toward the root

Slide30

Resources Routing (2)

Slide31

Conclusion

Simple space-efficient representation of a set or a list that can handle membership queries

Applications in numerous networking problem

Bloom filter principle

Slide32

THANK YOU!