/
Collaborative, Privacy-Preserving Collaborative, Privacy-Preserving

Collaborative, Privacy-Preserving - PowerPoint Presentation

acenum
acenum . @acenum
Follow
345 views
Uploaded On 2020-08-05

Collaborative, Privacy-Preserving - PPT Presentation

Data Aggregation at Scale Michael J Freedman Princeton University Joint work with Benny Applebaum Haakon Ringberg Matthew Caesar and Jennifer Rexford Problem Network Anomaly Detection ID: 798957

participants proxy privacy participant proxy participants participant privacy protocol pda key prx client hbc oprf problem encrypted retransmits keyword

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Collaborative, Privacy-Preserving" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Collaborative, Privacy-PreservingData Aggregation at Scale

Michael J. Freedman

Princeton University

Joint work with: Benny

Applebaum

, Haakon

Ringberg

,

Matthew Caesar, and Jennifer Rexford

Slide2
Problem:

Network Anomaly Detection

Slide3
Collaborative anomaly detection

Some attacks look like normal traffice.g., SQL-injection, application-level DoS

[

Srivatsa

TWEB ‘08]Is it a DDoS attack or a flash crowd? [Jung WWW ‘02]

Yahoo!

Google

Bing

I’m not sure about Beasty!

I’m not sure about Beasty!

I’m not sure

about Beasty!

Slide4

Collaborative anomaly detection

Targets (victims) could correlate attacks/attackers

[

Katti

IMC ’05], [

Allman

Hotnets

‘06], [Kannan SRUTI ‘06], [Moore INFOC ‘03]

Yahoo!

Google

Bing

Fool us once, shame on you. Fool us N times, shame on us.

Slide5
Problem:

Network Anomaly Detection

Solution:

Aggregate suspect

IPs

from many ISPs

Flag those

IPs

that appear > threshold

τ

Slide6

Problem:

Distributed Ranking

Solution:

Collect domain statistics from many users

Aggregate data by domain

Slide7
Problem:

Solution:

Aggregate (id, data) from many sources

Analyze data grouped by id

Slide8
But what about privacy?

What inputs are submitted?Who submitted what?

Slide9
Data Aggregation Problem

Many participants, each with (key, value) observationGoal: Aggregate observations by key

Key

Values

k

1

( va, vb

)k2

( vi

, vj, vk )…

kn

( vx

)

A

A

A

Slide10
Data Aggregation Problem

Many participants, each with (key, value) observationGoal: Aggregate observations by key

Key

Values

k

1

( va, vb

)k2

( vi

, vj, vk )…

kn

( vx

)

A

A

A

F

(

F

(

F

(

)

)

)

PDA: Only release the value column

CR-PDA: Plus keys whose values satisfy some

func

Slide11
Data Aggregation Problem

Many participants, each with (key, value) observationGoal: Aggregate observations by key

Key

Values

k

1

( 1, 1 )k2

( 1, 1, 1 )…k

n ( 1

)

Σ

Σ

Σ

PDA: Only release the value column

CR-PDA: Plus keys whose values satisfy some

func

τ

?

τ

?

τ

?

Slide12
Goals

Keyword privacy: No party learns anything about keysParticipant privacy: No party learns who submitted what

Efficiency:

Scale to many participants, each with many inputs

Flexibility:

Support variety of computations over valuesLack of coordination: No synchrony required, individuals cannot prevent progressAll participants need not be online at same time

Slide13
Potential solutions

Approach

Keyword

Privacy

Participant

PrivacyEfficiency

FlexibilityLack ofCoord

GarbledCircuitEvaluation

Multiparty

Set Intersection

Yes Yes Very Poor Yes No

Yes Yes Poor No No

Decentralized

Slide14

Security

Efficiency

Weaken security assumptions?

Assume honest but curious participants?

Assume no collusion among malicious participants? In large/open setting, easy to operate multiple nodes (so-called “Sybil attack”)

Slide15
Towards Centralization?

DB

Participants

Slide16
Potential solutions

Approach

Keyword

Privacy

Participant

PrivacyEfficiency

FlexibilityLack ofCoord

GarbledCircuitEvaluation

Multiparty

Set Intersection

HashingInputs

Network

Anonymization

Yes Yes Very Poor Yes No

Yes Yes Poor No No

No No Very Good Yes Yes

No Yes Very Good Yes Yes

Decentralized

Centralized

Slide17
Towards semi-centralization

Participants

Proxy

DB

Assumption: Proxy and DB do not collude

Slide18
Potential solutions

Approach

Keyword

Privacy

Participant

PrivacyEfficiency

FlexibilityLack ofCoord

GarbledCircuitEvaluation

Multiparty

Set Intersection

HashingInputs

Network

Anonymization

This

Work

Yes Yes Very Poor Yes No

Yes Yes Poor No No

No No Very Good Yes Yes

No Yes Very Good Yes Yes

Yes Yes Good Yes Yes

Decentralized

Centralized

Slide19
Privacy Guarantees

Privacy of PDA against malicious entities and participants Malicious participant may collude with either malicious proxy or DB, but not bothMay violate correctness

in almost arbitrary ways

Privacy of CR-PDA against

honest-but-curious entities and malicious participants

Slide20
PDA Strawman #0

Participant

Proxy

DB

Client sends input

k

k

Slide21
PDA Strawman #1

Participant

Proxy

DB

Client sends encrypted input

k

Proxy batches and retransmits

DB decrypts

input

ds

k

#

1.1.1.1

1

2.2.2.2

9

Violates keyword privacy

E

DB

(k

)

E

DB

(k

)

Slide22

dsPDA Strawman

#2

Participant

Proxy

DB

Client sends hashes of

k

Proxy batches and retransmits

DB decrypts input

H (

k

)

#

H(1.1.1.1)

1

H(2.2.2.2)

9

Still violates keyword privacy:

IPs

drawn from small domains

E

DB

( H (

k

) )

E

DB

( H (

k

) )

Slide23
PDA Strawman #3

Participant

Proxy

DB

5. Proxy recovers

k

from

E

PRX

(k

)

Client sends keyed hashes of

k

Keyed hash function (PRF)

Key

s

known only by proxy

F

s

(

k

)

#

F

s

(1.1.1.1)

1

F

s

(2.2.2.2)

9

E

DB

( F

s

(

k

) )

E

DB

( F

s

(

k

) )

But how do clients

learn F

s

(IP)) ?

Secret

s

Slide24
Our Basic PDA Protocol

Participant

Proxy

DB

Client sends keyed hashes of

k

F

s

(x

) learned by client through

Oblivious PRF protocol

Proxy batches and retransmits keyed hash

DB decrypts input

F

s

(

k

)

#

F

s

(1.1.1.1)

1

F

s

(2.2.2.2)

9

E

DB

( F

s

(

k

) )

OPRF

E

DB

( F

s

(

k

) )

F

s

(

k

)

Secret

s

Slide25

F

s

(

k

)

#

F

s

(1.1.1.1)

1

F

s

(2.2.2.2)

9

retransmits

Basic CR-PDA Protocol

Participant

Proxy

DB

Client sends keyed hashes of

k

,

and encrypted

k

for recovery

Proxy retransmits keyed hash

DB decrypts input

Identify rows to release and transmit E

PRX

(

k

) to proxy

Proxy decrypts

k

and releases

E

DB

( F

s

(

k

) )

F

s

(

k

)

E

DB

(E

PRX

(

k

))

E

PRX

(

k

)

F

s

(

k

)

#

Enc’d

k

F

s

(1.1.1.1)

1

E

PRX

(

1.1.1.1

)

F

s

(2.2.2.2)

9

E

PRX

(

2.2.2.2

)

Secret

s

Slide26
retransmits

Privacy Properties

Participant

Proxy

DB

Any coalition of HBC participants

HBC coalition of proxy and participants

HBC database

E

DB

( F

s

(

k

) )

F

s

(

k

)

E

DB

(E

PRX

(

k

))

E

PRX

(

k

)

Keyword privacy:

Nothing learned

about unreleased keys

Participant privacy:

Key

Participant not learned

Secret

s

Slide27
retransmits

Privacy Properties

Participant

Proxy

DB

Any coalition of HBC participants

HBC coalition of proxy and participants

HBC database

E

DB

( F

s

(

k

) )

F

s

(

k

)

E

DB

(E

PRX

(

k

))

E

PRX

(

k

)

Keyword privacy:

Nothing learned

about unreleased keys

Participant privacy:

Key

Participant not learned

Secret

s

malicious participants

HBC coalition of DB and participants

Slide28
retransmits

More Robust PDA Protocol

Participant

Proxy

DB

Any coalition of HBC participants

HBC coalition of proxy and participants

HBC database

E

DB

( F

s

(

k

) )

F

s

(

k

)

E

DB

(E

PRX

(

k

))

E

PRX

(

k

)

Secret

s

malicious participants

HBC coalition of DB and participants

ORPF

Encrypted OPRF Protocol

C

iphertext

re-randomization by proxy

Proof by participant that submitted

k’s

match

Slide29
Encrypted-OPRF protocol

Problem: in basic OPRF protocol, participant learns Fs(k)

Encrypted-OPRF protocol:

Client learns blinded

F

s(k)Client encrypts to DBProxy can unblind Fs(k) “under the encryption”

( )

r

-1

Enc ( ) ( )

rFs(k)

(

π si

)

ki=1

El

Gamal

g

mod

p

Slide30
Encrypted-OPRF protocol

Problem: in basic OPRF protocol, participant learns Fs(k)

Encrypted-OPRF protocol

Client learns blinded

F

s(k)Client encrypts to DBProxy can unblind Fs(k) “under the encryption”

OPRF runs OT protocol for each bit of input k

OT protocols expensive, so use batch OT protocol [Ishai et al]

( )

r-1

Enc ( ) ( ) r

Fs(k

)

Slide31
Scalable Protocol Architecture

Participants

Client-Facing

Proxies

Share

secret

s

Proxy Decryption

Oracles

Share

PRX key

Front-End

DB Tier

Share

DB key

Back-

End

DB

Storage

Partition

F

s

keyspace

Slide32
Evaluation

Scalable architecture implementedBasic CR-PDA / PDA protocol + and encrypted-OPRF protocol w/ Batch OT

~5000 lines of

threaded C

+

+, GnuPG for cryptoTestbed of 2 GHz Linux machinesAlgorithm

ParameterValueRSA /

ElGamalkey size1024 bits

Oblivious Transferk80

AESkey size256 bits

Slide33
Throughput vs. participant batch size

Single CPU core for DB and proxy each

Slide34
Maximum throughput per server

Four CPU cores for DB and proxy

(each)

Slide35
Throughput scalability

Number CPU

cores

per DB

and

proxy (each)

Slide36
Summary

Privacy-Preserving Data Aggregation protects:Participants: Do not reveal who submitted what

Keywords: Only

reveal

values / released keys

Novel composition of crypto primitivesBased on assumption that 2+ known parties don’t colludeEfficient implementation of architecture

Scales linearly with computing resourcesEx: Millions of suspected IPs in hours

Of independent interest… Introduced encrypted OPRF protocolFirst implementation/validation of Batch OT protocol