/
SecureML : A System for Scalable SecureML : A System for Scalable

SecureML : A System for Scalable - PowerPoint Presentation

tatiana-dople
tatiana-dople . @tatiana-dople
Follow
438 views
Uploaded On 2018-03-15

SecureML : A System for Scalable - PPT Presentation

PrivacyPreserving Machine Learning Payman Mohassel and Yupeng Zhang Machine Learning More data Better Models Image processing Speech recognition Ad recommendation Playing Go ID: 651706

logistic regression secret server regression logistic server secret data multiplication preserving linear 000 sgd privacy neural triplets garbled model function faster networks

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "SecureML : A System for Scalable" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

SecureML: A System for ScalablePrivacy-Preserving Machine Learning

Payman Mohassel and Yupeng ZhangSlide2

Machine LearningMore data

→ Better Models

Image processing

Speech recognition

Ad recommendation

Playing GoSlide3

Ad recommendation

Machine Learning

More data

→ Better Models

Image processing

Speech recognition

Playing Go

Data Privacy?Slide4

Example: Fraud Detection

Card

#

TimeLocationAmout

xxxxxxxx8/8/2016CA, USAxx.xx……

xxxxxxxx

x/xx/

xxxx

xx,xxx

xx.xx

……

Name

SSN

Alice

xxxxx

……

Alice

xxxxx

…..

Products

xxx

……Slide5

Privacy-preserving Machine LearningDecision trees

[LP00, …]k-means clustering [JW05, BO07, …]SVM classification [YVJ06, VYJ08, …]Linear regression [DA01, DHC04, SKLR04, NWI

+13, GSB+16, GLL+16, …]Logistic regression

[SNT07, WTK+13, AHTW16, …]Neural networks [SS15, GDL+16, …]

……Slide6

Two-server Model

server

server

two party computation

model

user

=

+

data

More efficient than MPC and FHE

Users can be offline during the training

Used in many prior work

[NWI

+

13, NIW

+

13, GSB

+

16, …] Slide7

Our ContributionsNew protocols for linear regression, logistic regression and neural networks training

Secret sharing and arithmetic with precomputed triplets + Garbled circuitSystem:54 – 1270× faster than prior workScale to large datasets (1 million records, 5000 features for logistic regression)Slide8

Linear RegressionSlide9

Linear Regression

x

y

Output: model

w

Input: data value pairs (

x

,

y

)s

Stochastic Gradient Decent (SGD):

w

Initialize

w

randomly

Select a random sample (

x

,

y

)

Update Slide10

Secret Sharing

server

server

a

a

0

= a

-

r

mod

p

a

1

= r

mod

pSlide11

Secret Sharing and Addition

server

server

a

0a

1

b

0

b

1

+

+

=

=

c

0

c

1

c

0

+ c

1

= a + bSlide12

Secret Sharing and Multiplication Triplets

server

server

a

0a

1

b

0

b

1

u

0

, v

0

, z

0

u

1

, v

1

, z

1

(

u

0

+ u

1

)

×

(

v0

+ v1)= (z0 + z1)

a0 - u0 , b0 - v0 a1 – u1 , b1 – v1

e = a - uf = b - ve = a - uf = b - vc0= -

ef + a0 f + eb0 + z0

c

1

= a

1

f + eb

1

+ z

1

c

0

+ c

1

= a

×

bSlide13

Privacy-preserving Linear Regression

SGD:Users secret share data and values (x,y)Servers initialize and secret share the model w

Run SGD using pre-computed multiplication tripletsDecimal number?Slide14

Decimal Multiplications in Integer Fields

a

.

16 bits

b

.

16 bits

×

c

.

32 bits

c

.

16 bits

Truncation:

fixed-point multiplication

Same as integer multiplication

Decimal part grows

overflowSlide15

Truncation on shared values

a0

.

b0

.

×

c

0

.

.

a

1

.

b

1

.

c

1

.

c

0

.

c

1

Truncation:

c

.

+1, +0 or -1 on the last bit, with high probabilitySlide16

Privacy-preserving Linear Regression

SGD:Users secret share data and values (x,y)Servers initialize and secret share the model w

Run SGD using pre-computed multiplication triplets Truncate the shares after every multiplicationSlide17

Effects of Our Technique4-8× faster than fix-point multiplication garbled circuitSlide18

Logistic RegressionSlide19

Logistic Regression

x

Output: model

w

Input: data value pairs (

x

,

y

)s

y

=0 or 1Slide20

Privacy-preserving Logistic Regression

Logistic function

degree 10 polynomial

degree 2 polynomialSlide21

Privacy-preserving Logistic Regression

Logistic function

Our function

Almost the same accuracy as logistic function

Much faster than polynomial approximationSecure-computation-friendly activation functionSlide22

Privacy-preserving Logistic Regression

Logistic function

Our function

Run our protocol for linear regression

Switch to garbled circuit for f

[DSZ15]

Switch back to arithmetic secret sharingSlide23

Vectorization

Mini-batch SGD:Take a batch of B records and update w by their averageConverge faster and smoother

Fast matrix-vector/matrix-matrix multiplicationSlide24

Vectorization

Mini-batch SGD:Multiplication triplets for matrix-vector/matrix multiplications2× online computational overhead compared to plaintext training4-66× offline speedupSlide25

Neural NetworksSlide26

Neural Networks

Mini-batch SGD: coefficient matrices are updated by close-form formulas using matrix/element-wise multiplicationsSlide27

Experimental ResultsSlide28

Experiments Results: Linear Regression100,000 records, 500 features

54 - 1270× faster than systems in

[NWI+13, GSB+16]Support arbitrary partitioning of data

10,000

1000

100

10

1

378

1.4

8782

20

20

4.9

465

141

time(s)

LAN: 1.2GB/s, delay 0.17ms

WAN: 9MB/s, delay 72ms

offline

online

Client-aided

triplets

Client-aided

tripletsSlide29

Experiments Results: Logistic Regression100,000 records, 500 features

10,000

1000

100

10

1

378

9.6

8782

20

20

11.5

652

422

time(s)

LAN: 1.2GB/s, delay 0.17ms

WAN: 9MB/s, delay 72ms

offline

online

Client-aided

triplets

Client-aided

triplets

Scale to 1 million records and 5,000 featuresSlide30

Experiments: Neural Networks2 hidden layers with 128 neurons each

LAN: 25,200 sec online + offlinePlaintext training: 700 sec. 35× overhead.WAN: 220,000* sec online + offlineSlide31

SummaryPrivacy-preserving linear, logistic regression and neural networks

Decimal arithmetic on integer fieldSecure-computation-friendly activation functionsVectorization (mini-batch SGD)System:Orders of magnitude faster than prior workScale to large datasetsSlide32

Future Work

Privacy-preserving Neural NetworksAccuracy: softmax, convolutional neural networks, etc.Efficiency: partitioning, parallelization etc. Multi-party model

Thank you!!!

Q&ASlide33

Large Scale Logistic Regression

1,000,000 records, 5,000 features

LAN: 2,500 sec client-aided offline, 623.5 sec onlineSlide34

Garbled Circuits

AND

a

b

c

a

b

c

0

0

0

0

1

0

1

0

0

1

1

1

Truth Table

a

b

a

b

Garbled Table

c

(

)

(

)

(

)

(

)

cSlide35

Garbled Circuits

server o

server 1

k

b

k

0

, k

1

b

0

b

1

b

0

+b

1

=bSlide36

Switching Between Secret Sharing and GC

server 0

server 1

x

0

x

1

C(

x

0

, x

1

): modulo addition circuit, then output the most significant bit

Garbled circuit C

k

b

k

0

, k

1

b

0

b

1

b

0

+b

1

=b

m

0

=

x0 b0+rm1 = x0 (1-b0)+rOT(b

1)m = x0 b+rm0 = x1 b

1+r’m1 = x1 (1-b1)+r’

OT(

b

0

)

m =

x

1

b+r

f

(

x

) =

x

×

(

x>0)- r

- r’