Kartik Nayak With Xiao Shaun Wang Stratis Ioannidis Udi Weinsberg Nina Taft Elaine Shi 1 2 Users Data Data Privacy concern Data Mining Engine Data Model Data Mining on User Data ID: 674913
Download Presentation The PPT/PDF document "GraphSC : Parallel Secure Computation Ma..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
GraphSC: Parallel Secure Computation Made Easy Kartik Nayak
With Xiao Shaun Wang, Stratis Ioannidis, Udi Weinsberg, Nina Taft, Elaine Shi
1Slide2
2
UsersDataData
Privacy concern!
Data Mining Engine
Data Model
Data Mining on User DataSlide3
3
Graph representing social connections
Graph representing professional connections
Compute user’s influence in both circles
Companies Computing on Private
D
ataSlide4
4
Companies want to run
machine learning
algorithms
Users/Companies do
NOT
want to reveal
data
Can we enable this in practice?Slide5
5
Cryptography to the rescue
:
Secure Multiparty Computation
Ensures that we learn only the outcomeSlide6
Key Challenges6
Generic Solutions1
Lot of work improving individual algorithms
Departure from one-at-a-time approachSlide7
Key Challenges7
Convert Program to Run on Secure Computation(Cost of obliviousness)2Slide8
Key Challenges8
Parallelizability3
There’s a lot of data – maintain benefits of parallelism in the insecure setting
With cryptography, expensive computationSlide9
9Key ContributionsSlide10
Key Contributions10
Generic Framework for “Graph-parallel” Algorithms
Pregel
by
PageRank
Matrix Factorization using gradient descent
Risk Minimization using ADMM
And many more
Matrix Factorization using ALS
Challenge:
Generic SolutionsSlide11
Key Contributions11
Efficiently Convert Graph-parallel Programs to Oblivious ProgramsTotal work
blowup is O(log |V|)
Blowup for naïve solution: O(|V|) for sparse graphs
Challenge:
Convert program
to run on Secure ComputationSlide12
Key Contributions12
Maintain ParallelizabilityDepth of the computation is O(log |V|)Matrix Factorization:
4K ratings, 32 threads
[NIWJTB’13]1.4
hours
Challenge:
Parallelizability
< 4
minsSlide13
131
2
3
Efficiently Convert to Oblivious Programs
Maintain Parallelizability
Generic Framework for Graph-parallel Algorithms
Key ContributionsSlide14
14
function bs(val, s,
t)
mid = (s + t) / 2;
if
(
val
<
mem[mid
]
)
bs
(
val
, 0,
mid)
else
bs(val
, mid+1, t)Programmer’s favorite model
Cryptographer’s favorite modelSlide15
15Programmer’s model: Programs
Oblivious Programs
Cryptographer’s model:
Circuits
Intuitively,
Program traces should not depend on input dataSlide16
16
function bs(val, s,
t)
mid = (s + t) / 2;
if
(
val
<
mem[mid
]
)
bs
(
val
, 0,
mid)
else
bs(val
, mid+1, t)Programmer’s favorite model
Cryptographer’s favorite modelSlide17
17Programmer’s model: Programs
Oblivious Programs
Cryptographer’s model:
Circuits
Intuitively,
Program traces should not depend on input data
Easy
HardSlide18
18
Achieving Parallelism
Goal:
Low
Depth Circuits
Oblivious Parallel RAM
[BCP’14]
Polylogarithmic
Blowup:
Not practical
GraphSC
:
O(log |V|)
blowupSlide19
19
Pregel by
“
Graph-parallel”
algorithms
[LGKB’10, GLGBG’12, MABDHLC’10, ZCF’10]Slide20
20Graph-parallel Algorithms
A
B
C
D
1
2
4
5
1
1
1
2
4
7
1
0
1
Scatter: Send data to edges
Gather: Aggregate data from edges
Apply: Perform some
computationSlide21
21Obliviousness of Graph-parallel AlgorithmsDo not reveal edge/vertex data
Do not reveal structure of the graphNaïve Solution: O(|V|
2
)
A
B
C
D
1
1
1
2
4
Our
Solution:
O(|E|
log|V
|)
7
1
0
1Slide22
Oblivious Gather – Key Trick22
3
4
1
2Slide23
Oblivious Gather – Key Trick23
Oblivious Sort with (v, isVertex)
Single pass
Sort
: O
(|E|
log
|V|)
Single pass: O
(|E|)
Oblivious Gather:
(|E|
log
|V|)
Gather in clear: O
(|E|)Slide24
ScatterComplexity of Our Algorithms24
GatherApplySequential Insecure
(Total Work)
Parallel Oblivious
(Total Work)
Parallel
Oblivious
(Parallel Time)
O(|E|)
O(|V|)
O(|E| log |V|)
O(|E|)
O(log |V|)
O(1)
Naïve
Oblivious
(Total
Work)
O(|
V|
2
)
O
(|
E
|)Slide25
Algorithms on GraphSCHistogram computationPageRankMatrix Factorization using gradient descent
Matrix Factorization using alternating least squaresBellman-Ford shortest pathBipartite matchingParallel empirical risk minimization through alternating direction method of multipliers (ADMM)25
Pregel
bySlide26
Experimental Setup26
…
…
Cloud 1 (Garblers)
Cloud 2 (Evaluators)
Two Scenarios:
LAN
Across Data Centers (WAN)Slide27
Key Evaluation Results27Histogram
Input Size1K – 0.5M
Parallel Time (32 processors)
4 sec – 34 min
PageRank (1 iteration)
Matrix Factorization (1 iteration)
Using GD
Using ALS
4K – 128K
20 sec – 15.5 min
1K – 32K
47 sec – 34 min
64 – 4K
2 min – 2.35 hoursSlide28
Max: 16K ratings (64x smaller data) [NIWJTB’13]
Running at Scale28Matrix Factorization using gradient descent: 1M ratings, 6K
users, 4K
movies [KBV’09]
7 machine cluster,
128
processors,
525 GB
RAM
Time taken: ~
13 hours
(1 iteration)
4K ratings, 32
threads
1.4 hours
< 4
mins
We used only 7 machines!
13 hours
-> few
mins
by
using more
machinesSlide29
Across Data Centers29
Page RankGarblers: OregonEvaluators: N. VirginiaB/W provisioned: 2 Gbps
Time reduces linearly with increasing processorsSlide30
30GraphSC is a parallel secure computation framework for Graph-parallel algorithms
www.oblivm.comThank You!kartik@cs.umd.edu
Conclusion