Node Differential Privacy Sofya Raskhodnikova Penn State University Joint work with Shiva Kasiviswanathan GE Research Kobbi Nissim BenGurion U and Harvard U ID: 574341
Download Presentation The PPT/PDF document "1 Graph Analysis with" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
1
Graph Analysis with Node Differential Privacy
Sofya RaskhodnikovaPenn State University
Joint work with
Shiva Kasiviswanathan (GE Research), Kobbi Nissim (Ben-Gurion U. and Harvard U.), Adam Smith (Penn State)Slide2
Publishing information about graphs
Many datasets can be represented as graphs
“Friendships” in online social networkFinancial transactionsEmail communication
Romantic relationships2
American J. Sociology
,
Bearman, Moody, Stovelimage source http://community.expressor-software.com/blogs/mtarallo/36-extracting-data-facebook-social-graph-expressor-tutorial.html
Privacy is a big issue!Slide3
Differential privacy for graph data
Differential privacy
[Dwork McSherry Nissim Smith 06]
An algorithm A is
-differentially private if for all pairs of neighbors
and all sets of answers S:
Graph
G
A
queries
answers
)
(
Government,
researchers,
businesses
(or)
m
alicious
adversary
3
image
source
http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/
Trusted
curator
UsersSlide4
Two variants of differential privacy for graphs
Edge differential privacy
Two graphs are neighbors if they differ in one edge.Node differential privacyTwo graphs are neighbors if one can be obtained from the other by deleting a node and its adjacent edges.
4
G:
G
:
G:
G
:
Slide5
Node differentially private analysis of graphs
5
image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/
Two conflicting goals:
utility and privacy
Impossible to get both in the worst casePreviously: no node differentially private algorithms that are accurate on realistic graphs
Graph
G
A
queries
answers
)
(
Government,
researchers,
businesses
(or)
m
alicious
adversary
Trusted
curator
UsersSlide6
Our contributions
First node differentially private algorithms that are accurate for sparse graphsnode differentially private for all graphs
accurate for a subclass of graphs, which includes graphs with sublinear (not necessarily constant) degree boundgraphs where the tail of the degree distribution is not too heavydense graphsTechniques for node differentially private algorithmsMethodology for analyzing the accuracy of such algorithms on realistic networks
Concurrent work on node privacy [Blocki
Blum Datta Sheffet
13]
6Slide7
Node differentially private algorithms for releasing
number of edgescounts of small subgraphs (e.g., triangles
, -triangles, -stars)degree distributionAccuracy analysis of our algorithms for graphs with not-too-heavy-tailed degree distribution: with
-decay for constant
Notation:
fraction of nodes in G of degree
Every graph satisfies -decayNatural graphs (e.g., “scale-free” graphs, Erdos-Renyi) satisfy Our contributions: algorithms
7
A graph G satisfies
-decay
if
for all
…
…
Frequency
Degrees
…
…
…Slide8
Our contributions: accuracy analysis
Node differentially private algorithms for releasing number of edgescounts of small subgraphs
(e.g., triangles, -triangles, -stars)degree distributionAccuracy analysis of our algorithms for graphs with not-too-heavy-tailed degree distribution: with
-decay for constant
number of edges
counts of small subgraphs (e.g., triangles,
-triangles, -stars)degree distribution 8
A graph G satisfies
-decay
if for all
(1+o(1))-approximation
}
…
…Slide9
Previous work on
differentially private computations on graphsEdge differentially private algorithms
number of triangles, MST cost [Nissim Raskhodnikova Smith 07]degree distribution [Hay
Rastogi Miklau
Suciu 09, Hay Li Miklau
Jensen 09, Karwa Slavkovic
12]small subgraph counts [Karwa Raskhodnikova Smith Yaroslavtsev 11]cuts [Blocki Blum Datta Sheffet 12]Edge private against Bayesian adversary (weaker privacy)small
subgraph
counts
[
Rastogi
Hay
Miklau
Suciu
09]
Node zero-knowledge private (
stronger
privacy)average degree, distances to nearest connected, E
ulerian, cycle-free graphs for dense graphs [Gehrke
Lui Pass 12]
9Slide10
Challenge for node privacy: high local sensitivity
Local sensitivity
[NRS’07]:Global sensitivity [DMNS’06]:For many functions of the data, node
is high.
Consider adding a node connected to all other nodes.Examples:
(G)= |E(G)|.
(G)= # of s in G. 10
Edge
is 1;
node
is
for all G.
Edge
is
;
node
is |E(G)|.
Slide11
“Projections” on graphs of small degree
Let
= family of all graphs, = family of graphs of degree .Notation.
=
node
over
. = node over .Observation. is low for many useful
.
Examples
:
=
(compare to
=
)
=
(
compare to
=
)
Idea:
``Project’’ on graphs in
for a carefully chosen d << n.
11
Goal:
privacy for all graphsSlide12
Method 1: Lipschitz extensions
Release
via GS framework [DMNS’06]Requires designing Lipschitz extension for each function
we base ours on maximum flow and linear and convex programs
12
low
high
=
A function
is a
Lipschitz
extension
of
from
to
if
agrees with
on
and
=
Slide13
Lipschitz
extension of
: flow graph
For a graph G=(V, E), define
flow graph of G:Add edge
iff .(G) is the value of the maximum flow in this graph.Lemma. (G
)/2
is a
Lipschitz
extension of
.
13
s
1
3
5
1'
3'
5'
t
2
4
2'
4'
1
Slide14
Lipschitz
extension of
: flow graph
For a graph G=(V, E), define
flow graph of G:Add edge
iff .(G) is the value of the maximum flow in this graph.Lemma. (G
)/2
is a
Lipschitz
extension of
.
Proof: (1)
(G
) =
for all
G
(2)
= 2⋅
14
s
1
3
5
1'
3'
5'
t
2
4
2'
4'
1
1/Slide15
Lipschitz
extension of
: flow graph
For a graph G=(V, E), define
flow graph of G:
(G) is the value of the maximum flow in this graph.Lemma. (G)/2 is a Lipschitz extension of .Proof: (1)
(G
) =
for all
G
(2)
= 2⋅
=
2
15
s
1
3
5
1'
3'
5'
t
2
4
2'
4'
1
6
'
6Slide16
For a graph G=([n], E), define
LP with variables
for all triangles :
(G) is the value of LP.
Lemma.
(G)
is a Lipschitz extension of .Can be generalized to other counting queriesOther queries use convex programs Lipschitz extensions via linear/convex programs16
Maximize
for all triangles
for all nodes
Slide17
Method 2: Generic reduction to privacy over
Time(A) = Time(B) + O(m+n
)Reduction works for all functions
How it works: Truncation T(G) outputs G with nodes of degree
removed.Answer queries on T(G) instead of G
17
low
high
Input:
Algorithm
B that is node-DP over
Output:
Algorithm
A that is node-DP over
,
has
accuracy similar to B on “nice”
graphs
via Smooth Sensitivity framework
[
NRS’07]
via
finding
a
DP upper bound
on
[
Dwork
Lei
09, KRSY’11
]
and
running any algorithm
that is
-
node-DP
over
T
query
f
+ noise
T(G)
G
S
(G)
ASlide18
Generic Reduction via Truncation
Truncation T(G) removes nodes of degree
.On query , answer
How much noise?
Local sensitivity of
as a map
Lemma
.
,
where
=
nodes of degree
.
Global
sensitivity
is
too
large.
18
…
…
d
Frequency
Degrees
Nodes that determine
Slide19
Smooth Sensitivity of Truncation
Lemma
.
is a smooth bound
for
, computable in time
“
Chain rule
”:
is a smooth bound
for
19
Smooth Sensitivity Framework
[NRS ‘07]
is a
smooth bound on local sensitivity
of
if
for all neighbors
T
query
f
+ noise
T(G)
G
S
(G)
ASlide20
Utility of the Truncation Mechanism
Lemma.
If we truncate to a random degree in ,
Application to releasing
the degree
distribution:
an -node differentially private algorithm
such that
with
probability
at least
if
satisfies
-decay for
20
Utility:
If G is
-bounded,
expected noise magnitude is
.
T
query
f
+ noise
T(G)
G
S
(G)
ASlide21
Techniques used to obtain our results
Node differentially private algorithms for releasing number of edgescounts of small subgraphs
(e.g., triangles, -triangles, -stars)degree distribution
21
via Lipschitz extensions
} via generic reductionSlide22
Conclusions
It is possible to design node differentially private algorithms with good utility on sparse graphsOne can first test whether the graph is sparse privatelyDirections for future workNode-private algorithm for releasing cutsNode-private synthetic graphs
What are the right notions of privacy for graph data?22