Preserving Privacy in GPS Traces via Uncertainty-Aware Path

Preserving Privacy in GPS Traces via Uncertainty-Aware Path Preserving Privacy in GPS Traces via Uncertainty-Aware Path - Start

2017-08-27 82K 82 0 0

Preserving Privacy in GPS Traces via Uncertainty-Aware Path - Description

B. Hoh, M. . Gruteser. , H. . Xiong. , and A. . Alrabady. . ACM CCS. Presented by . Solomon Njorombe. Abstract. Motivation. Probe-vehicle automotive monitoring systems. Guaranteed anonymity in location traces datasets. ID: 582655 Download Presentation

Download Presentation

Preserving Privacy in GPS Traces via Uncertainty-Aware Path




Download Presentation - The PPT/PDF document "Preserving Privacy in GPS Traces via Unc..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.



Presentations text content in Preserving Privacy in GPS Traces via Uncertainty-Aware Path

Slide1

Preserving Privacy in GPS Traces via Uncertainty-Aware Path Cloaking

B. Hoh, M. Gruteser, H. Xiong, and A. Alrabady. ACM CCS

Presented by

Solomon Njorombe

Slide2

Abstract

Motivation

Probe-vehicle automotive monitoring systems

Guaranteed anonymity in location traces datasets

Maintain high accuracy

Analysis of GPS Traces from 233 vehicles

Accuracy vs. privacy guarantee (esp. low density areas)

Contribution

Time-to-confusion in location data set

Uncertainty-aware path cloaking algorithm

Slide3

Introduction

Problem definition & Solution highlight

Slide4

Introduction

Increase in sensors and wireless connection.

Many location aware apps, more privacy concerns

Automotive traffic monitoring systems

Infer traffic congestion(using position, speed) from GPS equipped vehicles

Also for road and city planning

How do we protect privacy?

Get rid of identifiers. Spatial-temporal characteristics allow re-identification

K-anonymity concept. Substantial modification limits accuracy

Slide5

Introduction

How do we guarantee high data accuracy and provide strong privacy guarantee?

Contributions

Novel time-to-confusion metric to evaluate privacy. How long can individual vehicles be tracked?

Uncertainty aware privacy algorithm. Guarantee maximum time-to-confusion

Demonstration. Maximum time-to-confusion, more accurate location data than random sampling baseline algorithm

Slide6

Traffic Monitoring with Probe Vehicles

Slide7

Real world GPS Trace Collection

Case study systemTravel time estimate for each route. Use realtime GPS on vehiclesProbe-vehicleOnboard GPS receivers with cellular communicationReport position and speed Central system;Store data in db

Slide8

Real world GPS Trace Collection

70km x 70km road networkCell weight indicate busiest roadsGPS sample [vehicle id, timestamp, longitude, latitude, heading info]Each vehicle 1 sample/minTemporal gaps: Parked/Obstructions

Slide9

Real world GPS Trace Collection

Temporal distribution of GPS for 233 vehiclesEach dot represent received data sampleGaps >10 min(parked)Trace between 2 gaps > 10 min(trip)

Slide10

Data Quality Metrics and Requirements

For data privacy algorithms to increase privacy

Omission, Perturbation or Generalization of data

Trade off btw data quality(affect utility) and privacy

Road map representation

Road map: graph with set of road segments

Segments: Road stretch between 2 intersections

Slide11

Data Quality Metrics and Requirements

Congestion map generation at data analysis engineHigh spatial accuracy

Spatial accuracyUpdate IntervalAccuracy50 mtrs1 sec99.5%50 mtrs45 sec98%100 mtrs1 min Few min delay

Data Analysis Engine

Mapping location samples to road segments

1.

Calculate mean link travel time per segment from location samples mapped on it

2.

Convert mean link travel time into congestion index

3.

Slide12

Data Quality Metrics and Requirements

Other factors:

Road coverage (depend on penetration rate). Aim 3% freeways, 5% surface areas.

percentage of vehicles with traffic monitoring equipment.

Measure relative weighted coverage metric for algorithm based on the following heuristics

Coverage decreases as more samples are withheld

Information from busier roads is more important

Coverage limited by probe-vehicles on the roads

Slide13

Data Quality Metrics and Requirements

Measure effect of removed sample on road coverage

Relative weighted road coverage for a set of location sample L is c : fxn for cell index of specified sample

 

Allocate each location sample a weight (based on how busy)Divide sum modified location sample by sum original location sampleDivide area 1km x 1km cells, count samples ni from cell i in a day (fig.1b)Normalize ni with sum of all samples

# of location sample from cell i

weight of all samples from cell

i

Sum weight of all samples

Slide14

Data Quality Metrics and Requirements

Summary

To measure data quality for traffic monitoring system

We use relative weighted road coverage

Segments is considered covered if there exist data sample with 100m accuracy

Slide15

Privacy Leakage through anonymous location traces

Slide16

Existing Privacy Algorithms

K-anonymity: notion of strong anonymity (for db tables)Generalize the data, until undistinguishable from (k-1) records

To achieve this constraint, they need knowledge of nearby vehicles.

Implemented on trusted server accessing positions of all vehicles

Figure shows spatial accuracy using

CliqueCloak

algorithm

Best accuracy for spatial algorithms but fails to meet accuracy requirements

Slide17

Existing Privacy Algorithms

Best effort algorithms

Subsampling. Collect data with reduced sampling frequency

Tang et al. Sampling frequency with higher intervals. 10 min GPS chipsets offer maximum sampling frequency of 1Hz

Beresford and

Stajano

. Suppress

infor

. in high density areas

Motivation: Suppressing frequency in high density areas increase confusion

Path confusion algorithm

Focus on high density area but perturbs location sample over suppression

Suppression and path confusion don’t consider low density areas. They have no clear advantage

Slide18

Existing Privacy Algorithms

Choosing subsampling as the effort baseline algorithm

Precision 95% except for density 2000 and 50% removal (60%)

No removal – 1 min interval

50% removal – 2 min interval

15 min and 20 min are tracking durations

Slide19

Existing Privacy Algorithms

Figure 3: Empirical distribution of travel times per vehicle trip Per trip travel timeThe shorter the trips the more the frequencyBy following traces for 10 min, adversary can track vehicle to home and sensitive areas

Slide20

Existing Privacy Algorithms

Deduction

Protecting all drivers using subsampling is difficult

1 min interval is already large, low density require more

How do you choose interval? Traffic density change with time and space

Slide21

Privacy Metric and Adversary Model

Privacy risk depends how long adversary can follow a vehicle

Privacy breach: (Both increase with longer traces)

trace with privacy sensitive event

Adversary can id the driver (

eg

known home work location)

Adversary reconstruct path of a vehicle from mix of anonymous samples.

Rely on Max Likelihood detection

Take samples with highest probability

If more than one, adversary is confused and tracing ends

Slide22

Privacy Metric and Adversary Model

Privacy MetricsPrivacy measured by Mean Time To Confusion (MTTC)For any point Tracking uncertainty Pi : probability that sample i belong to vehicle being trackedLower H: more certainty/lower privacyTracking confidence: C=(1-H)We compute pi by first evaluating ,µ: significant distance difference. MTTC: mean tracking time at which H<confusion threshold

 

Slide23

Privacy Metric and Adversary Model

Figure 4: Fitting distance errors in tracking using an exponential function µ from empirical pdf of distance deviation

Slide24

Path Privacy-preserving mechanism

Slide25

Path Privacy-Preserving Mechanism

Present privacy preserving scheme even for low density

Maximum allowable time to confusion and its uncertainty threshold

Need to be aware of other vehicles

Start of as centralized, then relax reliability on trustworthy server

To guarantee max time to confusion

Time since last point of confusion<max specified time to confusion

OR at current time: tracking uncertainty > threshold

Slide26

Path Privacy-Preserving Mechanism

Algorithm

Process data recursively for each time interval (maintain state of each vehicle through recursions)

Input

set of GPS samples at a time:

v.current

GPS Samples

Max time to confusion

Associated uncertainty threshold

Output

Publishable set of sample GPS

Slide27

Path Privacy-Preserving Mechanism

Algorithm

Identify vehicles that can be safely released based on:

Time passed <

confusionTimeout

Id vehicles that can be revealed based on:

Current tracking uncertainty > specified confusion level

Update

the time

for last confusion point, and last visible GPS samples

Slide28

Path Privacy-Preserving Mechanism

AlgorithmsReduce complexityCalculate tracking uncertainty only with k closest samples to prediction pointPrune the candidate vehicle until all meet uncertainty thresholdKey property to achieve ∀

 

Slide29

Algorithm Extension for Reacquisition Tracking Model

No adequate privacy guarantee under reacquisition model. Reacquisition possible over short time scalesNo reacquisition over gaps longer than 10 min

Slide30

Algorithm Extension for Reacquisition Tracking Model

The following extensions prevent reacquisition after a time window w ( w=10 in this case)

After the

confusionTimeout

expires

Calculated from prior released location sample within last w minutes. Values only released if these confusion values>threshold

Before the

confusionTimeout

expires

Released samples must maintain confusion to samples released during last w minutes and before

confisionTimeout

was last reset

Slide31

Experimental Evaluation

Slide32

Experimental Setup

Data set

Targeting tracking for 24

hrs

500 probe vehicles and 2000 probe vehicles.

Density scenario: Overlay some 24hrs traces in different dates

Evaluation Metric

Tracking time: Used Time to confusion (sec 3) as privacy metric. Reported max TTC and median TTC

(Relative) Weighted Road Coverage.

Indicate data quality.

Provide reasonable privacy, and same coverage.

Also provide percentage of released sample. (Original traces = 100%)

Slide33

Experimental Setup: uncertainty-aware path cloaking traces

Generated at off peak time in high density scenario

Generated at peak time in high density scenario

Gray dots: released samples Black dots: removed samples

More dots are retained in busier areas

Slide34

Max value of TTC

Median value of TTC

Protection Against Target Tracking: Bounded Tracking Time without reacquisition

Uncertainty aware path clocking guarantees 5 min even at higher quality

Release up to 92.5% of original while achieving bounded tracking property

Slide35

Max value of TTC

Median value of TTC

Protection Against Target Tracking: Dependence on Reacquisition and Density

Maximum allowable

amt

of released location sample is lowered

Median increase by 1, due to change in tracking model

Slide36

Max TTC without reacquisition

Max TTC with reacquisition

Protection Against Target Tracking: Dependence on Reacquisition and Density

While random sampling require longer TTC, scheme preserves it at 5 min.

Slide37

Maximum TTC against Weighted Road Coverage (Uncertainty-aware privacy algorithm)

Maximum TTC against Weighted Road ((with reacquisition) Uncertainty-aware privacy algorithm)

Protection Against Target Tracking: Quality of Service Analysis

Slide38

Number of Released Location Samples in Peak Time

Weighted Road Coverage in Peak Time

Protection Against Target Tracking: Quality of Service Analysis

Slide39

Conclusion

Slide40

Conclusion

Proposed a novel time to confusion metric for quantification of privacy in an anonymized set of location traces

Developed uncertainty-aware privacy algorithm that guarantee a defined maximum time-to-confusion

Demonstrated through an experiment through real world GPS

Slide41

Questions


About DocSlides
DocSlides allows users to easily upload and share presentations, PDF documents, and images.Share your documents with the world , watch,share and upload any time you want. How can you benefit from using DocSlides? DocSlides consists documents from individuals and organizations on topics ranging from technology and business to travel, health, and education. Find and search for what interests you, and learn from people and more. You can also download DocSlides to read or reference later.