Preserving Privacy in GPS Traces via Uncertainty-Aware Path

Preserving Privacy in GPS Traces via Uncertainty-Aware Path Preserving Privacy in GPS Traces via Uncertainty-Aware Path - Start

2017-08-27 82K 82 0 0

Preserving Privacy in GPS Traces via Uncertainty-Aware Path - Description

B. Hoh, M. . Gruteser. , H. . Xiong. , and A. . Alrabady. . ACM CCS. Presented by . Solomon Njorombe. Abstract. Motivation. Probe-vehicle automotive monitoring systems. Guaranteed anonymity in location traces datasets. ID: 582655 Download Presentation

Download Presentation

Preserving Privacy in GPS Traces via Uncertainty-Aware Path

Download Presentation - The PPT/PDF document "Preserving Privacy in GPS Traces via Unc..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentations text content in Preserving Privacy in GPS Traces via Uncertainty-Aware Path


Preserving Privacy in GPS Traces via Uncertainty-Aware Path Cloaking

B. Hoh, M. Gruteser, H. Xiong, and A. Alrabady. ACM CCS

Presented by

Solomon Njorombe




Probe-vehicle automotive monitoring systems

Guaranteed anonymity in location traces datasets

Maintain high accuracy

Analysis of GPS Traces from 233 vehicles

Accuracy vs. privacy guarantee (esp. low density areas)


Time-to-confusion in location data set

Uncertainty-aware path cloaking algorithm



Problem definition & Solution highlight



Increase in sensors and wireless connection.

Many location aware apps, more privacy concerns

Automotive traffic monitoring systems

Infer traffic congestion(using position, speed) from GPS equipped vehicles

Also for road and city planning

How do we protect privacy?

Get rid of identifiers. Spatial-temporal characteristics allow re-identification

K-anonymity concept. Substantial modification limits accuracy



How do we guarantee high data accuracy and provide strong privacy guarantee?


Novel time-to-confusion metric to evaluate privacy. How long can individual vehicles be tracked?

Uncertainty aware privacy algorithm. Guarantee maximum time-to-confusion

Demonstration. Maximum time-to-confusion, more accurate location data than random sampling baseline algorithm


Traffic Monitoring with Probe Vehicles


Real world GPS Trace Collection

Case study systemTravel time estimate for each route. Use realtime GPS on vehiclesProbe-vehicleOnboard GPS receivers with cellular communicationReport position and speed Central system;Store data in db


Real world GPS Trace Collection

70km x 70km road networkCell weight indicate busiest roadsGPS sample [vehicle id, timestamp, longitude, latitude, heading info]Each vehicle 1 sample/minTemporal gaps: Parked/Obstructions


Real world GPS Trace Collection

Temporal distribution of GPS for 233 vehiclesEach dot represent received data sampleGaps >10 min(parked)Trace between 2 gaps > 10 min(trip)


Data Quality Metrics and Requirements

For data privacy algorithms to increase privacy

Omission, Perturbation or Generalization of data

Trade off btw data quality(affect utility) and privacy

Road map representation

Road map: graph with set of road segments

Segments: Road stretch between 2 intersections


Data Quality Metrics and Requirements

Congestion map generation at data analysis engineHigh spatial accuracy

Spatial accuracyUpdate IntervalAccuracy50 mtrs1 sec99.5%50 mtrs45 sec98%100 mtrs1 min Few min delay

Data Analysis Engine

Mapping location samples to road segments


Calculate mean link travel time per segment from location samples mapped on it


Convert mean link travel time into congestion index



Data Quality Metrics and Requirements

Other factors:

Road coverage (depend on penetration rate). Aim 3% freeways, 5% surface areas.

percentage of vehicles with traffic monitoring equipment.

Measure relative weighted coverage metric for algorithm based on the following heuristics

Coverage decreases as more samples are withheld

Information from busier roads is more important

Coverage limited by probe-vehicles on the roads


Data Quality Metrics and Requirements

Measure effect of removed sample on road coverage

Relative weighted road coverage for a set of location sample L is c : fxn for cell index of specified sample


Allocate each location sample a weight (based on how busy)Divide sum modified location sample by sum original location sampleDivide area 1km x 1km cells, count samples ni from cell i in a day (fig.1b)Normalize ni with sum of all samples

# of location sample from cell i

weight of all samples from cell


Sum weight of all samples


Data Quality Metrics and Requirements


To measure data quality for traffic monitoring system

We use relative weighted road coverage

Segments is considered covered if there exist data sample with 100m accuracy


Privacy Leakage through anonymous location traces


Existing Privacy Algorithms

K-anonymity: notion of strong anonymity (for db tables)Generalize the data, until undistinguishable from (k-1) records

To achieve this constraint, they need knowledge of nearby vehicles.

Implemented on trusted server accessing positions of all vehicles

Figure shows spatial accuracy using



Best accuracy for spatial algorithms but fails to meet accuracy requirements


Existing Privacy Algorithms

Best effort algorithms

Subsampling. Collect data with reduced sampling frequency

Tang et al. Sampling frequency with higher intervals. 10 min GPS chipsets offer maximum sampling frequency of 1Hz

Beresford and


. Suppress


. in high density areas

Motivation: Suppressing frequency in high density areas increase confusion

Path confusion algorithm

Focus on high density area but perturbs location sample over suppression

Suppression and path confusion don’t consider low density areas. They have no clear advantage


Existing Privacy Algorithms

Choosing subsampling as the effort baseline algorithm

Precision 95% except for density 2000 and 50% removal (60%)

No removal – 1 min interval

50% removal – 2 min interval

15 min and 20 min are tracking durations


Existing Privacy Algorithms

Figure 3: Empirical distribution of travel times per vehicle trip Per trip travel timeThe shorter the trips the more the frequencyBy following traces for 10 min, adversary can track vehicle to home and sensitive areas


Existing Privacy Algorithms


Protecting all drivers using subsampling is difficult

1 min interval is already large, low density require more

How do you choose interval? Traffic density change with time and space


Privacy Metric and Adversary Model

Privacy risk depends how long adversary can follow a vehicle

Privacy breach: (Both increase with longer traces)

trace with privacy sensitive event

Adversary can id the driver (


known home work location)

Adversary reconstruct path of a vehicle from mix of anonymous samples.

Rely on Max Likelihood detection

Take samples with highest probability

If more than one, adversary is confused and tracing ends


Privacy Metric and Adversary Model

Privacy MetricsPrivacy measured by Mean Time To Confusion (MTTC)For any point Tracking uncertainty Pi : probability that sample i belong to vehicle being trackedLower H: more certainty/lower privacyTracking confidence: C=(1-H)We compute pi by first evaluating ,µ: significant distance difference. MTTC: mean tracking time at which H<confusion threshold



Privacy Metric and Adversary Model

Figure 4: Fitting distance errors in tracking using an exponential function µ from empirical pdf of distance deviation


Path Privacy-preserving mechanism


Path Privacy-Preserving Mechanism

Present privacy preserving scheme even for low density

Maximum allowable time to confusion and its uncertainty threshold

Need to be aware of other vehicles

Start of as centralized, then relax reliability on trustworthy server

To guarantee max time to confusion

Time since last point of confusion<max specified time to confusion

OR at current time: tracking uncertainty > threshold


Path Privacy-Preserving Mechanism


Process data recursively for each time interval (maintain state of each vehicle through recursions)


set of GPS samples at a time:


GPS Samples

Max time to confusion

Associated uncertainty threshold


Publishable set of sample GPS


Path Privacy-Preserving Mechanism


Identify vehicles that can be safely released based on:

Time passed <


Id vehicles that can be revealed based on:

Current tracking uncertainty > specified confusion level


the time

for last confusion point, and last visible GPS samples


Path Privacy-Preserving Mechanism

AlgorithmsReduce complexityCalculate tracking uncertainty only with k closest samples to prediction pointPrune the candidate vehicle until all meet uncertainty thresholdKey property to achieve ∀



Algorithm Extension for Reacquisition Tracking Model

No adequate privacy guarantee under reacquisition model. Reacquisition possible over short time scalesNo reacquisition over gaps longer than 10 min


Algorithm Extension for Reacquisition Tracking Model

The following extensions prevent reacquisition after a time window w ( w=10 in this case)

After the



Calculated from prior released location sample within last w minutes. Values only released if these confusion values>threshold

Before the



Released samples must maintain confusion to samples released during last w minutes and before


was last reset


Experimental Evaluation


Experimental Setup

Data set

Targeting tracking for 24


500 probe vehicles and 2000 probe vehicles.

Density scenario: Overlay some 24hrs traces in different dates

Evaluation Metric

Tracking time: Used Time to confusion (sec 3) as privacy metric. Reported max TTC and median TTC

(Relative) Weighted Road Coverage.

Indicate data quality.

Provide reasonable privacy, and same coverage.

Also provide percentage of released sample. (Original traces = 100%)


Experimental Setup: uncertainty-aware path cloaking traces

Generated at off peak time in high density scenario

Generated at peak time in high density scenario

Gray dots: released samples Black dots: removed samples

More dots are retained in busier areas


Max value of TTC

Median value of TTC

Protection Against Target Tracking: Bounded Tracking Time without reacquisition

Uncertainty aware path clocking guarantees 5 min even at higher quality

Release up to 92.5% of original while achieving bounded tracking property


Max value of TTC

Median value of TTC

Protection Against Target Tracking: Dependence on Reacquisition and Density

Maximum allowable


of released location sample is lowered

Median increase by 1, due to change in tracking model


Max TTC without reacquisition

Max TTC with reacquisition

Protection Against Target Tracking: Dependence on Reacquisition and Density

While random sampling require longer TTC, scheme preserves it at 5 min.


Maximum TTC against Weighted Road Coverage (Uncertainty-aware privacy algorithm)

Maximum TTC against Weighted Road ((with reacquisition) Uncertainty-aware privacy algorithm)

Protection Against Target Tracking: Quality of Service Analysis


Number of Released Location Samples in Peak Time

Weighted Road Coverage in Peak Time

Protection Against Target Tracking: Quality of Service Analysis





Proposed a novel time to confusion metric for quantification of privacy in an anonymized set of location traces

Developed uncertainty-aware privacy algorithm that guarantee a defined maximum time-to-confusion

Demonstrated through an experiment through real world GPS



About DocSlides
DocSlides allows users to easily upload and share presentations, PDF documents, and images.Share your documents with the world , watch,share and upload any time you want. How can you benefit from using DocSlides? DocSlides consists documents from individuals and organizations on topics ranging from technology and business to travel, health, and education. Find and search for what interests you, and learn from people and more. You can also download DocSlides to read or reference later.