Processing Transitive Nearest-Neighbor Queries in Multi-Cha

Processing Transitive Nearest-Neighbor Queries in Multi-Cha Processing Transitive Nearest-Neighbor Queries in Multi-Cha - Start

2017-06-27 64K 64 0 0

Download Presentation

Processing Transitive Nearest-Neighbor Queries in Multi-Cha

Download Presentation - The PPT/PDF document "Processing Transitive Nearest-Neighbor Q..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentations text content in Processing Transitive Nearest-Neighbor Queries in Multi-Cha


Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments

Xiao Zhang1, Wang-Chien Lee1, Prasenjit Mitra1, 2, Baihua Zheng31 Department of Computer Science and Engineering2 College of Information Science and TechnologyThe Pennsylvania State University3 School of Information Systems, Singapore Management University

EDBT, Nantes, France, 03/28/2008


BackgroundProblem AnalysisNew TNN AlgorithmsOptimizationExperimentsConclusions & Future Work



What is TNN?S is a set of banksR is a set of restaurantsTNN distance = 5+1 = 6

Background – TNN


What is TNN?Given a query point p and two datasets S and R, TNN returns a pair of objects (s, r) such that ∀(s’, r’)∈S×R, dis(p, s) + dis(s, r) ≤ dis(p, s’) + dis(s’, r’) where dis(p,s) is the Euclidean distance between p and s.

Background – TNN

First proposed by Zheng, Lee and Lee [1].

[1] B. Zheng,




. Transitive nearest neighbor search in mobile environments.




Server has all the data and broadcasts data in forms of radio signals in channels.Mobile clients (cell phones and PDAs) tune in to broadcast channels, download necessary data and process queries.

Background - broadcast

Broadcast VS. on-demandSupport an arbitrary number of mobile devices to have simultaneous accessEfficient use of limited bandwidthLight workload on the server side


Assumption:Zheng, Lee and Lee assumed a single broadcast channel.Based on existing technology (dual-mode, dual-standby cell phone), we assume multiple channels.A mobile client can access information in multiple channels simultaneouslyChallenges:How to utilize the parallel processing ability of mobile clients to facilitate query processing?How to reduce access time?How to reduce energy consumption?

Background - motivation


1. We developed two new algorithms for TNN query in multi-channel access environment.2. We proposed two new distance metrics (MinTransDist and MinMaxTransDist) so that our new algorithms efficiently reduce search cost.3. We proposed an optimization technique to reduce energy consumption.

Our contributions:


1. Two broadcast channels, for S and R2. 2-dim points3. Air-indexing: R-tree[2]4. Broadcast in depth-first order, in order to avoid back-tracking5. (1, m) interleaving [3]6. performance metrics (in # of pages): Access timeTune-in time

Background – settings

[2] A.


. R-trees: a dynamic index structure for spatial searching. in







, and


. Data on air: organization and access.




Problem Analysis

Randomly choose


pair of objects (

s’, r’

), use the trans. dist. as a search range

Guarantee to enclose the answer pair (

s, r



Theorem[1]: the transitive distance determined by any pair of objects (s, r) is an upper bound.General ideas of answering TNN queries:Estimate: find a search range from the query point p by searching the indexFilter: filter unqualified data objects in the search range determined earlier to find the pair of objects with minimum transitive distance.

Problem Analysis


Deficiencies of existing algorithms:Approximate-TNN-Search:Uses an equation to estimate the search range in the first stepSearch range may be too large or too smallWindow-Based-TNN-Search:Two sequential NN searches in estimation stepSearch range estimation is done in sequential orderLarge access time

Problem Analysis


Algo 1: Double-NN-SearchIssue two NN queries in estimation stepp’s NN in S, and p’s NN in R(s1, r2)

New TNN algorithms – algo1


Hybrid-NN-SearchIncreases interaction between two channelsUses result of the finished NN to guide the unfinished NN in order to reduce search rangeUses new distance metrics to perform branch-and-boundTreat TNN distance as a whole

New TNN Algorithms – algo2


NN in Channel 1 finishes firstAlready found s=p.NN(S)Looking for r2, instead of r1

New TNN Algorithms – algo 2


NN in channel 2 finishes firstAlready found r=p.NN(R)Looking for s2 instead of s1Use new criteria when searching the indexNeed new distance metrics for branch&bound

New TNN Algorithms – algo 2


MinTransDist: Lower bound for trans. dist. from p to an MBR to r.MinMaxTransDist:Upper bound for trans. dist. from p to an MBR to r.Details given in the paper.

New TNN Algorithms –




Algorithm description:If the two NN searches in both channels are not finished, follow the Double-NN algorithmIf the NN search in Channel 1 (Dataset S) finishes first, let s=p.NN(S), use s as the new query point and perform NN on the remaining portion of R-tree for dataset R. If the NN search in Channel 2 (Dataset R) finishes first, change distance metrics, use MinTransDist and MinMaxTransDist to perform branch-and-bound. Find an s which can minimize the transitive distance.

New TNN Algorithms -



Updating and pruning strategyUse queue to keep potential MBRs, sorted based on their arrival timeCase 2 (s=p.NN(S) finishes first):Switch NN query point to the sInitial upper bound updateIf there is an intermediate result r’, update the upper bound with dis(p, s)+dis(s, r’ )Scan the queue of MBRs and use dist. metr. in traditional NN queries.

New TNN Algorithms - Hybrid


Updating and pruning strategy (cont.)Case 3 (r=p.NN(R) finishes first):If there is an intermediate result s’, use dis(p, s’)+dis(s’, r) as the new upper boundThen scan all the MBRs in the queue, use z=minMi∈MBR_queue{MinMaxTransDist(p, Mi, r)} to update the upper bound.In traversal, use MinMaxTransDist to update the upper bound; use MinTransDist for pruning

New TNN Algorithms - Hybrid


Example for pruning:

New TNN Algorithms - Hybrid


Goal: reduce energy consumptionAnalysis:Previous algorithms minimize the search range in the Estimate Step by issuing “exact” searchEnergy consumption in Filter Step is lowEnergy consumption in Estimate Step is highApproach: use “approximate” search in Estimate Step to save energy in this step



Approximate Search:Relax the pruning conditionUse ratio of overlapping area to estimate the probabilityCompare the ratio with a threshold α



How to determine α? factors:R-tree height and node depthUse small α on the root and large α on leavesDifference in densities of the two datasets involvedSmall α or 0 on the dataset with smaller density





exact search

approximate search


Dataset 1:39,000 * 39,000 square regionDensities: 10-7.0, 10-6.6, 10-6.2, 10-5.8, 10-5.4, 10-5.0, 10-4.6, 10-4.2# of points: 152, 382, 960, 2411, 6055, 15210, 38206, 95969Dataset 2:39,000 * 39,000 square region# of points: 2,000 – 30,000 with 2,000 increment

Performance Evaluation - settings


R-tree as air indexBroadcast in depth-first orderSTR packing algorithm [3](1, m) interleaving [2]1,000 query points generated for each of the experiments

Performance Evaluation - settings

ParameterSizeIndex pointer2 bytesCoordinate4 bytesData content1k bytesPage capacity64 – 512 bytes

[3] S.Leutenegger, M.Lopez and J.Edginton. Str: a simple and efficient algorithm for r-tree packing.



[2] T.Imielinski, S.Viswanathan, and B.Badrinath. Data on air: organization and access.




Algorithms with exact search:Access time: Double-NN and Hybrid-NN have the same access time, which is smaller than Window-Based1.8 ≥ size(S) / size(R) ≥ 1 / 40

Performance Evaluation


Algorithms with exact search:Tune-in time: when 0.01 ≤ size(S)/size(R) ≤ 0.4 Hybrid-NN gives the best tune-in time

Performance Evaluation


ANN vs. eNNImprovement in tune-in time ranges from 11%-20%

Performance Evaluation


Hybrid algorithm with ANN:

Performance Evaluation


Double-NN and Hybrid-NN effectively reduce access timeCases in which our algorithms reduces tune-in time are stated and discussedOptimization technique effectively reduces tune-in time of all three algorithms



Generalized TNN queries in broadcast environment:More than 2 datasets are involvedVisiting order not specifiedComplete route queryUsing new distance metrics in disk based environment

Future Work


Any questions?

Thank you!


Def 1: (MinTransDist)Given two points p and r, and an MBR MS, MinTransDist(p, MS ,r) finds a point s on MS such that MinTransDist(p, MS ,r)=dis(p, s)+dis(s, r) and for any point s’≠ s, s’ ∈MS dis(p, s’)+dis(s’, r) ≥ MinTransDist(p, MS ,r)

New TNN Algorithms – distance metrics (backup slides)


Def 2: (MaxDist)Given two points p and r, and a line segment ℓ, MaxDist(p, ℓ, r) = maxi=I,2 {dis(p, vi)+dis(vi, r), where vi, (i=1, 2) are the two end points of ℓMaxDist(p, ℓ, r) gives a tight upper bound for all the transitive distances from p to any points on ℓ, to r.

New TNN Algorithms – distance metrics (backup slides)




Def 3: (MinMaxTransDist)Given two points p and r, and an MBR MS, MinMaxTransDist(p, MS, r) = min1≤i≤4{ MaxDist(p,ℓi, r ) } where ℓi (1≤i≤4) are the four sides of MBR MSLemma:Given a starting point p, an ending point r, and an MBR MS enclosing a point dataset S, ∃s ∈ S, such that dis(p, s)+dis(s, r) ≤ MinMaxTransDist(p, MS, r)

New TNN Algorithms – distance metrics (backup slides)

About DocSlides
DocSlides allows users to easily upload and share presentations, PDF documents, and images.Share your documents with the world , watch,share and upload any time you want. How can you benefit from using DocSlides? DocSlides consists documents from individuals and organizations on topics ranging from technology and business to travel, health, and education. Find and search for what interests you, and learn from people and more. You can also download DocSlides to read or reference later.