Between Priority Queues and Sorting in External Memory Zhewei Wei Renmin University of China MADALGO Aarhus University Ke Yi The Hong Kong University of Science and Technology Priority Queue ID: 485024
Download Presentation The PPT/PDF document "Equivalence" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Equivalence Between Priority Queues and Sorting in External Memory
Zhewei
Wei
Renmin
University of China
MADALGO, Aarhus University
Ke
Yi
The Hong Kong University of Science and Technology Slide2
Priority QueueMaintain a set of keys
Support insertions, deletions and
findmin
(
deletemin
)
Fundamental data structure
Used as subroutin
es in greedy algorithms
Dijkstra’s
single source shortest path
algorithm
Prim’s minimum
spanning tree
algorithmSlide3
Sorting to Priority QueuePriority queue can do sortingGiven N unsorted keys
Insert the keys to the priority queue
Perform N
deletemin
operations
(find minimum and delete it)
If a priority queue can support insertion, deletion,
findmin
in
S(N)
time, then the sorting algorithm runs in
O(NS(N))
time.Slide4
Priority Queue to SortingThorup [2007]: sorting can do priority queue
!
A sorting algorithm sorts N keys in
N*S(N)
time in RAM model
O(
Nloglog
N) sorting -> O(
loglog
N) priority queueO() sorting -> O() priority queue
A priority queue support all operations in
O(S(N)) time
Use sorting algorithm
as a black boxSlide5
The I/O Model [Aggarwal and Vitter 1988]
Disk
Memory
CPU
Block
Complexity: # of block transfers (I/
Os
)
CPU computations and memory accesses are freeSize: MUnlimited sizeSize: BSlide6
Cache-Oblivious Model
Disk
Memory
CPU
Block
Optimal without knowledge of M and B
Optimal for all M and B
Size: ?
Unlimited size
Size: ?Slide7
Sorting in the I/O ModelSorting bound:
Upper bound: external merge sort
Lower bound: holds for comparison model or indivisibility assumption
Conjecture: lower bound holds for B not too small, even without indivisibility assumption
Sort(N)=
Θ(
N/B *
log
M
/BN ) I/OsTreat keys as atomsSlide8
Priority Queue in External MemoryTree-based: do not give any priority queue-to-sorting reduction
O(1/B*
log
M
/B
N ) amortized cost
I/O model
Buffer tree [
Arge
1995]M/B-ary heaps [Fadel et. al. 1999]Array heaps[Brodal and Katajainen 1998]Slide9
Priority Queue in External MemoryCache-oblivious priority queue [Arge
et.al. 2002]
Keys are moving around in
l
oglog
N levels
O(1/B*
log
M
/BN) with tall cache assumptionM>B2Reduction: Given an external sorting algorithm that sorts N keys in NS(N)/B I/Os, there is an external priority queue that support all operations in O(S(N)loglog N/B) amortized I/
OsSlide10
Our ResultsS(N)/B for S(N) = Ω
(2
log*N
), or M =
Ω
(B*log
(c)
N)
Other wise O((S(N) log*N) /B)No new bounds for external priority queueExternal priority queue lower bound -> external sorting lower bound
A sorting algorithm sorts N keys in N*S(N)/B time in the I/O modelA priority queue support all operations in 1/B*Σi≥0S(Blog(i)(N/B)) amortized I/OsUse sorting algorithm as a black boxS(N) + S(B*log N) + S(B*loglog N)) + …Slide11
OutlineHow Thorup did
it (on a high level)
How
we
extend it in external memory (on a high level)
Open
problemsSlide12
Thorup’s ReductionWord RAM model:
each word consists of w ≥ log N bits
constant number of registers,
each with
capacity for one
word
Atomic heap [Han 2004]: support insertions, deletions, and predecessor
queries in
set of O(log2 N) size in constant timeSlide13
Thorup’s Reduction – O(S(N)*log N)
O(log N)
levels
…
N keys
N/2 keys
c
keys
2
c
keysN/4 keys
Keep min in the head
Invariant: Keys in higher level are larger than keys in Lower levelSlide14
Thorup’s Reduction – O(S(N)*log N)
Rebalance cost
for
level
2
j
:
2
j
*S(N) # of sorts in N updates: N/2jAmortized cost in level 2j: S(N)log N levels
N keys
N/2 keys
c keys
2c
keys
N/4 keys
O(log N)
levels
…
Cost: O(S(N)*
logN
)Slide15
Thorup’s Reduction
N/log N
base sets
N/2log N
base sets
1 base sets
2 base sets
N/4log N
Base sets
l
og N
Split/merge base sets:
S(N)
amortized
Rebalancing level
2
j
:
2
j
S(N)/log N
# of
rebalance
in N updates:
N/2
j
Amortized cost for
level 2
j
:
S(N)/log
N
…
O(log N)
levels
O(S(N)) Amortized costSlide16
Thorup’s Reduction
N/log N
base sets
N/2log N
base sets
1 base sets
2 base sets
N/4log N
Base sets
Atomic
heap
of size
log N
l
og N
Split/merge base sets:
S(N)
amortized
Rebalancing level
2
j
:
2
j
S(N)/log N
# of
rebalance
in N updates:
N/2
j
Amortized cost for
level 2
j
:
S(N)/log
N
…
O(1) cost
O(S(N)) Amortized costSlide17
Thorup’s Reduction
Amortized Cost
: O(S(N))
Atomic
heap
of size
log N
N/log N
base sets
N/2log N
base sets
1 base sets
2 base sets
N/4log N
Base sets
Atomic heap of size
log N
Buffer size: N/log N
Buffer size: N/2log N
Buffer size: N/4log N
…
O(S(N)) Amortized cost
O(1) costSlide18
Externalize Thorup’s Reduction
Where does B come in?
How to replace atomic heap?
How to handle deletions in external memory?Slide19
Where does B come in?
B
uffer
of size
B*log N
N/Blog
N
base sets
N/2Blog N
base sets
1 base sets
2 base sets
N/4Blog N
Base sets
Buffer size:
N/log
N
Buffer size:
N/2log
N
Buffer size:
N/4log
N
B*log N
…Slide20
I/O-efficient Flush Operation
Buffer size
|R|
k substructures
Sort keys in
buffer: O(R*S(R)/
B)
Distribute
keys to k substructures: O(R/
B+k
)
Total I/O cost:
O(RS(N)/
B + k)
If k =O(R/B), total flush cost is O(RS(N)/B), amortized cost is O(S(N)/B)Slide21
Where does B come in?
Base sets: 2
j
/(Blog N)
Buffer size:
2
j
/log
N
B*log N
…
Amortized I/O cost for flushing level buffers: O(S(N)/B
)
If a level holds 2
j
keys
Large
st buffer size: 2
j
/log N
Largest # of base sets: 2
j
/Blog N
Smallest base set (head) size: B*log NSlide22
Replacing Atomic Heap
R = B*log N
k = log N
B
uffer
of size
B*log N
…Slide23
Replacing Atomic Heap
Head of size O(Blog N)
Amortized I/O cost: O(S(N)/B)
B
uffer
of size
B*log N
…
Recursively build the structure in the head Slide24
Recursively Build Layers
N keys
B*log (N/B) keys
cB
keys
2^c*B keys
B*
loglog
(N/B) keys
O(log* N) Layers…
Levels rebalancing- Move base sets around
- Redistribute buffer
- S(N)/(Blog N) for one level- S(N)/B
for one layer
- S(N)log* N/B
amortized I/O cost Slide25
Recursively Build Layers
N keys
B*log (N/B) keys
cB
keys
2^c*B keys
B*
loglog
(N/B) keys
O(log* N) Layers…
Layers Rebalancing- Rebuild the first (last) level
-
S(N)/B for one layer- S (N)log*
N/B
amortized I/O costSlide26
Recursively Build Layers
N keys
B*log (N/B) keys
cB
keys
2^c*B keys
B*
loglog
(N/B) keys
O(log* N) Layers…Slide27
Recursively Build Layers
N keys
B*log (N/B) keys
cB
keys
2^c*B keys
B*
loglog
(N/B) keys
Memory
b
uffer
of sizeO(B)
R = B
k = log* N
…Slide28
Recursively Build Layers
N keys
B*log (N/B) keys
cB
keys
2^c*B keys
B*
loglog
(N/B) keys
Memory
b
uffer
of sizeO(B)
Amortized cost:
log* N/B
I/O cost per update:
O(S(N)log* N/B)
…Slide29
Handle DeletionsFollow a pointer to perform deletion takes 1 I/O per deletion
Deleting signals
:
Delete x -> Insert (-, x)
Perform actual deletion afterwards
Unlike
buffer tree, we don’t have access to the “leaves”(base sets)
Invariant: Only
process deleting signals in the headSlide30
ScheduleAvoid repeated sortingIf head or memory buffer unbalanced:
Flush stage:
flush
all
overflowed buffers and rebalance all unbalanced base sets
Push
stage: rebalance all overflowed layers and
levels
(expand)Pull stage: deal with delete signals and
rebalance all underflowed layers and levels (shrink)Slide31
Open problemsOptimal reduction? Priority queue that support insertions/deletions in O(1/B) I/O cost for set of size O(B*log
(c)
N)
New reduction framework
Better (than
loglog
N) reduction in Cache-oblivious model?
Hard to do I/O-efficient flushing and rebalancing without knowing BSlide32
Thank You!