/
Encodings of Range Maximum-Sum Segment Queries and Applicat Encodings of Range Maximum-Sum Segment Queries and Applicat

Encodings of Range Maximum-Sum Segment Queries and Applicat - PowerPoint Presentation

aaron
aaron . @aaron
Follow
417 views
Uploaded On 2017-09-05

Encodings of Range Maximum-Sum Segment Queries and Applicat - PPT Presentation

Pawe ł Gawrychowski and Pat Nicholson University of Warsaw MaxPlanck Institut für Informatik Range Queries in Arrays Input an array Preprocess the array to answer queries of the form ID: 585353

range data idea solution data range solution idea main word query encoding sum array structure dealing queries bottleneck input

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Encodings of Range Maximum-Sum Segment Q..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Encodings of Range Maximum-Sum Segment Queries and Applications

Pawe

ł

Gawrychowski

* and

Pat Nicholson**

*University of Warsaw

**Max-Planck-

Institut

für

InformatikSlide2

Range Queries in Arrays

Input: an array

Preprocess the array to answer queries of the form

“Given a range find _____ in the subarray ”Where ______ is something like:the index of the maximum/minimum elementthe index of the top- values the index of the -th largest/smallest numberfind the maximum sum range

 Slide3

There is a

succinct data structure

that occupies

bits, and answers queries in constant time.Fischer and Heun (2011) Encoding Range Queries in ArraysHow much space do we need to answer these queries?As an example, think of range min. queries (RMinQ):If we return the value of min, then we must store the array. Why?Because we can ask the query for each This allows us to recover the entire arrayIf we return just the array index, then we can do

much

better.

 Slide4

Typical Data Structure

Input Data

(Relatively Big)Slide5

Typical Data Structure

Input Data

(Relatively Big)

Data StructurePreprocessSlide6

Encoding Approach

Input Data

(Relatively Big)Slide7

Encoding Approach

Input Data

(Relatively Big)

Preprocessw.r.t.Some QueryEncoding(Hope: much smaller)Slide8

Encoding Approach

Input Data

(Relatively Big)

Preprocessw.r.t.Some QueryEncoding(Hope: much smaller)Slide9

Encoding Approach

Input Data

(Relatively Big)

Preprocessw.r.t.Some QueryEncoding(Hope: much smaller)Auxiliary Data Structures:(Should be smaller still)Slide10

Encoding Approach

Succinct Data Structure:

Minimum Space Possible

Encoding(Hope: much smaller)Auxiliary Data Structures:(Should be smaller still)Input Data(Relatively Big)Preprocessw.r.t.Some QuerySlide11

Encoding Approach

Succinct Data Structure:

Minimum Space Possible

Encoding(Hope: much smaller)Auxiliary Data Structures:(Should be smaller still)Query(Hope: as fast as non-succinct counterpart)Input Data(Relatively Big)Preprocessw.r.t.

Some QuerySlide12

This Talk: Maximum-Sum Segments

From Jon Bentley’s “Programming Pearls”:

Input

: an array containing arbitrary numbersOutput: the range s.t. is maximizedOnly non-trivial if array contains negative numbersCan be solved in linear time (credited to Kadane)Applications:Bentley [1986]: “[problem] is a toy – it was never incorporated into a system.”Chen and Chao [2004]: “…plays an important role in sequence analysis.”We focus on the range query case:Find range s.t.

is

maximized

Also motivated by biological sequence analytics applications

 Slide13

Range Maximum-Sum Segment Queries

What was known:

Chen and Chao [ISAAC 2004, Disc. App. Math. 2007]

This can be done in words of space and timeVery closely related to the range maximum problem:RMSSQ RMaxQ: Pad elements with large negative numbersRMinQ/RMaxQ RMSSQ: More complicated argument Slide14

Range Maximum-Sum Segment Queries

What was not known:

I

s there an efficient encoding structure for this problem?That is: can we beat words Really similar to RMaxQSolution storesseveral

word

arrays;

compares various array

elements

 Slide15

Range Maximum-Sum Segment Queries

Our main results:

We can encode these queries using

bits(Rest of this talk)A space lower bound of bits(Enumeration argument using methods from: )Application to computing -covers( disjoint subranges that achieve the maximum sum)Csűrös: “The problem arises in DNA and protein segmentation, and in postprocessing of sequence alignments.”  Slide16

Main Idea:

word solution

 

Define an array consisting of the partial sums of  Slide17

Main Idea:

word solution

 

Imagine shooting a ray from each to the left Slide18

Main Idea:

word solution

 

Imagine shooting a ray from each to the left Slide19

Main Idea:

word solution

 

Now find the minimum in this rangeSlide20

Main Idea:

word solution

 

Now find the minimum in this rangeSlide21

Main Idea:

word solution

 

Define another array storing these minima  

 Slide22

Candidate Pairs

We call each pair

a

candidateWe define (yet another) array as follows: is the score of the candidate That is: the sum within the range  Slide23

What Do They Store?

The array

:

words(Cumulative Sums)The array : words(Candidate partners)Range min (RMinQ) structure on : bitsRange max (RMaxQ) structure on : bits(Candidate Scores)

 Slide24

Main Idea:

word solution

 

How to answer a query: the easy caseSlide25

Main Idea:

word solution

 

Let , and examine candidate pair 

 

 Slide26

Main Idea:

word solution

 

If is in query range, return  

 

 Slide27

Main Idea:

word solution

 

How to answer a query: the not so easy caseSlide28

Main Idea:

word solution

 

Let … this time  

 

 Slide29

Main Idea:

word solution

 

Let … 

 

 Slide30

Main Idea:

word solution

 

Let and  

 

 

 

 Slide31

Main Idea:

word solution

 

Return the greater sum: or  

 

 

 

 Slide32

Reducing the Space

What are the bottlenecks in the data structure?

Storing the array

We need to store the candidate pairsStoring the array We must compare scores of candidates in the not so easy case Slide33

Dealing with

: Bottleneck

I

 Slide34

Dealing with

: Bottleneck

I

 Slide35

Nested Is Good

Imagine indices as

vertices, candidate pairs as edges

We can represent an -edge nested graph in bitsAlso known as a one-page or outerplanar graphNavigation is efficient: select vertices, follow edges, etc.Jacobson (1989), Munro and Raman (2001); bits 

1

2

3

4

5

6

7

8

()((

())(

()(

()))

())((

()(

()))

())Slide36

Dealing with

: Bottleneck

I

 Slide37

Dealing with

: Bottleneck

II

   Slide38

Dealing with

: Bottleneck

II

  

 

 Slide39

Dealing with

: Bottleneck

II

 We call the point the left sibling of

Knowing

, we can handle the not so easy case.

 

 

 

 Slide40

Recall The Query Algorithm

Return the greater sum:

or

 

 

 

 

 Slide41

Recall The Query Algorithm

Return the greater sum:

or

 

 

 

 

If the left sibling of

is

 

 Slide42

Recall The Query Algorithm

Return the greater sum:

or

 

 

 

 

If the left sibling of

is

 

 Slide43

Recall The Query Algorithm

Return the greater sum:

or

 

 

 

 

L

eft sibling of

can’t be here

 

 Slide44

Dealing with

: Bottleneck

II

  

 

 

Problem: cannot store the left siblings explicitlySlide45

Dealing with

: Bottleneck

II

  

 

 

Idea: try to find something that is nestedSlide46

Dealing with

: Bottleneck

II

  

 

 

Solution: the pairs

are nested

 Slide47

Dealing with

: Bottleneck

II

 Slide48

Dealing with

: Bottleneck

II

 Slide49

What Do We Store?

The graph representing candidates:

bits

The graph representing left siblings: bitsRange min (RMinQ) structure on : bitsRange max (RMaxQ) structure on : bitsGrand total:

bits…

(

can be reduced slightly with more tricks)

 Slide50

Thank You