/
Approximate On-line Palindrome Recognition, and Application Approximate On-line Palindrome Recognition, and Application

Approximate On-line Palindrome Recognition, and Application - PowerPoint Presentation

celsa-spraggs
celsa-spraggs . @celsa-spraggs
Follow
416 views
Uploaded On 2016-06-12

Approximate On-line Palindrome Recognition, and Application - PPT Presentation

Amihood Amir Benny Porat Moskva River Confluence of 4 Streams Palindrome Recognition Approximate Matching Interchange Matching Online Algorithms CPM 2014 Palindrome Recognition Vozmika slovo ID: 359286

palindrome mod time mismatch mod palindrome mismatch time reversal matching reversals online group algorithm positive concatenation string fingerprint mismatches

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Approximate On-line Palindrome Recogniti..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Approximate On-line Palindrome Recognition, and Applications

Amihood AmirBenny PoratSlide2

Moskva RiverSlide3

Confluence of 4 Streams

Palindrome Recognition

Approximate Matching

Interchange Matching

Online Algorithms

CPM 2014Slide4

Palindrome Recognition

- Voz'mi-ka slovo

ropot

, - govoril Cincinnatu ego shurin,

ostriak, -- I prochti obratno. A? Smeshno poluchaetsia?

Vladimir Nabokov,

Invitation to a Beheading (1)

"Take the word

ropot

[murmur]," Cincinnatus' brother-in-law,

the wit, was saying to him, "and read it backwards. Eh? Comes out funny, doesn't it?" [--›

topor: the axe] A palindrome

is a string that is the same whether read from right to left or from left to right: Examples: доход

A man, a plan, a cat, a ham, a yak, a yam, a hat, a canal-Panama! Slide5

Palindrome Example

Ibn Ezra:

Medieval Jewish philosopher, poet, Biblical commentator, and mathematician.

Was asked:

"

אבי אל חי שמך למה מלך משיח לא יבא

"

[ My Father, the Living God, why does the king messiah not arrive?]

His response:

"

דעו מאביכם כי לא בוש אבוש, שוב אשוב אליכם כי בא מועד"[ Know you from your Father that I will not be delayed. I will return to you when the time will come ]Slide6

Palindromes in Computer Science

Great programming exercise in CS 101.

Example of a problem that can be solved by a RAM in

linear time

, but

not

by a 1-tape Turing machine.

(Can be done in linear time by a 2-tape TM)Slide7

Palindrome Concatenation

We may be interested

in finding out whether a string is a concatenation of palindromes of length > 1.

Example:

ABCCBABBCCBCAACB

Why would we be interested in such a funny problem?

– we’ll soon see

Exercise:

Do this in linear time…

ABCCBA

BB

CC

BCAACBSlide8

Stream 2 - Approximations

As in exact matching, there may be errors. Find the

minimum

number of errors that, if fixed, will give a string that is a concatenation of palindromes of length > 1

Example:

ABCCBCBBCCBCABCB

For Hamming distance:

A-Porat [ISAAC 13]:

Algorithm of time

O(n

2

)

ABCCBA

BB

CCBCAACBSlide9

Stream 3 - Reversals

Why is this funny problem interesting?

Sorting by reversals:

In the evolutionary process a substring may “detach” and “reconnect” in reverse:

ABCA

BCDAABC

BAD

CBAADCB

ABCA

BCDAABC

BADSlide10

Sorting by Reversals

What is the

minimum

number of reversals that, when applied to string A, result in string B?

History:

Introduced:

Bafna & Pevzner [95]

NP-hard:

Carpara [97]

Approximations:

Christie [98]

Berman, Hannenhalli, Karpinski [02]

Hartman [03]Slide11

Sorting by Reversals – Polynomial time Relaxations

Signed reversals:

Hannenhalli & Pevzner [99]

Kaplan, Shamir, Tarjan [00]

Tannier & Sagot [04]

. . .

Disjointness:

Swap Matching

Muthu [96]

Two constraints:

The

length

of the reversed substring is limited to 2

. All swaps are disjoint.Slide12

Reversal Distance (RD):The RD between s

1 and s2 is the minimum number k, such that there exist s2’ , where HAM(s1

,s2’) =k, and s

1

reversal match s

2

.

A

B

D

E

A

B

C

D

A

E

CB

A

B

A

A

D

A

S

1

:

S

2

:

RD(S

1

,S

2

) = 2

Pattern Matching with Disjoint ReversalsSlide13

Interleave Strings:

A

B

D

E

A

B

C

D

A

E

D

B

A

B

A

A

D

C

S

1

:

S

2

:

Connection between Reversal Matching and Palindrome Matching

A C D D C A B A A B E A D B B D A ESlide14

On-line Input

Suppose that we get the input a byte at a time:

For the palindrome problem:

A

C

D

A

C

A

B

B

A

A

E

B

B

A

A

A

E

A

D

D

DSlide15

On-line Input

Suppose that we get the input a byte at a time:

For the reversal problem:

AC

CA

BA

AB

EA

BD

A

A

A

AE

DD

DBSlide16

Main Idea – Palindrome Fingerprint

s0,s1

,s2,…sm-1

Φ

R

(S)=r

-1

s

0

+ r

-2

s

1

+… r

-msm-1

mod (p)

Φ(S)=r1s

0+ r2s

1

+… r

m

s

m-1

mod (p)

The Rabin Karp

Fingerprint

If r

m+1

Φ

R

(S) =

Φ

(S) => S is a palindrome.

w.h.p.

The Reversal

FingerprintSlide17

Palindrome Fingerprint

If rm+1ΦR

(S) = Φ(S) => S is a palindrome.

Example:

S =

A B C B A

r

6

Φ

R

(S)=

r

6 (1/r A + 1/r2 B + 1/r3 C + 1/r4 B + 1/r

5 A) =r5 A + r4 B + r3 C + r

2 B + r A = Φ(S)

Φ

R(S)=r

-1s0+ r

-2

s

1

+… r

-m

s

m-1

mod (p)

Φ

(S)=r

1

s

0

+ r

2

s

1

+… r

m

s

m-1

mod (p)Slide18

Simple Online Algorithm for Finding a Palindrome in a Text

t1,t2,t

3, … t

i

,t

i+1

,t

i+2

,

t

i+m

, ti+m+1 , …

tn

ΦR=r-1t

i+ r-2ti+1

+… r-mt

i+m mod (p)

Φ

=r

1

t

i

+ r

2

t

i+1

+… r

m

t

i+m

mod (p)

If

not

, then for the next position:

If

r

m+1

Φ

R

=

Φ

=

>

there is a palindrome starting in

the i-th position

.

Φ

=

Φ

+ r

m+1

t

i+m+1

mod (p)

Φ

R

=

Φ

R

+ r

-(m+1)

t

i+m+1

mod (p)

Note:

This algorithm finds online whether the prefix of a text is a permutation. For finding online whether the text is a concatenation of permutations, assume even-length permutations, otherwise, every text is a concatenation of length-1 permutations.Slide19

Palindrome with mismatches

Start with 1 mismatch case.Slide20

1-Mismatch

s0,s1,s2,

… sm-1

S=

Choose

l

prime numbers

q

1

,…,q

l

<

m

such that Slide21

1-Mismatch

s0,s1,s2,

… sm-1

s

0

,s

2

,s

4

s

1

s3,s5

s0

,s3,s6

s

1

,s

4

,s

7

s

2

,s

5

,s

8

mod 2

mod 3

S=

S

2,0

=

S

2,1

=

S

3,0

=

S

3,1

=

S

3,2

=

For each

q

i

construct

q

i

subsequences of

S

as follows: subsequence

S

q

i

,j

is all elements of S whose index is

j

mod

q

i

.

Examples:

q

1

=2, q

2

=3Slide22

Example

s0,s1,s2, s

3,s4,s

5

s

0

,s

2

,s

4

s

1

s

3

,s

5

s0

,s3

s

1

,s

4

s

2

,s

5

mod 2

mod 3

S=

S

2,0

=

S

2,1

=

S

3,0

=

S

3,1

=

S

3,2

=Slide23

1-Mismatch

We need to compare:We prove that in the partitions strings:

s

0

, s

1

, s

2

,

s

m-2

,s

m-1

sm-1, sm-2

, sm-3 … s

1 , s0

S

q,j

= S

R

q,(m-1-j)mod qSlide24

Example

s0,s1,s2,s

3,s4,s

5

s

0

,s

2

,s

4

s

0

,s

3

s

1

,s4

S=

s

5

,s

4

,s

3

,s

2

,s

1

,s

0

S

R

=

s

5

s

3

,s

1

s

5

,s

2

s

4

,s

1

s

0

,s

2

,s

4

s

1

s

3

,s

5

s

0

,s

3

s

1

,s

4

s

2

,s

5

S

2,0

=

S

2,1

=

S

3,0

=

S

3,1

=

S

3,2

=S2,0=SR2,1=S3,0=SR3,2=S3,1=SR3,1=Slide25

Exact Matching

Lemma: S=SR  Sq,j

= SRq,(m-1-j) mod

q

for all q and all

0 ≤ j

q.Slide26

1-Mismatch

Lemma: S is a palindrome with 1-mismatch  for each q, there is exactly

one

j such that:

Φ

(S

q,j

) ≠ r

|Sq,j|

Φ

R(SR

q,(m-1-j)mod q)Slide27

1-Mismatch

Lemma:There is exactly one mismatchThere is exactly one subpattern in each group that does not match.

C.R.TSlide28

Chinese Remainder Theorem

Let n and m two positive integers.

In our case:

if two different indices,

i

and

j

, have an error, and only one subsequence is erroneous, since the product of all q’s > m, it means that

i=j

.Slide29

Complexity

There exists a constant c such that, for any x<m, there are at least

x/log m

prime numbers between

x

and

cx

.

Therefore, choose prime numbers between

log

m

and

c

log

m

.Slide30

Complexity

For each qi we compute 2qi different fingerprints:Overall space

:

Each character participates in

exactly two

fingerprints (the regular and the reverse).

Overall time:Slide31

Online

All fingerprint calculations can be done onlineWe know the m

at every input character, to compute the comparisons.

Conclude:

Our algorithm is online.Slide32

k-Mismatches

Use Group testing…Slide33

k-Mismatches

Group TestingGiven n items with some

positive ones, identify all positive ones by a small number of tests

.

Each test is on a

subset of items

.

Test outcome is positive iff there is a positive item in the subset. Slide34

k-Mismatch

Group: partition of the text.Test: distinguish between:

(using the 1-mismatch algorithm)match 1-mismatch

more then 1-mismatchSlide35

k-Mismatches

s0,s1,s2,

… sm-1

s

0

,s

2

,s

4

s

1

s3,s5

s0

,s3,s6

s

1

,s

4

,s

7

s

2

,s

5

,s

8

mod 2

mod 3

S=

S

2,0

=

S

2,1

=

S

3,0

=

S

3,1

=

S

3,2

=

Similar to the 1-mismatch algorithm just with more prime numbers…

Each S

q,j

is a

group

in our group testingSlide36

Our tests

We define The reversal pair of Sq,j to be S

Rq,(m-1-j)mod q

Each partition is “tested against” its reversal pair.Slide37

Correctness

s0,s1,s

2, … sj …. s

m-1

For any group of k character i

1

,i

2

,..i

k

There exists a partition where s

j

appears alone

i

2

i

5

i

7

i

9

i

C.R.TSlide38

Correctness

s0,s1,s

2, … sj …. s

m-1

If s

j

invokes a mismatch we will catch it.

i

2

i

5

i

7

i

9

iSlide39

Complexity

Overall space:Overall time:Slide40

Approximate Reversal Distance

Using the palindrome up to k-mismatches algorithm, can be solved in

time, and

space.Slide41

спасибо