/
Readability Readability

Readability - PowerPoint Presentation

luanne-stotts
luanne-stotts . @luanne-stotts
Follow
419 views
Uploaded On 2016-06-27

Readability - PPT Presentation

Combinatoria l Pattern Matching CPM June 29 2015 Rayan Chikhi CNRS Lille Sofya Raskhodnikova Penn State Paul Medvedev Penn State Martin Milanič University of Primorska ID: 380553

labeling graphs decomposition overlap graphs labeling overlap decomposition readability string bipartite rule graph length exists edge weight vertices theorem digraph distinctness free

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Readability" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

ReadabilityCombinatorial Pattern Matching (CPM)June 29, 2015

Rayan Chikhi, CNRS LilleSofya Raskhodnikova, Penn StatePaul Medvedev, Penn StateMartin Milanič, University of PrimorskaSlide2

Overlap Digraph (definition)A string overlaps

a string if there is a suffix of that is equal to a prefix of . They overlap properly

if, in addition, the suffix and prefix are both proper.The overlap digraph of a set of strings

is a digraph where each string is a vertex and there is an edge

if and only if properly overlaps

.

Various variants of overlap graphs used in bioinformatics applications

 

ACGTA

GTAAC

CCCCT

GGACTSlide3

QuestionsDo overlap digraphs have any properties or structure that can be exploited

?Given a graph, Braga and Meidanis (2002) showed how to label the vertices so that the graph is an overlap graphHow does the set of graphs generated depend on the string length?BM labeling used strings of length

Limiting the string length limits the graphs that can be generated

 

?

?

?

?Slide4

Readability in the digraph modelA labeling is an assignment of strings to verticesLet

be a directed graph.An overlap labeling is a labeling such that

is an edge if and only if the string of x

properly overlaps the string of y.The readability of a digraph D, denoted

, is the smallest nonnegative integer

such that there exists an injective overlap labeling of

with

strings of length .

 

ACGTA

GTAAC

CCCCT

GGACT

 Slide5

Readability in the bipartite graph modelLet

be a bipartite graph.

An overlap labeling is a labeling such that

is an

edge if

and only if the

string of

x properly overlaps the string of y.The readability

of a bipartite graph , denoted r(G), is the smallest nonnegative integer r such that there exists an

injective overlap labeling of G with strings of length r.

Thm: There exists a bijection

such that for all

= set of

bipartite

graphs with nodes

in each

part

= set of all digraphs with nodes

.

 

ACA

CAC

AGA

CATSlide6

ExamplesComplete bipartite graph on

vertices ()

Even cycle on vertices (

)

 

41

12

12

23

23

34

34

41Slide7

Is there a simple and useful string-free formulation of readability?Slide8

 

 

 

 

 

 

 

P

4

-rule and

P

4

Lemma

A

decomposition of size k

is a weight function

Given an overlap labeling

,

the

-decomposition

is a decomposition assigning each edge

the length of the minimum overlap between

and

.

P

4

Lemma

: If

is an overlap labeling, then the

-decomposition

satisfies the following (called the P

4

-rule):

For every induced

, if middle edge has the maximum weight, then

 

 

 

 

 

 

 

 Slide9

Trees

Given a decomposition

, we say that labeling

achieves

if it is an overlap labeling and

is the

-decomposition.

Let

be a tree.

Theorem:

P

4

Lemma implies

Claim:

if

satisfies the P

4

-rule, then there exists a labeling achieving

Order edges by non-decreasing weight, and def

Inductively construct labeling

for

. Let

Note that

, because of

-rule and

is

-free

Relabel

and

with

where A

has length

and is composed

of new, non-repeating

characters

 

 

 

 

 

A

 

 

A

 

 

 Slide10

Proof of claim (key idea)Case

 

 

 

 

 

A

 

 

A

 

 

Case

 

 

 

 

 

 

 

 

 

 

 Slide11

For cycles, theorem not true

2

4

2

3

1

2

3Slide12

-free bipartite graphs

 The strict -rule is

For every induced , if middle edge has the maximum weight, then

Theorem

: For a

-free bipartite graph

For graphs with

,

theorem

not true

 

4

2

3

3

1

1

1Slide13

General bipartite graphsLet

be the subgraph of including only edges with weight .Define

as the size of the smallest decomposition satisfying the HUB-rule: for all

bicliques: is a disjoint union of

bicliqueshierarchical: If and

have the same neighborhoods in

, then they have the same neighborhoods in

for

.

Thm

:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 Slide14

How large can readability be?Theorem: Almost all graphs have readability

via counting argument

 Slide15

DistinctnessDistinctness of two vertices in the same bipartition is the number of vertices in one neighborhood and not the other (taking the max of the two values)Distinctness of

is the minimum distinctness over all pairsThm:

Consider the decomposition of an optimal labelingCase 1: every is a matching

Adding a matching can increase the distinctness by at most oneCase 2: Let

be the last one that is not a matchingUsing the fact that the decomposition satisfies the HUB-rule

 

 

 

 

 

 

 

 Slide16

Hadamard Graphs

bipartite graph

vertices assigned -long binary codewords

edge if the inner-product of the

codewords is odd

 

 

00

01

1

0

1

1

00

01

1

0

1

1

 

Theorem

:

 Slide17

Trees

 

 

 

 

 

 

 

 

 

 

 

 

Thm

:

For all trees

,

For full k-

ary

tree of height k,

Assume

fsoc

there exists an opt

decomp

of size

A path from root to leaf with distinct edge weights,

with

values, with

edges

 

 

 

 

 

 

 Slide18

ConclusionsResultsA string-free formulation of readability that isexactly equivalent for treesasymptotically equivalent for

-free bipartite graphs“weakly” equivalent for general graphsExistence of a graph family with readability of

Open problemsFind other rules that an

-decomposition must satisfy to close the gap :

Let

We know

Do there exists graphs with

?

Complexity

Understand graphs that have poly-logarithmic readability

 Slide19

The endCombinatorial Pattern Matching (CPM)June 29, 2015Rayan

Chikhi, CNRS LilleSofya Raskhodnikova, Penn StatePaul Medvedev, Penn StateMartin Milanič, University of PrimorskaSlide20

General graphsDefine

for as the subgraph of including only edges with weight at most

.Lem: An -decomposition satisfies the following (HUB-rule), for all

is a disjoint union of bicliquesIf and

have the same neighborhoods in

, then they have the same neighborhoods in

for

.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Define

as the size of the smallest decomposition satisfying the HUB-rule.

Thm

:

 Slide21

Questions/ResultsDo there exists graphs with readabilitySlide22

Almost all graphs have readability

 Counting argument

There are

bipartite graphs with vertices.There are at most

labellings of length