/
Branching Programs Part 3 Branching Programs Part 3

Branching Programs Part 3 - PowerPoint Presentation

sherrill-nordquist
sherrill-nordquist . @sherrill-nordquist
Follow
367 views
Uploaded On 2018-02-27

Branching Programs Part 3 - PPT Presentation

Paul Beame University of Washington Outline Branching program basics Space size lower bounds Multioutput functions TimeSpace tradeoff lower bounds for general BPs Singleoutput functions ID: 637748

bps read time embedded read bps embedded time layers input log bounds oblivious size bound variables rectangle length output functions rectangles trace

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Branching Programs Part 3" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Branching ProgramsPart 3

Paul

Beame

University of WashingtonSlide2

Outline

Branching program basics

Space (size) lower bounds

Multi-output functions

Time-Space tradeoff lower bounds for general BPs

Single-output functions

Restricted classes of BPs

OBDDs, Read-once (FBDDs), Oblivious,

Read-k

Lower bound methods for restricted classes

Lower bound methods for general BPs

Applying tradeoffs: BPs and static data structures

Multi-output

functions using single-output techniques

Lower bound for encoding good codesSlide3

Limited Branching Program Forms

Structure-based

Oblivious

For each BP level, all the nodes on that level have the same variable name

e.g. Parity BP

Time-Based

Read-OnceOn every path through the BP each variable is queried at most oncee.g. Parity BPRead-π‘˜On every path through the BP each variable is queried at most π‘˜ timesTime-BoundedEvery path in the BP has length at most

Β Slide4

Oblivious vs Read-k and Best Bounds

Recall argument for oblivious BPs:

If

length

then

variables are read at most

timesSo oblivious length π’Œπ’ β‰ˆ Read-

2π’ŒExponential Read-

π’Œ

size lower bounds for simple explicit Boolean functions for π’Œ=𝐎(log 𝒏)Inspired by 2-party communication complexity [Borodin-Razborov-Smolensky 1989][Okol’nishnikova 1989]Exponential size lower bound for an explicit function over large domain for π’Œ=𝐎(log2 𝒏)Inspired by multiparty communication complexity [B-Vee 2002]Drawback: Function is not known to be in NP. No larger π’Œ possible until the oblivious case is improved

Β Slide5

Read-k BPs

On every path through the BP each variable is queried at most

π’Œ

times

Unlike read-once, there may be paths that are not consistent with any input. We assume that those paths are restricted, too.Lower bound methods for read-

π’Œ BPs often also apply to nondeterministic read-π’Œ BPsthe β€œevery path” constraint is essential thereDefn: Nondeterministic BPs (NBPs) generalize BPs by allowing many out-edges from a vertex with the same label:An NBP outputs 1 on input 𝒙 iff there is some path that 𝒙 can take that leads to the 1-sink node.Slide6

Read-k BPs

We can also separate the levels of the Read-

π’Œ

BP hierarchy:

e.g., a small Read-

2 BP can tell whether an

𝒏 x 𝒏 binary matrix is a permutation matrix, unlike small Read-Once BPsThere are explicit Boolean functions with small Read-π’Œ+1 BPs that require exponential size Read-π’Œ BPs for π’Œ ≀

/

2

, even allowing nondeterminism or randomization

[Jayram S. Thathachar 1998] Techniques are easier but are similar enough to those for general time-bounded BPs in the case of non-Boolean inputs that we do them togetherΒ Slide7

Limited Branching Program Forms

Structure-based

Oblivious

For each BP level, all the nodes on that level have the same variable name

e.g. Parity BP

Time-Based

Read-OnceOn every path through the BP each variable is queried at most oncee.g. Parity BPRead-π‘˜On every path through the BP each variable is queried at most π‘˜ timesTime-BoundedEvery path in the BP has length at most

Β Slide8

Breaking up a BP via its Traces

Split BP

𝑷

with input set

𝑫

𝒏

into layersLet traces(𝑷)={trace(𝒙)| π’™βˆŠπ‘«π’}

For 𝝉

∊

traces(

𝑷) let 𝒇𝝉 be the function that has value 1 on input 𝒙 iff trace(𝒙)=𝝉 and 𝒇(𝒙)=1Since 𝑷 computes 𝒇 𝒇=

The

𝒇

𝝉

are disjoint and

|

traces

(𝑷)|

≀

2

𝑺

Can extend this to nondeterministic BPs:

Each

𝒙

may have multiple tracesThe 𝒇𝝉 are no longer disjoint

Β 

1

0Slide9

Read-k BPs and Traces

Split BP

𝑷

with input

set

𝑫

𝒏 into layers 𝒇=

and

|

traces

(𝑷)| ≀ 2𝑺If 𝑷 is a read-π’Œ BP then w.l.o.g. for every pair of nodes 𝒖, 𝒗 in 𝑷 the same set of variables is read on every path from 𝒖 to 𝒗Only must avoid variables read π’Œ times on some pair of paths above 𝒖 and below

𝒗So, each trace

𝝉

yields a

fixed sequence

of

𝑳

sets of variables read, each of size

≀ π’Œπ’/𝑳

Can assign the layers for each

𝒇

𝝉

as we did for oblivious BPs

Get 2𝑺 assignments, one for each 𝝉 

1

0Slide10

Recall: Strategy for Assigning Layers

Assign each of

𝑳

layers to either Alice or Bob for

Goal:

maximize # of bits per player π’Ž, while minimizing 𝑳.Flip an independent coin for each layer: π’Ž=𝒏/2π’Œ+1, 𝑳

=8π’Œ

2

2π’Œ equal length layers [Borodin-Razborov-Smolensky 1989, B-Jayram-Saks 2001]Β Slide11

Recall: Strategy for Assigning Layers

Assign each of

𝑳

layers to either Alice or Bob for

Goal:

maximize # of bits per player π’Ž, while minimizing 𝑳.Flip an independent coin for each layer: π’Ž=𝒏/2π’Œ+1, 𝑳

=8π’Œ

2

2π’Œ equal length layers [Borodin-Razborov-Smolensky 1989, B-Jayram-Saks 2001]Use 4π’Œ2 equal layers. Give a random subset of 2π’Œ of them to Bob. π’Ž=𝒏/(2π’†π’Œ)2π’Œ, 𝑳=4π’Œ2 [Okol’nishnikova 1989,

Ajtai 1999]

Β Slide12

Alice

Input

Β 

Bob

Input

Β 

… …

010

Β 

11

0

= # of bits Alice and Bob need to

exchange

to compute

on

Β 

Communication ComplexitySlide13

Communication Complexity

Defn

:

A

(combinatorial) rectangle in 𝑿𝗑𝒀 is a subset

𝑼𝗑𝑽 where π‘ΌβŠ†π‘Ώ and π‘½βŠ†π’€.Lemma: Any deterministic 𝒄-bit protocol for 𝒇:𝑿𝗑𝒀→{0,

1}

yields a

partition of 𝑿𝗑𝒀 into 2𝒄 rectangles on which 𝒇 is constant Lemma: Nondeterministic 𝒄-bit protocols correspond to coverings of 𝒇-1(1) by 2𝒄 rectangles.To prove that 𝒇 requires large (non)deterministic communication complexity it suffices to prove 𝒇-1

(1

)

is large

Any rectangle in

𝒇

-

1

(

1

)

is

small

𝑿

𝒀

𝑼

𝑽Slide14

Best Partition and Fixed Variables

For BP lower bounds, we don’t know

a priori

how the input

{

0,

1}𝒏 or [π’Ž]𝒏 to 𝒇 is partitioned into 𝑿𝗑𝒀 Need to analyze rectangles for all possible partitions of [𝒏]β€œbest partition” communication complexityAlso, in the case of oblivious BPs, most input variables (ones seen by both parties) were fixed ahead of time

Implicitly, rectangle size was only important relative to the space of unfixed variables.For Read-

π’Œ

and general time-bounded BPs this aspect is even more important so we need to make it explicit.Slide15

Embedded Rectangles

Defn

:

For disjoint sets

𝑨

, 𝑩 βŠ† [𝒏] and

𝜢 ∊ 𝑫[𝒏]-𝑨-𝑩, the embedded rectangle 𝑹 βŠ† 𝑫𝒏 with footprint (𝑨,𝑩), tail 𝜢,

and body

𝑹

𝑨

𝗑𝑹𝑩 for π‘Ήπ‘¨βŠ†π‘«π‘¨, π‘Ήπ‘©βŠ†π‘«π‘© is the set 𝑹={π’™βˆŠπ‘«π’ | π’™π‘¨βˆŠπ‘Ήπ‘¨, π’™π‘©βˆŠπ‘Ήπ‘©, 𝒙

[𝒏]-𝑨-𝑩 = 𝜢}

.

Defn

:

The

density

𝜹

𝑹

of

𝑹

is

|𝑹𝑨𝗑𝑹𝑩|/ |𝑫𝑨𝗑𝑫𝑩| The footsize π’Žπ‘Ή of 𝑹 is min{|𝑨|,|𝑩|}.For oblivious BP lower bounds, the footprint

(𝑨

,

𝑩) is the same for all of the embedded rectangles associated with the partition of the input sequence 𝝈

Lets us use communication lower bounds over the reduced variable set 𝑨

βˆͺ 𝑩.Slide16

Lower Bounds from Embedded Rectangles

Strategy:

Write

where each

𝒇

π’Š-1

(1) is a union of embedded rectangles with footsize

π’Ž

,

the same footprint (π‘¨π’Š,π‘©π’Š), but different tailsFor Read-π’Œ BPs 𝑬 ≀ 2𝑳𝑺 (one per trace).For general length π’Œπ’ BPs?Show that any embedded rectangle in 𝒇-1(1) with footsize

β‰₯ π’Ž

has density

≀

𝜹

.

Implies that

𝑬

βˆ™

𝜹 β‰₯ |𝒇

-

1

(1

)|/|𝑫𝒏|Β Slide17

Decomposition for Length

π’Œπ’

Recall that

𝒇

=

. Fix one the 2𝑺

traces 𝝉

.

Apply layer assignment separately for each 𝒙 with trace 𝝉 to the sequence of variables queried on input 𝒙.One of ≀ 2𝑳 possible layer assignments 𝝀=layers(𝒙)Let private(𝒙) be the pair consisting of the first π’Ž variables in the private inputs for Alice and Bob, respectively under 𝝀At most

choices

(𝑨, 𝑩)

for

private

(𝒙

)

Claim:

For disjoint

𝑨,π‘©βŠ†[𝒏]

with

|𝑨|=|𝑩|=π’Ž

,

values πœΆβˆŠπ‘«[𝒏]-𝑨-𝑩, trace 𝝉 and layer assignment 𝝀, 𝑹={π’™βˆŠπ‘«π’

| private

(𝒙)=(𝑨,𝑩), 𝒙

[𝒏]-𝑨-𝑩=𝜢,

trace(𝒙)=𝝉,

layers(𝒙)=𝝀} is an embedded rectangle

with footprint

(𝑨,𝑩) in

𝒇-

1(1

).

Β 

Claim

⇨ we can choose 𝑬≀

2𝑺

2

𝑳

Β Slide18

Proving the Claim

Claim

:

For

disjoint

𝑨,π‘©βŠ†[𝒏]

with

|𝑨|=|𝑩|=π’Ž

,

values

πœΆβˆŠπ‘«

[𝒏]-𝑨-𝑩

,

trace

𝝉

and

layer

assignment

𝝀

𝑹={π’™βˆŠπ‘«

𝒏

|

private

(𝒙)=(𝑨,𝑩), 𝒙

[𝒏]-𝑨-𝑩

=𝜢,

trace(𝒙)=𝝉,

layers(𝒙)=𝝀}

is an embedded rectangle

with footprint (𝑨,𝑩)

in

𝒇-

1(1).

1

0Slide19

Lower Bounds from Embedded Rectangles

Strategy:

Write

where each

𝒇

π’Š

-1(1)

is a union of embedded rectangles with footsize

π’Ž

,

the same footprint (π‘¨π’Š,π‘©π’Š), but different tailsFor Read-π’Œ BPs: 𝑬 ≀ 2𝑳𝑺 (one per trace).For general length π’Œπ’ BPs: 𝑬≀ 2(𝑺+1)

Show that any embedded rectangle in

𝒇

-

1

(1)

with

footsize

β‰₯

π’Ž

has density ≀ 𝜹.Implies that π‘¬βˆ™πœΉ β‰₯ |𝒇-1(1)|/|𝑫𝒏|

Β Slide20

Functions with Embedded Rectangle Tradeoffs

Show that any embedded rectangle in

𝒇

-

1

(1)

with footsize = π’Ž has density ≀ 𝜹:Cannot be smaller than 𝜹=|𝑫|-

2π’Ž

(just one point)

Functions 𝒇 with 𝜹=|𝑫|-𝜺 π’Ž: Hamming separation HAM𝜸 : [Ajtai 2002] e.g. 𝑫=[𝒏6]={0,1}6log 𝒏

Output is

1

iff

𝚫(𝒙

π’Š

,𝒙

𝒋

)β‰₯5 log

2

𝒏

for all

π’Šβ‰ π’‹

Membership in linear codes over finite field 𝔽 [Jukna 2009]Middle bit of integer multiplication of numbers with 𝑫={0,1}𝒃

, i.e.

𝒏

𝒃-bit blocks. [Sauerhoff-Woelfel

2003]All have |𝒇

-1

(1)|/|𝑫𝒏

| β‰₯ 1

/|𝑫|.

Β Slide21

Lower Bounds for BPs/NBPs/RAMs for large

𝑫

Suppose that

and BRS layer assignment based on independent coin-flips is used. Then for

99%

of

𝒙 π’Ž=𝒏/2π’Œ+1 𝑳=8π’Œ

2

2

π’Œ

𝑬 βˆ™ 𝜹 β‰₯ 0.99|𝒇-1(1)|/|𝑫𝒏| β‰₯ 0.99/|𝑫| 𝑬 ≀ 2(𝑺+1)

and

𝜹

=|𝑫|

-𝜺 π’Ž

For these values

< 2

(2

π’Œ

+6)

π’Ž

.

If

log

2

|𝑫| > 4π’Œ/𝜺

then

2𝑺

β‰₯

|𝑫|

𝜺 π’Ž/4

Taking logs we get 𝑺

β‰₯

πœΊβ€™ π’Ž log

2 |

𝑫|

Plugging in as before yields 𝑻=𝛀(𝒏 log((𝒏log 𝑫

)/𝑺))

[B-Jayram

-Saks 2001, B-Saks-Sun-Vee 2003]

Β Slide22

Boolean Domains and

Β 

For

we only have

𝜹=

2-πœΊπ’Ž

Over

𝔽23𝒏 Ajtai defined an explicit cubic form 𝒇(𝒙,π’š)=π’™π—§πŒπ’š 𝒙 that requires 𝜹=2-πœΊπ’Ž

Alternatively:

𝒇(𝒙)=

1

iff

# of

(π’Š,𝒋)

pairs

s.t.

π’Š<𝒋

,

π’™π’Š=𝒙𝒋=π’™π’Š+𝒋=1 is oddΒ 

0

π’š

1

π’š

2n-1

π’š

2n-2

π’š

n+2

π’š

n+1

π’š

n

π’š

4

π’š

3

π’š

2

M

π’šSlide23

BP Lower Bound Technology for

𝜹

=2

-πœΊπ’Ž

Much more complicated argument that holds only up to small amounts of nondeterminism.

[Ajtai 2005]Uses correlations between private(𝒙) values for related inputs. an independent layer assignment that leaves most layers unassigned to either playera probability of assigning a layer for input 𝒙 to one of Alice or Bob that depends on the typical # of different layers in which input variables are read on input 𝒙.Theorem:[Ajtai 2005,B-Saks-Sun-Vee 2003]

and

𝒙

π—§πŒπ’š 𝒙 both require

.

Β Slide24

BPs and Static Data Structures

Theorem:

[

Miltersen

-Nisan-Safra-

Wigderson 1998]With query set {0,1}π’Ž, time lower bounds of 𝝎(π’Ž) for size

π’πŽ(

1

)

static cell-probe data structures require non-trivial time-space tradeoffs (i.e. 𝐎(log 𝒏) space requires superlinear time.)Proof:View the query π’™βˆŠ{0,1}π’Ž as the input vectorFor each fixed dataset 𝓓, have a different branching program 𝑷𝓓. Can use each cell of the cell-probe data structure to hold a node of 𝑷𝓓Values are the index of the variable and the names of the two pointers Size

π’πŽ(

1

)

BP implies

𝒏

𝐎(

1

)

size cell-probe data structure with

π’˜=3log 𝒏

-bit words (simply follow the branching program)

Time is preserved.

Can extend each step to full tree of height π’Œ at cost of 2π’Œ factor larger word-size π’˜. Saves a factor π’Œ in time.Slide25

A Converse

Theorem:

[B-

Vee

2002]Static data structure:

2𝑺 cells + extra work space at most 𝑺 time 𝑻 query algorithm that reads ≀ π’Œ consecutive bits of the query in a one step yields a

2π’Œ-

way BP

𝑷

𝓓 of time 𝐎(𝑻) and space 𝐎(𝑺+log 𝑻) for every dataset 𝓓.Proof: Each BP node corresponds to a cell name + configuration of the extra storage. Memory contents are fixed by 𝓓. The input bits accessed are determined by the algorithm and the fixed memory cell contents just read. Slide26

Application to

π›Œ

-Near Neighbor

[B-

Vee

2002]

Hamming separation HAM𝜸 : e.g. 𝑫=[𝒏6]={0

,1}

6

log 𝒏

. Output is 1 iff 𝚫(π’™π’Š,𝒙𝒋)β‰₯5 log2 𝒏 for all π’Šβ‰ π’‹Can solve HAM𝜸 using a

π›Œ-Near Neighbor data structure:

Encode each coordinate

𝒙

π’Š

in

𝑫

as

𝒙

π’Š

using twice the bits so distance from

0

is fixed

Choose 𝓓 to be set of all possible strings of the form 0π’Š-1𝒂 0𝒋-π’Š-1 𝒂 0𝒏-𝒋 HAM𝜸(𝒙)=0

iff

𝓓 contains a close string to

𝒙

So...BP lower bound for HAM𝜸

implies:

Theorem: Any π›Œ-Near Neighbor

data structure for Hamming distance on {0

,1

}π’Ž that reads

𝐎(log 𝒏)

consecutive bits per time step and

memory cells requires time

𝛀(π’Ž)

.

Β Slide27

Larger bounds for Huge Domains

Inspired by multiparty NOF communication complexity

Uses

embedded cylinder intersections

instead of embedded rectangles

Theorem: There is an explicit function over a huge domain for which

𝑻=𝛀(𝒏 log2 (𝒏log|𝑫|/𝑺)) is needed. [B-Vee 2002]Drawbacks: Domain size |𝑫| requires

𝚯(log3

𝒏)

bits to encode

Function, which is based on tensored, interleaved Reed-Solomon codes, is not known to be in NP.Slide28

Single-Output Methods for Multi-output ProblemsSlide29

Open Problems

Prove general BP lower bounds for out-degree 2 (arbitrary) directed graph reachability

Savitch’s

Theorem implies

𝑺=𝐎(log

2

𝒏) and we don’t expect 𝑺=𝐎(log 𝒏) is possible at all. (Would imply NL/poly=L/poly.)Prove that 𝑺=𝐎(log

𝒏)

implies

𝑻=𝛀(𝒏2) or 𝑻= 𝛀(𝒏1+𝜺)At least match oblivious BP bound of 𝑻=𝛀(𝒏 log2(𝒏/𝑺)) for out-degree 1.Improve best lower bound for Boolean functions from

to

𝑻=𝛀(𝒏

log(𝒏/

𝑺

))

to match the large domain and oblivious BP bounds.

Generalize embedded rectangle techniques for Boolean inputs to embedded cylinder intersections.

Β Slide30

Open Problems

Prove any oblivious BP lower bound for an explicit single-output function that holds for time

𝑻= 𝒏 log

π›š(1)

𝒏

or even

𝑻=π›š(𝒏 log2 𝒏).Prove 𝑻=𝛀(𝒏 log2 (𝒏/𝑺)) oblivious BP lower bound for a wider range of natural functions.Slide31

Open Problems

Prove

an

𝛀(𝒏

2

)

size lower bound for an explicit Boolean function Find better time-space tradeoff lower bounds for other multi-output functions, e.g.Encoding asymptotically good error-correcting codes. [Bazzi-Mitter 2005] conjectured

Element distinctness

in sliding windows

[B-Clifford-Machmouchi 2013] ?Β Slide32

Okol’nishnikova Strategy for Layers

Use

4

π’Œ

2

equal-length

layers. Give a random subset of 2π’Œ of them to Bob 𝒓 ≀ 4π’Œ+1 follows automatically If length

then

variables

are read at most

timesEach such variable appears in at most layersProbability all of those layers given to Bob is β‰₯ (2π’†π’Œ)-2π’Œso expected size for Bob’s set is

β‰₯ (

2

π’†π’Œ)

-

2

π’Œ

𝒏/

2

Choose some subset that achieves the average

Bob only sees

≀

2

π’Œ(π’Œπ’/(

4π’Œ2)) = 𝒏/2 inputs so Alice’s set size is always β‰₯ 𝒏/2Β