Download
# IEEE TRANSACTIONS ON INFORMATION THEORY VOL PDF document - DocSlides

alida-meadow | 2014-12-12 | General

### Presentations text content in IEEE TRANSACTIONS ON INFORMATION THEORY VOL

Show

Page 1

1166 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 Correspondence ________________________________________________________________________ Efficient Coding Schemes for the Hard-Square Model Ron M. Roth , Senior Member, IEEE , Paul H. Siegel , Fellow, IEEE and Jack Keil Wolf , Life Fellow, IEEE Abstract The hard-square model, also known as the two-dimensional (2-D) (1 -RLL constraint, consists of all binary arrays in which the ’s are isolated both horizontally and vertically. Based on a certain prob- ability measure defined on those arrays, an efficient variable-to-fixed en- coder scheme is presented that maps unconstrained binary words into ar- rays that satisfy the hard-square model. For sufficiently large arrays, the average rate of the encoder approaches a value which is only 0.1% below the capacity of the constraint. A second, fixed-rate encoder is presented whose rate for large arrays is within 1.2% of the capacity value. Index Terms Constrained codes, enumerative coding, hard-square model, maxentropic probability measure, permutation codes, two-di- mensional (2-D) run-length-limited (RLL) constraints, variable-to-fixed encoders. I. I NTRODUCTION In current digital optical and magnetic recording systems, such as disks and tapes, the data is written along tracks , thus visualized as a one-dimensional (1-D) long sequence. To ensure reliability, the raw data typically undergoes lossless coding into a binary sequence that sat- isfies certain constraints. One of the most commonly used constraints is the (1-D) d;k -run-length-limited (RLL) constraint, which consists of all finite binary words in which the run lengths of ’s are at least and run lengths of ’s between two consecutive ’s do not exceed [18], [19], [25]. Recent developments in optical storage—especially in the area of holographic memory—are attempting to increase the recording density by exploiting the fact that the recording device is a surface . Under this new model, the recorded data is regarded as two–dimensional (2-D), as opposed to the track-oriented 1-D recording paradigm. The new approach, however, introduces new types of error patterns and con- straints—those now become 2-D rather than 1-D (see [2], [4], [12], [13], [22], [23]). Manuscript received September 5, 1999; revised August 24, 2000. This work was supported in part by the United States–Israel Binational Science Foundation (BSF), Jerusalem, Israel, under Grants 95-00522 and 98-00199, the National Science Foundation (NSF) under Grant NCR-9612802, and by the Center for Magnetic Recording Research at the University of California, San Diego. The material in this correspondence was presented in part at the SPIE International Symposium on Optical Science, Engineering and Instrumentation, Denver, CO, July 2000 and at the IEEE International Symposium on Information Theory, Sorrento, Italy, June 2000. R. M. Roth is with the Computer Science Department, Technion–Israel Insti- tute of Technology, Haifa 32000, Israel (e-mail: ronny@cs.technion.ac.il). P. H. Siegel is with the Department of Electrical and Computer Engineering, 0407, University of California, San Diego, La Jolla, CA 92093-0407 USA (e-mail: psiegel@ucsd.edu). J. K. Wolf is with the Center for Magnetic Recording Research, Uni- versity of California, San Diego, La Jolla, CA 92093-0401 USA (e-mail: jwolf@ucsd.edu). Communicated by E. Soljanin, Associate Editor for Coding Techniques. Publisher Item Identifier S 0018-9448(01)01331-1. Fig. 1. Parallelogram The treatment of 2-D constraints seems to be much more difficult than the 1-D case. This is, in part, due to the fact that in the general constrained setting, there are problems that are easy to solve in the 1-D case, yet they become undecidable when we shift to two dimensions [3], [24]. One important example of a 2-D constraint is the 2-D extension of the (1 -RLL constraint. This constraint, which is also referred to as the hard-square model, has been treated in quite a few papers in the past several years; see, for example, [6], [10], [11], [20], [29]. This constraint will also be the focus of this work. We define next the hard- square model, borrowing terms from [5]. Let be a finite subset of the integer plane and let be a finite set, referred to as an alphabet .A configuration is a mapping The value of at location i;j will be denoted by ;j We say that a -configuration satisfies the hard-square model if = and for every two distinct locations i;j ;j j 1= i;j =0 or ;j =0) Equivalently, if we write down the values of the -configuration in the integer plane, then the ’s are isolated both horizontally and vertically (either by ’s or by unassigned locations). The set of all -configura- tions that satisfy the hard-square model will be denoted by The subsets considered in this work will be either rectangles m;n i;j :0 i j (1) or parallelograms m;n i;j :0 i j (2) (see Fig. 1). We will be mainly concentrating on the hard-square model, as the known literature, as well as the results obtained herein, are elab- orate enough already for this special case. The capacity , or the topological entropy , of the hard-square model is given by )= lim m;n !1 log jS m;n mn = lim m;n !1 log jS ( m;n mn The limits indeed exist and are equal [5], [16]. The value of is known to be approximately 5878911162 ; see [6], [10], [11], [29]. 0018–9448/01$10.00 © 2001 IEEE

Page 2

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1167 Much less is known about efficient (i.e., polynomial-time, or low- complexity) high-rate coding schemes for this constraint. In [26], the idea of 2-D bit-stuffing was introduced, resulting in a variable-to-fixed encoder whose expected rate was bounded from below in [26] by ap- proximately 5515 . Note that in a variable-to-fixed scheme, the set of preimages, denoted , consists of binary words that are not neces- sarily of the same length; still, every sufficiently long binary uncon- strained word has exactly one element in as a prefix (namely, the set is prefix-free and complete). For the purpose of computing the rate, we define a probability measure on , where a preimage of length has probability . Indeed, by the properties of it follows that =1 (the Kraft equality). The expected rate of such a coding scheme is given by m;n mn A very simple coding scheme into ( m;n at a fixed rate 1: 2 is implied by [14, Lemma 1(e)]: entries i; j m;n such that is even are filled with the input bit stream, while the remaining entries are set to zero. We do not know of any other published efficient fixed-rate encoders at (significantly) higher rates for the 2-D (1 -RLL con- straint. The main goal of this work is designing efficient coding schemes for mapping, in a one-to-one manner, unconstrained binary words into ele- ments of m;n or ( m;n . Based on the idea of 2-D bit-stuffing introduced in [26], we present in Section III a variable-to-fixed encoder into ( m;n . Our coding scheme attains a rate which is approxi- mately 587277 , namely, only 0.1% below the value of Our variable-to-fixed rate encoder effectively realizes a certain prob- ability measure m;n on ( m;n . This measure is defined in Section II and its properties are proved in Section IV. In particular, we show that the marginal probability induced by m;n at every given row—and, re- spectively, at every given diagonal—of a random m;n -configuration in ( m;n is a first-order Markov process. With a slight compromise on the coding rate, we can also obtain an efficient fixed-rate encoder into m;n . Such an encoder is pre- sented in Section V with a rate that approaches 581074 for large values of or ; this rate is within 1.2% of the value of II. P ROBABILITY EASURE ON ARALLELOGRAMS Let m;n be the parallelogram defined by (2) and shown in Fig. 1. Row in m;n consists of all locations i; j such that j Diagonal consists of all locations i; d such that i .Row will be denoted by (h) and will be referred to as the horizontal boundary of m;n . Similarly, Diagonal , denoted (d) , will be referred to as the diagonal boundary of m;n . Those boundaries are depicted as thick lines in Fig. 1. The set m;n ( (h) (d) (i.e., the parallelogram excluding its boundaries) will be denoted by m;n A random m;n -configuration taking values from ( m;n (ac- cording to some probability measure) will be denoted by , and its value at location i; j will be denoted by i;j Let m;n be a probability measure defined on ( m;n ; that is, m;n )= for every 2S ( m;n . The (measure- theoretic) entropy of m;n is defined by m;n )= mn 2S ( m;n )log m;n The value m;n is the largest possible expected rate of any encoder that maps, in a one-to-one manner, a set of input binary words into ( m;n , with a probability measure defined on that induces the measure m;n on ( m;n . This clearly implies the inequality m;n log jS ( m;n mn (3) Now, suppose that m;n m;n =1 is a (2-D) sequence of prob- ability measures, where each individual measure m;n is defined on ( m;n . For a m;n -configuration 2S ( m;n , let (h) be the set of all +1 ;n -configurations in ( +1 ;n obtained from by appending an +1) st row. Similarly, let (d) be the set of all m;n +1 -configurations in ( m;n +1 obtained from by appending an +1) st diagonal. We say that the sequence m;n m;n is nested if for every m; n and 2S ( m;n m;n )= +1 ;n )= m;n +1 In other words, for every and , the measure m;n is the marginal distribution on ( m;n which is induced by the measure ;n ( ;n [0 1] . The nesting property allows us to regard as a measure which is an infinite extension of the individual measures m;n . The entropy of is defined by ) = lim m;n !1 m;n (by subadditivity the limit exists), and from (3) we have [5]. An (infinite extension) measure for which )= is called a maxentropic measure . Such a measure indeed ex- ists [5]. Our coding scheme effectively defines nested measures m;n ( m;n [0 1] for every m; n As we show, the sequence m;n m;n satisfies ) = lim m;n !1 m;n 587277 Since the limit is very close to the known bounds on , we can say that is “almost maxentropic.” The expected rate of our coding scheme approaches, through the values of m;n , the value For every 2S ( m;n , the value m;n )= takes the following form: m;n )= (h) ;x ;x ;n (d) ;x ... ;x 1) =1 +1 i;j i;j ;x ;j ;x ;j +1 (4) The components (h) , and (d) define the measure on location (0 0) and on the horizontal and diagonal boundaries, respectively, and will be specified in more detail below. The function [0 1] is defined through two parameters, [0 1) and (0 1] as follows: (0 u; y ; v )= if =0 otherwise (5)

Page 3

1168 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 Fig. 2. Location of the arguments of the function u;y;v and (1 u; y ; v )= 1 (0 u; y ; v . The distribution on ;j can be described verbally as follows. As dictated by the hard-square model, the value of i; j is forced to be unless i; j ;j =0 . When the latter condition is met, then i; j will be a Bernoulli random bit whose distribution depends on the value of ;j +1 ; if that value is , then i; j takes the value with probability ; otherwise, i; j takes the value with probability . Fig. 2 shows the values that determine the distribution of i; j ; the location i; j is marked by a box. The measures on the boundaries, defined by (h) and (d) , are set so that the nonboundary values have a stationary distribution in the sense stated in Proposition 2.1 below. Specifically, (h) will take the form of a first-order Markov process (h) ;w ... ;w )= =1 (h) (6) where (h) [0 1] is given by (h) (0 )= ; if =0 otherwise (7) for some [0 1] , and (h) (1 )=1 (h) (0 The values of g! [0 1] will be set to the stationary prob- abilities of the first-order Markov process (h) as follows: (0) = and (1) = 1 , where (0) = (8) As for the diagonal boundary, (d) will be a first-order Markov process of the form (d) ;w ... ;w )= =1 (d) (9) where (d) [0 1] is given by (d) (0 )= (10) with q and q (11) and (d) (1 )=1 (d) (0 (since , the denominators in (11) are guaranteed to be positive). The values in (11) are consistent with the stationary distribution along the horizontal boundary: as we show in Section IV-A, (11) implies that i; =0 and, furthermore, i; =0 ;j ;j =0 for all i and j The nesting property of the measures m; n is easily verified. Next, we state other properties of those measures that will be proved in Sec- tion IV. Hereafter, the notation ( m; n will mean that the random m; n -configuration is taken from the sample space ( m; n according to the distribution m; n We say that row in ( m; n forms a first-order Markov process identical to the horizontal boundary if for every j< and every nonempty word of length i; j =0 i; j i; j i; j ; if =0 if =1 provided that the event we condition on has positive probability. The main result in Section IV-A is the following. Proposition 2.1: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)–(11). Then the entries in each row in ( m; n form a first-order Markov process identical to the horizontal boundary if and only if ;q )= +4 (1 2(1 (12) As we show in Section IV-B, there exists a counterpart of Proposition 2.1 also for the diagonals in ( m; n Remark: The definition of m; n through a “local” conditional mea- sure j u; y ; v on as given by (5) somewhat resembles the Pickard random fields defined in [21], except that columns therein are replaced here by diagonals. Note, however, that Pickard fields assume that the measure is invariant under all the symmetries of the square, whereas we require less: the distribution along rows may (and will) differ from the distribution along diagonals. Thus, the result in Propo- sition 2.1 is different than the first-order Markov property of Pickard fields. We now turn to the measure-theoretic entropy of m; n . Define the real-valued function :[0 1] [0 1] by )= log (1 ) log (1 where (0) = (1) = 0 We show in Section IV-A the following lower and upper bounds on m; n Proposition 2.2: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)––(11) and (12). Then m; n 1)( 1) mn )+(1 )) mn By Proposition 2.2 we have )= lim m; n !1 m; n )= )+ (1 ))

Page 4

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1169 which, with (8) and (11), yields )= ;q )) = )+ (1 )) (2 )( q where ;q is given by (12). The numerator and denominator can be made linear in by observing that (12) implies that =(1 (1 . This yields ;q )) = (1 )+( (1 (2 +3 )+ (1 + (13) To obtain the largest rate, we maximize ;q )) with respect to and . The maximum is attained for 671833 and 566932 , and the maximum value is ;q )) 587277 Our analysis depends strongly on the particular structure of the mea- sure m;n —in particular, on conditioning the probability of the event i;j =0 only on the values of the three entries ;j ;j +1 and i;j , as shown in Fig. 2. We note that such conditioning is causal in that we can select—according to the measure m;n —an el- ement of ( m;n by determining the values of its entries consecu- tively row-by-row or diagonal-by-diagonal: in such a process, the dis- tribution of values of the next entry to be determined is well-defined as this distribution depends on values that have already been set. Such a feature enables using the measure m;n for encoding, as we show in Section III. Clearly, we may maintain causality and still approach capacity by conditioning the value of i;j on more entries in the “past.” However, it appears that the analysis thus becomes much more complex. For ex- ample, when i;j is conditioned also on ;j +2 , we no longer have even a second-order Markov process along rows. III. V ARIABLE -R ATE NCODING CHEME We describe how the estimate on m;n , given in Proposition 2.2, can be approached by a variable-to-fixed rate coding scheme. The objective is to realize the probability measure m;n on ( m;n in the output of the encoder. The encoder consists of the following components. 1) A distribution transformer that maps, in a one-to-one manner, sequences of fair coin flips (i.e., independent Bernoulli random bits, each equaling with probability ), into sequences of independent Bernoulli random bits such that each bit equals with probability . There are known methods [8, Sec. 5.12] to implement variable-to-fixed rate transformers such that, for as the code length goes to infinity, the following holds: a) the expected rate (i.e., expected number of input bits per each output bit) of is at least )(1 b) all the words of the original Bernoulli source, except for a fraction whose probability is less than , are generated by with probability that differs from the original prob- ability by a factor within . Namely, the typical words of the original source are generated by with virtually the same probability. 2) A distribution transformer , similar to , except that the output is with probability . The rate of can get arbitrarily close to 3) Probabilistic boundary generator , to be explained below. 4) Constrained coder , to be explained below. Fig. 3. Encoding of by The raw input bits are fed into the transformers and , each input bit entering exactly one of the transformers. The coder then queries the outputs of and throughout the encoding process. The order of queries determines which transformer is fed by any given input bit. The encoding procedure starts by generating the entry at the origin, the entries ;j j , along the horizontal boundary, and the entries i; i , along the diagonal boundary. Those entries are generated by probabilistically, using (internal) sources of Bernoulli random trials (i.e., internal coin flips), with probabilities of success , and , as given by (12), (8), and (11). Note that these coin flips can be driven by external sources (as is done in and ), thus, contributing to the rate; however, since the boundaries occupy only bits out of the mn bits of m;n , such a rate contribution becomes marginal when and are large. The main coding task is performed by , which is fed by the outputs of and . At each encoding step, generates a value i;j in a new location i; j in m;n , as described in Fig. 3. The value of i;j depends on the values ;j ;j +1 , and i;j (which are assumed to have already been generated), and also on at most one output bit of or . To this end, there are two natural orders in which the values i;j can be computed: they can be generated row-by-row, or diagonal-by-diagonal. As we show in Corollary 4.3 in Section IV-A, when m;n satisfies (4)–(11) and (12) we have i;j ;j +1 =0 for all i; j m;n (h) . Hence, the expected number of locations i; j m;n for which i;j ;j +1 =0 ,is =( 1)( 1) : This is the expected number of times that or are queried by The expected number of times that (respectively, ) is queried is N (respectively, (1 ). Therefore, the expected rate of the overall coding scheme is m;n )+ (1 )) (1 m;n mn 1)( 1) mn )+ (1 ))(1 m;n 1)( 1) mn (1 m;n where lim m;n !1 m;n =0 . Namely, for 2f , we bound from below the rates of by )(1 m;n ; the factor m;n also in- corporates the ratio between the probability with which a typical word is generated by , compared to the probability with which such a word is generated by an ideal Bernoulli source (defined by ). Simulations suggest that the rate m;n is attained regardless of the boundary values set by ; yet we have not proved this. On the other hand, there is clearly a fixed assignment for the boundaries that yields expected rate at least m;n . If we knew such an assignment, we could hard-wire it into the decoder, in which case it would be sufficient to

Page 5

1170 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 transmit only the 1)( 1) nonboundary values of , making redundant. Decoding is carried out as follows. Bits are read from the received m;n -configuration in the order they were generated by the encoder, disregarding each that immediately follows a horizontally or ver- tically. The remaining bits are then divided into two bit streams ac- cording to the transformer that generated each individual bit. The bit streams are then fed into the decoders (i.e., inverse mappings) of the respective transformers. Our coding scheme can be simplified by combining the distribu- tion transformers and , in which case our encoder becomes the bit-stuffing encoder in [26] (except that the analysis here takes into ac- count that stuffed ’s overlap, thereby improving on the lower bound of [26] on the expected rate). In such a case, we maximize ;q )) in (13) under the restriction . The maximum is attained for 644400 , and the maximum value is ;q )) 583056 , which is within 1% of the capacity . We mention that this latter rate can be attained also by tuning the parameters in the method presented recently and independently in [15]. IV. F IRST -O RDER ARKOV ROPERTIES OF A. Horizontal First-Order Markov Process In this section, we provide proofs for Propositions 2.1 and 2.2. We start by verifying that the value of in (8) is the stationary prob- ability of the first-order Markov process along the horizontal and diag- onal boundaries. It is easy to see that (h) (0 0) (h) (0 1)(1 )= +1 thus implying that is indeed the stationary probability of ;j =0 along the horizontal boundary. Similarly, by the choice of and in (11) we also have (d) (0 0) (d) (0 1)(1 )= (1 making also the stationary probability of i; =0 along the diagonal boundary. Denote by the set of all binary words of length , and by the set of all nonempty binary words of length For i; j m;n and +1 , denote by i;j the event i;j )= i;j i;j i;j i;j +1 Also, for i; j m;n rm and 2f , define the event i;j )= i;j \f ;j +1 (see Fig. 4). We also define the vectors i;j )= fA (0) i;j fA (1) i;j Note that by the way we set the diagonal boundary we have i; (0)= i; =0 ;X =0 i; =0 ;X =1 (1 q (1 (14) Fig. 4. Event and i; (1)= i; =1 ;X =0 i; =1 ;X =1 (1 (1 )(1 q (1 (15) Given ;q ; [0 1] , we define the following matrices: q (1 (1 )0 (1 )(1 )0 and =0 . The notations and will stand for the identity matrix and the row vector (11) , respectively. Recall that we say that row in ( m; n forms a first- order Markov process identical to the horizontal boundary if for every j and every word i; j =0 i; j i; j i; j ; if =0 if =1 (16) provided that the event we condition on has positive probability. The following lemma is easily verified. Lemma 4.1: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)–(11). Suppose that for some in the range i ,row in ( m; n forms a first-order Markov process identical to the horizontal boundary. Then i; j bc )= b; c i; j for all j and Proof of Proposition 2.1: We start with the “if” part and prove by induction on that row forms a first-order Markov process identical to the horizontal boundary. We do this by showing that (16) holds for every j and every word of length exactly (which clearly implies that it holds for all shorter words). First note that the sample space ( m; n of forces (16) to hold whenever the first bit in is =1 We now consider words with =0 . Our induction proof for row assumes that row forms a first-order Markov process identical to the horizontal boundary. Clearly, this trivially holds for the induction

Page 6

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1171 base =1 . Write =0 where 2f , in which case (16) becomes fA i; j (00 fA i; j (0 This, in turn, is equivalent to i; j (00 )= i; j (0 (17) By Lemma 4.1 and the induction hypothesis we have i; j (00 )= i; j (0 Hence, (17) can be rewritten as I i; j (0 )=0 It follows that in order to show (16), it suffices to prove that for j and 2f , the vector i; j (0 is either the zero vector or a (right) eigenvector of associated with the eigenvalue Now, it is easy to see that (12) implies that is a nonnegative eigen- value of ; indeed, is the nonnegative root of the quadratic equa- tion (1 )+ q =0 (18) obtained from the equality det( I )=det q (1 =0 We now distinguish between two cases for the value of the word Hereafter stands for the all-zero word of length Case 1: . By (14) it follows that i; (0) is an eigen- vector of associated with the eigenvalue . Hence, by Lemma 4.1 and the induction hypothesis, we have i; j )= i; (0)= i; (0) j i: Namely, i; j is also an eigenvector of associated with the eigenvalue Case 2: . Write .If s then starts with a , or else we are in the trivial case in which the event we are conditioning on in (16) has zero probability (i.e., i; j (1 is zero). By Lemma 4.1 and the induction hypothesis, we obtain i; j (1 )= i; j (0) (1 (1 )(1 (1 (19) for some real , where (0) is the first coordinate of i; j and where the last equality in (19) follows from (18). In fact, (19) also applies to , in which case is the empty word and i; j (1 )= i; (1) , which, in turn, is given by (15). Combining (19) with Lemma 4.1 we obtain i; j (01 )= i; j (1 (1 (1 That is, i; j (01 , if nonzero, is an eigenvector of associated with the eigenvalue . Now, by Lemma 4.1 we have i; j (0 )= i; j (01 )= i; j (01 We thus conclude that i; j (0 , if nonzero, is an eigenvector of associated with the eigenvalue for every j and 2f . This establishes the induction step. We now turn to the “only if” part. We show that the equality =1 =0 =1 (20) implies that satisfies (18). Indeed, =01 =001 =0001 =0101 ((1 +(1 )(1 )) Combining this with (20), we obtain (1 )= ((1 +(1 )(1 )) which, with (8) and (11), yields (18). Corollary 4.2: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)–(11) and (12). Then, for every i; j m; n and , the vector i; j (0 ,if nonzero, is an eigenvector of associated with the eigenvalue Proof: By (16) we have i; j (00 )= i; j (0 for every i; j m; n and ; so, by Lemma 4.1 I i; j (0 )=0 (21) Now, the matrix I is singular since is an eigenvalue of On the other hand, implies that =1 ; so, the vector 1( I is nonzero. Hence, that vector spans the rows of I , thus implying by (21) that i; j (0 is an eigenvector of associated with the eigenvalue Corollary 4.3: For m; n [0 1) , and (0 1] let m; n be a measure defined on ( m; n by (4)–(11) and (12). If ( m; n then i; j ;j +1 =0 for every i; j m; n (h) Proof: Recall that for a m; n -configuration 2S ( m; n we denote by (d) the set of all m; n +1 -configurations in ( m; n +1 obtained from by appending another diagonal. By the nesting property of m; n we have m; n )= m; n +1 for every 2S ( m; n . Hence, it suffices to show that when ( m; n +1 , then i; j ;j +1 =0 for every i; j m; n (h) , or, equivalently, for every i; j +1) m; n +1

Page 7

1172 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 Fig. 5. Event By Proposition 2.1 and Corollary 4.2, when applied to ( m; n +1 , it follows that for every i; j +1) m; n +1 , the vec- tors i; j (0) are eigenvectors of associated with the eigenvalue ; namely, i; j (0)= i; j =0 ;Z ;j +1 =0 i; j =0 ;Z ;j +1 =1 i; j (1 (22) for some constants i; j . On the other hand, we also have i; j (0)= i; j =0 : Combining the latter equation with (22) we thus obtain i; j (0)= +(1 (1 (1 for every i; j +1) m; n +1 (compare with (14)). Proof of Proposition 2.2: By (4) and Corollary 4.3 we have mn H m; n )+( 1) +( 1)( )+(1 )) =1 +1 i; j ;j =0 ;j +1 =0 i; j ;j =0 g ;j +1 =1 i; j ;j =0 g )) )+( 1) +( 1)( )+(1 )) +( 1)( 1) )+(1 )) Therefore, m; n 1)( 1) mn )+(1 )) mn as claimed. B. Diagonal First-Order Markov Process In this section, we present a counterpart of Proposition 2.1 for the diagonals of ( m; n . We state the respective claims and point out the difference in proofs compared to those in Section IV-A. For i; d m; n and , denote by d; i the event d; i )= i; d ;d 1) ;d 2) +1 ;d +1) Also, for i; d m; n (d) and 2f , define the event d; i )= d; i \f i; d (see Fig. 5). We also define the vectors d; i )= fB (0) d; i fB (1) d; i The counterparts of (14) and (15) take the form d; (0)= ;d =0 ;X ;d =0 ;d =0 ;X ;d =1 and d; (1)= ;d =1 ;X ;d =0 ;d =1 ;X ;d =1 (1 and the counterparts of the matrices b; c are (1 )0 00 (1 )0 00 where and are given by (11). We say that diagonal in ( m; n forms a first-order Markov process identical to the diagonal boundary if for every i< and every word i; d =0 ;d 1) ;d 2) `; d (23) provided that the event we condition on has positive probability. Lemma 4.4: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)–(11). Suppose that for some in the range d< n , diagonal in ( m; n forms a first-order Markov process identical to the diagonal boundary. Then d; i bc )= b; c d; i for all i and Proposition 4.5: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)–(11). Then the en- tries in each diagonal ( m; n form a first-order Markov process identical to the diagonal boundary if and only (12) holds. Proof: The proof follows along the lines of the proof of Proposi- tion 2.1, with a notable difference in the treatment of the case =1 in (23). Specifically, in the “if” part, we also need to show that fB d; i (01 fB d; i (1 for all i; d m; n and (24)

Page 8

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1173 The proof is carried out by induction on , and the induction step assumes that (23) holds for diagonal . Defining fB (0) d;i (1 we have d;i (1 )= By the induction hypothesis and Lemma 4.4 we obtain d;i (01 )= d;i (1 )= Now, by (11), we have +1 Hence, d;i (01 )= d;i (1 )=( +1 d;i (1 thus implying (24). The case =0 in (23) is treated by showing through the induction on that for every i; d m;n and 2f , the vector d;i (0 , if nonzero, is an eigenvector of associated with the eigenvalue given in (11). V. F IXED -R ATE NCODING CHEME Let m;n be the rectangle defined by (1). A m;n -configuration =( i;j 2S m;n is called circular if for every i , the entries i; and i;n are not both . The set of all m;n -configura- tions in m;n that are circular will be denoted by m;n . Note also that if 2S m;n and if we define the m;n -configuration =( i;j by i;j i;j if i;n otherwise i; j m;n then 2S ( m;n In this section, we present a fixed-rate coding scheme into m;n with a rate that approaches 581074 for large values of (or ). Our scheme borrows ideas from permutation codes [27], [30] combined with enumerative coding [9]. Even though the circular property is not necessary for the coding, it will make the analysis simpler. The m;n -configurations generated by the encoder will have the additional property that all rows in them have the same (Hamming) weight n for a value [0 1] that will be determined in the sequel. Let be in m;n and assume that for some in the range i ,row in has weight . Let ;j ;j be the indexes for which ;j =1 . Clearly, we must have i;j =0 for every . Define the words i;j +1 i;j +2 i;j k and i;j +1 i;n i; i;j The word is called the th phrase of row in . Note that row in is obtained by shifting the word (1) (2) cyclically entries to the right. The length of is called the th phrase length in row of . Denoting that length by , the list ;` ;` is called the phrase profile of row in . Clearly, +1 , where +1 is defined to be . Hence, the phrase profile of row is completely determined by row . Note also that =1 . We mention that a (somewhat different) definition of phrases is used also in 1-D permutation codes [27], [30]. For a positive integer , let ;` denote the set of all words of length that satisfy the 1-D (1 -RLL constraint. Similarly, we define ;` . Also, denote by `;r (respectively, `;r ) the set of words in (respectively, `;r ) of weight . It is easy to see that m;n -configuration 2S m;n is in m;n if and only if every row in is in Lemma 5.1: For every two positive integers and such that , there is a mapping a;b S !S S that is one-to-one and weight-preserving. Proof: Since and and play symmetrical roles, we can assume that .For 2S and , we define a;b ;y )=( ;w where and are determined as follows: if =0 or =0 , then we set and (25) Otherwise (if =01 and =1 +1 +2 and (26) It is easy to see that the mapping a;b is into S and is weight- preserving. To show that it is one-to-one, we need to verify that we can distinguish pairs ;w generated by (25) from those that are generated by (26). Indeed, only in the latter, the last entry of and the first entry of are both equal to Let and be two words in . We say that is consistent with if and form the rows of an array in ;n . In other words, and do not have ’s in the same position. Define n; t )= =0 jS +2 ;t (27) Lemma 5.2: For every word 2S n;t there are at least n; t words 2S n;t that are consistent with Proof: The proof is based on the observation that the number of possible assignments for depends only on the phrase profile of and only through the multiplicity (but not the order) with which each phrase length appears in that phrase profile. This phrase profile, in turn, is completely determined by Assume first that induces on the phrase profile ;` ;` where =2 (and, so, +2 ). We refer to this profile as the worst profile for length and weight . Each of the phrases of length in can take a value from 00 01 10 .It follows that for s , there are ways to assign values to the phrases of length in so that their overall weight is . If the overall weight of is , then the remaining phrase, of length +2 in must have weight . This proves the lemma assuming that induces the worst profile on It remains to establish that the worst profile is indeed the worst, in the sense that it leads to the smallest possible number of assignments for . We show this by descending induction on the number of phrases in whose lengths equal . Clearly, if at least phrase lengths equal , then, up to permutation of phrase lengths, the phrase profile is a worst profile and the claim immediately follows. Turning to the induction step, suppose that induces on a phrase profile =( ;` ;` in which =2 and =2 . We can further assume that ; indeed, if all phrase lengths in were less than , then we would have , in which case n; t )=0 We denote by the set of all words in n;t that are consistent with Let be a word in n;t that induces the phrase profile =(2 ;` ;` ;` ;` on every word that is consistent with . The set of all such words in n;t will be denoted by . Observe that has more phrase lengths

Page 9

1174 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 Fig. 6. Enumerative coding into a row of an array in equaling than does. So, by the induction hypothesis on we have j n; t We next define a mapping as follows. Given the first two phrases of are obtained by applying the mapping ;` of Lemma 5.1 to the first two phrases of . The remaining phrases in are identical to their counterparts in . By Lemma 5.1, the mapping is one-to-one and weight-preserving and, so, j j n; t Let max max be the value of a nonnegative integer for which n; t is maximized. Given , and (e.g., max ), Lemma 5.2 suggests a coding scheme at a fixed rate log n; t mn into the set m;n as follows. For =0 ;m we select row from n;t so that it is consistent with row (for the case =0 , we can assume a particular word from n;t to serve as a “phantom” row ). Lemma 5.2 guarantees that we have at least n; t words in n;t that can be selected for row . This, in turn, implies the following result. Proposition 5.3: log jS m;n mn log n; t max The effective computation of row in the suggested coding scheme can be done by enumerative coding, as we describe next [7], [25, Ch. 6], [28]. Let ;` ... ;` be the phrase profile of row as induced by row . For this particular phrase profile, denote by k;s the number of possible assignments for the first phrases of row so that their overall weight is .Wehave k;s =0 jS ;r j ;s (28) where =1 and ;s =0 for s> . The values jS `;r , in turn, can be computed by the recurrence jS `;r jS ;r jS ;r ;` (29) where jS =1 jS ;r =0 for =0 jS jS =1 , and jS ;r =0 for r= 2f We can rewrite (28) and (29) in polynomial notation as follows. Let be an indeterminate, and define the polynomials )= =0 `;r =0 jS `;r j ;` and )= =0 k;s t: Then (29) becomes )= )+ ;` (30) where )=1 and )=1+ . The recurrence (28), in turn, can be written as )(mod +1 where )=1 .So =1 )(mod +1 t: (31) The latter formula can be used to accelerate the computation of the values k;s through fast techniques of polynomial multiplication [1]. Note also that the polynomials need to be computed for +1 only once for the whole array. The enumerative coding algorithm of row is presented in Fig. 6. The unconstrained input stream to be coded into row is regarded as an integer in the range p n; t , and the phrase profile of row is also assumed to be available. The main loop of the algorithm computes the phrases of row , in reverse order, starting with the th phrase. In each iteration of the main loop, the variable determines the weight of the th phrase, and equals the overall weight of the first phrases. It can be easily verified by descending induction on that each loop iteration starts with a value of in the range p k;s the induction base following from p n; t t;t . Similarly, the value of at the end of each loop iteration lies in the range < ; jS ; . The mapping from into a word in ; assumes an ordering on the elements of each set `;r . If the standard lexicographic ordering is used, then such a mapping can be efficiently implemented by (a second level of) enumerative coding, using the recurrence (29) (or (30)). We next obtain an asymptotic estimate for n; t which will enable us to compute an asymptotic lower bound on (log n; t max )) =n The following lemma is a well-known asymptotic estimate for the bi- nomial coefficients (see [17, p. 309]). Lemma 5.4: For and `> log r=` `; r ))

Page 10

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1175 where lim !1 max `;r =0 Lemma 5.5: For and `> log jS `;r =log =( r= )) r;r )) where lim !1 max `;r =0 Proof: A word is in `;r if and only if it can be written as a sequence of nonoverlapping blocks, of which equaling 10 and the remaining equaling (if the last entry in is , then the last block will also include the first entry in ). Hence, jS `;r equals the number of combinations of elements (being the indexes of the blocks 10 within ) out of It follows from Lemma 5.5 and the continuity of the function 7! that for every real [0 2] we have lim !1 (1 =` log jS `; ` =(1 = (1 )) where stands for the smallest integer not greater than . (Indeed, (1 = (1 )) is the entropy of a first-order Markov process de- fined on the (1 -RLL constraint, in which the probability of having following is = (1 ; the stationary probability of is then .) Observing that jS +3 ;r jjS +2 ;r jjS t;r jjS t;r and that we get from (27) the lower bound log n;t max s +log +log +log jS t;t (32) and the upper bound log n;t max s +log +log )+log jS +3 ;t (33) Write t=n s=n , and =( . Assuming that =(1 and , we can incorporate Lemmas 5.4 and 5.5 into the lower bound (32) to yield that, whenever !n is a nonnegative integer less than n log n;n !n +log n !n +log +log jS (1 n; (1 n =( != )+(1 (1 = (1 )) (1)) where (1) stands for an expression that goes to zero as goes to infinity. We now observe that (1 and that implies = (1 . Hence, for every fixed rational [0 3] and every such that n is an integer (1 =n log n;n sup ; (1) (34) where ; )= [1+ ((1 = 3) )]+(1 [(1 = (1 )) and the supremum in the right-hand side of (34) is taken over all rational in the range min = (1 . In fact, from the upper bound (33) it follows that the inequality in (34) can be replaced by an equality. Furthermore, since the function ; is continuous we have liminf !1 (1 =n log n; n )=max ; where the maximum is taken over all real [0 min = (1 We now maximize the expression ; over real values of [0 3] and [0 min = (1 . By taking partial deriva- tives of ; with respect to and , we get the equations (23 4)(29 4) (8357 8357 +3098 518 +38 1)=0 and (369 101 +4) 1469 682 +95 The maximum is attained for max ; max (0 216594 248986) in which case liminf !1 (1 =n log n;t max )) =liminf !1 max (1 =n log n; n sup liminf !1 (1 =n log n; n =max ; ; )= max ; max 581074 (in fact, one can easily show that in the third step—where we change the order between maximizing over and taking the limit over —the inequality can be replaced by an equality). CKNOWLEDGMENT The authors wish to thank S. Forchhammer, J. Justesen, E. Or- dentlich, A. Orlitsky, and K. Zeger for very helpful discussions and comments. EFERENCES [1] A. V. Aho, J. E. Hopcroft, and J. D. Ullman, The Design and Analysis of Computer Algorithms . Reading, MA: Addison-Wesley, 1974. [2] J. J. Ashley, M. Blaum, and B. H. Marcus, “Report on coding techniques for holographic storage,” IBM, Res. Rep. RJ 10013, 1996. [3] R. Berger, “The undecidability of the domino problem, Mem. Amer. Math. Soc. , vol. 66, 1966. [4] D. Brady and D. Psaltis, “Control of volume holograms, J. Opt. Soc. Amer. A , vol. 9, pp. 1167–1182, 1992. [5] R. Burton and J. E. Steif, “Non-uniqueness of measures of maximal en- tropy for subshifts of finite type, Ergod. Theory Dynam. Syst. , vol. 14, pp. 213–235, 1994. [6] N. Calkin and H. S. Wilf, “The number of independent sets in a grid graph, SIAM J. Discr. Math. , vol. 11, pp. 54–60, 1997. [7] T. M. Cover, “Enumerative source encoding, IEEE Trans. Inform. Theory , vol. IT-19, pp. 73–77, Jan. 1973. [8] T. M. Cover and J. A. Thomas, Elements of Information Theory .New York: Wiley, 1991. [9] S. Datta and S. W. McLaughlin, “An enumerative method for run length- limited codes: Permutation codes, IEEE Trans. Inform. Theory , vol. 45, pp. 2199–2204, Sept. 1999. [10] K. Engel, “On the Fibonacci number of an lattice, Fibonacci Quart. , vol. 28, pp. 72–78, 1990. [11] S. Forchhammer and J. Justesen, “Entropy bounds for constrained 2-D random fields, IEEE Trans. Inform. Theory , vol. 45, pp. 118–127, Jan. 1999. [12] J. F. Heanue, M. C. Bashaw, and L. Hesselink, “Volume holographic storage and retrieval of digital data, Science , vol. 265, pp. 749–752, 1994. [13] J. F. Heanue, M. C. Bashaw, and L. Hesselink, “Channel codes for digital holographic data storage, J. Opt. Soc. Amer. A , vol. 12, 1995. [14] H. Ito, A. Kato, Z. Nagy, and K. Zeger, “Zero capacity region of multidi- mensional run length constraints, Electron. J. Combin. , vol. 6, p. R33, 1999.

Page 11

1176 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 [15] J. Justesen and Y. M. Shtarkov, “Simple models of two-dimensional in- formation sources and codes,” in Proc. 1998 IEEE Int. Symp. Informa- tion Theory (ISIT’98) , Cambridge, MA, 1998, p. 412. [16] A. Kato and K. Zeger, “On the capacity of two-dimensional run-length constrained channels, IEEE Trans. Inform. Theory , vol. 45, pp. 1527–1540, July 1999. [17] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes . Amsterdam, The Netherlands: North Holland, 1977. [18] B. H. Marcus, R. M. Roth, and P. H. Siegel, “Constrained systems and coding for recording channels,” in Handbook of Coding Theory ,V.S. Pless and W. C. Huffman, Eds. Amsterdam, The Netherlands: Elsevier, 1998, pp. 1635–1764. [19] B. H. Marcus, P. H. Siegel, and J. K. Wolf, “Finite-state modulation codes for data storage, IEEE J. Select. Areas Comm. , vol. 10, pp. 5–37, 1992. [20] Z. Nagy and K. Zeger, “Capacity bounds for the three-dimensional (0 1) run length limited channel, IEEE Trans. Inform. Theory , vol. 46, pp. 1030–1033, May 2000. [21] D. K. Pickard, “A curious binary lattice process, J. Appl. Probab. , vol. 14, pp. 717–731, 1977. [22] D. Psaltis and F. Mok, “Holographic memories, Scientific Amer. , vol. 273, no. 5, pp. 70–76, 1995. [23] D. Psaltis, M. A. Neifeld, A. Yamamura, and S. Kobayashi, “Optical memory disks in optical information processing, Appl. Opt. , vol. 29, pp. 2038–2057, 1990. [24] R. M. Robinson, “Undecidability and nonperiodicity for tilings of the plane, Inventiones Math. , vol. 12, pp. 177–209, 1971. [25] K. A. Schouhamer Immink, Codes for Mass Data Storage Sys- tems . The Netherlands: Shannon Foundation, 1999. [26] P. H. Siegel and J. K. Wolf, “Bit-stuffing bounds on the capacity of 2-di- mensional constrained arrays,” in Proc. 1998 IEEE Int. Symp. Informa- tion Theory (ISIT’98) , Cambridge, MA, 1998, p. 323. [27] D. Slepian, “Permutation codes, Proc. IEEE , vol. 53, pp. 228–236, 1965. [28] D. T. Tang and L. R. Bahl, “Block codes for a class of constrained noise- less channels, Inform. Contr. , vol. 17, pp. 436–461, 1970. [29] W. Weeks IV and R. E. Blahut, “The capacity and coding gain of certain checkerboard codes, IEEE Trans. Inform. Theory , vol. 44, pp. 1193–1203, May 1998. [30] J. K. Wolf, “Permutation codes, d;k codes and magnetic recording, in Proc. 1990 IEEE Colloq. South America, Argentina, Brazil, Chile, Uruguay , W. Tompkins, Ed., 1990. Quantum Codes from Cyclic Codes over GF Andrew Thangaraj , Student Member, IEEE, and Steven W. McLaughlin , Senior Member, IEEE Abstract We provide a construction for quantum codes (Hermi- tian-self-orthogonal codes over GF (4) ) starting from cyclic codes over GF (4 . We also provide examples of these codes some of which meet the known bounds for quantum codes. Index Terms -ary image, quantum Bose–Chaudhuri–Hocquenghem (BCH) code, quantum code. I. I NTRODUCTION In this correspondence, we use the ideas of Calderbank et al. [1] to construct a new class of quantum codes from cyclic codes over GF (4 . In particular, the following theorem from [1] can be used directly to obtain quantum codes from certain codes over GF (4) Theorem 1: Suppose is an n;k linear code over GF (4) self- orthogonal with respect to the Hermitian inner product. Suppose also that the minimum weight of is . Then, an [[ n;n k; d ]] quantum code can be obtained from The Hermitian inner product of u; v GF (4) is defined to be u:v where for GF (4) . From now on, orthogonality over GF (4) will be with respect to the Hermitian inner product defined above. In this correspondence, we consider self-orthogonal codes over GF (4) that are obtained as -ary images of -ary cyclic codes of length (4 1) . Binary images of self-orthogonal codes over GF (2 have been used to obtain quantum codes in [2]. Using Theorem 1 and working with -ary images usually results in smaller dimension quantum codes at the same efficiency and minimum distance. We begin with a discussion of -ary images of cyclic codes over GF . Then we present a method to obtain self-orthogonal codes over GF (4) as images of codes over GF (4 II. ARY MAGES A. Some Definitions and Notation The finite field GF can be considered to be a vector space of dimension over the field GF . Let ; ... ; be a basis for GF over GF . The map GF 7! GF mn is defined so that for ;u ;u g2 GF )= 11 ... ;u ;u 12 ... ;u ... ;u nm where =1 ij ;u ij GF n: Notice that is not the obvious expansion of in terms of but a permutation of the expansion. Manuscript received March 2, 2000; revised October 28, 2000. The authors are with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: andrew@ ee.gatech.edu; swm@ee.gatechedu). Communicated by P. W. Shor, Associate Editor for Quantum Information Theory. Publisher Item Identifier S 0018-9448(01)01526-7. 0018–9448/01$10.00 © 2001 IEEE

47 NO 3 MARCH 2001 Correspondence Efficient Coding Schemes for the HardSquare Model Ron M Roth Senior Member IEEE Paul H Siegel Fellow IEEE and Jack Keil Wolf Life Fellow IEE ID: 22955

- Views :
**138**

**Direct Link:**- Link:https://www.docslides.com/alida-meadow/ieee-transactions-on-information-563
**Embed code:**

Download this pdf

DownloadNote - The PPT/PDF document "IEEE TRANSACTIONS ON INFORMATION THEORY ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Page 1

1166 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 Correspondence ________________________________________________________________________ Efficient Coding Schemes for the Hard-Square Model Ron M. Roth , Senior Member, IEEE , Paul H. Siegel , Fellow, IEEE and Jack Keil Wolf , Life Fellow, IEEE Abstract The hard-square model, also known as the two-dimensional (2-D) (1 -RLL constraint, consists of all binary arrays in which the ’s are isolated both horizontally and vertically. Based on a certain prob- ability measure defined on those arrays, an efficient variable-to-fixed en- coder scheme is presented that maps unconstrained binary words into ar- rays that satisfy the hard-square model. For sufficiently large arrays, the average rate of the encoder approaches a value which is only 0.1% below the capacity of the constraint. A second, fixed-rate encoder is presented whose rate for large arrays is within 1.2% of the capacity value. Index Terms Constrained codes, enumerative coding, hard-square model, maxentropic probability measure, permutation codes, two-di- mensional (2-D) run-length-limited (RLL) constraints, variable-to-fixed encoders. I. I NTRODUCTION In current digital optical and magnetic recording systems, such as disks and tapes, the data is written along tracks , thus visualized as a one-dimensional (1-D) long sequence. To ensure reliability, the raw data typically undergoes lossless coding into a binary sequence that sat- isfies certain constraints. One of the most commonly used constraints is the (1-D) d;k -run-length-limited (RLL) constraint, which consists of all finite binary words in which the run lengths of ’s are at least and run lengths of ’s between two consecutive ’s do not exceed [18], [19], [25]. Recent developments in optical storage—especially in the area of holographic memory—are attempting to increase the recording density by exploiting the fact that the recording device is a surface . Under this new model, the recorded data is regarded as two–dimensional (2-D), as opposed to the track-oriented 1-D recording paradigm. The new approach, however, introduces new types of error patterns and con- straints—those now become 2-D rather than 1-D (see [2], [4], [12], [13], [22], [23]). Manuscript received September 5, 1999; revised August 24, 2000. This work was supported in part by the United States–Israel Binational Science Foundation (BSF), Jerusalem, Israel, under Grants 95-00522 and 98-00199, the National Science Foundation (NSF) under Grant NCR-9612802, and by the Center for Magnetic Recording Research at the University of California, San Diego. The material in this correspondence was presented in part at the SPIE International Symposium on Optical Science, Engineering and Instrumentation, Denver, CO, July 2000 and at the IEEE International Symposium on Information Theory, Sorrento, Italy, June 2000. R. M. Roth is with the Computer Science Department, Technion–Israel Insti- tute of Technology, Haifa 32000, Israel (e-mail: ronny@cs.technion.ac.il). P. H. Siegel is with the Department of Electrical and Computer Engineering, 0407, University of California, San Diego, La Jolla, CA 92093-0407 USA (e-mail: psiegel@ucsd.edu). J. K. Wolf is with the Center for Magnetic Recording Research, Uni- versity of California, San Diego, La Jolla, CA 92093-0401 USA (e-mail: jwolf@ucsd.edu). Communicated by E. Soljanin, Associate Editor for Coding Techniques. Publisher Item Identifier S 0018-9448(01)01331-1. Fig. 1. Parallelogram The treatment of 2-D constraints seems to be much more difficult than the 1-D case. This is, in part, due to the fact that in the general constrained setting, there are problems that are easy to solve in the 1-D case, yet they become undecidable when we shift to two dimensions [3], [24]. One important example of a 2-D constraint is the 2-D extension of the (1 -RLL constraint. This constraint, which is also referred to as the hard-square model, has been treated in quite a few papers in the past several years; see, for example, [6], [10], [11], [20], [29]. This constraint will also be the focus of this work. We define next the hard- square model, borrowing terms from [5]. Let be a finite subset of the integer plane and let be a finite set, referred to as an alphabet .A configuration is a mapping The value of at location i;j will be denoted by ;j We say that a -configuration satisfies the hard-square model if = and for every two distinct locations i;j ;j j 1= i;j =0 or ;j =0) Equivalently, if we write down the values of the -configuration in the integer plane, then the ’s are isolated both horizontally and vertically (either by ’s or by unassigned locations). The set of all -configura- tions that satisfy the hard-square model will be denoted by The subsets considered in this work will be either rectangles m;n i;j :0 i j (1) or parallelograms m;n i;j :0 i j (2) (see Fig. 1). We will be mainly concentrating on the hard-square model, as the known literature, as well as the results obtained herein, are elab- orate enough already for this special case. The capacity , or the topological entropy , of the hard-square model is given by )= lim m;n !1 log jS m;n mn = lim m;n !1 log jS ( m;n mn The limits indeed exist and are equal [5], [16]. The value of is known to be approximately 5878911162 ; see [6], [10], [11], [29]. 0018–9448/01$10.00 © 2001 IEEE

Page 2

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1167 Much less is known about efficient (i.e., polynomial-time, or low- complexity) high-rate coding schemes for this constraint. In [26], the idea of 2-D bit-stuffing was introduced, resulting in a variable-to-fixed encoder whose expected rate was bounded from below in [26] by ap- proximately 5515 . Note that in a variable-to-fixed scheme, the set of preimages, denoted , consists of binary words that are not neces- sarily of the same length; still, every sufficiently long binary uncon- strained word has exactly one element in as a prefix (namely, the set is prefix-free and complete). For the purpose of computing the rate, we define a probability measure on , where a preimage of length has probability . Indeed, by the properties of it follows that =1 (the Kraft equality). The expected rate of such a coding scheme is given by m;n mn A very simple coding scheme into ( m;n at a fixed rate 1: 2 is implied by [14, Lemma 1(e)]: entries i; j m;n such that is even are filled with the input bit stream, while the remaining entries are set to zero. We do not know of any other published efficient fixed-rate encoders at (significantly) higher rates for the 2-D (1 -RLL con- straint. The main goal of this work is designing efficient coding schemes for mapping, in a one-to-one manner, unconstrained binary words into ele- ments of m;n or ( m;n . Based on the idea of 2-D bit-stuffing introduced in [26], we present in Section III a variable-to-fixed encoder into ( m;n . Our coding scheme attains a rate which is approxi- mately 587277 , namely, only 0.1% below the value of Our variable-to-fixed rate encoder effectively realizes a certain prob- ability measure m;n on ( m;n . This measure is defined in Section II and its properties are proved in Section IV. In particular, we show that the marginal probability induced by m;n at every given row—and, re- spectively, at every given diagonal—of a random m;n -configuration in ( m;n is a first-order Markov process. With a slight compromise on the coding rate, we can also obtain an efficient fixed-rate encoder into m;n . Such an encoder is pre- sented in Section V with a rate that approaches 581074 for large values of or ; this rate is within 1.2% of the value of II. P ROBABILITY EASURE ON ARALLELOGRAMS Let m;n be the parallelogram defined by (2) and shown in Fig. 1. Row in m;n consists of all locations i; j such that j Diagonal consists of all locations i; d such that i .Row will be denoted by (h) and will be referred to as the horizontal boundary of m;n . Similarly, Diagonal , denoted (d) , will be referred to as the diagonal boundary of m;n . Those boundaries are depicted as thick lines in Fig. 1. The set m;n ( (h) (d) (i.e., the parallelogram excluding its boundaries) will be denoted by m;n A random m;n -configuration taking values from ( m;n (ac- cording to some probability measure) will be denoted by , and its value at location i; j will be denoted by i;j Let m;n be a probability measure defined on ( m;n ; that is, m;n )= for every 2S ( m;n . The (measure- theoretic) entropy of m;n is defined by m;n )= mn 2S ( m;n )log m;n The value m;n is the largest possible expected rate of any encoder that maps, in a one-to-one manner, a set of input binary words into ( m;n , with a probability measure defined on that induces the measure m;n on ( m;n . This clearly implies the inequality m;n log jS ( m;n mn (3) Now, suppose that m;n m;n =1 is a (2-D) sequence of prob- ability measures, where each individual measure m;n is defined on ( m;n . For a m;n -configuration 2S ( m;n , let (h) be the set of all +1 ;n -configurations in ( +1 ;n obtained from by appending an +1) st row. Similarly, let (d) be the set of all m;n +1 -configurations in ( m;n +1 obtained from by appending an +1) st diagonal. We say that the sequence m;n m;n is nested if for every m; n and 2S ( m;n m;n )= +1 ;n )= m;n +1 In other words, for every and , the measure m;n is the marginal distribution on ( m;n which is induced by the measure ;n ( ;n [0 1] . The nesting property allows us to regard as a measure which is an infinite extension of the individual measures m;n . The entropy of is defined by ) = lim m;n !1 m;n (by subadditivity the limit exists), and from (3) we have [5]. An (infinite extension) measure for which )= is called a maxentropic measure . Such a measure indeed ex- ists [5]. Our coding scheme effectively defines nested measures m;n ( m;n [0 1] for every m; n As we show, the sequence m;n m;n satisfies ) = lim m;n !1 m;n 587277 Since the limit is very close to the known bounds on , we can say that is “almost maxentropic.” The expected rate of our coding scheme approaches, through the values of m;n , the value For every 2S ( m;n , the value m;n )= takes the following form: m;n )= (h) ;x ;x ;n (d) ;x ... ;x 1) =1 +1 i;j i;j ;x ;j ;x ;j +1 (4) The components (h) , and (d) define the measure on location (0 0) and on the horizontal and diagonal boundaries, respectively, and will be specified in more detail below. The function [0 1] is defined through two parameters, [0 1) and (0 1] as follows: (0 u; y ; v )= if =0 otherwise (5)

Page 3

1168 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 Fig. 2. Location of the arguments of the function u;y;v and (1 u; y ; v )= 1 (0 u; y ; v . The distribution on ;j can be described verbally as follows. As dictated by the hard-square model, the value of i; j is forced to be unless i; j ;j =0 . When the latter condition is met, then i; j will be a Bernoulli random bit whose distribution depends on the value of ;j +1 ; if that value is , then i; j takes the value with probability ; otherwise, i; j takes the value with probability . Fig. 2 shows the values that determine the distribution of i; j ; the location i; j is marked by a box. The measures on the boundaries, defined by (h) and (d) , are set so that the nonboundary values have a stationary distribution in the sense stated in Proposition 2.1 below. Specifically, (h) will take the form of a first-order Markov process (h) ;w ... ;w )= =1 (h) (6) where (h) [0 1] is given by (h) (0 )= ; if =0 otherwise (7) for some [0 1] , and (h) (1 )=1 (h) (0 The values of g! [0 1] will be set to the stationary prob- abilities of the first-order Markov process (h) as follows: (0) = and (1) = 1 , where (0) = (8) As for the diagonal boundary, (d) will be a first-order Markov process of the form (d) ;w ... ;w )= =1 (d) (9) where (d) [0 1] is given by (d) (0 )= (10) with q and q (11) and (d) (1 )=1 (d) (0 (since , the denominators in (11) are guaranteed to be positive). The values in (11) are consistent with the stationary distribution along the horizontal boundary: as we show in Section IV-A, (11) implies that i; =0 and, furthermore, i; =0 ;j ;j =0 for all i and j The nesting property of the measures m; n is easily verified. Next, we state other properties of those measures that will be proved in Sec- tion IV. Hereafter, the notation ( m; n will mean that the random m; n -configuration is taken from the sample space ( m; n according to the distribution m; n We say that row in ( m; n forms a first-order Markov process identical to the horizontal boundary if for every j< and every nonempty word of length i; j =0 i; j i; j i; j ; if =0 if =1 provided that the event we condition on has positive probability. The main result in Section IV-A is the following. Proposition 2.1: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)–(11). Then the entries in each row in ( m; n form a first-order Markov process identical to the horizontal boundary if and only if ;q )= +4 (1 2(1 (12) As we show in Section IV-B, there exists a counterpart of Proposition 2.1 also for the diagonals in ( m; n Remark: The definition of m; n through a “local” conditional mea- sure j u; y ; v on as given by (5) somewhat resembles the Pickard random fields defined in [21], except that columns therein are replaced here by diagonals. Note, however, that Pickard fields assume that the measure is invariant under all the symmetries of the square, whereas we require less: the distribution along rows may (and will) differ from the distribution along diagonals. Thus, the result in Propo- sition 2.1 is different than the first-order Markov property of Pickard fields. We now turn to the measure-theoretic entropy of m; n . Define the real-valued function :[0 1] [0 1] by )= log (1 ) log (1 where (0) = (1) = 0 We show in Section IV-A the following lower and upper bounds on m; n Proposition 2.2: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)––(11) and (12). Then m; n 1)( 1) mn )+(1 )) mn By Proposition 2.2 we have )= lim m; n !1 m; n )= )+ (1 ))

Page 4

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1169 which, with (8) and (11), yields )= ;q )) = )+ (1 )) (2 )( q where ;q is given by (12). The numerator and denominator can be made linear in by observing that (12) implies that =(1 (1 . This yields ;q )) = (1 )+( (1 (2 +3 )+ (1 + (13) To obtain the largest rate, we maximize ;q )) with respect to and . The maximum is attained for 671833 and 566932 , and the maximum value is ;q )) 587277 Our analysis depends strongly on the particular structure of the mea- sure m;n —in particular, on conditioning the probability of the event i;j =0 only on the values of the three entries ;j ;j +1 and i;j , as shown in Fig. 2. We note that such conditioning is causal in that we can select—according to the measure m;n —an el- ement of ( m;n by determining the values of its entries consecu- tively row-by-row or diagonal-by-diagonal: in such a process, the dis- tribution of values of the next entry to be determined is well-defined as this distribution depends on values that have already been set. Such a feature enables using the measure m;n for encoding, as we show in Section III. Clearly, we may maintain causality and still approach capacity by conditioning the value of i;j on more entries in the “past.” However, it appears that the analysis thus becomes much more complex. For ex- ample, when i;j is conditioned also on ;j +2 , we no longer have even a second-order Markov process along rows. III. V ARIABLE -R ATE NCODING CHEME We describe how the estimate on m;n , given in Proposition 2.2, can be approached by a variable-to-fixed rate coding scheme. The objective is to realize the probability measure m;n on ( m;n in the output of the encoder. The encoder consists of the following components. 1) A distribution transformer that maps, in a one-to-one manner, sequences of fair coin flips (i.e., independent Bernoulli random bits, each equaling with probability ), into sequences of independent Bernoulli random bits such that each bit equals with probability . There are known methods [8, Sec. 5.12] to implement variable-to-fixed rate transformers such that, for as the code length goes to infinity, the following holds: a) the expected rate (i.e., expected number of input bits per each output bit) of is at least )(1 b) all the words of the original Bernoulli source, except for a fraction whose probability is less than , are generated by with probability that differs from the original prob- ability by a factor within . Namely, the typical words of the original source are generated by with virtually the same probability. 2) A distribution transformer , similar to , except that the output is with probability . The rate of can get arbitrarily close to 3) Probabilistic boundary generator , to be explained below. 4) Constrained coder , to be explained below. Fig. 3. Encoding of by The raw input bits are fed into the transformers and , each input bit entering exactly one of the transformers. The coder then queries the outputs of and throughout the encoding process. The order of queries determines which transformer is fed by any given input bit. The encoding procedure starts by generating the entry at the origin, the entries ;j j , along the horizontal boundary, and the entries i; i , along the diagonal boundary. Those entries are generated by probabilistically, using (internal) sources of Bernoulli random trials (i.e., internal coin flips), with probabilities of success , and , as given by (12), (8), and (11). Note that these coin flips can be driven by external sources (as is done in and ), thus, contributing to the rate; however, since the boundaries occupy only bits out of the mn bits of m;n , such a rate contribution becomes marginal when and are large. The main coding task is performed by , which is fed by the outputs of and . At each encoding step, generates a value i;j in a new location i; j in m;n , as described in Fig. 3. The value of i;j depends on the values ;j ;j +1 , and i;j (which are assumed to have already been generated), and also on at most one output bit of or . To this end, there are two natural orders in which the values i;j can be computed: they can be generated row-by-row, or diagonal-by-diagonal. As we show in Corollary 4.3 in Section IV-A, when m;n satisfies (4)–(11) and (12) we have i;j ;j +1 =0 for all i; j m;n (h) . Hence, the expected number of locations i; j m;n for which i;j ;j +1 =0 ,is =( 1)( 1) : This is the expected number of times that or are queried by The expected number of times that (respectively, ) is queried is N (respectively, (1 ). Therefore, the expected rate of the overall coding scheme is m;n )+ (1 )) (1 m;n mn 1)( 1) mn )+ (1 ))(1 m;n 1)( 1) mn (1 m;n where lim m;n !1 m;n =0 . Namely, for 2f , we bound from below the rates of by )(1 m;n ; the factor m;n also in- corporates the ratio between the probability with which a typical word is generated by , compared to the probability with which such a word is generated by an ideal Bernoulli source (defined by ). Simulations suggest that the rate m;n is attained regardless of the boundary values set by ; yet we have not proved this. On the other hand, there is clearly a fixed assignment for the boundaries that yields expected rate at least m;n . If we knew such an assignment, we could hard-wire it into the decoder, in which case it would be sufficient to

Page 5

1170 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 transmit only the 1)( 1) nonboundary values of , making redundant. Decoding is carried out as follows. Bits are read from the received m;n -configuration in the order they were generated by the encoder, disregarding each that immediately follows a horizontally or ver- tically. The remaining bits are then divided into two bit streams ac- cording to the transformer that generated each individual bit. The bit streams are then fed into the decoders (i.e., inverse mappings) of the respective transformers. Our coding scheme can be simplified by combining the distribu- tion transformers and , in which case our encoder becomes the bit-stuffing encoder in [26] (except that the analysis here takes into ac- count that stuffed ’s overlap, thereby improving on the lower bound of [26] on the expected rate). In such a case, we maximize ;q )) in (13) under the restriction . The maximum is attained for 644400 , and the maximum value is ;q )) 583056 , which is within 1% of the capacity . We mention that this latter rate can be attained also by tuning the parameters in the method presented recently and independently in [15]. IV. F IRST -O RDER ARKOV ROPERTIES OF A. Horizontal First-Order Markov Process In this section, we provide proofs for Propositions 2.1 and 2.2. We start by verifying that the value of in (8) is the stationary prob- ability of the first-order Markov process along the horizontal and diag- onal boundaries. It is easy to see that (h) (0 0) (h) (0 1)(1 )= +1 thus implying that is indeed the stationary probability of ;j =0 along the horizontal boundary. Similarly, by the choice of and in (11) we also have (d) (0 0) (d) (0 1)(1 )= (1 making also the stationary probability of i; =0 along the diagonal boundary. Denote by the set of all binary words of length , and by the set of all nonempty binary words of length For i; j m;n and +1 , denote by i;j the event i;j )= i;j i;j i;j i;j +1 Also, for i; j m;n rm and 2f , define the event i;j )= i;j \f ;j +1 (see Fig. 4). We also define the vectors i;j )= fA (0) i;j fA (1) i;j Note that by the way we set the diagonal boundary we have i; (0)= i; =0 ;X =0 i; =0 ;X =1 (1 q (1 (14) Fig. 4. Event and i; (1)= i; =1 ;X =0 i; =1 ;X =1 (1 (1 )(1 q (1 (15) Given ;q ; [0 1] , we define the following matrices: q (1 (1 )0 (1 )(1 )0 and =0 . The notations and will stand for the identity matrix and the row vector (11) , respectively. Recall that we say that row in ( m; n forms a first- order Markov process identical to the horizontal boundary if for every j and every word i; j =0 i; j i; j i; j ; if =0 if =1 (16) provided that the event we condition on has positive probability. The following lemma is easily verified. Lemma 4.1: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)–(11). Suppose that for some in the range i ,row in ( m; n forms a first-order Markov process identical to the horizontal boundary. Then i; j bc )= b; c i; j for all j and Proof of Proposition 2.1: We start with the “if” part and prove by induction on that row forms a first-order Markov process identical to the horizontal boundary. We do this by showing that (16) holds for every j and every word of length exactly (which clearly implies that it holds for all shorter words). First note that the sample space ( m; n of forces (16) to hold whenever the first bit in is =1 We now consider words with =0 . Our induction proof for row assumes that row forms a first-order Markov process identical to the horizontal boundary. Clearly, this trivially holds for the induction

Page 6

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1171 base =1 . Write =0 where 2f , in which case (16) becomes fA i; j (00 fA i; j (0 This, in turn, is equivalent to i; j (00 )= i; j (0 (17) By Lemma 4.1 and the induction hypothesis we have i; j (00 )= i; j (0 Hence, (17) can be rewritten as I i; j (0 )=0 It follows that in order to show (16), it suffices to prove that for j and 2f , the vector i; j (0 is either the zero vector or a (right) eigenvector of associated with the eigenvalue Now, it is easy to see that (12) implies that is a nonnegative eigen- value of ; indeed, is the nonnegative root of the quadratic equa- tion (1 )+ q =0 (18) obtained from the equality det( I )=det q (1 =0 We now distinguish between two cases for the value of the word Hereafter stands for the all-zero word of length Case 1: . By (14) it follows that i; (0) is an eigen- vector of associated with the eigenvalue . Hence, by Lemma 4.1 and the induction hypothesis, we have i; j )= i; (0)= i; (0) j i: Namely, i; j is also an eigenvector of associated with the eigenvalue Case 2: . Write .If s then starts with a , or else we are in the trivial case in which the event we are conditioning on in (16) has zero probability (i.e., i; j (1 is zero). By Lemma 4.1 and the induction hypothesis, we obtain i; j (1 )= i; j (0) (1 (1 )(1 (1 (19) for some real , where (0) is the first coordinate of i; j and where the last equality in (19) follows from (18). In fact, (19) also applies to , in which case is the empty word and i; j (1 )= i; (1) , which, in turn, is given by (15). Combining (19) with Lemma 4.1 we obtain i; j (01 )= i; j (1 (1 (1 That is, i; j (01 , if nonzero, is an eigenvector of associated with the eigenvalue . Now, by Lemma 4.1 we have i; j (0 )= i; j (01 )= i; j (01 We thus conclude that i; j (0 , if nonzero, is an eigenvector of associated with the eigenvalue for every j and 2f . This establishes the induction step. We now turn to the “only if” part. We show that the equality =1 =0 =1 (20) implies that satisfies (18). Indeed, =01 =001 =0001 =0101 ((1 +(1 )(1 )) Combining this with (20), we obtain (1 )= ((1 +(1 )(1 )) which, with (8) and (11), yields (18). Corollary 4.2: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)–(11) and (12). Then, for every i; j m; n and , the vector i; j (0 ,if nonzero, is an eigenvector of associated with the eigenvalue Proof: By (16) we have i; j (00 )= i; j (0 for every i; j m; n and ; so, by Lemma 4.1 I i; j (0 )=0 (21) Now, the matrix I is singular since is an eigenvalue of On the other hand, implies that =1 ; so, the vector 1( I is nonzero. Hence, that vector spans the rows of I , thus implying by (21) that i; j (0 is an eigenvector of associated with the eigenvalue Corollary 4.3: For m; n [0 1) , and (0 1] let m; n be a measure defined on ( m; n by (4)–(11) and (12). If ( m; n then i; j ;j +1 =0 for every i; j m; n (h) Proof: Recall that for a m; n -configuration 2S ( m; n we denote by (d) the set of all m; n +1 -configurations in ( m; n +1 obtained from by appending another diagonal. By the nesting property of m; n we have m; n )= m; n +1 for every 2S ( m; n . Hence, it suffices to show that when ( m; n +1 , then i; j ;j +1 =0 for every i; j m; n (h) , or, equivalently, for every i; j +1) m; n +1

Page 7

1172 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 Fig. 5. Event By Proposition 2.1 and Corollary 4.2, when applied to ( m; n +1 , it follows that for every i; j +1) m; n +1 , the vec- tors i; j (0) are eigenvectors of associated with the eigenvalue ; namely, i; j (0)= i; j =0 ;Z ;j +1 =0 i; j =0 ;Z ;j +1 =1 i; j (1 (22) for some constants i; j . On the other hand, we also have i; j (0)= i; j =0 : Combining the latter equation with (22) we thus obtain i; j (0)= +(1 (1 (1 for every i; j +1) m; n +1 (compare with (14)). Proof of Proposition 2.2: By (4) and Corollary 4.3 we have mn H m; n )+( 1) +( 1)( )+(1 )) =1 +1 i; j ;j =0 ;j +1 =0 i; j ;j =0 g ;j +1 =1 i; j ;j =0 g )) )+( 1) +( 1)( )+(1 )) +( 1)( 1) )+(1 )) Therefore, m; n 1)( 1) mn )+(1 )) mn as claimed. B. Diagonal First-Order Markov Process In this section, we present a counterpart of Proposition 2.1 for the diagonals of ( m; n . We state the respective claims and point out the difference in proofs compared to those in Section IV-A. For i; d m; n and , denote by d; i the event d; i )= i; d ;d 1) ;d 2) +1 ;d +1) Also, for i; d m; n (d) and 2f , define the event d; i )= d; i \f i; d (see Fig. 5). We also define the vectors d; i )= fB (0) d; i fB (1) d; i The counterparts of (14) and (15) take the form d; (0)= ;d =0 ;X ;d =0 ;d =0 ;X ;d =1 and d; (1)= ;d =1 ;X ;d =0 ;d =1 ;X ;d =1 (1 and the counterparts of the matrices b; c are (1 )0 00 (1 )0 00 where and are given by (11). We say that diagonal in ( m; n forms a first-order Markov process identical to the diagonal boundary if for every i< and every word i; d =0 ;d 1) ;d 2) `; d (23) provided that the event we condition on has positive probability. Lemma 4.4: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)–(11). Suppose that for some in the range d< n , diagonal in ( m; n forms a first-order Markov process identical to the diagonal boundary. Then d; i bc )= b; c d; i for all i and Proposition 4.5: For m; n [0 1) , and (0 1] , let m; n be a measure defined on ( m; n by (4)–(11). Then the en- tries in each diagonal ( m; n form a first-order Markov process identical to the diagonal boundary if and only (12) holds. Proof: The proof follows along the lines of the proof of Proposi- tion 2.1, with a notable difference in the treatment of the case =1 in (23). Specifically, in the “if” part, we also need to show that fB d; i (01 fB d; i (1 for all i; d m; n and (24)

Page 8

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1173 The proof is carried out by induction on , and the induction step assumes that (23) holds for diagonal . Defining fB (0) d;i (1 we have d;i (1 )= By the induction hypothesis and Lemma 4.4 we obtain d;i (01 )= d;i (1 )= Now, by (11), we have +1 Hence, d;i (01 )= d;i (1 )=( +1 d;i (1 thus implying (24). The case =0 in (23) is treated by showing through the induction on that for every i; d m;n and 2f , the vector d;i (0 , if nonzero, is an eigenvector of associated with the eigenvalue given in (11). V. F IXED -R ATE NCODING CHEME Let m;n be the rectangle defined by (1). A m;n -configuration =( i;j 2S m;n is called circular if for every i , the entries i; and i;n are not both . The set of all m;n -configura- tions in m;n that are circular will be denoted by m;n . Note also that if 2S m;n and if we define the m;n -configuration =( i;j by i;j i;j if i;n otherwise i; j m;n then 2S ( m;n In this section, we present a fixed-rate coding scheme into m;n with a rate that approaches 581074 for large values of (or ). Our scheme borrows ideas from permutation codes [27], [30] combined with enumerative coding [9]. Even though the circular property is not necessary for the coding, it will make the analysis simpler. The m;n -configurations generated by the encoder will have the additional property that all rows in them have the same (Hamming) weight n for a value [0 1] that will be determined in the sequel. Let be in m;n and assume that for some in the range i ,row in has weight . Let ;j ;j be the indexes for which ;j =1 . Clearly, we must have i;j =0 for every . Define the words i;j +1 i;j +2 i;j k and i;j +1 i;n i; i;j The word is called the th phrase of row in . Note that row in is obtained by shifting the word (1) (2) cyclically entries to the right. The length of is called the th phrase length in row of . Denoting that length by , the list ;` ;` is called the phrase profile of row in . Clearly, +1 , where +1 is defined to be . Hence, the phrase profile of row is completely determined by row . Note also that =1 . We mention that a (somewhat different) definition of phrases is used also in 1-D permutation codes [27], [30]. For a positive integer , let ;` denote the set of all words of length that satisfy the 1-D (1 -RLL constraint. Similarly, we define ;` . Also, denote by `;r (respectively, `;r ) the set of words in (respectively, `;r ) of weight . It is easy to see that m;n -configuration 2S m;n is in m;n if and only if every row in is in Lemma 5.1: For every two positive integers and such that , there is a mapping a;b S !S S that is one-to-one and weight-preserving. Proof: Since and and play symmetrical roles, we can assume that .For 2S and , we define a;b ;y )=( ;w where and are determined as follows: if =0 or =0 , then we set and (25) Otherwise (if =01 and =1 +1 +2 and (26) It is easy to see that the mapping a;b is into S and is weight- preserving. To show that it is one-to-one, we need to verify that we can distinguish pairs ;w generated by (25) from those that are generated by (26). Indeed, only in the latter, the last entry of and the first entry of are both equal to Let and be two words in . We say that is consistent with if and form the rows of an array in ;n . In other words, and do not have ’s in the same position. Define n; t )= =0 jS +2 ;t (27) Lemma 5.2: For every word 2S n;t there are at least n; t words 2S n;t that are consistent with Proof: The proof is based on the observation that the number of possible assignments for depends only on the phrase profile of and only through the multiplicity (but not the order) with which each phrase length appears in that phrase profile. This phrase profile, in turn, is completely determined by Assume first that induces on the phrase profile ;` ;` where =2 (and, so, +2 ). We refer to this profile as the worst profile for length and weight . Each of the phrases of length in can take a value from 00 01 10 .It follows that for s , there are ways to assign values to the phrases of length in so that their overall weight is . If the overall weight of is , then the remaining phrase, of length +2 in must have weight . This proves the lemma assuming that induces the worst profile on It remains to establish that the worst profile is indeed the worst, in the sense that it leads to the smallest possible number of assignments for . We show this by descending induction on the number of phrases in whose lengths equal . Clearly, if at least phrase lengths equal , then, up to permutation of phrase lengths, the phrase profile is a worst profile and the claim immediately follows. Turning to the induction step, suppose that induces on a phrase profile =( ;` ;` in which =2 and =2 . We can further assume that ; indeed, if all phrase lengths in were less than , then we would have , in which case n; t )=0 We denote by the set of all words in n;t that are consistent with Let be a word in n;t that induces the phrase profile =(2 ;` ;` ;` ;` on every word that is consistent with . The set of all such words in n;t will be denoted by . Observe that has more phrase lengths

Page 9

1174 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 Fig. 6. Enumerative coding into a row of an array in equaling than does. So, by the induction hypothesis on we have j n; t We next define a mapping as follows. Given the first two phrases of are obtained by applying the mapping ;` of Lemma 5.1 to the first two phrases of . The remaining phrases in are identical to their counterparts in . By Lemma 5.1, the mapping is one-to-one and weight-preserving and, so, j j n; t Let max max be the value of a nonnegative integer for which n; t is maximized. Given , and (e.g., max ), Lemma 5.2 suggests a coding scheme at a fixed rate log n; t mn into the set m;n as follows. For =0 ;m we select row from n;t so that it is consistent with row (for the case =0 , we can assume a particular word from n;t to serve as a “phantom” row ). Lemma 5.2 guarantees that we have at least n; t words in n;t that can be selected for row . This, in turn, implies the following result. Proposition 5.3: log jS m;n mn log n; t max The effective computation of row in the suggested coding scheme can be done by enumerative coding, as we describe next [7], [25, Ch. 6], [28]. Let ;` ... ;` be the phrase profile of row as induced by row . For this particular phrase profile, denote by k;s the number of possible assignments for the first phrases of row so that their overall weight is .Wehave k;s =0 jS ;r j ;s (28) where =1 and ;s =0 for s> . The values jS `;r , in turn, can be computed by the recurrence jS `;r jS ;r jS ;r ;` (29) where jS =1 jS ;r =0 for =0 jS jS =1 , and jS ;r =0 for r= 2f We can rewrite (28) and (29) in polynomial notation as follows. Let be an indeterminate, and define the polynomials )= =0 `;r =0 jS `;r j ;` and )= =0 k;s t: Then (29) becomes )= )+ ;` (30) where )=1 and )=1+ . The recurrence (28), in turn, can be written as )(mod +1 where )=1 .So =1 )(mod +1 t: (31) The latter formula can be used to accelerate the computation of the values k;s through fast techniques of polynomial multiplication [1]. Note also that the polynomials need to be computed for +1 only once for the whole array. The enumerative coding algorithm of row is presented in Fig. 6. The unconstrained input stream to be coded into row is regarded as an integer in the range p n; t , and the phrase profile of row is also assumed to be available. The main loop of the algorithm computes the phrases of row , in reverse order, starting with the th phrase. In each iteration of the main loop, the variable determines the weight of the th phrase, and equals the overall weight of the first phrases. It can be easily verified by descending induction on that each loop iteration starts with a value of in the range p k;s the induction base following from p n; t t;t . Similarly, the value of at the end of each loop iteration lies in the range < ; jS ; . The mapping from into a word in ; assumes an ordering on the elements of each set `;r . If the standard lexicographic ordering is used, then such a mapping can be efficiently implemented by (a second level of) enumerative coding, using the recurrence (29) (or (30)). We next obtain an asymptotic estimate for n; t which will enable us to compute an asymptotic lower bound on (log n; t max )) =n The following lemma is a well-known asymptotic estimate for the bi- nomial coefficients (see [17, p. 309]). Lemma 5.4: For and `> log r=` `; r ))

Page 10

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1175 where lim !1 max `;r =0 Lemma 5.5: For and `> log jS `;r =log =( r= )) r;r )) where lim !1 max `;r =0 Proof: A word is in `;r if and only if it can be written as a sequence of nonoverlapping blocks, of which equaling 10 and the remaining equaling (if the last entry in is , then the last block will also include the first entry in ). Hence, jS `;r equals the number of combinations of elements (being the indexes of the blocks 10 within ) out of It follows from Lemma 5.5 and the continuity of the function 7! that for every real [0 2] we have lim !1 (1 =` log jS `; ` =(1 = (1 )) where stands for the smallest integer not greater than . (Indeed, (1 = (1 )) is the entropy of a first-order Markov process de- fined on the (1 -RLL constraint, in which the probability of having following is = (1 ; the stationary probability of is then .) Observing that jS +3 ;r jjS +2 ;r jjS t;r jjS t;r and that we get from (27) the lower bound log n;t max s +log +log +log jS t;t (32) and the upper bound log n;t max s +log +log )+log jS +3 ;t (33) Write t=n s=n , and =( . Assuming that =(1 and , we can incorporate Lemmas 5.4 and 5.5 into the lower bound (32) to yield that, whenever !n is a nonnegative integer less than n log n;n !n +log n !n +log +log jS (1 n; (1 n =( != )+(1 (1 = (1 )) (1)) where (1) stands for an expression that goes to zero as goes to infinity. We now observe that (1 and that implies = (1 . Hence, for every fixed rational [0 3] and every such that n is an integer (1 =n log n;n sup ; (1) (34) where ; )= [1+ ((1 = 3) )]+(1 [(1 = (1 )) and the supremum in the right-hand side of (34) is taken over all rational in the range min = (1 . In fact, from the upper bound (33) it follows that the inequality in (34) can be replaced by an equality. Furthermore, since the function ; is continuous we have liminf !1 (1 =n log n; n )=max ; where the maximum is taken over all real [0 min = (1 We now maximize the expression ; over real values of [0 3] and [0 min = (1 . By taking partial deriva- tives of ; with respect to and , we get the equations (23 4)(29 4) (8357 8357 +3098 518 +38 1)=0 and (369 101 +4) 1469 682 +95 The maximum is attained for max ; max (0 216594 248986) in which case liminf !1 (1 =n log n;t max )) =liminf !1 max (1 =n log n; n sup liminf !1 (1 =n log n; n =max ; ; )= max ; max 581074 (in fact, one can easily show that in the third step—where we change the order between maximizing over and taking the limit over —the inequality can be replaced by an equality). CKNOWLEDGMENT The authors wish to thank S. Forchhammer, J. Justesen, E. Or- dentlich, A. Orlitsky, and K. Zeger for very helpful discussions and comments. EFERENCES [1] A. V. Aho, J. E. Hopcroft, and J. D. Ullman, The Design and Analysis of Computer Algorithms . Reading, MA: Addison-Wesley, 1974. [2] J. J. Ashley, M. Blaum, and B. H. Marcus, “Report on coding techniques for holographic storage,” IBM, Res. Rep. RJ 10013, 1996. [3] R. Berger, “The undecidability of the domino problem, Mem. Amer. Math. Soc. , vol. 66, 1966. [4] D. Brady and D. Psaltis, “Control of volume holograms, J. Opt. Soc. Amer. A , vol. 9, pp. 1167–1182, 1992. [5] R. Burton and J. E. Steif, “Non-uniqueness of measures of maximal en- tropy for subshifts of finite type, Ergod. Theory Dynam. Syst. , vol. 14, pp. 213–235, 1994. [6] N. Calkin and H. S. Wilf, “The number of independent sets in a grid graph, SIAM J. Discr. Math. , vol. 11, pp. 54–60, 1997. [7] T. M. Cover, “Enumerative source encoding, IEEE Trans. Inform. Theory , vol. IT-19, pp. 73–77, Jan. 1973. [8] T. M. Cover and J. A. Thomas, Elements of Information Theory .New York: Wiley, 1991. [9] S. Datta and S. W. McLaughlin, “An enumerative method for run length- limited codes: Permutation codes, IEEE Trans. Inform. Theory , vol. 45, pp. 2199–2204, Sept. 1999. [10] K. Engel, “On the Fibonacci number of an lattice, Fibonacci Quart. , vol. 28, pp. 72–78, 1990. [11] S. Forchhammer and J. Justesen, “Entropy bounds for constrained 2-D random fields, IEEE Trans. Inform. Theory , vol. 45, pp. 118–127, Jan. 1999. [12] J. F. Heanue, M. C. Bashaw, and L. Hesselink, “Volume holographic storage and retrieval of digital data, Science , vol. 265, pp. 749–752, 1994. [13] J. F. Heanue, M. C. Bashaw, and L. Hesselink, “Channel codes for digital holographic data storage, J. Opt. Soc. Amer. A , vol. 12, 1995. [14] H. Ito, A. Kato, Z. Nagy, and K. Zeger, “Zero capacity region of multidi- mensional run length constraints, Electron. J. Combin. , vol. 6, p. R33, 1999.

Page 11

1176 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 [15] J. Justesen and Y. M. Shtarkov, “Simple models of two-dimensional in- formation sources and codes,” in Proc. 1998 IEEE Int. Symp. Informa- tion Theory (ISIT’98) , Cambridge, MA, 1998, p. 412. [16] A. Kato and K. Zeger, “On the capacity of two-dimensional run-length constrained channels, IEEE Trans. Inform. Theory , vol. 45, pp. 1527–1540, July 1999. [17] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes . Amsterdam, The Netherlands: North Holland, 1977. [18] B. H. Marcus, R. M. Roth, and P. H. Siegel, “Constrained systems and coding for recording channels,” in Handbook of Coding Theory ,V.S. Pless and W. C. Huffman, Eds. Amsterdam, The Netherlands: Elsevier, 1998, pp. 1635–1764. [19] B. H. Marcus, P. H. Siegel, and J. K. Wolf, “Finite-state modulation codes for data storage, IEEE J. Select. Areas Comm. , vol. 10, pp. 5–37, 1992. [20] Z. Nagy and K. Zeger, “Capacity bounds for the three-dimensional (0 1) run length limited channel, IEEE Trans. Inform. Theory , vol. 46, pp. 1030–1033, May 2000. [21] D. K. Pickard, “A curious binary lattice process, J. Appl. Probab. , vol. 14, pp. 717–731, 1977. [22] D. Psaltis and F. Mok, “Holographic memories, Scientific Amer. , vol. 273, no. 5, pp. 70–76, 1995. [23] D. Psaltis, M. A. Neifeld, A. Yamamura, and S. Kobayashi, “Optical memory disks in optical information processing, Appl. Opt. , vol. 29, pp. 2038–2057, 1990. [24] R. M. Robinson, “Undecidability and nonperiodicity for tilings of the plane, Inventiones Math. , vol. 12, pp. 177–209, 1971. [25] K. A. Schouhamer Immink, Codes for Mass Data Storage Sys- tems . The Netherlands: Shannon Foundation, 1999. [26] P. H. Siegel and J. K. Wolf, “Bit-stuffing bounds on the capacity of 2-di- mensional constrained arrays,” in Proc. 1998 IEEE Int. Symp. Informa- tion Theory (ISIT’98) , Cambridge, MA, 1998, p. 323. [27] D. Slepian, “Permutation codes, Proc. IEEE , vol. 53, pp. 228–236, 1965. [28] D. T. Tang and L. R. Bahl, “Block codes for a class of constrained noise- less channels, Inform. Contr. , vol. 17, pp. 436–461, 1970. [29] W. Weeks IV and R. E. Blahut, “The capacity and coding gain of certain checkerboard codes, IEEE Trans. Inform. Theory , vol. 44, pp. 1193–1203, May 1998. [30] J. K. Wolf, “Permutation codes, d;k codes and magnetic recording, in Proc. 1990 IEEE Colloq. South America, Argentina, Brazil, Chile, Uruguay , W. Tompkins, Ed., 1990. Quantum Codes from Cyclic Codes over GF Andrew Thangaraj , Student Member, IEEE, and Steven W. McLaughlin , Senior Member, IEEE Abstract We provide a construction for quantum codes (Hermi- tian-self-orthogonal codes over GF (4) ) starting from cyclic codes over GF (4 . We also provide examples of these codes some of which meet the known bounds for quantum codes. Index Terms -ary image, quantum Bose–Chaudhuri–Hocquenghem (BCH) code, quantum code. I. I NTRODUCTION In this correspondence, we use the ideas of Calderbank et al. [1] to construct a new class of quantum codes from cyclic codes over GF (4 . In particular, the following theorem from [1] can be used directly to obtain quantum codes from certain codes over GF (4) Theorem 1: Suppose is an n;k linear code over GF (4) self- orthogonal with respect to the Hermitian inner product. Suppose also that the minimum weight of is . Then, an [[ n;n k; d ]] quantum code can be obtained from The Hermitian inner product of u; v GF (4) is defined to be u:v where for GF (4) . From now on, orthogonality over GF (4) will be with respect to the Hermitian inner product defined above. In this correspondence, we consider self-orthogonal codes over GF (4) that are obtained as -ary images of -ary cyclic codes of length (4 1) . Binary images of self-orthogonal codes over GF (2 have been used to obtain quantum codes in [2]. Using Theorem 1 and working with -ary images usually results in smaller dimension quantum codes at the same efficiency and minimum distance. We begin with a discussion of -ary images of cyclic codes over GF . Then we present a method to obtain self-orthogonal codes over GF (4) as images of codes over GF (4 II. ARY MAGES A. Some Definitions and Notation The finite field GF can be considered to be a vector space of dimension over the field GF . Let ; ... ; be a basis for GF over GF . The map GF 7! GF mn is defined so that for ;u ;u g2 GF )= 11 ... ;u ;u 12 ... ;u ... ;u nm where =1 ij ;u ij GF n: Notice that is not the obvious expansion of in terms of but a permutation of the expansion. Manuscript received March 2, 2000; revised October 28, 2000. The authors are with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: andrew@ ee.gatech.edu; swm@ee.gatechedu). Communicated by P. W. Shor, Associate Editor for Quantum Information Theory. Publisher Item Identifier S 0018-9448(01)01526-7. 0018–9448/01$10.00 © 2001 IEEE

Today's Top Docs

Related Slides