Weighted FiniteState Transducer Algorithms An Overview Mehryar Mohri ATT Labs  Research Shannon Laboratory  Park Avenue Florham Park NJ  USA mohriresearch
176K - views

Weighted FiniteState Transducer Algorithms An Overview Mehryar Mohri ATT Labs Research Shannon Laboratory Park Avenue Florham Park NJ USA mohriresearch

attcom May 14 2004 Abstract Weighted 64257nitestate transducers are used in many applicat ions such as text speech and image processing This chapter gives an o verview of several recent weighted transducer algorithms includi ng composition of weighte

Download Pdf

Weighted FiniteState Transducer Algorithms An Overview Mehryar Mohri ATT Labs Research Shannon Laboratory Park Avenue Florham Park NJ USA mohriresearch

Download Pdf - The PPT/PDF document "Weighted FiniteState Transducer Algorith..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation on theme: "Weighted FiniteState Transducer Algorithms An Overview Mehryar Mohri ATT Labs Research Shannon Laboratory Park Avenue Florham Park NJ USA mohriresearch"— Presentation transcript:

Page 1
Weighted Finite-State Transducer Algorithms An Overview Mehryar Mohri AT&T Labs Research Shannon Laboratory 180 Park Avenue, Florham Park, NJ 07932, USA mohri@research.att.com May 14, 2004 Abstract Weighted finite-state transducers are used in many applicat ions such as text, speech and image processing. This chapter gives an o verview of several recent weighted transducer algorithms, includi ng composition of weighted transducers, determinization of weighted auto mata, a weight pushing algorithm, and minimization of weighted automata. It briefly describes these

algorithms, discusses their running time c omplexity and conditions of application, and shows examples illustratin g their applica- tion. 1 Introduction Weighted transducers are used in many applications such as text, speech and im- age processing [9, 12, 5]. They are automata in which each tra nsition in addition to its usual input label is augmented with an output label from a possibly new alphabet, and carries some weight element of a semiring. Transducers can be used to define a mapping between two different types of informa tion sources, e.g., word and phoneme sequences. The

weights are critically need ed to model the uncertainty or the variability of such information sources . Weighted transducers can be used for example to assign different pronunciations to the same word but with different ranks or probabilities. Their weights are typ ically derived from large data sets using various sophisticated statistical le arning techniques. Much of the theory of weighted transducers and rational powe r series was de- veloped more than two decades ago [15, 7, 4]. However, many es sential weighted transducer algorithms such as determinization and minimiz ation

of weighted transducers [9] are recent and arise new questions, both the oretical and algo- rithmic. This chapter overviews several recent weighted tr ansducer algorithms, including composition of weighted transducers, determini zation of weighted au- tomata, a weight pushing algorithm, and minimization of wei ghted automata.
Page 2
Semiring Set Boolean Probability Log ∪{ ∞} log Tropical ∪{ ∞} min Table 1: Semiring examples. log is defined by: log log( 2 Preliminaries This section introduces the definitions and notation used in the following.

Definition 1 ([7]) A system 1) is a semiring if: 0) is a com- mutative monoid with identity element 1) is a monoid with iden- tity element distributes over ; and is an annihilator for : for all , a 0 = Thus, a semiring is a ring that may lack negation. Table 1 list s some familiar semirings. A semiring is said to be commutative when the multiplicative oper- ation is commutative. It is said to be left divisible if for any 0, there exists such that 1, that is if all elements of admit a left inverse. 1) is said to be weakly left divisible if for any and in such that 0, there exists at least

one such that = ( . When the operation is cancellative, is unique and we can then write: = ( When is not unique, we can still assume that we have an algorithm to find one of the possible and call it ( . Furthermore, we will assume that can be found in a consistent way, that is: (( )) ) = ( for any x, y, u such that 0. A semiring is zero-sum-free if for any and in 0 implies 0. In the following definitions, is assumed to be a left semiring or a semiring. Definition 2 weighted finite-state transducer over a semiring is an 8-tuple = ( , Q, I, F, E, λ, where: is the

finite input alphabet of the transducer; is the finite output alphabet; is a finite set of states; the set of initial states; the set of final states; ( ∪{ ( a finite set of transitions; the initial weight function; and the final weight function mapping to We denote by the sum of the number of states and transitions of Weighted automata are defined in a similar way by simply omitting the input or out put labels. Given a transition , we denote by ] its origin or previous state and ] its destination state or next state, and ] its weight. A path is

an element of with consecutive transitions: ] = ], , . . ., k . We extend and to paths by setting: ] = ] and ] = ]. The weight function can also be extended to paths by defining the weight of a path as the -product of the weights of its constituent transitions:
Page 3
] = ]. We denote by q, q ) the set of paths from to and by q, x, y, q ) the set of paths from to with input label and output label . These definitions can be extended to subsets R, R by: R, x, y, R ) = R, q q, x, y, q ). A transducer is regulated if the output weight associated by to any pair of input-output

string ( x, y ) by: [[ ]]( x, y ) = I,x,y,F ]] ]] (1) is well-defined and in . [[ ]]( x, y ) = 0 when I, x, y, F ) = . If for all q,,,q , then is regulated. In particular, when does not have any -cycle, it is regulated. We define the domain of Dom ), as: Dom ) = x, y ) : [[ ]]( x, y 3 Composition of Weighted Transducers Composition is a fundamental algorithm used to create compl ex weighted trans- ducers from simpler ones. Let be a commutative semiring and let and be two weighted transducers defined over such that the input alphabet of coincides with the

output alphabet of . Assume that the infinite sum x, z z, y ) is well-defined and in for all ( x, y . This condition holds for all transducers defined over a closed semiring [8, 11] such as the Boolean semiring and the tropical semiring and for all ac yclic transducers defined over an arbitrary semiring. Then, the result of the co mposition of and is a weighted transducer denoted by and defined for all x, y by [3, 6, 15, 7]: [[ ]]( x, y ) = x, z z, y ) (2) There exists a general and efficient composition algorithm fo r weighted trans- ducers [13, 12]. States

in the composition of two weighted transducers and are identified with pairs of a state of and a state of . Leaving aside transitions with inputs or outputs, the following rule specifies how to compute a transition of from appropriate transitions of and , a, b, w , q ) and ( , b, c, w , q ) = (( , q , a, c, w , q )) (3) See [13, 12] for a detailed presentation of the algorithm inc luding the use of a transducer filter for dealing with -multiplicity in the case of non-idempotent semirings. In the worst case, all transitions of leaving a state match all those of leaving state ,

thus the space and time complexity of composition is quadratic: || ). However, an on-the-fly implementation of composition can be used to construct just the part of the composed transdu cer that is needed. Figures 1(a)-(c) illustrate the algorithm when applied to t he transducers of Figures 1(a)-(b) defined over the tropical semiring. Note that we use a matrix notation for the definition of composition as opposed to a func- tional notation . This is a deliberate choice motivated in many cases by impro ved readability.
Page 4
a:b/0.1 a:b/0.2 b:b/0.3 3/0.7 b:b/0.4

a:b/0.5 a:a/0.6 b:b/0.1 b:a/0.2 a:b/0.3 3/0.6 a:b/0.4 b:a/0.5 (a) (b) (0, 0) (1, 1) a:b/0.2 (0, 1) a:a/0.4 (2, 1) b:a/0.5 (3, 1) b:a/0.6 a:a/0.3 a:a/0.7 (3, 2) a:b/0.9 (3, 3)/1.3 a:b/1 (c) Figure 1: (a) Weighted transducer over the tropical semiring. (b) Weighted transducer over the tropical semiring. (c) Composition of and Intersection (or Hadamard product) of weighted automata an d composition of finite-state transducers are both special cases of compos ition of weighted transducers. Intersection corresponds to the case where in put and output la- bels of transitions are identical and

composition of unweig hted transducers is obtained simply by omitting the weights. In general, the definition of composition cannot be extended to the case of non-commutative semirings because the composite transduc tion cannot always be represented by a weighted finite-state transducer. Consi der for example, the case of two transducers and , with Dom ) = Dom ) = ( a, a , with [[ ]]( a, a ) = and [[ ]]( a, a ) = and let be the composite of the transductions corresponding to and . Then, for any non-negative integer , a ) = which in general is different from ( if and do not

commute. An argument similar to the classical Pumping lemma can then be used to show that cannot be represented by a weighted finite-state transducer. When and are acyclic, composition can be extended to the case of non- commutative semirings. The algorithm would then consist of matching paths of and directly rather than matching their constituent transitio ns. The termination of the algorithm is guaranteed by the fact that t he number of paths of and is finite. However, the time and space complexity of the algor ithm is then exponential. The weights of matching transitions and

paths are -multiplied in compo- sition. One might wonder if another operation, , can be used instead of , in particular when is not commutative. The following proposition proves that that cannot be. Proposition 1 Let , e be a monoid. Assume that is used instead of in composition. Then, coincides with and 1) is a commutative semiring.
Page 5
Proof. Consider two sets of consecutive transitions of two paths: , a, a, x, q )( , b, b, y, r ) and = ( , a, a, u, q )( , b, b, v, r ). Matching these transitions using result in the following: (( , p , a, a, x u, , q )) and (( , q , b, b, y v, ,

r )) (4) Since the weight of the path obtained by matching and must also corre- spond to the -multiplication of the weight of , and the weight of , we have: ) = ( ) (5) This identity must hold for all x, y, u, v . Setting and 1 leads to and similarly for all . Since the identity element of is unique, this proves that 1. With 1, identity 5 can be rewritten as: for all and , which shows that coincides with . Finally, setting 1 gives for all and which shows that is commutative. 4 Determinization A weighted automaton is said to be deterministic or subsequential [16] if it has a unique initial

state and if no two transitions leaving any s tate share the same input label. There exists a natural extension of the classical subset con struction to the case of weighted automata over a weakly left divisible semir ing called deter- minization [9]. The algorithm is generic: it works with any weakly left di- visible semiring. Figures 2(a)-(b) illustrate the determi nization of a weighted automaton over the tropical semiring. A state of the output automaton that can be reached from the start state by a path corresponds to the set of pairs q, x such that can be reached from an initial state of

the original machine by a path with ] = ] and ]] ] = ]] Thus, is the remaining weight at state . The worst case complexity of de- terminization is exponential even in the unweighted case. H owever, in many practical cases such as for weighted automata used in large- vocabulary speech recognition, this blow-up does not occur. It is also importa nt to notice that just like composition, determinization admits a natural on-the -fly implementation [9] which can be useful for saving space. Unlike the unweighted case, determinization does not halt f or some input weighted automata. In fact,

some weighted automata, non subsequentiable au- tomata, do not even admit equivalent subsequential machine s. We say that a weighted automaton is determinizable if the determinization algorithm halts for the input . With a determinizable input, the algorithm outputs an equi v- alent subsequential weighted automaton [9]. We assume that the weighted automata considered are all such that for any string I,x, Q )] 0. This condition is always satisfied with trim machines over the tropical semiring or any zero-sum-free semiring.
Page 6
a/1 a/2 b/3 3/0 c/5 b/3 d/6 (0,0) (1,0),(2,1) a/1

b/3 (3,0)/0 c/5 d/7 a/1 a/2 b/3 3/0 c/5 b/4 d/6 (a) (b) (c) Figure 2: Determinization of weighted automata. (a) Weight ed automaton over the tropical semiring . (b) Equivalent weighted automaton obtained by determinization of . (c) Non-determinizable weighted automaton over the tropical semiring, states 1 and 2 are non-twin siblings. There exists a general property, the twins property , first formulated for finite- state transducers by C. Choffrut [3], later generalized to we ighted automata over the tropical semiring by [9], that provides a characterizat ion of

determinizable weighted automata under some general conditions. Let be a weighted automaton over a weakly left divisible left sem iring Two states and of are said to be siblings if there exist two strings and in such that both and can be reached from by paths labeled with and there is a cycle at and a cycle at both labeled with . When is a commutative and cancellative semiring, then two sibling s tates are said to be twins iff for any string q, y, q )] = , y, q )] (6) has the twins property if any two sibling states of are twins. Figure 2(c) shows an unambiguous weighted automaton over

the tropical s emiring that does not have the twins property: states 1 and 2 can be reached by pa ths labeled with from the initial state and admit cycles with the same label , but the weights of these cycles (3 and 4) are different. Theorem 1 ([9]) Let be a weighted automaton over the tropical semiring. If has the twins property, then is determinizable. With trim unambiguous weighted automata, the condition is a lso necessary. Theorem 2 ([9]) Let be a trim unambiguous weighted automaton over the tropical semiring. Then the three following properties are equivalent: 1. is determinizable.

2. has the twins property. 3. is subsequentiable. There exists an efficient algorithm for testing the twins prop erty for weighted automata [2]. The test of the twins property for finite-state transducers and weighted automata over other semirings is also discussed by [2]. Note that any acyclic weighted automaton over a zero-sum-free semiring h as the twins property and is determinizable.
Page 7
5 Weight Pushing The choice of the distribution of the total weight along each successful path of a weighted automaton does not affect the definition of the func tion

realized by that automaton, but this may have a critical impact on the effic iency in many applications, e.g., natural language processing applicat ions, when a heuristic pruning is used to visit only a subpart of the automaton. Ther e exists an algorithm, weight pushing , for normalizing the distribution of the weights along the paths of a weighted automaton or more generally a weighte d directed graph. Let be a weighted automaton over a semiring . Assume that is zero- sum-free and weakly left divisible. For any state , assume that the follow- ing sum is well-defined and in ] = q,F

]]) (7) ] is the shortest-distance from to [11]. ] is well-defined for all when is a -closed semiring [11]. The weight pushing algorithm consis ts of computing each shortest-distance ] and of reweighting the transition weights, initial weights and final weights in the following way: s.t. ]] , w ]] ]] (8) I, ] (9) F, s.t. , ] (10) Each of these operations can be assumed to be done in constant time, thus reweighting can be done in linear time ) where denotes the worst cost of an -operation. The complexity of the computation of the shorte st- distances depends on the semiring. In the

case of -closed semirings such as the tropical semiring, ], , can be computed using a generic shortest- distance algorithm [11]. The complexity of the algorithm is linear in the case of an acyclic automaton: + ( ), where denotes the worst cost of an -operation. In the case of a general weighted automaton over the tropical semiring, the complexity of the algorithm is log ). In the case of closed semirings such as ( 1), a generalization of the Floyd-Warshall algorithm for computing all-pairs shortes t-distances can be used. The complexity of the algorithm is Θ( )) where denotes the worst

cost of the closure operation. The space complexity of these algorithms is Θ( ). These complexities make it impractical to use the Floyd-W arshall algorithm for computing ], for relatively large graphs or automata of several hundred million states or transitions. An approx imate version of the shortest-distance algorithm of [11] can be used instead to c ompute ] efficiently [10]. Roughly speaking, the algorithm pushes the weights of each path as much as possible towards the initial states. Figures 3(a)-(c) illu strate the application of the algorithm in a special case both for the

tropical and prob ability semirings.
Page 8
a/0 b/1 c/5 d/0 e/1 e/0 f/1 e/4 f/5 0/0 a/0 b/1 c/5 d/4 e/5 3/0 e/0 f/1 e/0 f/1 0/15 a/0 b/(1/15) c/(5/15) d/0 e/(9/15) 3/1 e/0 f/1 e/(4/9) f/(5/9) 0/0 a/0 b/1 c/5 3/0 e/0 f/1 (a) (b) (c) (d) Figure 3: Weight pushing algorithm. (a) Weighted automaton . (b) Equivalent weighted automaton obtained by weight pushing in the tropical semiring. (c) Weighted automaton obtained from by weight pushing in the probability semiring. (d) Minimal weighted automaton over the tropical semiring equivalent to Note that if ] = 0, then, since is zero-sum-free,

the weight of all paths from to is 0. Let be a weighted automaton over the semiring . Assume that is closed or -closed and that the shortest-distances ] are all well-defined and in . Note that in both cases we can use the distributivity over th e infinite sums defining shortest distances. Let ) denote the transition (path after application of the weight pushing algorithm. ) differs from (resp. ) only by its weight. Let denote the new initial weight function, and the new final weight function. Proposition 2 Let = ( , Q, I, F, E , , be the result of the weight

pushing algorithm applied to the weighted automaton , then 1. the weight of a successful path is unchanged after application of weight pushing: ]] ]] = ]] ]] (11) 2. the weighted automaton is stochastic , i.e. Q, ] = 1 (12) Proof. Let . . . e . By definition of and ]] ]] = ]] ]] ]] ]] ]] ]] ]] ]] ]] ]] which proves the first statement of the proposition. Let ] = ]]
Page 9
]] ,F ]]) , ,F ]]) ] = where we used the distributivity of the multiplicative oper ation over infinite sums in closed or -closed semirings. This proves the second statement of the proposition.

These two properties of weight pushing are illustrated by Fi gures 3(a)-(c): the total weight of a successful path is unchanged after pushing ; at each state of the weighted automaton of Figure 3(b), the minimum weight of the outgoing transitions is 0, and at at each state of the weighted automat on of Figure 3(c), the weights of outgoing transitions sum to 1. Weight pushing can also be used to test the equivalence of two weighted automata [9]. 6 Minimization A deterministic weighted automaton is said to be minimal if there exists no other deterministic weighted automaton with a smaller number

of s tates and realizing the same function. Two states of a deterministic weighted au tomaton are said to be equivalent if exactly the same set of strings with the same weights label paths from these states to a final state, the final weights being incl uded. Thus, two equivalent states of a deterministic weighted automaton ca n be merged without affecting the function realized by that automaton. A weighte d automaton is minimal when it admits no two distinct equivalent states aft er any redistribution of the weights along its paths. There exists a general algorithm for

computing a minimal det erministic au- tomaton equivalent to a given weighted automaton [9]. The al gorithm consists of first applying the weight pushing algorithm to normalize t he distribution of the weights along the paths of the input automaton, and then o f treating each pair (label, weight) as a single label and applying the class ical (unweighted) automata minimization. Theorem 3 ([9]) Let be a deterministic weighted automaton over a semiring . Assume that the conditions of application of the weight pus hing algorithm hold, then the execution of the following steps: 1. weight

pushing, 2. (unweighted) automata minimization, lead to a minimal weighted automaton equivalent to
Page 10
a/1 b/2 c/3 d/4 e/5 3/1 e/.8 f/1 e/4 f/5 0/(459/5) a/(1/51) b/(2/51) c/(3/51) d/(20/51) e/(25/51) 2/1 e/(4/9) f/(5/9) 0/25 a/.04 b/.08 c/.12 d/.80 e/1 2/1 e/.8 f/1 (a) (b) (c) Figure 4: Minimization of weighted automata. (a) Weighted a utomaton over the probability semiring. (b) Minimal weighted automaton equivalent to (c) Minimal weighted automaton equivalent to The complexity of automata minimization is linear in the cas e of acyclic au- tomata ) [14] and in log ) in the

general case [1]. Thus, in view of the complexity results given in the previous section , in the case of the tropical semiring, the total complexity of the weighted min imization algorithm is linear in the acyclic case ) and in log ) in the general case. Figures 3(a), 3(b), and 3(d) illustrate the application of t he algorithm in the tropical semiring. The automaton of Figure 3(a) cannot b e further mini- mized using the classical unweighted automata minimizatio n since no two states are equivalent in that machine. After weight pushing, the au tomaton (Figure 3(b)) has two states (1 and 2)

that can be merged by the classic al unweighted automata minimization. Figures 4(a)-(c) illustrate the minimization of an automat on defined over the probability semiring. Unlike the unweighted case, a min imal weighted au- tomaton is not unique, but all minimal weighted automata hav e the same graph topology, they only differ by the way the weights are distribu ted along each path. The weighted automata and are both minimal and equivalent to is obtained from using the algorithm described above in the probabil- ity semiring and it is thus a stochastic weighted automaton i n the

probability semiring. For a deterministic weighted automaton, the first operation of the semiring can be arbitrarily chosen without affecting the definition of the function it real- izes. This is because, by definition, a deterministic weight ed automaton admits at most one path labeled with any given string. Thus, in the al gorithm de- scribed in theorem 3, the weight pushing step can be executed in any semiring whose multiplicative operation matches that of . The minimal weighted automata obtained by pushing the weights in is also minimal in since it can be

interpreted as a (deterministic ) weighted automaton over In particular, can be interpreted as a weighted automaton over the (max )-semiring ( max 1). The application of the weighted minimiza- tion algorithm to in this semiring leads to the minimal weighted automaton of Figure 4(c). is also a stochastic weighted automaton in the sense that, at any state, the maximum weight of all outgoing transitions is one. 10
Page 11
This fact has several interesting observations. One is rela ted to the com- plexity of the algorithms. Indeed, we can choose a semiring in which the complexity of

weight pushing is better than in . The resulting automaton is still minimal in and has the additional property of being stochastic in It only differs from the weighted automaton obtained by pushi ng weights in in the way weights are distributed along the paths. They can b e obtained from each other by application of weight pushing in the appro priate semiring. In the particular case of a weighted automaton over the proba bility semiring, it may be preferable to use weight pushing in the (max )-semiring since the complexity of the algorithm is then equivalent to that of cla ssical

single-source shortest-paths algorithms. The corresponding algorithm i s a special instance of the generic shortest-distance algorithm given by [11]. Another important point is that the weight pushing algorith m may not be defined in because the machine is not zero-sum-free or for other reason s. But an alternative semiring can sometimes be used to minimize the input weighted automaton. The results just presented were all related to the minimizat ion of the num- ber of states of a deterministic weighted automaton. The fol lowing proposition shows that minimizing the number of states

coincides with mi nimizing the num- ber of transitions. Proposition 3 ([9]) Let be a minimal deterministic weighted automaton, then has the minimal number of transitions. Proof. Let be a deterministic weighted automaton with the minimal numb er of transitions. If two distinct states of were equivalent, they could be merged, thereby strictly reducing the number of its transitions. Th us, must be a minimal deterministic automaton. Since, minimal determin istic automata have the same topology, in particular the same number of states an d transitions, this proves the proposition. 7 Conclusion We

surveyed several recent weighted finite-state transduce r algorithms. These algorithms can be used in a variety of applications to create efficient and com- plex systems. They have been used with weighted transducers of several hundred million states and transitions to create large-vocabulary speech recognition and complex spoken-dialog systems. Other algorithms such as -removal and syn- chronization of weighted transducers also play a critical r ole in the design of such large-scale systems. References [1] Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. The design and

analysis of computer algorithms . Addison Wesley: Reading, MA, 1974. 11
Page 12
[2] Cyril Allauzen and Mehryar Mohri. Efficient Algorithms fo r Testing the Twins Property. Journal of Automata, Languages and Combinatorics , 8(2), 2003. [3] Jean Berstel. Transductions and Context-Free Languages . Teubner Studi- enbucher: Stuttgart, 1979. [4] Jean Berstel and Christophe Reutenauer. Rational Series and Their Lan- guages . Springer-Verlag: Berlin-New York, 1988. [5] Karel Culik II and Jarkko Kari. Digital Images and Formal Languages. In Grzegorz Rozenberg and Arto Salomaa, editors,

Handbook of Formal Languages , volume 3, pages 599616. Springer, 1997. [6] Samuel Eilenberg. Automata, Languages and Machines , volume A. Aca- demic Press, 1974. [7] Werner Kuich and Arto Salomaa. Semirings, Automata, Languages . Num- ber 5 in EATCS Monographs on Theoretical Computer Science. S pringer- Verlag, Berlin, Germany, 1986. [8] Daniel J. Lehmann. Algebraic Structures for Transitive Closures. Theoret- ical Computer Science , 4:5976, 1977. [9] Mehryar Mohri. Finite-State Transducers in Language an d Speech Pro- cessing. Computational Linguistics , 23:2, 1997. [10] Mehryar Mohri.

General Algebraic Frameworks and Algor ithms for Shortest-Distance Problems. Technical Memorandum 981210 -10TM, AT&T Labs - Research, 62 pages, 1998. [11] Mehryar Mohri. Semiring Frameworks and Algorithms for Shortest- Distance Problems. Journal of Automata, Languages and Combinatorics 7(3):321350, 2002. [12] Mehryar Mohri, Fernando C. N. Pereira, and Michael Rile y. Weighted Au- tomata in Text and Speech Processing. In Proceedings of the 12th biennial European Conference on Artificial Intelligence (ECAI-96), Workshop on Extended finite state models of language, Budapest, Hungary

. ECAI, 1996. [13] Fernando C. N. Pereira and Michael D. Riley. Speech reco gnition by com- position of weighted finite automata. In Emmanuel Roche and Y ves Sch- abes, editors, Finite-State Language Processing , pages 431453. MIT Press, Cambridge, Massachusetts, 1997. [14] Dominique Revuz. Minimisation of acyclic determinist ic automata in linear time. Theoretical Computer Science , 92:181189, 1992. [15] Arto Salomaa and Matti Soittola. Automata-Theoretic Aspects of Formal Power Series . Springer-Verlag: New York, 1978. 12
Page 13
[16] Marcel Paul Schutzenberger. Sur

une variante des fonc tions sequentielles. Theoretical Computer Science , 4(1):4757, 1977. 13