Allen Emerson and ineet Kahlon Department of Computer Sciences and Computer Engineering Research Center The Uni ersity of xas Austin TX78712 USA Abstract ne method is proposed for par ameterized reasoning about snoop cache coherence protocols The me ID: 22697 Download Pdf

69K - views

Published bylois-ondreau

Allen Emerson and ineet Kahlon Department of Computer Sciences and Computer Engineering Research Center The Uni ersity of xas Austin TX78712 USA Abstract ne method is proposed for par ameterized reasoning about snoop cache coherence protocols The me

Download Pdf

Download Pdf - The PPT/PDF document "Rapid arameterized Model Checking of Sno..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Page 1

Rapid arameterized Model Checking of Snoopy Cache Coher ence Pr otocols E. Allen Emerson and ineet Kahlon Department of Computer Sciences and Computer Engineering Research Center The Uni ersity of xas, Austin TX78712, USA Abstract. ne method is proposed for par ameterized reasoning about snoop cache coherence protocols. The method is distincti for being xact (sound and complete), fully automatic (algorithmic), and tractably ef ﬁcient. The states of most cache coherence protocols can be or ganized into hierarchy reﬂecting ho tightly memory block in gi en cache

state is bound to the processor broad frame ork encompassing snoop cache coherence protocols is proposed where the hierarchy implicit in the design of protocols is captured as pr e-or der This yields ne solution technique that hinges on the construction of an abstr act history gr aph where global concrete state is represented by an abstract state reﬂecting the occupied local states. The abstract graph also tak es into account the history of local transitions of the protocol that were ﬁred along the computation to get to the global state. This permits the abstract history graph to

xactly capture the beha viour of systems with an arbitrary number of homogeneous processes. Although the orst case size of the abstract history graph can be xponential in the size of the transition diagram describing the protocol, the actual size of the abstract history graph is small for standard cache protocols. The method is appli- cable to all of the most common snoop cache protocols described in Handy book [19] from Illinois-MESI to Dragon. The xperimental results for parame- terized eriﬁcation of each of those protocols document the ef ﬁcienc of this ne method in practice,

with each protocol being eriﬁed in just fraction of second. It is emphasized that this is parameterized eriﬁcation. Intr oduction Cache protocols pro vide vital uf fer between the er gro wing performance of pro- cessors and lagging memory speeds making them indispensable for applications such as shared memory multi-processors. Unfortunately cache protocols are beha viorally com- ple x. Ensuring their correct operation, in particular that the maintain the fundamental safety property of coher ence so that dif ferent processes agree on their vie of shared data items, can be subtle.

The dif ﬁculty of the problem is often magniﬁed as the number of coordinating caches increases. Moreo er it is highly desirable that cache proto- col be correct independent of the magnitude of There is thus great practical as well This ork as supported in part by NSF grants CCR-009-8141 CCR- 020-5483, and SRC contract 2002-TJ-1026. The authors email addresses are emerson,kahlon @cs.utexas.edu

Page 2

as theoretical interest in uniform parameterized reasoning about systems comprised of homogeneous cache protocols so as to ensure correctness for systems of all sizes This

general problem is kno wn in the literature as the ar ameterized Model Chec king Pr oblem (PMCP) It is in general algorithmically undecidable. Prior attempts to address the PMCP for cache protocols (cf. Section 5) ha had number of limitations, ranging from incompleteness to the need for considerable human interv ention and ingenuity to potentially catastrophic inef ﬁcienc In this paper we present general method for solving the PMCP er snoop cache coherence protocols of the sort commonly used in shared memory multiprocessors. Our frame ork includes all of the protocols in the book of

Handy [19]. Our method is specialized to dealing with safety properties, as is appropriate for reasoning about coherence. gi solution for this PMCP er our cache frame ork for safety that is distinguished by being xact (sound and complete), fully automatic (algorithmic), and ha ving comple xity bounds that are quite tractable. The orst case comple xity of our general algorithm is single xponential time in the size of the state diagram of single cache unit; ho we er our xperimental results sho that our algorithm performs very ef ﬁciently in practice. ha applied our method to erify

parameterized ersions of the MSI, MESI, MOESI, Illinois (MESI-type), Berk ele N+1, Dragon, and Fireﬂy cache coherence protocols. In our frame ork, we model cache coherence protocols using specialized ariant of broadcast protocols [14] that we call pr e-or der ed br oadcast pr otocols where pro- cesses coordinate using broadcast primiti es plus boolean guards. broadcast trans- mission corresponds to cache protocol putting message on the us; reception of such message corresponds to snooping the us and taking appropriate action. Boolean guards mak it possible to model protocols (e.g.,

Illinois, Fireﬂy Dragon) that need to determine the presence or absence of the required memory block in other caches. Our approach xploits feature common to most snoop cache coherence protocols [8]: their states can be or ganized into hier ar hy based on ho tightly memory block in gi en state is bound to the processor Consider for xample, the MSI cache coherence protocol (cf. Figure 1). memory block in the modiﬁed state is intended to be used by at most one processor and can be written to by that processor locally without gen- erating an memory transactions across the us. So it

is tightly bound to the processor Ho we er block in the shar ed state can potentially be shared by multiple processes and cannot be modiﬁed locally Hence it is less tightly bound to the processor mak precise this notion of tightness by capturing it as pr e-or der on the state set of an indi vidual cache protocol. Intuiti ely state higher in the order is more tightly bound to the processor than state that is comparably lo wer in the order or instance, in the case of the MSI protocol, the pre-order is gi en by Our technique in olv es the construction of an abstr act history gr

aph er nodes of the form where is the set of states of the gi en cache protocol. The idea is the follo wing: represent global state of system with caches pr e-or der on ﬁnite set is reﬂe xi and transiti binary relation on There are se eral associated relations. say is equi alent to written if !#"$%& strictly precedes written (') if (#"+*-,./10 is incomparable to written &2 if *-,.+4506"+*-,.7410

Page 3

by tuple of the form $ ! Here denotes the local state of the process ecuting the most recent transition in the

computation leading up to that ﬂushes all the other processes into some unique ﬁx ed state. The set denotes the maximal set of states of that could potentially be ﬁlled gi en arbitrarily man processes by ﬁring (a stuttering of the) the sequence of local transitions that were ﬁred in the system with caches to get to The standard abstract graph construction used in, e.g., [25] just stores the set of local states occurring in global state. Our ne construction xtra historical information permits us to reason about an arbitrary number of caches in an xact ashion

with respect to safety properties. In the orst case, the size of the abstract graph may be xponential in the size of the state diagram of the gi en cache protocol. But in practice the abstract graph tends to be small as documented by our empirical results. In our xperiments, protocols with states had abstract graphs with abstract states, for small belie this may be reﬂection of the tendenc for broadcast transitions to dri recipients from wider range of cache states to narro wer (lo wer in the pre-order) range of cache states, thereby reducing the number of de grees of freedom

possible for abstract states. Finally we discuss ho our technique enables us to generate error traces once an error is detected. The rest of the paper is or ganized as follo ws. be gin by introducing the system model in section 2. In section 3, we present model checking algorithm for erifying parameterized safety properties based on the construction of the abstract history graph. Applications and xperimental results are discussed in section 4, while comparison with related orks and some concluding remarks are gi en in the ﬁnal section 5. Pr eliminaries 2.1 Moti ating Example use as an

xample the simple MSI cache coherence protocol. The state transition diagram for the MSI protocol is sho wn in ﬁgure 1. The symbols and stand for and states, respecti ely The states are or ganized so that the closer the state is to the top, the more tightly is the memory block in that state bound to the processor In our system model we capture this notion of tightness as pr e-or der on the states of the cache protocol. The notation means that if the controller observ es the ent from the processor side of the us then in addition to the state change it

generates the us transaction or action The null action is denoted by “-”. ransitions due to observ ed us transactions are sho wn as dashed arcs, while those due to local processor actions are sho wn in bold arcs. The ! #" $" transaction is generated by process read % " request when the memory block is not in the cache. The ne wly loaded block is pr omoted viz., mo ed up in the state diagram, from in alid to the shared state in the requesting cache. If an other cache has the block in the modiﬁed state and it observ es $" transaction on the us, then its cop is stale and so it

demotes its cop to the shared state. call such transition low-push broadcast. More generally broadcast transition '&)( is lo w-push transition with respect to if it forces ery other process in local that is strictly higher in the pre-order than to state that is at most as high as The *" ,+!-. / ! $" 10 transaction is generated by %324 to block that is either not in the cache or is in the cache

Page 4

! ! "! ## ##

## Fig 1. The MSI Cache Coherence Protocol and its template ut not in the modiﬁed state. The cache controller puts the address on the us and asks for an xclusi cop that it intends to modify All other caches are in alidated. Once the cache obtains the xclusi cop the write can be performed in the cache. This is an xample of ﬂush broadcast transition, that forces ery process other than the one ﬁring the transition and in its non-initial state into unique ﬁx ed state deﬁned by the transition. The template for

protocol, such as MSI, is obtained from its state transition dia- gram through simple abstraction, treating the beha vior of the processors as purely non- deterministic. The transformation is straightforw ard, syntactic, and mechanical: Each transition generated by processor actions (represented by bold line) and labeled by %'&)( where (+* ,.- is labeled with the broadcast send label %0// while ery transition generated by us actions represented by dashed lines) and labeled with (1&32 is labeled with the matching broadcast recei label %54)4 In the original diagram the relationship between

broadcast send %0&)( and its corresponding recei (6&32 as established with the common symbol while in the template it is established by the common symbol in the labels %0// and %54)4 Ev ery bold transition labeled with %'& represents local action and is therefore labeled with the local transition label The natural pre-order on is 9;:=<>:@? All transitions labeled with A'B3CED are lo w-pushes with respect to while those labeled with A'BGF=B are ﬂushes. 2.2 The System Model: Pr e-Order ed Br oadcast Pr otocols In this paper we consider amilies of systems of the form $0H such that

pre-order can be imposed on the states of template such that each transition of is either local transition or ﬂush broadcast or lo w-push broadcast with respect to Furthermore There is usually natural and visually ob vious pre-order ut there may be more than one suitable pre-order suitable pre-order can be constructed as sho wn in the section 3.4.

Page 5

the transition could also be labeled with the specialized disjuncti guard or the specialized conjuncti guard call such systems pr e-or der ed br oadcasts The process template is formally deﬁned by the 4-tuple

where is ﬁnite, non-empty set of states is ﬁnite set of labels including the local transition label broadcast labels and recei labels The local transition relation is such that each transition is either local or broadcast, &)( or recei assume that recei es are deterministic: for each label appearing in some broadcast send and for each state in there is unique corresponding recei transi- tion on out of The guard labeling each transition of is either the boolean xpression true or the specialized conjunctive guard or the

specialized disjunctive guard assume that the guard is true for recei transitions. In practice, the abo mentioned guards suf ﬁce in modeling cache coherence protocols as each cache only needs to kno whether another cache has the memory block it requires, xpressed using the specialized disjuncti guard, or whether no other cache has it, xpressed using the specialized conjuncti guard. further stipulate pre-ordering, on the state set of such that is the minimum element, i.e., for all local states "! we ha %$& and such that each broadcast transition is of either of the tw forms 1.

Flush Gi en state of transition where is called an ﬂush transition pro vided that there xists the matching recei transition ' ( in and for each state of there is matching recei transition of the form ') in ﬂush transition is an -ﬂush for some Intuiti ely an -ﬂush transition pushes ery process in its non-initial state, other than the one ﬁring the transition, into local state 2. Low-push ransition &)( is low-push transition pro vided that, *! 6 and for each state such that there is matching recei transition of the form ' such that

and, for all other states there is matching self-loop recei transition Intuiti ely transition is lo w-push if it pushes ery process in local state strictly higher than in the pre-order into state at most as high as while lea ving the rest of the processes untouched. In practice, natural pre-order is normally supplied along with the diagram of as it dra wn in appropriate le els. If not, there is gi en in the section 3.4 an ef ﬁcient algorithm (O( ,+ )) to compute an appropriate pre-order if one xists. capture block replacement beha vior we also require that templates be

initializ- able This means that from each state of protocol, there is local transition of the form . Such initializations model block replacement beha vior where cache is non-deterministically pushed into its in alid state, irrespecti of the current state of the Initializability is not needed for the mathematical results of section 3.1; ho we er it is needed for the results of section 3.2.

Page 6

block. or simplicity re-initialization transitions and self-loop receptions are not dra wn in state transition diagrams of cache protocols (cf. [8]). Gi en the state transition digram for

1 the system with copies of is based on interlea ving semantics in the standard ay path - of is sequence of states of starting at the initial state of such that for ery " for some / or global state of and we use to denote the local state of process in and for computation path of we use to denote the local computation path of in viz., the sequence write 7 to mean that ﬁnite computation path of ends in global state In this paper we will focus on ﬁnite paths and computations as the suf ﬁce for

safety Finally gi en global state of and local state of we let denote the number of copies of in viz., the number of processes in local state in global state Safety Pr operties Gi en state of we say that is eac hable if there xists such that there is ﬁnite computation of leading to state with process in local state or cache coherence protocols, we are typically interested in pairwise eac hability viz., gi en pair of local states and of template deciding whether for some there xists reachable global state of with process in each of the local states and or instance, in the case of

the MSI protocol, we are interested in sho wing that none of the pairs in the set 6 ! is pairwise reachable. 3.1 Systems without conjuncti guards In this section, we assume that is template without conjuncti guards; guards of the form true or are permitted. This allo ws us to handle the MSI, MOESI, MESI (not the Illinois ersion which is handled in the ne xt section), Berk ele and N+1 protocols. standard technique for reasoning about parameterized systems in olv es the con- struction of an abstract graph to capture the beha viour of system instance of arbitrary size. Classically

the abstract graph is deﬁned to be transition diagram er the set 6 with gi en concrete global state of system instance being mapped via mapping say onto the set or / $# transition is intro- duced from to in the abstract graph if there xists and concrete states and of % such that and results from by ﬁring concrete transition of % There is loss of information in the mapping which is reﬂected in the act that it might not be possible to identify unique successor of in the abstract graph that results by ﬁring transition &)( where / or instance if is local

tran- sition, then tw dif ferent successors are possible: '& ( *)+ 3( , and ')+ 3( , depending, respecti ely on whether there is xactly one or at least copies of in the concrete state that maps onto preserv soundness we co er for both cases and in- troduce both and as possible successors. Ho we er this may generate bogus paths in the abstract graph, viz., paths for which there do not xist matching concrete compu- tations. Thus there might xist paths in the abstract graph that don “lift to concrete computations and hence the abo technique though sound is not complete.

Page 7

In

this paper to check pairwise reachability we use the abstr act history gr aph of denoted by where we bypass the abo problem by mapping each concrete state onto tuple of the form that denotes formal state with at least one cop of state and ﬁnite ut arbitrarily man copies of each state in As we later sho this permits us to reason about safety properties in sound and complete ashion. Deﬁnition (r epr esentati e) Gi en template and ﬁnite computation of we deﬁne ep to be the tuple 4 where, if no ﬂush transition as ﬁred along then and

! and if is the process to last ﬁre ﬂush transition along then and Gi en template the abstr act history gr aph is tran- sition diagram deﬁned er tuples of the form ) or ! for some we will sho ho to map onto tuple of the form This mapping depends not only on the global state ut also on viz., the history of the computation leading to and thus the term abstr act history gr aph Essentially in tuple state records the local state in of the process ecuting the last ﬂush along whereas is superset of the set of the local

states of the remaining processes. This dichotomy is justiﬁed on the basis of the act that we can pump up the multiplicity of each local state in to an desired alue xcept possibly of the current local state in of the process to last ecute ﬂush along which could ha multiplicity xactly one as we later sho no deﬁne the transition relation ards that end, gi en tuple and local or broadcast send transition we deﬁne the successor of via as either the state-successor denoted by state-succ or the set-successor of denoted by set-succ As

mentioned abo e, we think of as state with ﬁnite ut arbitrarily man copies of each state in plus one cop of The case of the state-successor captures the scenario when process in local state that possibly has multiplicity only one ﬁres while the case of the set-successor captures the scenario when process in local state with arbitrarily lar ge multiplicity ﬁres enabled transition Deﬁnition (state-successor) Let + & and let transition labeled by guard be enabled in viz., if then 6 # Then state-succ where if is local

transition then and if is broadcast send transition then 3( + 1 1 that is matc hing eceive for tr As an xample, since ﬁring the transition of the MSI protocol af fects only processes in state by causing them to transit to state therefore state- succ * Deﬁnition (set-successor) Let ! and let transition where ( be such that if is labeled by guard then it is enabled in viz., if then for some ( Then, set-succ is deﬁned as the tuple if is -ﬂush transition

Page 8

Fig 2. The abstr act histor aph or the MSI Cache Coherence Protocol if is local transition. Note that since we had arbitrarily man copies of to start with so en after ﬁring local transition we are guaranteed arbitrarily man processes in local state which is therefore not xcluded from the second component of the resulting tuple. % if is lo w-push broadcast transition, where is the (unique) match- ing recei for from and 1( + 1 6 that is matching recei for tr As

in the pre vious case since we ha arbitrarily man copies of so in we include the local state that results from ﬁring the matching recei for from which by deﬁnition of lo push transition (and the act that is itself. As an xample, since ﬁring the transition of the MSI protocol ﬂushes ery other process into state therefore set-succ * / no formally deﬁne the abstract history graph of template Deﬁnition (Abstract History Graph) Gi en template the abstr act history gr aph of is deﬁned to be the tuple where 4 and

6 % state-succ or set- succ for some local or br oadcast send tr ansition of As an xample, the abstract history graph for the MSI protocol is sho wn in ﬁgure 3. Self loops are omitted for the sak of simplicity or con enience, we ha labeled each transition of the graph by the label of the transition responsible for “ﬁring it. Note that as opposed to the classical construction, gi en tuple and transition both the set-successor and state-successor of via are uniquely deﬁned. This is because as will be sho wn in proposition 3.3, we can ha

arbitrarily man copies of each state in thereby alle viating the problem of considering the dif ferent successors that may arise from concrete states with dif ferent counts of local states as as the case with the classical abstract graph construction. This permits us to gi xact path cor respondences between the parameterized amily of concrete systems and the abstract

Page 9

history graph as we no sho Since we are dealing with systems of “disjuncti e nature ha ving (arbitrarily man y) xtra copies does not disable an transitions. Gi en ! the precise mapping of onto tuple of is gi

en by the -representati of denoted by -r ep Deﬁnition -r epr esentati e) Let * be ﬁnite computation path of Then we deﬁne the -r epr esentative of denoted by -r ep as the tuple $ deﬁned as follo ws: If then else suppose that transition is initiated by transition of ﬁred locally by process and let be the process to last ecute ﬂush transition in Then ( ( The tuple ep speciﬁes the actual set of states present in the global state ha ving follo wed path through % In

contrast, the -representati -r ep incor porates not only the local states present in ut also the states that could potentially be present, gi en suf ﬁciently man processes in global state of that results from ﬁring (a stuttering of) the same local transitions as were ﬁred along to get to Thus, -r ep drags along some “history of the computation leading to and thereby stores more information than ep This is formalized as follo ws. Pr oposition 3.1 (Containment Pr operty) Gi en such that ep and -r ep we ha and no establish “path correspondence between ﬁnite

computations of and be- tween ﬁnite paths of starting at Pr oposition 3.2 (Pr ojection) or an ﬁnite path in there xists ﬁnite path in starting at such that -r ep or the other direction, we ha Pr oposition 3.3 (Lifting) Let be path of starting at and leading to tu- ple of Then, gi en there xists for some such that ep and has at least copies of each state in plus cop of Combining the pre vious three results, we ha Theor em 3.4 (Decidability Result) air is pairwise reachable if there xists path in starting at to tuple of the form $# where either and

or and / or / and # Thus we ha reduced the problem of pairwise reachability for pair of local states of gi en template to the problem of reachability in the abstract history graph constructed from Since the size of the abstract graph is O( ,+ ), we ha Cor ollary 3.5 The pairwise reachability problem for pair of local states of gi en template can be solv ed in time O( ,+ ), where ,+ is the size of template as measured by the number of states and transitions in

Page 10

Note that in the construction of it suf ﬁces to consider only the set of tuples reachable

from the initial tuple In practice, the number of states of this graph may be much smaller than the orst case scenario where it could be 6 This is illustrated clearly by our xperimental results in section 4.2. 3.2 Adding the Specialized Conjuncti Guard reason about systems wherein the templates are augmented with the specialized conjuncti guard along with the assumption of initializability we use modiﬁcation of the abstract history graph. Broadly speaking, the intuition behind the modiﬁcation is that we can mak the specialized conjuncti guard of process aluate to true

starting at an global state by dri ving all the other processes into their respecti initial states by making use of the local initializing transition mentioned abo e. Thus for ery tuple in the abstract history graph, we add transition of the form *& where either or ( to Deﬁnition (Modiﬁed Abstract History Graph) Gi en template $ and its abstract graph deﬁne the modiﬁed abstract graph to be the tuple where is the set of all transitions % where and either or ) This transition corresponds to the successi ﬁring of the local

initializing transition that lea es one process in state ( and the rest of the processes in their initial states, thereby enabling guard ( labeling its transitions. and labeled by ( This corresponds to the ﬁring of transition labeled with labeled either by or by true such that either % state- succ or set-succ ! This correspond to the ﬁring of transitions labeled with or true. Then, as in section 3.1, we can sho “path correspondence between concrete ﬁnite computations of and ﬁnite paths in starting at The proofs are

similar and are therefore omitted. Thus as in section 3.1, we ha the follo wing de- cidability result from which it follo ws, as before, that for this model of computation, pairwise reachability can be decided in time O( ,+ ), where ,+ is the size of the template Theor em 3.6 (Decidability Result) air is pairwise reachable if there xists path in starting at to tuple of the form $# where either and or and / or / and # 3.3 Generating Err or races critical part of the eriﬁcation process, once an error is detected, is the generation of concrete computation of the system at hand

leading to an erroneous global state. 10

Page 11

!" !" " Fig 3. The template or the Brok en MSI Protocol and its abstr act histor aph ill no we ha sho wn ho to reduce the eriﬁcation process for safety properties of the parameterized ersion of gi en cache protocol to reachability analysis er the corresponding abstract history graph. This only allo ws us to detect an erroneous state in the

abstract history graph and thereby construct path in the abstract graph to an erroneous state. get back concrete computation of an instance of an original system leading to concrete erroneous state, we mak use of the construction used in pro ving proposition 3.3. Gi en path starting at the initial tuple leading to an erroneous tuple of the abstract history graph, this construction can be used to gi fully automated procedure to construct ﬁnite computation of concrete system for some ending in state such that ep In general, is of size linear in the length of viz., O( in

the orst case. But, as mentioned abo e, in practice, the number of states of the abstract history graph reachable from its initial state tend to be small and consequently so does the length of The ability to automatically generate error traces distinguishes our ork from [9], where no ef fecti ay to generate error traces as gi en. no illustrate the construction with brok en ersion of the MSI protocol (ﬁg- ure 3). The MSI protocol is clobbered by replacing the ﬂush transition labeled with % 24 from the shared state to the modiﬁed state by lo push transition labeled with

%324 In the abstract history graph, self loops are omitted for simplicity reasons and erroneous tuples are shaded. Note that the erroneous pair $ can be reached via the path $ by ﬁring transi- tion labeled with % " follo wed by transition labeled with %324 From this path we can get back concrete computation of system with caches by ﬁring tran- sitions labeled with %" % " and %324 in the order listed, stuttering of the sequence %" %324 The resulting concrete computation is: $ $ $ %'& / Here symbol labeling

transition indi- cates that process ﬁres transition of template labeled with 11

Page 12

3.4 utomatic Construction of Pr e-order In practice, one can usually obtain the natural pre-order by dra wing the diagram in le els, reﬂecting ho tightly memory block in gi en cache state is bound to the pro- cessor Such le els are used in the te xtbook by Culler [8] et al. If not, we can ef ﬁciently xhibit feasible pre-order that can be imposed, or determine that none xists. proceed by constructing the labeled, directed graph - 5 where 6 /& is its edge set. or /

an edge of the form represents 4 indicates and means 4 construct as follo ws. 1. Initially 1 6 + This is because of the assumption we made in the system model that for each we ha 6 2. or each non-local transition or non-ﬂush broadcast send transition we ha 6 Thus we augment by adding the edge 6 Furthermore if is matching recei for such that then we ha that $ and so we add the edges and to On the other hand if is matching recei for then we ha that and so we add the edge to If already contains an edge of the form then in case we add the

edge to in the abo step, we remo to ensure that there is only one edge from to labeled with or Let be the subgraph of that we get by deleting all edges labeled with Then we can impose pre-order on the states of compatible with its transitions if (1) there does not xist ycle in containing an edge labeled with and (2) for each edge of there do not xist tw distinct maximal strongly connected components of one containing state and the other one containing state such that there is path from to in Since the maximal strongly connected components of can be constructed in time linear in the

size of viz., linear in therefore the abo mentioned conditions and can be check ed in time quadratic in the size of Thus we can decide in O( ,+ time whether desired pre-order can be imposed on or not. pplications As applications, we consider model checking parameterized ersions of all of the snoop based cache protocols presented in [19]. The translation from the state transition dia- gram of gi en protocol to its template is straightforw ard and syntactic and can be performed in the same mechanical ashion as as done for the MSI protocol in section 2.1: Firing bold transition labeled with

and/or one that requires that no other cache currently possesses the desired memory block does not af fect the status of the memory block in an other cache. Such transition is therefore labeled with the lo- cal transition label and in the second case also guarded with the Otherwise, transition labeled by where is labeled with the broadcast send label * Flush broadcast send transitions can be identiﬁed syntactically as all their matching recei es from ery non-initial state transit to unique state with the matching recei from self- looping on itself. Local transitions can be

identiﬁed by the absence of matching recei es. 12

Page 13

PrRd/ BusRd(S) PrRd/ PrWr/ BusRdX PrWr/ BusRdX/ Flush BusRd/ Flush BusRd/ Flush BusRdX/ Flush PrRd/ PrWr/ PrRd/ PrRd/ BusRd/Flush BusRd(S) PrWr!! PrWr?? PrWr!! PrRw?? PrRd?? PrRd?? PrWr/ BusRdX PrRw?? Flush BusRdX/ PrRd!! Fig 4. The Illinois MESI Cache Coherence Protocol and its template while ery transition generated by us actions represented by dashed lines) and la- beled with is labeled with the matching broadcast recei label If to ﬁre the transition additionally requires some other

cache to possess the desired memory block then it is also guarded by Belo we consider only the Illinois MESI protocol in detail, with some others being handled in the full report [10]. 4.1 The Illinois MESI Cache Coher ence Pr otocol The transition digram and the template for the Illinois MESI cache coherence protocol is sho wn in ﬁgure 4. ormally the template is deﬁned as ! "#%$'&($*)+$-,.0/ where, #! 1,324$-56$876$9+/ with the pre-order being gi en by 2;:<5=:>7(?@9 The set &A ,B3$8CEDGFIHKJLJM$8CEDGFIHNO$'CPDLQRDJMJL$'CPDMQD O/ The transitions are as

deﬁned belo Empty Broadcasts (Local ransitions): S9T$UB3$82 S76$*B3$'20 56$*B3$'20 V2M$UB3$7K W56$UBK$ 5K S76$UBK$87K S9T$UBK$89 Note that the ﬁrst three transitions are included because of the assumption of initializability and are for simplicity reasons not sho wn in ﬁgure nor are broadcast recei transitions that are self loops. Lo w-push sends: S24$8CEDGFIHKJLJL$ 5K Lo w-push recei es: S9T$'CPDGF6HXO$ 5K S76$8CEDGFIHNO$ 5; Flush sends: V2M$'CPDMQRDJLJM$9Y W56$'CPDLQRDJMJL$89 Flush recei es: U9Z$8CEDLQRD O$'20 U76$'CPDMQD

O$'20 5I$'CPDMQRD O$'20 The transitions S24$8CEDGFIHKJLJM$-5K and S24$*B3$87K are labeled with W and [=W respec- ti ely with the rest of the transitions being labeled with the true guard. need to decide whether the follo wing pairs are pairwise reachable: U9Z$9Y S9T$87K S9T$ 5K S76$7K $3S76$ 5; 13

Page 14

4.2 Experimental Results Here we summarize the results for wide range of xamples of cache coherence proto- cols. or detailed descriptions of these protocols refer to [19]. The column under of Abstr act States refers to the number of reachable states

in the abstract history graph for protocols that don use conjuncti guards, viz., MSI, MESI, MOESI, Berk ele and N+1; and in the modiﬁed abstract history graph for ones that use conjuncti guards, viz., Illinois-MESI, Fireﬂy and Dragon. It is orth noting that although in the orst case the number of reachable abstract states in the modiﬁed abstract history graph cor responding to the template could be as lar ge as in practice it typically turns out to be much smaller or instance in the MESI protocol, the number of reachable abstract states were 6, against orst case

possibility of states. similar scenario holds for the other protocols. Thus, in conclusion, the abstract history graph construction seems to ork well in practice. The xperiments were car ried out on machine with 797MHz Intel Pentium III processor and 256 Mb RAM. Belo we tab ulate the results for ariety of cache coherence protocols. The user time for erifying each of the cache coherence protocols as less than seconds. Pr otocol Pr e-Or der of Abstr act States MSI In valid Shar ed Modiﬁed MESI In valid Shar ed Exclusive Modiﬁed Illinois In valid Shar ed Exclusive

Modiﬁed MOESI In valid Owned Shar ed Exclusive Modiﬁed N+1 In valid alid Dirty In valid Owned Non-e xclusively Unowned Berk ele Unowned Owned Exclusively Fireﬂy In valid Shar ed Dirty alid Exclusive In valid Shar ed Clean Shar ed Modiﬁed Exclusive Dragon Exclusive Modiﬁed Concluding Remarks The generally undecidable PMCP has recei ed good deal of attention in the literature. number of interesting proposals ha been put forth, and successfully applied to cer tain xamples ([7, 6, 26, 20, 2, 3, 27, 21]). Most of these orks, ho we er suf fer from the dra wbacks of

being either only partially automated or being sound ut not guaranteed complete. Much human ingenuity may be required to de elop, e.g., netw ork in ariants; the method may not terminate; the comple xity may be intractably high; and the under lying abstraction may only be conserv ati e, rather than xact. Similar limitations apply to prior ork on PMCP for cache protocols. Pong and Dubois [25] described methods that were sound ut not complete, as the were based on Ho we er for frame orks that handle specialized applications domains decisions procedures can be gi en that are both sound and

complete and fully automatic and in some cases ef ﬁcient ([13, 15, 11, 12, 5, 24]). 14

Page 15

conserv ati e, ine xact abstractions. In [14] general frame ork of parameterized br oad- cast pr otocols as introduced and it as sho wn ho certain simple cache protocols could be modeled. That frame ork, ho we er did not admit guarded transitions, neces- sary to model man cache protocols such as Illinois (MESI). In [16], it as sho wn that sho wed that PMCP for safety er such broadcast protocols of [14] is decidable using the general backw ard reachability procedure of [1]. Ho we er

the backw ard reachabil- ity algorithm of [1] that [16], mak es use of, although general, suf fers from the handicap that the best kno wn bound for its running time is not kno wn to be primiti recursi [23]. In [22], Maidl, using proof tree based construction, sho ws decidability of the PMCP for broad class of systems including broadcast protocols, ut again the de- cision procedure is not kno wn to be primiti recursi e. Moreo er [22, 16, 14] do not report xperimental results for cache protocols. More recently Delzanno [9] uses arithmetical constraints to model global states of systems with man

identical caches. This method uses in ariant checking via backw ard reachability analysis of [1] and pro vides broad frame ork for reasoning about cache coherence protocols ut the procedure does not terminate on some xamples. Further more, this technique does not pro vide ay to generate err or tr aces when ug is detected. In [17], it as sho wn that for sub class of broadcast protocols called en- tr opic broadcast protocols, generalization of the Karp-Miller procedure for Petri nets terminates. While mathematically ele gant, the model does not allo for boolean guards necessary for modeling

protocols lik Illinois-MESI, Fireﬂy and Dragon. Also, no x- plicit bounds were pro vided on the size of the resulting co erability tree (cf. [23]). In this paper we ha xploited the hierarchical or ganization inherent in the design of snoop cache protocols, representing and generalizing this or ganization using pre- orders. then present specialized ariant of the broadcast protocols model called pr e-or der ed pr otocols tailored to capture snoop cache coherence protocols. This has allo wed us to pro vide uniﬁed, fully automated and ef ﬁcient method to reason about

parameterized snoop cache coherence protocols. Our method is unique in meeting all these important criteria: (a) it is sound and complete; (b) it is algorithmic; (c) it is apid meaning reasonably ef ﬁcient in principle: orst case comple xity single xponential. (d) it has broad modeling po wer: handles all xamples from Handy book; (e) it is apid also meaning demonstrably ef ﬁcient in xperimental practice; each xample protocol as eriﬁed for parameterized correctness in fraction of second; and (f) it caters for error trace reco ery Refer ences 1. Abdulla, K. Cerans, B.

Jonsson, K. Tsay General Decidability Theorems for Inﬁnite State Systems. LICS 1996. 2. Abdulla, A. Boujjani, B. Jonsson and M. Nilsson. Handling global conditions in parame- terized systems eriﬁcation. CA 1999. 3. Abdulla and B. Jonsson. On the xistence of netw ork in ariants for erifying parameter ized systems. In Corr ect System Design Recent Insights and Advances 1710, LNCS, pp 180-197, 1999. 4. K. Apt and D. ozen. Limits for automatic eriﬁcation of ﬁnite-state concurrent systems. Information Pr ocessing Letter 15, pages 307-309, 1986. 15

Page 16

5.

Arons, A. Pnueli, S. Ruah, J, Xu and L. Zuck. arameterized eriﬁcation with Automati- cally Computed Inducti Assertions. CA 2001, LNCS 2102, 2001. 6. M.C. Bro wne, E.M. Clark and O. Grumber g. Reasoning about Netw orks with Man Iden- tical Finite State Processes. Information and Contr ol 81(1), pages 13-31, April 1989. 7. E.M. Clark e, O. Grumber and S. Jha. erifying arameterized Netw orks using Abstraction and Re gular Languages. CONCUR. LNCS 962, pages 395-407, Springer -V erlag, 1995. 8. D. E. Culler and J. Singh. arallel Computer Architecture: Hardw are/Softw are Ap- proach. Mor gan

Kaufmann Publishers, 1998. 9. G. Delzanno. Automatic eriﬁcation of arameterized Cache Coherence Protocols. CA 2000, 51-68. 10. E.A. Emerson and Kahlon. This paper full ersion. ailable at http://www.cs.utexas.edu/users/ emerson,ka hlon /taca s03/ 11. E.A. Emerson and Kahlon. Reducing Model Checking of the Man to the Fe CADE-17. LNCS Springer -V erlag, 2000. 12. E.A. Emerson and Kahlon. Model Checking Lar ge-Scale and arameterized Resource Allocation Systems. CAS, 2002. 13. E.A. Emerson and K.S. Namjoshi. Reasoning about Rings. POPL. pages 85-94, 1995. 14. E.A. Emerson and K.S. Namjoshi.

On Model Checking for Non-Deterministic Inﬁnite-State Systems. LICS 1998. 15. E.A. Emerson and K.S. Namjoshi. Automatic eriﬁcation of arameterized Synchronous Systems. CA LNCS Springer -V erlag, 1996. 16. J. Esparza, Fink el and R. Mayr On the eriﬁcation of Broadcast Protocols. LICS 1999. 17. A. Fink el and J. Leroux. ﬁnite co ering tree for analyzing entropic broadcast protocols. Proc. VCL 2000. Report DSSE-TR-2000-6, Uni Southampton, GB. 18. S.M. German and A.P Sistla. Reasoning about Systems with Man Processes. CM 39(3), July 1992. 19. J. Handy The Cache Memory

Book. Academic Press, 1993. 20. R. urshan and K. L. McMillan. Structural Induction Theorem for Processes. PODC. pages 239-247, 1989. 21. D. Lesens, N. Halbw achs and Raymond. Automatic eriﬁcation of arameterized Linear Netw ork of Processes. POPL 1997. pp 346-357, 1997. arallel Coordination Programs I. Acta Informatica 21 1984. 22. M. Maidl. Unifying Model Checking Approach for Safety Properties of arameterized Systems. CA 2001. 23. K. McAloon. Petri Nets and Lar ge Finite Sets. Theor etical Computer Science 32 pp. 173- 183, 1984. 24. A. Pnueli, S. Ruah and L. Zuck. Automatic Deducti

eriﬁcation with In visible In ariants. CAS 2001, LNCS, 2001. 25. Pong and M. Dubois. Ne Approach for the eriﬁcation of Cache Coherence Protocols. IEEE ansactions on ar allel and Distrib uted Systems ol. 6, No. 8, August 1995. 26. A. Sistla, arameterized eriﬁcation of Linear Netw orks Using Automata as In ariants, CA 1997. 27. olper and Lo vinfosse. erifying Properties of Lar ge Sets of Processes with Net- ork In ariants. In J. Sif akis(ed) utomatic eriﬁcation Methods for inite State Systems Springer -V erlag, LNCS 407, 1989. 16

Â© 2020 docslides.com Inc.

All rights reserved.