OptimalAlgorithmsforSubstrateT esting inMultiChipModules AndrewB

OptimalAlgorithmsforSubstrateT esting inMultiChipModules AndrewB - Description

KahngGabrielRobins andElizabethAW alkup Departmen t of Computer Science UCLA Los Angeles CA 900241596 Departmen t of Computer Science Univ ersit y of Virginia Charlottesville V A 229032442 Departmen t of Computer Science Univ ersit yofW ashington Sea ID: 29045 Download Pdf

27K - views

OptimalAlgorithmsforSubstrateT esting inMultiChipModules AndrewB

KahngGabrielRobins andElizabethAW alkup Departmen t of Computer Science UCLA Los Angeles CA 900241596 Departmen t of Computer Science Univ ersit y of Virginia Charlottesville V A 229032442 Departmen t of Computer Science Univ ersit yofW ashington Sea

Similar presentations

Download Pdf

OptimalAlgorithmsforSubstrateT esting inMultiChipModules AndrewB

Download Pdf - The PPT/PDF document "OptimalAlgorithmsforSubstrateT esting in..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation on theme: "OptimalAlgorithmsforSubstrateT esting inMultiChipModules AndrewB"— Presentation transcript:

Page 1
OptimalAlgorithmsforSubstrateT esting inMulti-ChipModules AndrewB.Kahng,GabrielRobins andElizabethA.W alkup Departmen t of Computer Science, UCLA, Los Angeles, CA 90024-1596 Departmen t of Computer Science, Univ ersit y of Virginia, Charlottesville, V A, 22903-2442 Departmen t of Computer Science, Univ ersit yofW ashington, Seattle, W A 98195 Abstract Multi-c hip mo dule (MCM) pac aging tec hniques presen tsev eral new tec hnical c hallenges, no- tably substrate testing. W e form ulate MCM substrate testing as a problem of connectivit yv er- i cation in trees via k-pr ob es ,

and presen t a linear-time algorithm whic h computes a minim um set of prob es ac hieving complete op en fault co erage. Since actual substrate testing also in olv es sc heduling prob e op erations, w e form ulate ecien tprobesc heduling as a sp ecial t yp e of metric tra eling salesman optimization and giv e a pro ably-go o d heuristic. Empirical results using b oth random and industry b enc hmarks demonstrate reductions in testing costs of up to 21% o er pre- vious metho ds. W e conclude with generalization s to alternate prob e tec hnologies and sev eral op en problems. Key w ords: MCMs,

testing, graph algorithms, fault detection, circuit probing, TSP In tro duction Multi-c hip mo dule (MCM) tec hnology has recen tly emerged as an economically viable means for pac k- aging complex, high-p erformance systems [2 ] [11] [21 ][24 ]. T raditionally , system p erformance is limited yin terconnection dela ys at the upp er lev els of the hierarc y (e.g., prin ted circuit b oard or bac kplane), and ma y b e impro ed b y increasing circuit densit y and die size. Ho ev er, as w e approac hw afer- scale in tegration, p o or man ufacturing yield and incompatibili y with mixed tec hnologies

mak e suc ha monolithic system implem en tation unattractiv e. The MCM approac hresolv es this dilemma, impro ving circuit densit y and yield while decreasing in terconnect dela MCMs eliminate individual in tegrated circuit (IC) pac ages, allo wing dies to b e situated closer together. This shortens in terconnect length and enables up to a three-fold increase in clo c k frequency asev en-fold decrease in area, and a 30% decrease in p o er consumption o er the b est v alues ac hiev able AndrewB.Kahngissupportedb yNSFY oungIn estigatorAw ardMIP-9257982.GabrielRobinsissupportedb yNSF oungIn

estigatorAw ardMIP-9457412.ElizabethA.W alkupissupportedb yanNSFGraduateF ello wship.Additional relatedpapersma ybefoundath ttp://www.cs.virginia .ed u/ robins/andh ttp://bal lad e.cs. ucl a.ed u:8 08 0/ abk/.
Page 2
using high-densit y prin ted circuit b oards (PCBs) [8]. A t ypical MCM (see Figure 1(a)) consists of a substrate con taining in ter-c hip wiring, up on whic h are moun ted a n um b er of bare die. The MCM substrate is made of silicon, alumina, or co red ceramic, and usually consists of m ultiple la ers (up to thirt y or more wiring la ers). The bare die are b onded to

pads on the upp er-most \c hip la er" of the substrate using solder bumps or tap e-automated b onding (T AB) tec hnology [22 ]. substrate die interconnect die die Figure 1: An example of a m ulti-c hip mo dule, sho wing the underlying substrate con taining in terconnect, as w ellassev eral moun ted die. The increased use of m ulti-c hip mo dule pac aging for large, high-p erformance systems has fo cused atten tion on sev eral new and c hallenging CAD problems, esp ecially those related to la out, thermal reliabilit , and testing [9 ][21 ]. T esting in particular presen ts one of the most p

ersisten tc hallenges of the MCM approac h[1 ][11 ] [24]. It is desirable to disco er defects in the MCM substrate as early as p ossible, since the cost of lo cating and xing a system fault increases geometrically with eac h successiv stage of the system man ufacturing and mark eting pro cess. Certainly , the fully-assem bled MCM pac age can b e tested using com binatorial IC testing tec hniques. Ho ev er, the pre-assem bly MCM substrate con tains a set of disjoin t wiring connections with no activ e devices; th us, the substrate cannot b e tested using con en tional tec hniques. With this in

mind, w e address v eri cation of electrical connectivit yin MCM substrates. e mo del in terconnect in the MCM substrate as follo ws. A net is a set of pins that are to b e electrically connected. Eac h signal net is routed on m ultiple routing la ers using a tree top ology , where e assume without loss of generalit y that eac h leaf is a net terminal, eac h edge is a wire segmen tona single wiring la er, and eac hin ternal no de is a via bet een t o or more routing la ers (Figure 2). W wish to v erify that the routing top ology of eac h net is prop erly implem en ted, with no faults. Tw o

fault classes are of in terest in MCM substrate testing: op en faults, and short faults. An op en fault is an electrical disconnection b et een t opoin ts that are to b e connected. As will b e discussed in
Page 3
L1 L3 L2 L4 V1 V2 L1 L2 L3 L4 V2 V1 Figure 2: A sample net (left) and its corresp onding tree represen tation (righ t); pins b ecome leaf no des while vias b ecome in ternal no des. Section 2 b elo w, there are t ot yp es of op en faults: wire op ens, whic h corresp ond to edge failures in the tree top ology , and crac ed vias, whic h corresp ond to a ph ysical form of no

de failure whic h arbitrarily disconnects subtrees and is not necessarily detected b y tests designed to co er wire op ens. A short fault is de ned to b e an electrical connection b et een t onets thatare not in tended to b e connected. raditional metho ds for connectivit yc hec king in olv e either parallel probing of the circuit, or com binatorial exercising of the logic, neither of whic h apply to MCM substrate testing [2 ]. In v erifying connectivit y for PCBs, a b ed-of-nails tester will sim ultaneously access ev ery grid p oin t, yielding an ecien t, parallel c hec king pro cedure. Ho

ev er, this idea cannot b e applied to MCMs as feature sizes are to o small to allo wsuc h a grid-based metho dology . A com binatorial approac h, e.g., the b oundary- scan metho d for hierarc hical design, requires system-sp eci c, built-in test circuitry [10 ][23 ]. In general, this metho d will apply only to a completely assem bled MCM, but not to a substrate whic hcon tains isolated in terconnect with no activ e circuit elemen ts. Sev eral groups ha e recen tly prop osed new metho ds for v erifying circuit connectivit y during MCM man ufacturing. Eac h of these new metho dologies relies on

se quential pr obing of the MCM substrate, in con trast to the standard approac hes ab o ewhic h use parallel probing or com binatorial testing. Gollada y et al. [9 ] prop ose an electron-b eam metho d to test MCM substrates for short/op en faults y injecting c harge in to individual nets and then scanning them for faults. Unfortunately , electron- b eam testers t ypically ha e a relativ ely small w orking windo w of access to the c hip/substrate, so that probing a lo cation outside that windo w requires ph ysical motion of an apparatus. Also, an electron- b eam ma y require a long time to c

harge up large nets, so that a testing metho dology based on this pro cess can b e prohibitiv ely slo w [20 ]. All other sequen tial probing approac hes in olv ev arian ts of -prob e testing, where \ ying" prob e heads sim ultaneously mo e around the circuit, measuring resistance and capacitance v alues to deter- mine the existence of shorts b et een pairs of nets and op ens b et een t o pins of a single net. F ormally e de ne a -pr ob to b e a set of distinct net terminals whic h are visited sim ultaneously b mo able
Page 4
prob e heads (t ypical prob e tec hnology uses = 2, but

prob e mac hines with higher v alues of are under dev elopmen t [19]). A single -prob e sim ultaneously v eri es all paths b et een pairs of terminals in the prob e set b y measuring resistance and capacitance v alues. F or example, when = 2 the unique path b et een the t o terminals is c hec ed (Figure 3). Figure 3: T esting the in terconnect on the subsrate using prob es: for example, the (A,B) prob e tests the sho wn A-B path for op en faults. A 2-prob e sequen tial testing approac hdev elop ed b y Cro ell et al. [7] for bare-b oard testing has b een adopted b y some MCM man ufacturers [17

]. The metho d of Cro ell et al. uses only one prob e for eac h net in the la out, placing the prob e heads on the t o pins of the net whic hareph ysically farthest apart. Unless the measured resistance deviates signi can tly from the v alue predicted for the correct circuit, one assumes that no op en fault exists. Similarl , only when capacitance is far from the predicted v alue will a p ossible short fault b et een t onets bein estigated more carefully The algorithm of Cro ell et al. [7 ] is ecien t in that it uses just one probing op eration p er net. Ho ev er, an unfortunate c hoice of

prob e lo cations ma y yield measured capacitance and resistance v ery similar to the predicted v alues, ev en in the presence of a fault. F or example, an op en fault caused b a disconnected pad will b e detected only b y directly probing a path through the pad itself; probing an y other path will fail to notice the small deviation in capacitance and resistance v alues. Indeed, the um b er of pads in the net induces a lo er b ound on the n um b er of prob e op erations needed for fault co erage. The incomplete fault co erage a orded b y suc h methodsas[7 ] is economically unacceptable. Th us,

MCM man ufacturers are no w adopting substrate test metho dologies whic hpro vide complete op en fault co erage for all nets [19 ][25 ]. With this in mind, Y ao et al. [25 ]ha e prop osed a quadratic-time algorithm that determines a set of 2-prob es whic hwillc hec k for all p ossible op en faults (this w ork w as later further dev elop ed in [3 ][4 ][5][6 ] [26 ] [27 ]). In this metho d, sucien t capacitance measuremen ts
Page 5
are tak en during the op en fault c hec king pro cess to determine whether t o nets ha e b een shorted together (i.e., w e will encoun ter a capacitance v

alue that is to o high) [7 ] [25 ]. Th us, the remainder of this discussion is con ned to the issue of complete op en fault co erage. In this pap er, w e giv ea linear-time algorithm whic hforan 2 determines a -prob e set whic h accomplishes complete op en fault co erage of eac h net, using the minim um p ossible n um berofprobes. Once prob es are found whic h adequately test the required classes of op en faults, w em ust sc hedule the prob es for execution b y a mec hanical tester. Obtaining a go o d sc hedule is critical, esp ecially with large pro duction runs. Previous groups [7 ][25 ]ha e

used generic greedy or iterativ etra eling salesman heuristics to attac k this problem. W e prop ose t o e ectiv e heuristics for prob e sc heduling based on new observ ations concerning the metricit y and allo able structure of the prob e set. The remainder of this pap er is organized as follo ws. Section 2 form ulates optimal op en fault detection as a tr e testing problem and presen t linear-time algorithms whic h nd an optimal n um ber of prob es to co er all p ossible op en faults. Section 3 sho ws that prob e sc heduling to minim ize total tra el time is a form of metric tra eling

salesman problem (TSP); w e presen tt o e ectiv e heuristics, one of whic h has small constan t-factor error b ound for sc heduling an ygiv en set of prob es. Section 4 giv es exp erimen tal results on random and industry b enc hmark la outs, and Section 5 concludes with directions for future researc h. Preliminary v ersions of this w ork ha e app eared in [13] [14 ][18 ]. Op en F ault Detection In this section w e address the follo wing problem: MinimalProbeGeneration(MPG)Problem: Giv en a routing top ology for a signal net with lea es (i.e., pins), determine a minim um set of -prob es needed

to v erify the net routing. e consider t o lev els of op en fault co erage: (i) co erage of all op en faults on wire segmen ts, and (ii) co erage of all op en faults on wire segmen ts and \crac ed" vias (see Subsection 2.2). This section presen ts optimal solutions for the t o corresp onding v ersions of the MPG problem. Due to the nature of curren t probing tec hnology , the discussion assumes = 2; extensions to arbitrary are straigh tforw ard. 2.1 OptimalDetectionofWireOpenF aults In order to test the in tegrit y of all wire segmen ts, ev ery segmen twhic h is inciden ttoapinm ust b e

tested. Th us, the n um b er of pins (lea es in the routing top ology) induces a lo er b ound of prob es
Page 6
when = 2. Our rst prob e generation algorithm orders the pins of a net as ;:::;p via an arbitrary in-order tra ersal of the routing tree. Cho osing the prob es ;p ,1 b , will co er all edges of the tree, as illustrated in Figure 4; if is o dd, an additional prob e ;p is generated. Figure 5 giv es a formal description of the algorithm, whic hw ecallPR OBE1. Probes 10 Figure 4: Selecting a minimal set of prob es to detect the existence of an ywireopen faults. The prob es ;p

,1 b , pro vide complete wire op en fault co erage. PR OBE1:Computingaminim umprobesetforedgefaultdetection Input: ree =( V; E )with lea es ;p ;:::;p Output: Minim um prob e set whic h detects all p ossible edge faults 1: Ro ot the tree arbitrarily at an in ternal no de 2: Induce an in-order lab eling ;:::;p of the lea es 3: Output the set of prob es ;p b cg 4: If is o dd Then output the prob e ;p Figure 5: PR OBE1: Generation of minim um prob e set for edge fault detection. PR OBE1 is time-optim al b ecause it requires time linear in the size of the input tree . Optimalit yof PR OBE1 in

terms of its prob e set size follo ws from t o simple observ ations. Lemma2.1 Every e dge which is incident to a le af no de must b eteste d, implying a lower b ound of for the size of a pr ob e set which dete cts al l p ossible e dge faults. Lemma2.2 or any e dge ;v in wher is the father of , let denote the pr op er
Page 7
subtr eof ote dat . Then algorithm PR OBE1 outputs some pr ob e which c onne cts a le af in to ale af not in Proof: Assume to ard a con tradiction that ev ery leaf in is connected b y a prob e edge to whic h is also in . Because the set of lea es in is nonempt ,

the in-order lab eling implies that lea es and +1 are also in . Our assumption then implies that is also in , along with when is ev en or when is o dd. F or ev en, th us con tains all lea es of , con tradicting the fact that is a prop er subtree of .F or o dd, an y prob e whic h tests the sole leaf not in ust connect to some leaf in , con tradicting the assumption that prob es in olving lea es of are in ternal to Theorem2.3 Given an inter onne ction tr with le aves, pr ob es ar ene essary and sucient for dete ction of al l p ossible e dge faults in Proof: Necessit yfollo ws from Lemma 2.1. T

o see suciency , consider the graph formed b y adding \prob e edges" to , i.e., add an edge to bet een eac h pair of lea es that corresp onds to a prob e output b yPR OBE1. Note that a set of prob es will detect all p ossible edge faults if ev ery edge lies on some simple cycle in that con tains only one prob e edge. As in the statemen t of Lemma 2.2, for an y arbitrary edge ;v in with the father of , let denote the prop er subtree of ro oted at . By Lemma 2.2, PR OBE1 outputs some prob e ( ;p ) with and Then, the cycle ;:::;p ;p ;:::;v ;:::;v ;v in is the lo est common ancestor of and in )

con tains edge (Figure 6). Because PR OBE1 outputs only prob es, this n um b er of prob es is sucien t (and algorithm PR OBE1 is optimal). 2.2 DetectionofCrac edViaF aults In man ufacturing the MCM substrate, a via can ph ysically \crac k" due to factors suc h as misalignm en in lithograph y and thermal stress. In other w ords, subtrees ro oted at this in ternal no de of the net can b ecome electrically separated (see Figure 7) [25 ], so that certain sets of prob es will detect this op en fault, while other sets will fail to nd the crac ed via. This section giv es a linear-time algorithm,

called PR OBE2, that tests for oth wire faults and crac ed vias using the minim um p ossible n um ber of prob es. PR OBE2 (Figure 8) rst c ho oses an in ternal no de of maxim um degree and then ro ots the tree at y orien ting all edges to ards .Eac h leaf is giv en the lab el . The algorithm propagates message lists con taining lab els, starting from the lea es in b ottom-up order. Initially ,eac h leaf sends
Page 8
i' T' Figure 6: PR OBE1 c hec ks eac hedge ;v for a fault using the prob e ( ;p ), whose prob e edge completes a cycle con taining AAAAAAA AAAAAAA AAAAA AAAAA AAAAA

Cracked via Layer 1 Layer 2 ABCD ABCD Figure 7: A crac ed via in a routing: the t o routing la ers are depicted using di eren t shadings, while the crac ed via (depicted in blac k) disconnects the circuit as sho wn (left). The pair of prob es A; B C; D will detect all edge faults (middle), but ma y fail to detect some no de faults arising from crac ed vias (righ t). to its paren t a message list of size one, i.e., con taining only is o wn lab el . Phase I of the algorithm (lines 4-13) p ertains to in ternal no des :whensuc hanode has receiv ed message lists from all of its c hildren, it

iterativ ely generates a prob e b y pairing t o lab els from distinct incoming message lists as long as one of these lists is of size 1; the t o lab els are then deleted from these lists. After the total um b er of lab els in the incoming message lists has b een reduced to or less, all remaining lab els are concatenated in to a single message list that is passed up to 's paren t. When only the ro ot remains unpro cessed, Phase I I (lines 14-23) p erforms a simple clean up step. Figure 9 traces the execution of PR OBE2 on a small example. The algorithm statemen t in Figure 8 allo ws

non-deterministic c hoice of
Page 9
lab el pairings, e.g., at lines 7-9. This can easily b e made deterministic; ho ev er, our optimalit y result is stronger since it holds ev en for the giv en (non-deterministic) PR OBE2 statemen t. PR OBE2:Computingaminim umprobesetforedgeandnodefaultdetection Input: ree =( V; E ) with lea es ;p ;:::;p Output: Minim um prob e set whic h detects all p ossible edge and no de faults 1: Let 2: Find in ternal no de with maxim um degree 3: Ro ot y directing all edges to ards /* Phase I: pro cessing in ternal no des other than */ 4: or =1 to , send

message list from to par ent 5: While ha ving receiv ed message lists ;:::;M deg 6: While deg =1 >d /* note that this implies 3j j 2*/ 7: Cho ose arbitrary for some with j 8: Cho ose arbitrary for some j 9: Output prob e ( ;p 10: f f 11: Concatenate message lists: ::: deg 12: Send message list to par ent 13: f /* Phase I I: ; pro cessing whic h has receiv ed message lists ;:::;M from its c hildren */ 14: While there are at least 2 nonempt y message lists with one ha ving size 15: Reorder ;:::;M suc h that jj +1 for all 1 i 16: Find maxim um index suc hthat /* is smallest non-empt y list */

17: Cho ose arbitrary 18: Output prob e ( ;p 19: f f 20: Concatenate message lists: ::: 21: If Then output prob es ( ;p j and terminate /* note that denotes the lab el at p osition in the concatenated list */ 22: Else ho ose leaf no de suc hthat ere not b oth passed b y the same c hild of 23: Output prob e ( ;p ) and terminate Figure 8: PR OBE2: Optimal detection of all edge and no de faults. Except at the ro ot, eac h prob e generated b yPR OBE2 will remo et o distinct lab els from the message lists b eing passed. A tmost lab els will remain to b e pro cessed at the ro ot , requiring at most

1 additional prob es. Therefore, to test an in terconnection tree with lea es, PR OBE2 uses at most +( 1) = 1 prob es; this b ound is tigh t for, e.g., a star top ology . Using a sequence of tec hnical lemmas [13 ] [18 ]w e can pro e that (i) that PR OBE2 outputs a prob e set whic h detects all p ossible edge and no de faults, and (ii) that PR OBE2 uses the minim um p ossible n um b er of prob es for
Page 10
a) b) 1,2 3,4,5 7,8 c) 2,4,5 7,8 d) 4,5 7,8 output (1,3) e) 7,8 f) output (4,6) output (2,9) output (1,8) output (5,7) Figure 9: Execution of PR OBE2 on a tree con taining 9 lea

esand5in ternal no des; a total of 5 prob es are generated (shaded lines). Message lists are sho wn on the edges of complete fault detection. The time complexit yof PR OBE2 is optimal as w ell: eac hnode passes no more than lab els to its paren t, and th us eac h no de will receiv efew er than lab els from its c hildren. Assuming that is a tec hnology-dep enden t constan t, the amoun t of pro cessing at eac h no de is a constan t, and since eac no de is pro cessed only once, the o erall time complexit yof PR OBE2 is linear in the size of the input tree. 10
Page 11
Ecien t Prob e Sc

heduling Ecien t prob e sc heduling algorithms are essen tial b ecause testing cost dep ends largely on the total tra el time of the prob e heads. In mec hanical probing, individual stepp er motors con trol the and co ordinates of eac hmo ving head. The distance dist ;B )tra eled b ythe th prob e head is giv en dist ;B )= max This distance function (also kno wn as the Cheb yshev or norm) re ects the fact that the maxim um time in terv al for whic han y motor is engaged will determine the dela ybet een consecutiv e prob es; suc a metric is t ypical in man ufacturing applications and is quite

accurate despite second-order e ects suc as acceleration and deceleration of the mo ving heads. F or -prob es, when =2, the ost of mo ving the prob e heads from a set of pin lo cations ;A to another set of lo cations ;B is giv en b A; B ) = min max dist ;B ; dist ;B )] max dist ;B ; dist ;B )] or k> 2, the cost of mo ving the prob e heads from ;:::;A to ;:::;B is giv en b A; B ) = min max dist ;B (1) ;dist ;B (2) ; : : : ; dist ;B )] where denotes the set of all p erm utations of the prob e indices ;:::;k . In other w ords, w ec ho ose the mapping of on to in suc haw y that the maxim um tra el

time of an y prob e head is minimi zed (see Figure 10). Probe A Probe B B1=(0,0) A2=(2,4) A1=(6,5) B2=(11,2) Figure 10: An example sho wing the distance b et een t o prob es ;A ;B .W eha dist ;B 1)=6, dist ;B 2) = 9, dist ;B 1) = 4, and dist ;B 2) = 5; th us, the distance b et een the t oprobes and is min (max (6 9) max (4 5))=min (9 5) = 5 (i.e., the b est strategy will mo eone prob e head from 2to 1 while the other prob e head mo es from 1to 2). In some tec hnologies, eac hprobeheadma y b e carried b y its o wn mo ving horizon tal bar, and collisions b et een prob es b ecome a concern (i.e.,

no t osuc h parallel bars are allo ed to cross eac 11
Page 12
other's path, so the co ordinates of the prob e heads m ust satisfy ::: at all times). Th us, the prob e head co ordinates are alw ys sorted lexicographically [25 ]. This constrain t clearly yields a metric, whic hw e call the ol lision-fr metric; in con trast, the metric discussed ab o e will b e referred to as the gener alize metric. The collision-free metric is more restrictiv e, since there is alw ys a unique feasible p erm utation of the prob e heads in tra eling from one set of lo cations to another. In particular,

for = 2 the cost under the collision-free metric of mo ving the prob e heads from ;y ;y to ;y ;y is giv en b y max fj jg TheMinimal -ProbeSc heduli ng( -MPS)Problem: Giv en a set of -prob es, minimi ze the total prob e mo ving cost required in executing all prob es. A straigh tforw ard reduction from the geometric tra eling salesman problem [15 ] yields: Theorem3.1 The -MPS pr oblem is NP-har d. Proof: e can transform a geometric instance of TSP in to an instance of MPS b yin tro ducing copies of eac h site, then considering eac h set of iden tical copies of a site as a single -prob e.

Distances bet een prob es will corresp ond to the original distances b et een the corresp onding sites in the TSP instance. The prob e sc heduling problem seems quite unapproac hable, b oth due to its theoretical in tractabilit and b ecause the distance and tra el cost functions are not easily in tuited. Th us, previous w ork relies on generic tra eling salesman heuristics to optimize the prob e sc hedule. F or example, when =2 prob e heads are a ailable, Cro ell et al. [7] use a bandsort algorithm to optimize the mo emen tof one of the prob e heads. Unfortunately , the other prob e head ma y

b e forced to tra el v ery large distances bet een prob es, and indeed the resulting sc hedule is often exceedingly inecien t. Y ao et al. [25 ] use sim ulated annealing and the Kernighan-Lin 2- opt criterion [16 ] as the basis of an iterativ ein terc hange approac h; their sc hedules sa e up to 83% of tra el costs o er the metho d of [7 ]. All of the heuristics prop osed in [7 ] and [25 ], ho ev er, ha eun b ounded error. In this section, w e rstsho w that the -prob e tra el costs are actually metric (although clearly not geometric), i.e., distances b et een -prob es satisfy the triangle

inequalit yforallv alues of 1. As a consequence, tra eling salesman heuristics with constan t-factor error b ound apply [15 ]. Second, e exploit exibilit y in the c hoice of prob es to nd prob e sets whic h can co-exist in an ecien t prob e sc hedule. 12
Page 13
3.1 Metricit yofthe -MPSProblem or the collision-free metric, the tra el costs of the prob e heads satisfy the triangle inequalit , since the prob e head co ordinates are alw ys in lexicographic order. Th us, mo ving the prob e heads from to via an in termediary yields the same nal prob e p erm utation as w ould result b ymo

ving directly from to . Metricit y follo ws from the metricit yofthe Cheb yshev norm. or arbitrary -prob es and ,w ema y view the tra el costs A; B ), B; C )and A; C in the generalized metric as b eing resp ectiv ely determined b y the optimal p erm utations and . Comparing the comp osed p erm utation with the p erm utation yields the follo wing: Theorem3.2 or any thr -pr ob es and , the tr avel c osts A; B B; C and A; C in the gener alize d metric satisfy the triangle ine quality, i.e., A; B )+ B; C A; C Proof: Compare the set of edges of p erm utation that de nes A; C ), with the induced p

erm utation (see Figure 11). De ne max ) to b e the maxim um distance tra eled yan y prob e head according to the p erm utation . Clearly A; C max ), since is not necessarily the minim um -cost p erm utation b et een and . On the other hand, max A; B )+ B; C )b y the triangle inequalit y and the metricit y of the Cheb yshev norm. It follo ws that A; C A; B )+ B; C ). 1 c(A,C) Figure 11: Metricit y of the prob e tra el cost function. Theorem 3.2 allo ws us to apply heuristics whic hac hiev e b ounded error for metric TSP instances. In particular, Christo des' com bination of a minim um spanning

tree construction and minim um matc hing [15 ] yields: 13
Page 14
Corollary3.3 Given a set of nk -pr ob es, for any xe , a heuristic pr ob e sche dule with c ost at most times optimal c an b e found within time. 3.2 aryingtheProbeSet A further optimization of the tour sc hedule is p ossible b ecause the set of prob es is itself v ariable. Figure 12 depicts an instance where a \smarter" c hoice of prob es reduces the optimal tour cost b one-quarter. Most tree top ologies can b e tested with the minim um n um ber of probes in man y distinct ys. F or example, eac h three-pin net in

Figure 12 can b e tested b yaminim al set of 2-prob es in three distinct w ys (i.e., an yt o of the three p ossible prob es can b e used); in fact, the 2-pin net is the only connection top ology with a unique minim um prob e set. A2 A1 A3 B2 B1 B3 A2 A1 A3 B2 B1 B3 Figure 12: An example of ho w judicious prob e selection can reduce the total tour length b yasm uc h as one-quarter: four prob es are required for complete wire op en fault co erage o er the t o 3-pin nets =(0 0) ;A =(0 1) ;A =(1 0) and =( ;  ;B =( ; 1+ ;B =(1+ ;  Assuming that the prob e tour m ust start and end at the

origin, the prob e set on the left will b e optimall y ordered as (( ;A ;B ;B ;A )), requiring ab out four units of tra el time. The prob e set on the righ tma y b e ordered as (( ;A ;B ;B ;A )), requiring only ab out three units of tra el time. eth us obtain a new t yp e of omp atibility TSP problem, where sets of -prob es are selected to co er ev ery net suc h that the optimal tour cost for the union of all prob e sets is minim ized. In other ords, w e wish to exploit the synergy b et een the c hoice of prob es and the optimal tour cost: TheMinimalProbeGeneration/S hed ul ng(MPG/S)Problem:

Giv en a routing top ology for a signal net, determine and sc hedule a set of prob es so that the total prob e mo ving cost is minim ized. In order to h ybridize the prob e-generation phase with the tour-sc heduling phase, and to tak ead- an tage of the non-determinism inheren t in prob e selection, w e prop ose the heuristic PR OBE3 (Figure 13). PR OBE3 is based on a minim um -cost insertion strategy , i.e., it sc hedules all prob es for a small subset of nets, then iterativ ely adds the prob e whic h has lo est insertion cost in the tour while still 14
Page 15
allo wing a minim um

prob e set. Note that a prob e set whic h allo ws us to minim ize tra el cost ma ha e more than the minim um p ossible n um b er of prob es. Ho ev er, the heuristics discussed in this section require that the n um b er of prob es is minim um . As seen in the follo wing section, PR OBE3 yields signi can tly shorter sc hedules than other metho ds. PR OBE3:Insertion-basedmethodforprobeselection Input: A collection of nets and their routing tree top ologies Output: An ecien t heuristic prob e sc hedule Compute a minim al set of prob es whic hv eri es a subset of the nets Compute a heuristic sc

hedule (tour) ;:::;P ;P of While anet notha ving complete fault co erage Find aprobe for an y net suc hthat (i) is still co erable b y a minim al n um ber of probes after is added, and (ii) the prob e's minim um insertion cost b et een consecutiv eprobesis minim ized, i.e., min feasible min ;P )+ ;P +1 ;P +1 Insert in to the tour b et een prob es and +1 where as the tour index where had minim um insertion cost Figure 13: PR OBE3: An insertion-based heuristic for prob e selection. Exp erimen tal Results e tested our algorithms on an MCM b enc hmark design obtained from Hughes Aircraft Co., con

taining 44 comp onen ts and 199 nets (this is the same b enc hmark used b yY ao et al. in [25 ]). W e also used o randomized v ersions of the Hughes b enc hmark, where the same net top ologies w ere retained, but with pin co ordinates reassigned randomly from a uniform distribution in the la out region. Algorithm PR OBE2 w as used to generate minim al prob e sets whic hco er all p ossible wire op en and crac ed via faults. The sc hedules for these prob e sets w ere optimized using the 2-opt TSP heuristic, as w ell as b 2-opt follo ed b y 3-opt (in a separate run). e also tested a v arian tof

PR OBE3 on the same b enc hmark, as describ ed in Section 3 2abo e. e rst generated a minimal set of prob es for all nets other than the p o er, ground, and nets with 3 or less pins, then computed a heuristic tour for these prob es, using the 2-opt TSP heuristic (again, in a separate run, w e used 2-opt follo ed b y 3-opt). Finally ,w e iterativ ely added additional prob es for the remaining nets whic h (i) could b e inserted in to the curren t tour with minim um cost, and (ii) ere compatible with previously c hosen prob es in some minim um prob e set. In all cases, a total of 634 prob es w

ere generated b y our algorithm, the same n um b er as that generated b y the algorithm of [25 ]. With eac h of the PR OBE3 exp erimen ts, 226 prob es w ere initially c hosen to co er the nets whic hhad 15
Page 16
3 pins and whic hw ere neither p o er nor ground; the remaining 408 prob es w ere added incremen tally In the PR OBE3 exp erimen ts, w e optionally ran 2-opt impro emen t after ev ery 10 prob es added, and optionally ran 3-opt impro emen tafter ev ery 50 prob es added. All of the ab o e b enc hmarks w ere run with the collision-free distance function, as w ell as with the

generalized distance function. These results are summarized in T able 1. PR OBE2 PR OBE2 PR OBE3 PR OBE3 MCM metric +2-Opt + 2-Opt + 2-Opt +2-Opt + 3-Opt +3-Opt Hughes generalized 160,435,000 153,185,000 126,210,000 118,497,000 collision-free 163,202,000 157,600,000 131,010,000 126,637,500 Random1 generalized 294,164,000 286,679,000 265,276,000 257,838,000 collision-free 302,684,000 289,843,000 269,346,000 260,897,000 Random2 generalized 295,956,000 285,379,000 271,869,000 260,150,000 collision-free 304,885,000 294,421,000 270,767,000 263,113,000 able 1: P erformance of PR OBE2 and PR OBE3 v

arian ts on the industry b enc h- mark and on random examples. Note that the b est prob e sc hedule cost obtained b ao et al. for the industry b enc hmark, using sim ulated annealing, w as 150,525,000 units. The tour obtained b yPR OBE3 + 2-opt + 3-opt giv es sa vings of up to 21% er this v alue. Eac h b enc hmark w as run with the collision-free distance function, as w ell as with the generalized distance function. As exp ected, the PR OBE3 v arian ts, b eing able to carefully c ho ose prob es while constructing the heuristic tour, outp erformed PR OBE2 b y a considerable margin. Results are

somewhat b etter when 3-opt is incorp orated, also as exp ected. F or the b enc hmark design, the b est tour obtained in [25] using sim ulated annealing had cost 150,525,000; in comparison, our PR OBE3 v arian ts obtain up to 21% impro emen to er the results of [25 ]. Since sim ulated annealing usually giv es solutions quite close to optimal [12 ], our results con rm that careful c hoice of compatible prob es is an imp ortan t issue. uture W ork Substrate testing for op en faults is a critical phase in the pro duction of m ulti-c hip mo dule pac ages. eha e form ulated MCM substrate testing as

a problem of connectivit yv eri cation for trees using -prob es, and presen ted linear-time algorithms for optimal prob e generation. Our algorithms yield minim um prob e sets for co ering all p ossible wire op en and crac ed via faults. Since the asso ciated prob e sc heduling problem is metric, a b ounded-error sc heduling heuristic can b e obtained. W e presen ted an e ectiv e insertion-based heuristic whic h exploits the synergy b et een c hoice of prob es and the resulting optimal sc hedule cost. 16
Page 17
There remain a n um berofin teresting op en problems. The fact that man

y di eren t prob e sets can co er a giv en net yields an in teresting TSP v arian t, as noted ab o e. It is p ossible that a \prize- collecting salesman" form ulation (e.g., at least t o of the three p ossible prob es m ust b e \collected" for eac h three-pin net) can b e solv ed with constan t-factor error via an LP-relaxation sc heme. This w ould b e quite useful, as the b ounded error heuristic of Corollary 3.3 in Section 3 applies only when all of the prob es ha e b een xed. Analyzing the maxim um error inheren t in arbitrarily xing the prob es is also of in terest. Adv ances in prob e tec

hnology allo k> 2 prob e heads to mo esim ultaneously , whic a ords ev en greater freedom in c ho osing the prob e sets. Th us, the in terpla ybet een prob e c hoice and tour cost will con tin ue to b e imp ortan t. More sophisticated strategies for ecien tly inserting prob es in to a partial tour are p ossible (e.g., w ema y iterativ ely lo ok for the b est impro ving com bination of added and deleted prob es). Finally , the concept of v erifying connectivit yb yc hec king paths, rather than edges, as w ell as the \ph ysical" no de failure mo de (via crac king), can b e applied to b oth

trees and arbitrary graphs arising in other elds of study Ac kno wledgemen ts e thank Brian Tien and Ed Shi for access to the b enc hmark. W e are also grateful to Professor C. K. Cheng, Nan-Chi Chou, Da vid Y ao, and T om Russell, for man yin teresting discussions. References [1] R.W.Bassett,P.S.Gillis,andJ.J.Shushereba esting and Diagnosis of High-Density CMOS Multichip Mo dules , in Pro c. IEEE W orkshop on Multic hip Mo dules, San ta Cruz, Marc 1991, pp. 108{113. [2] R.H.Br uce,W.P.Meuli,andJ.M.Ho Multi Chip Mo dules , in Pro c. A CM/IEEE Design Automation Conf., Las V egas, June 1989, pp.

389{393. [3] R.Carra gher,N.C.Chou,C.K.Cheng,andT.R ussell Distortion Mapping for Co r Cer amic Substr ate T esting , in Pro c. In tl. Symp. on Micro electronics, No em b er 1993, pp. 295{300. [4] N.C.ChouandC.K.Cheng Optimal T est Size and Ecient Pr ob e Sche duling for Substr ate eri c ation Using Two-Pr ob eT esters , in Pro c. In tl. Symp. on Micro electronics, No em b er 1993, pp. 276{281. [5] N.C.Chou,C.K.Cheng,andT.R ussell Dynamic Pr ob e Sche duling Optimization for MCM Substr ate T est , IEEE T rans. on Comp onen ts, Hybrids, and Man ufacturing T ec h., (1994), pp. 182{189. [6]

N.C.Chou,C.K.Cheng,andT.C.R ussell High-Performanc e Micr ele ctr onic Substr ate eri c ation Using Pr ob eT esters , in Pro c. IEEE In tl. ASIC Conf., Ro c hester, NY, Septem b er 1992, pp. 230{233. 17
Page 18
[7] J.C.Cr well,R.Keogh,andJ.Conti Moving Pr ob e Bar eBo ar dT ester O ers Unlimite esting Flexibility , Industrial Electronics Equipmen t Design, (1984). [8] W.W.D ai Performanc e Driven L ayout of Thin- lm Substr ates for Multichip Mo dules , in Pro c. IEEE W orkshop on Multic hip Mo dules, San ta Cruz, Marc h 1991, pp. 114{121. [9] S.Gollad y,N.W gner,J.R uder

t,andR.Schmidt Ele ctr on-Be am T chnolo gy for Op en/Short T esting of Multi-Chip Substr ates , IBM J. Res. Dev elop., 34 (1990), pp. 250{259. [10] J.T.A.Gr oup JT G Boundary-Sc an A chite ctur e Standar dPr op osal ,v ersion 2.0 ed., Marc 1988. [11] D.Herrell Multichip Mo dule T chnolo gy at MCC , in Pro c. IEEE In tl. Symp. Circuits and Systems, New Orleans, LA, June 1990, pp. 2099{2103. [12] D.S.Johnson,C.R.Ara gon,L.A.McGeogh,andC.Schev on Optimization by Simulate nne aling: A n Exp erimental Evaluation (p art 1) , Op erations Researc h, 37 (1989), pp. 865{892. [13] A.B.Kahng,G.R

obins,andE.A.W alkup On Conne ctivity V eri c ation in Multi-Chip Mo dule Substr ates ,T ec h. Rep. CSD-TR-910074, Computer Science Departmen t, UCLA, 1991. [14] A.B.Kahng,G.R obins,andE.A.W alkup New R esults and A lgorithms for MCM Substr ate esting , in Pro c. IEEE In tl. Symp. Circuits and Systems, San Diego, CA, Ma y 1992, pp. 1113{ 1116. [15] E.L.La wler The T aveling Salesman Pr oblem: a Guide dT our of Combinatorial Optimization Wiley ,Chic hester, New Y ork, 1985. [16] S.Lin Computer Solutions of the T aveling Salesman Pr oblem , Bell System T ec hnical Journal, 44 (1965), pp.

2245{2269. [17] B.McWilliams private c ommunic ation .(in vited talk at CANDE meeting), San Marcos, CA, April 1991. [18] G.R obins On Optimal Inter onne ctions , PhD thesis, Departmen t of Computer Science, UCLA, CSD-TR-920024, 1992. [19] T.R ussell private c ommunic ation , August 1991. ALCO ACorp. [20] R.G.Sar tore,N.Shastr y,U.Brahme,K.Jefferson,andR.Hala via ti utorial for Computer A ide d Diagnostic E-Be am T esting of ASICs , in Pro c. IEEE In tl. ASIC Conf., Ro c hester, NY, Septem b er 1991, pp. T8:1.1 { T8:1.7. [21] K.P.Shambr ook n Overview of Multichip Mo dule T chnolo gies , in Pro

c. IEEE W orkshop on Multic hip Mo dules, San ta Cruz, CA, Marc h 1991, pp. 1{6. [22] M.T ylorandW.W.D ai TinyMCM , in Pro c. IEEE W orkshop on Multic hip Mo dules, San ta Cruz, CA, Marc h 1991, pp. 143{147. [23] L.W ang,M.Marhoefer,andE.McCluskey A Self-T est and Self-Diagnosis A chite ctur for Bo ar ds Using Boundary Sc ans , in Pro c. Europ ean T est Conf., P aris, April 1989, pp. 119{126. [24] S.Weber or VLSI, Multichip Mo dules May Be ome the Packages of Choic , Electronics, (1989), pp. 106{112. 18
Page 19
[25] S.Z.Y o,N.C.Chou,C.K.Cheng,andT.C.Hu A Multi-Chip Mo dule Substr ate

T esting lgorithm , in Pro c. IEEE In tl. ASIC Conf., Ro c hester, NY, Septem b er 1991, pp. P9:4.1 { P9:4.4. [26] S.Z.Y o,N.C.Chou,C.K.Cheng,andT.C.Hu n Optimal Pr ob eT esting A lgorithm for The Conne ctivity V eri c ation of MCM Substr ates , in Pro c. IEEE In tl. Conf. Computer-Aided Design, San ta Clara, CA, No em b er 1992, pp. 264{267. [27] S.Z.Y o,N.C.Chou,C.K.Cheng,andT.C.Hu A Multi-Pr ob e Appr ach for MCM Substr ate T esting , IEEE T rans. Computer-Aided Design, 13 (1994), pp. 110{121. 19
Page 20
Biographies AndrewB.Kahng (b. Oct. 1963, San Diego, CA) holds the A.B. degree

in applied mathematics and ph ysics from Harv ard College, and the M.S. and Ph.D. degrees in computer science from the Univ ersit y of California at San Diego. Since July 1989, he has b een on the facult y of the computer science departmen t at UCLA, where he has b een asso ciate professor since 1994. He has receiv ed NSF Researc h Initiation and Y oung In estigator Aw ards, and co-directs b oth the VLSI CAD and Commoti on (co op erativ e motion) Lab oratories. His researc hin terests include computer-aided design of high-p erformance VLSI circuits, practical global optimization, and the

theory of co op erativ e task- solving. GabrielRobins receiv ed his Ph.D. in 1992 from UCLA, where he w on a Distinguished T eac hing Aw ard and held an IBM F ello wship. Dr. Robins is no w Assistan t Professor of Computer Science at the Univ ersit y of Virginia, where he w on an NSF Y oung In estigator Aw ard, a Lilly F oundation eac hing F ello wship, and an All-Univ ersit y Outstanding T eac hing Aw ard. Dr. Robins is a mem ber of an advisory b oard to the U.S. Departmen t of Defense. He is General Chair of the 1996 A CM/SIGD Ph ysical Design W orkshop, and he also serv es on the program

committees of sev eral other leading conferences. He is a mem ber of A CM, IEEE, SIAM, and MAA. ElizabethA.W alkup receiv ed bac helor's degrees in Computer Science and Theatre at the Univ ersit of California, San Diego in 1989, and in 1992 a master of science degree in Computer Science at the Univ ersit yofW ashington, where she is curren tly nishing her Ph.D. thesis, on "Optimization of Linear Max-Plus Systems with Application to Timing Analysis". 20