/
Testing and Diagnosis of Interconnect Faults in Cluster-Based FPGA Architectures Testing and Diagnosis of Interconnect Faults in Cluster-Based FPGA Architectures

Testing and Diagnosis of Interconnect Faults in Cluster-Based FPGA Architectures - PowerPoint Presentation

karlyn-bohler
karlyn-bohler . @karlyn-bohler
Follow
347 views
Uploaded On 2018-11-03

Testing and Diagnosis of Interconnect Faults in Cluster-Based FPGA Architectures - PPT Presentation

David Mohabir University of Arizona March 19 th 2012 Testing and diagnosis of interconnect faults in clusterbased FPGA architectures Section 1 Motivation Quickly identify faulty components ID: 712794

cluster test configurations fault test cluster fault configurations interconnect segment configuration fpga equivalent input output tile faults pair extra

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Testing and Diagnosis of Interconnect Fa..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Testing and Diagnosis of Interconnect Faults in Cluster-Based FPGA Architectures

David Mohabir

University of Arizona

March 19

th

, 2012Slide2

Testing and diagnosis of interconnect faults in cluster-based FPGA architectures

Section 1Slide3

Motivation

Quickly identify faulty components

Design new, efficient testing methodologies to offset the complexity of FPGA testing as compared to ASIC testing

Defect location information is an important modern strategy as FPGAs can be reconfigured to avoid faults

Increased test generation complexity

Increased test application time

Multiple configurations to test assortment of switch settingsSlide4

Limitations

High complexity for test generation

Increased test application time

Need for external controllability and

observability

Multiple configurations to test assortment of switch settings, compared to a single configuration for an ASIC

As FPGAs have more programmable switch points, this becomes a bigger issueSlide5

Previous and related work

FPGA testing has been divided into interconnect testing and FPGA logic testing

Reduction in the need for I/O pads for testing

Several configurations are required to ensure all FPGA logic is tested in some configuration

Unutilized FPGA logic and routing are being used to implement modular redundancy

Faults can be targeted for the entire FPGA structure, or those that are application-specificSlide6

Related work (con’t)

Need for external controllability and

observability

has also been reduced using iterative logic array (ILA) test architecture

one-dimensional configuration with one direction for signal propagation

A complete array of m x m LUT/RAM modules requires 4 test configurations independent of size of array and of modules [11]

Problems of defining a set of test configurations for cluster-based architectures and diagnosisSlide7

Related work (con’t)

The use of LUTs with logic checkers to implement testing schemes in interconnects

Using LUTs to form shift registers to easily check the output of the test pattern

Built-in Self Test (BIST) architecture to locate any single and most multiple fault PLBs

This is FPGA logic

Cluster-based FPGA test methodologies

Does not cover specific fault extra-clusterSlide8

Geometric Scaling

Increased defect rates

Increased device variation

Increased change in device parameters

Increased single die capacity

Increased susceptibility to transient upsetsSlide9

Defect Tolerance

If device failure renders a

bitop

or an interconnect unusable, the device should be reconfigured to avoid these failing areas

Substitute good resources for bad ones

As defect rates increase, spare resources should be strategically reservedSlide10

Interchangeable LUTsSlide11

Interchangeability

Not all unused units will be substitutable, as location strongly affects interconnections to other logic blocks

Preferable to have fewer large pools of mostly interchangeable resourcesSlide12

Cluster-based architectures

Primitive logic components are grouped into coarse-grained clusters

Richness of internal connectivity means large range of potential interconnect patterns

External access to internal test points becomes increasingly difficult as device sizes scale

Cluster I/O are the input and output pins of the cluster

Tile I/O pins include the endpoint of wire segments which can connect to a neighboring tile via programmable interconnect pointsSlide13

StructureSlide14

Built-in Self Test

BIST overhead not an issue

Easily inserted and removed by reconfiguration

Test logic inside the FPGA enables test access to internal components

Each BISTER is composed of

T

est pattern generator

Output response analyzer

Two blocks under testSlide15

BISTER test structureSlide16

BISTERSlide17

BIST strategy

To guarantee testing of all tiles, the FPGA is reconfigured to shift the BISTERs across the entire array

All tiles will be tested by acting as a BUT

Perimeter tiles are tested by using the I/O pads to access the periphery

Total test application time is related to the area of the TPG/ORA logic

Decomposes the problem into many identical problems of a size which is determined by the test requirements for a single tileSlide18

Interconnect Fault Detection

High density of internal cluster interconnect makes test access difficult

Must test intra-cluster interconnect and extra-cluster interconnect

Four classes of faults

Permanent connection

PIP off

Permanent disconnection

PIP on

Stuck-at 0

Stuck-at 1Slide19

Detection and Diagnosis

Defines testability and diagnosis requirements of each fault and fault pair

Some test pattern must exist to detect each fault and differentiate each fault pair

All LUTs are configured as 4 input XOR gates

The

detectability

of each fault can be expressed as a function of the tile I/OSlide20

Fault Detection Conditions

Faulty line segment s1 must be both controllable by at least one tile input and observable by at least one tile outputSlide21

Fault Detection Conditions (con’t)

A faulty pair of segments must be both controllable, separately controllable, and both observable

The PIP between the two segments must be switched offSlide22

Fault Detection Conditions (con’t)

If s2 is the floating segment, then the non-floating segment must be controllable and the floating segment must be observable

PIP between the two segments must be switched onSlide23

Interconnect Fault Equivalence

Equivalent faults cannot be differentiated

Fault equivalence is determined by the FPGA configuration

Faults that are equivalent in one configuration may not be equivalent in another

Maximum diagnostic resolution is achieved when every pair of faults is non-equivalent in at least one configuration

Two faults are equivalent if their corresponding faulty machines produce the same output with all possible test patterns, at all outputs of the circuit

Two segments are test equivalent in a configuration if the segments have identical control sets and identical observe setsSlide24

Interconnect Fault Equivalence (con’t)

Two segments are test equivalent when they are controlled by the same set of tile inputs and observed by the same set of tile outputsSlide25

Interconnect Fault Equivalence (con’t)

Each segment in a faulty segment pair must be test equivalent to a segment in the other faulty segment pairSlide26

Interconnect Fault Equivalence (con’t)

Pair of faults may be equivalent if a segment which is not driven by a signal floats to a ‘v’ value

The two faults are equivalent if the floating segment is test equivalent to the segment associated with the stuck-at ‘v’ fault

The segment with the stuck-at fault and the floating segment must be controlled by the same set of tile inputs and observed by the same set of tile outputsSlide27

Interconnect Fault Equivalence (con’t)

The pair of segments involved in one fault are test equivalent to the pair of segments involved in the other fault

Each segment in a faulty segment pair must be test equivalent to a segment in the other faulty segment pairSlide28

Test Configurations

Identifies a set of configurations for the tiles acting as BUTs in a BISTER

Size of configuration should be minimized to reduce test application time

Intra-cluster configurations are defined separately from extra-cluster configurationsSlide29

Intra-Cluster Configurations

Fault effect on a cluster input must propagate to at least one cluster output

Cluster outputs must be separately controllableSlide30

BLE configurations

Observability

of cluster inputs and BLE output branches must be achieved by propagating fault effects

Controllability of the BLE outputs must be achieved through the BLEs

Each BLE is composed of a LUT and a multiplexer

Both must be configured

Each LUT acts as a 4-input XOR gate

Good controllability because output value can be determined by controlling any single input

Good

observability

because a fault effect on any input will propagate to output

Majority of test configurations bypass the flip-flop

A single configuration will test the interconnect associated with the flip-flopsSlide31

BLE input multiplexer configurations

Input

muxes

determine controllability of BLE outputs by determining the function which defines the output of each BLE ‘n’

BLE output function:

All inputs

XORed

together

Multiplexers are not configured to create loops

All BLE outputs are separately controllable from each other, and from all cluster inputs

Each input multiplexer is configured to select data from each of its inputs in at least one configuration

There is a sensitized path from each cluster input stem to a cluster output in every configurationSlide32

Algorithm 1Slide33

Input Multiplexer configurationsSlide34

Extra-Cluster Configurations

Defines current flow paths through the extra-cluster interconnect

Modeled as a flow graph

Create flow paths between tile I/O nodes which allow the detection criteria of each fault to be satisfied in at least one configuration

Flow paths are created from tile I/Os to every cluster input, and from every cluster output to tile I/OsSlide35

Transparent Extra-Cluster ConfigurationSlide36

Algorithm 2Slide37

Algorithm 3Slide38

Results

Assumptions

Cluster inputs and outputs are equally distributed around the sides of the cluster

Each cluster I/O on the north face may connect to all horizontal tracks via a set of PIPs

West face I/O connects to all vertical tracks

Cluster I/O for east and south faces connect directly to tracks in neighboring tiles

Results

Intra-cluster configuration, and two sets of extra-cluster configuration

Extra-Cluster (specific) is for when the fault independent algorithm has reached its coverage limit

By using the fault specific extra-cluster configuration algorithm, 100% fault coverage can be guaranteed

At a cost of increased number of configurations

Fault Coverage Achieved

Percent of fault pairs which are differentiated across all configurations

A small set of test configurations can detect and diagnose nearly all targeted interconnect faultsSlide39

ResultsSlide40

Summary

Approach is encompassing, can guarantee 100% fault detection

Does require good deal of computation time for extra-cluster

Does a good job of describing fault classes

I personally believe they could have described it using less mathematical jargon, so that it would make more sense to a digital logic engineer

Algorithms are described neatly in

pseudocode

All details are coveredSlide41

Discussion topics

Section 2Slide42

Discussion #1

Let’s discuss the logical ways to test circuitry for the various faults

Permanent open

Permanent closed

Stuck-at 0

Stuck-at 1

How could you design test patterns without access to all internal signals?Slide43

Discussion #2

Algorithms

Intra-cluster

Extra-clusterSlide44

Discussion #3

Defect mapping

Annealing placers

Marks physical location of defective units as

Costly

Invalid

Routers

Marks wires and switches that are defective as

In use

High cost

Avoids these defective components of the FPGASlide45

Discussion #4

Parity