Reducing Metastability in FPGA Designs

Frozen Content

Metastability – to the uninitiated, you could be forgiven for thinking this might be related to the integrity of some futuristic containment vessel, or force-field: "The metastability of the warp drive's flux triangulator and cryonic envelope is reaching critical levels Captain!".

To those of you who live and breathe digital electronics on a daily basis however, the term will likely be greeted with a mixture of disdain and respect.

This article looks to shed light on the concept of metastability in regards to digital circuits – and therefore FPGA designs – and how its 'appearance' can be greatly reduced, simply by adhering to proven design principles that mitigate its effect.

Metastability Explained

Metastability concerns the outputs of registers (or clocked flip-flops in old money) within digital devices and the potential for such outputs to enter a metastable state. FPGA devices typically utilize D-type flip-flops. Before looking at how such a state can be entered, it is a good idea to refresh ourselves with some basic key timing elements related to the operation of a register:

  • Set-up time – this is the minimum time that the input to the register must be stable, prior to the arrival of the next clock edge. Typically appears as t su in data sheets.
  • Hold time – this is the minimum time after the arrival of the clock edge, that the input to the register must continue to be in the same stable state. Typically appears as t h in data sheets.
  • Clock-to-Output Delay time – this is the amount of time, after the clock edge arrives, at which point the output of the register changes. This is also referred to as the register's 'settling time' or 'propagation delay'. Can appear in data sheets as, for example, t co, or t phl and t plh.

Whenever a signal travels between two asynchronous clock domains – digital sub-circuits within the overall design that are running on different, or unrelated clocks – there is the possibility of encountering metastability. This is also true of data transferal from an unclocked region of a design into a synchronous system – for example external (outside) signals fed into an FPGA.


Examples of asynchronous signals entering a synchronous system. Top: A signal travels between different clock domains. Bottom: A signal from an unclocked system is fed into a clocked (synchronous) system.

The problem arises when a data signal from one clock domain arrives at the register logic in another clock domain. The incoming data signal from the source domain may transition at any time in comparison to the clock in the target domain – there is no synchronicity between the two domains, no knowledge of transitory speeds in the two logic sub-circuits. If the data signal transitions at a point that violates the required Set-up or Hold times for the destination register, the output of that register can enter a 'metastable state' – a state where the output signal is neither logical Low, nor logical High, but rather in the unstable area between the two.

The length of time that the output continues to remain metastable may exceed the register's specified Clock-to-Output Delay time (settling time). In the majority of cases, registers will quickly resolve this output instability and return to one of the two defined (and stable) states. The problem for a design however, is in the minority of cases, when:

  • the time to settle to a stable state is not quick enough, or
  • the output signal resolves to the incorrect logic level.



The output of a register depending on transition of the input data signal. Input A: The input observes the register's Set-up and Hold times and the output is available after the device's Clock-to-Output Delay time. Input B: The input transitions during the register's Set-up time, with the output going metastable until settling to the correct stable level beyond the Clock-to-Output Delay time. Input C: The input transitions during the register's Hold time, with the output going metastable. Not only does the output settle to a stable state beyond the Clock-to-Output Delay time, it also settles to the wrong logic level!

If the output from the register feeds into more than one subsequent register in the circuit – in parallel – there is the possibility that these destination registers capture the data at differing logic levels, depending on whether the metastable output from the source register has settled to a stable state before each destination register is clocked over to capture the next data. Path delays between the source and destination registers, added to the time for the metastable output to become stable, only compounds the problem.

In summary, metastability is a statistical- or probability-based foe to a designer. Depending on the devices used and the circuitry lay out in the design, metastable output states may occur, or they may not. If they do occur, they may be detrimental – causing failure of the design – or luck may be on your side and settling times of devices, clock speeds and routed paths might just make their appearance benign. The problem as a designer however, is that can you really afford to take that 'chance'? What if the product you are designing is part of a medical-based installation or a commercial jet liner – failure of the design could be catastrophic.

Although metastability cannot be eradicated entirely – no device in the world can lay claim to operate absolutely free of potential metastability effects – it can be reduced to the point of becoming barely a 'blip on the radar'.

As a measure of the reliability of a design, in regards to metastability-induced failure, we talk about something called the Mean Time Between Failures – or MTBF. With metastability left unchecked – that is, no provisions are made in a design to mitigate its effect – the MTBF could be as little as seconds. By applying tried-and-tested digital design methodologies to combat metastability, and by making careful choices of the digital devices used in a design, the MTBF can be considerably increased. A thousand years between failures. A million years. Even a billion years if mathematically calculated and extrapolated. At these values for MTBF, such a design can be rubber-stamped as 'Highly Reliable' or virtually 'Fail-Safe' (or should that be 'Fail Free') – but you get the picture.

The following sections take a look at just how you, as a designer, can extend the MTBF, how device technology plays its part and which of Altium Designer's FPGA peripheral cores have inherent protection against metastability.

Synchronizing Asynchronous Signals

Perhaps the most prevalent and widely accepted solution to the metastability problem, is the addition of front-end circuitry to synchronize an incoming asynchronous signal with the clock of the target synchronous circuit. In its simplest form, this circuitry consists of one or more D-type flip-flops, chained together, and clocked using the target system clock. This is referred to as a synchronization register chain, or just plain synchronizer.

The additional delay imposed by each register allows the incoming signal to recover from any metastable state it may have entered. The more registers in the chain, the more delay and therefore the more time for a metastable output to resolve. The total delay is often known as the Metastability Settling Time. Typically the synchronization circuit will consist of two registers, but for critical applications – such as medical and military – three is not uncommon.


Example of adding a 2-stage synchronizer to the front-end of a synchronous system, to synchronize an incoming asynchronous signal.

Handshaking logic between circuitry in different clock domains and/or FIFO logic is also used – in addition to front-end synchronization – to ensure reception of correct data values. This is of particular importance when dealing with a bussed grouping of multiple asynchronous signals, each of which could transition at any time and independently of each other.

The Weakest Link...

In a digital design, there may be multiple different clock domains and a plethora of signals passing between them. In addition, there may be a variety of external, asynchronous signals – sourced from outside (especially for a design implemented in an FPGA and utilizing external peripheral components and communications interfaces). In such cases, it is not uncommon to find many synchronizing register chains, handling the different asynchronous signal transfers within the overall system.

In terms of MTBF, each synchronizing chain will have its own 'value'. As the overall failure rate for a design is the sum of the individual failure rates for the synchronizing chains within, and the failure rate is 1/MTBF, you can readily see that a synchronizing chain with a decreased MTBF in comparison to the others, would have an overall detrimental effect on the overall MTBF for the design. In fact, the MTBF for the design will essentially follow the MTBF of the worst synchronizer chain – which can be disastrous if five chains had MTBF of a million years and a sixth chain had MTBF of 50 years!!

To handle this, the solution is to add another register stage to the worst-performing synchronizer chain in the design, thus increasing the metastability settling time and enhancing the MTBF for that chain - and therefore overall design – considerably (if not exponentially!).

Device Technology - Faster vs Smaller

To recap, metastability (although there's nothing stable about this state!) occurs when an incoming asynchronous signal transitions in violation of a register's Set-Up and/or Hold Time. The overall length of time, Set-Up + Hold, essentially defines the 'window' for metastability occurring – the metastability window if you will.

It stands to reason that the faster a register's Set-Up and Hold times, the smaller the metastability window. Indeed, faster logic families exhibit these faster times and hence decrease the probability of a metastable event. Should a metastable event occur (remember metastability cannot be completely eradicated) the registers are fast enough to recover quickly. For example, a register in the 74F family would lead to better MTBF than a device used from the 74LS family – two ends of the device-speed spectrum.

With FPGAs, the decrease in process geometries (from 180nm, through 90nm and onward to 65nm, 40nm and beyond) lends itself to faster transistor switching speeds – typically improving the MTBF due to metastability. However, the benefits of reduced size is not without potential penalty. Shrinking geometries naturally bring reduced supply voltages. During a metastable state, the output from a register is typically half of the supply voltage. As the supply voltage gets smaller and smaller, the voltage difference between full and halfway becomes narrowed, leading to a reduction in the gain of the circuit and longer times for registers to recover from a metastable state.

FPGA vendors typically perform vigorous metastability analysis to ensure robustness against metastability in physical devices that utilize these ever-decreasing process geometries.

Altium Designer FPGA Peripherals with Inherent Metastability Protection

The following sections contain tables collectively listing the various FPGA peripheral cores, available for use in FPGA designs and shipped in libraries as part of the Altium Designer installation. For each peripheral, indication is given as to whether or not it incorporates provisions for metastability.

The following tables and the information they contain are currently a work-in-progress. Those peripherals, for which no information currently exists, will be addressed shortly.

Wishbone Peripherals

This table contains all Wishbone peripheral core components, located in \Library\Fpga\FPGA Peripherals (Wishbone).IntLib.

Peripheral

Metastability Provisions?

Comments

BT656

Not Needed/YES

Adheres to clock synchronization protocol and should not suffer from metastability issues from external signals. Provisions for signals between different clock domains inside the core are included.

CAN_W

 

 

CANB_W

 

 

EMAC8_W

Not Needed

Adheres to clock synchronization protocol and should not suffer from metastability issues from external signals.

EMAC8_MD_W

Not Needed

Adheres to clock synchronization protocol and should not suffer from metastability issues from external signals.

EMAC32

Not Needed/YES

Adheres to clock synchronization protocol and should not suffer from metastability issues from external signals. Provisions for signals between different clock domains are included.

KEYPADA_W

 

 

PS2_W

 

 

TMR3_W

 

 

WB_ASP

Not Needed

Peripheral has no external signals.

WB_BOOTLOADER_V2

YES

This peripheral is based on the WB_SPI core. SPI communications is, by definition, clock synchronous –- should not suffer from metastability issues.

WB_DUALMASTER

Not Needed

Peripheral has no external signals.

WB_FPU

Not Needed

Peripheral has no external signals.

WB_GPS_NMEA

 

 

WB_I2CM

Not Needed

Adheres to clock synchronization protocol and should not suffer from metastability issues from external signals.

WB_I2S

Not Needed

Adheres to clock synchronization protocol and should not suffer from metastability issues from external signals.

WB_IDE

Not Needed

IDE signals are stable when latched by core.

WB_ILI9320

Not Needed

No asynchronous inputs are used.

WB_INTERCON

Not Needed

Peripheral has no external signals.

WB_INTERFACE

 

 

WB_IR38KRX

YES

Contains a Synchronizer.

WB_IRRC

YES

 

WB_JPGDEC_V2

Not Needed

Peripheral has no external signals.

WB_LCDCTRL

 

 

WB_LCDCTRL_SRAM

 

 

WB_LED_CTRL

 

 

WB_MEM_CTRL

 

 

WB_MIDI

Not Needed

The MIDI peripheral is simply a combinatorial wrapper around a UART core – which itself has provisions against metastability.

WB_MP3DEC

 

 

WB_MULTIMASTER

Not Needed

Peripheral has no external signals.

WB_OWM_V2

 

 

WB_PRTIO

 

 

WB_PWM8

 

 

WB_PWMX

 

 

WB_SDCARD

YES

This peripheral is based on the WB_SPI core. SPI communications is, by definition, clock synchronous –- should not suffer from metastability issues.

WB_SDHC

 

 

WB_SHARED_MEM_CTRL_NB2DSK01

 

 

WB_SHARED_MEM_CTRL_NB3000

 

 

WB_SPDIF

 

 

WB_SPI

YES

SPI communications is, by definition, clock synchronous –- should not suffer from metastability issues.

WB_TSPENDOWN

NO

 

WB_UART8_V2

YES

Utilizes dual flip-flop 'synchronization chain' on the incoming RXD line.

WB_USB

 

 

WB_VGA

 

 

Non-Wishbone Peripherals

This table contains all non-Wishbone peripheral core components, located in \Library\Fpga\FPGA Peripherals.IntLib.

Peripheral

Metastability Provisions?

Comments

CAN

 

 

EMAC8

Not Needed

Adheres to clock synchronization protocol and should not suffer from metastability issues from external signals.

EMAC8_MD

Not Needed

Adheres to clock synchronization protocol and should not suffer from metastability issues from external signals.

I2CM

 

 

KEYPADA

 

 

LCD16X2A

 

 

LED_CTRL

 

 

MAX1104_DAC

 

 

PS2

 

 

SRL0

 

 

TMR3

 

 

VGA

 

 

Legacy Peripherals

This table contains all legacy peripheral core components, located in \Library\Fpga\Legacy Libraries\FPGA Legacy Peripherals.IntLib.

Peripheral

Metastability Provisions?

Comments

I2CM_W

 

 

I2S_W

 

 

PRTIOxxx

 

 

PRTIOXxxx

 

 

PRTOxxx

 

 

SPI_W

YES

SPI communications is, by definition, clock synchronous –- should not suffer from metastability issues.

SRL0_W

 

 

VGA32

 

 

VGA32_16BPP

 

 

VGA32_TFT

 

 

WB_IRCODER

 

 

WB_IRDEC

 

 

WB_JPGDEC

Not Needed

Peripheral has no external signals.

WB_OWM

 

 

WB_UART8

YES

Utilizes dual flip-flop 'synchronization chain' on the incoming RXD line.

Useful Links

Use the following links to access external documents that take a closer, and more detailed look, at the phenomenon of metastability and how its effect is essentially rendered negligible in digital electronic designs. Many of these documents take a look at equations used to calculate the MTBF for a flip-flop, and subsequent MTBF for an entire design, and themselves provide reference to further information on the subject.

You are reporting an issue with the following selected text and/or image within the active document: