SIGMETRICS 14 Summary Problem as PE cycle increases raw BER significantly increases beyond the fixed ECC capability Goal this paper tries to extend lifetime by reducing of bit errors in a page to the extent which ECC can fix ID: 524991
Download Presentation The PPT/PDF document "Neighbor-Cell Assisted Error Correction ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Neighbor-Cell Assisted Error Correction for MLC NAND Flash Memories
SIGMETRICS’14Slide2
Summary
Problem
: as P/E cycle increases, raw BER significantly increases beyond the fixed ECC capability
Goal
: this paper tries to extend lifetime by reducing # of bit errors in a page (to the extent which ECC can fix)How to reduce # of bit errors: when reading a page, they use multiple sets of reference voltages, instead of conventional single set of reference voltagesRationale of multiple sets of reference voltages: they observed “the threshold voltage distributions based on the values in the neighboring cell”, and reading with “the new reference voltage from the threshold voltage distributions” brings less error dataSolution: When read page fails to pass ECC, Neighbor-cell Assisted Correction (NAC) mechanism reads a page several times using multiple reference voltages, which makes # of errors in the page drop by the degree ECC can correct Slide3
Background: Program Interference
Prior works characterized and modeled this error type
Threshold voltage of a cell (victim cell) can change when its neighbor cells (aggressor cells) are being programmed
Program interference from neighbor cells on the same WL is negligible
Program interference on a victim cell C(n, j) due to the aggressor cell on the WL that is immediately above the victim WL “C(n+1, j)” is dominantSlide4
Optimizing Read Reference Voltage
Neighboring states (Pi and Pi+1) are overlapped
By a reference voltage (
Vref
), blue is error due to “Pi misread as Pi+1” and read is error due to “Pi+1 misread as Pi”F(x) and g(x) are probability density functions (PDF) of cells programmed into state Pi and Pi+1, respectivelyOptimal read reference voltage to minimize above BER is at cross-point of neighbor distributionsSlide5
Modeling Raw BER
Algebraic manipulation
with a set of assumptions
Threshold voltage distributions (
f(x) and g(x)) follow Gaussian distributionThe Gaussian distribution has equal variance (σ1 = σ2 = σ)Random data are programmed (P0 = P1)
Optimum read reference voltage is used
, v= (
μ
1+
μ
2)/2 from previous slideQ(x) is a function of raw BERWhen x = (μ2-μ1)/2σ, Q(x) (or raw BER) can be minimum
As x increases, Q(x) (or raw BER) monotonically decreasesHigher value of (μ2-μ1)/2σ is desirable for minimizing raw BERLarger threshold voltage distance (μ2-μ1) between neighboring distributions
Smaller variance (
σ
)
of threshold voltage distribution
, narrower
distributionsSlide6
Observations on Voltage Distribution
We want to read “victim page” (WL) “with minimum raw bit error”
Before aggressor page (WL) is programmed
, two neighboring distributions of victim page are easy to distinguish
After aggressor page is programmed, program interference cause the distributions to overlap, increasing raw BER The threshold voltage distributions of all cells (overall distribution) can be further divided into four different threshold voltage distributions (conditional distribution) based on the values of aggressor cells
[
Before
Aggress WL is programmed]
[
After
]Slide7
Overall
vs Conditional Distribution
Overall distribution
is the sum of all four
conditional distributionsIn perspective of minimizing raw BERThreshold voltage distance between neighboring distributionsVariance of threshold voltage distributionDistance: overall distribution ≈ conditional distributionVariance
: overall distribution > conditional distribution
Using conditional distribution to read a page
, instead of overall distribution
can minimize raw BER
Variance of overall distribution
Variance of conditional distribution
Distance of conditional distribution pairsSlide8
Multiple Sets of Reference Voltages
Optimal read reference voltage
is
(
μ1+μ2)/2 from the previous modelREFx is the single read reference voltage for overall distributionREFx11 is the read reference voltage for conditional distribution whose neighbor (aggressor) cell is programmed with value “11”For 2-bit MLC flash, there can be additionally four different read voltages for conditional distributions
Due to the small variance (or narrower distribution),
using the multiple sets of reference voltages (REFx11, REFx00, REFx10, REFx01)
, instead of single set of reference voltage (
REFx
),
can minimize raw BERSlide9
Measurement Analysis
SNR (Signal to Noise Ratio
): x = (
μ2-μ1)/
2σ, making Q(x) minimumDue to small variance (narrower distribution), conditional distribution is more likely to generate less error when reading threshold voltage
≈
Conditional distributions
>
>
<Slide10
Neighbor-cell Assisted Correction
(1) read target page using overall read reference (
REFx
)
(2) check ECC, if it fails, NAC works(3) firstly read neighbor (aggressor) pages (MSB/LSB) using REFx(4) then read target page using conditional read references (REF00,11,10,01)(5) when partially corrected, try to run ECC again(6) if it fails, try to use another read reference(7) if ECC continuously fails until all conditional references, return errorSlide11
NAC Implementation
Page-to-be-Corrected Buffer
: to store final read data
Neighbor LSB/MSB
Page Buffers: to store aggressor page dataBit1/Bit2: to determine which conditional reference voltage is usedLocal-Optimum-Read Buffer: to store temporary page read with one of four conditional referencesSlide12
Prioritized NAC
NAC degrades latency due to the increased reads
(up to +6)
The first read with overall read reference (1)
The read for neighbor MSB/LSB page (2)The read with conditional read reference (4)Observation reveals that specific errors are dominantPi+1(11)->Pi : the cell in state Pi+1 whose neighbor cell is 11 misread as PiTry to use REFx11 first among four conditional referencesIf ECC still fails, then use REFx10 REFx01 REFx00
P/E cycle increasesSlide13
Lifetime Extension
NAC with different strengths (different # of conditional references)
Lifetime can be largely increased by lowering raw BER (to be corrected by ECC)Slide14
Performance Analysis
Low P/E cycles
: performance a little improved since neighbor MSB/LSB page read of NAC generates hits in SSD buffer due to good locality of some workloads
18K~24K P/E cycles:
less than 5% degradation while providing 33% lifetime improvementOver 25K P/E cycles: sharply increased latency because to one of every 3 reads requires NAC due to ECC failure