/
CAFO: Cost Aware Flip Optimization for Asymmetric Memories CAFO: Cost Aware Flip Optimization for Asymmetric Memories

CAFO: Cost Aware Flip Optimization for Asymmetric Memories - PowerPoint Presentation

trish-goza
trish-goza . @trish-goza
Follow
403 views
Uploaded On 2016-10-21

CAFO: Cost Aware Flip Optimization for Asymmetric Memories - PPT Presentation

Rakan Maddah Seyed Mohammad Seyedzadeh and Rami Melhem Computer Science Department University of Pittsburgh HPCA 2015 Introduction DRAM and NAND Flash are facing physical limitations putting their scalability into question ID: 478903

flip cost write gain cost flip gain write data flips 0000 00000000 overhead bit encoding rows cell costs reduction

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "CAFO: Cost Aware Flip Optimization for A..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

CAFO: Cost Aware Flip Optimization for Asymmetric Memories

Rakan Maddah*, Seyed Mohammad Seyedzadeh and Rami MelhemComputer Science Department University of Pittsburgh

HPCA 2015Slide2

Introduction

DRAM and NAND Flash are facing physical limitations putting their scalability into questionDRAM: Decrease in cell reliability and Increase in power consumptionNAND Flash: Endurance degradation and Increase in number of transient and hard errorsPhase-Change Memory (PCM) and Spin-Transfer Torque Random Access Memory (STT-RAM)are a promising alternativeScalability, low access latency and close to zero leakage powerInitial assessments and evaluations are encouragingSlide3

Challenges

PCM and STT-RAM have a number of challenges that needs to be dealt with before deployment in functional systemsPCM suffers from limited endurance STT-RAM suffers from high write bit error rateSolution: Bit flip minimizationService write requests while flipping as few bits as possiblePreserves PCM’s endurance and improves STT-RAM’s write reliabilitySlide4

Previous Work

Differential Write: compares old data against new data and then only flips differing cells.

Flip-N-Write: encodes

write data into either its regular or inverted

form and then picks the encoding that yields in less flips in comparison against old data

Flip-Min:

encodes write data into

a set of data vectors

and then picks the

vector that

yields in less flips in comparison

against old

data

0

0

111

01001

Old

New

Saves 2 bit flips

0

0

111

01001

Old

New

10110

New

Saves 3 bit flips

0

0

1

1

1

01001

Old

New1

10110

New2

Saves 4 bit flips

10111

New

3Slide5

Write Asymmetries

PCMThe RESET state is more detrimental to endurance than the set state

STT-RAM

Anti-parallel magnetization is more prone to write errors than parallel magnetization

SET (“1”)

RESET (“0”)

Time

Power

Free Layer

Oxide

Layer

Reference Layer

Free Layer

Oxide

Layer

Reference Layer

Parallel magnetization (“0”)

Anti-parallel magnetization (“1”)Slide6

Contribution

Observation: existing schemes fail to exploit the write asymmetry

0

0

0

1

1

1

1

1

0

0

0

0

Saves 1

bit

flipOld

New

New

Saves 3 bit

flips

Writing a “0” is 4 times more detrimental to endurance than writing a“1”

Number of bit flips is oblivious to the write asymmetry!Slide7

Contribution

Observation: existing schemes fail to exploit the write asymmetryFocusing solely on the number of bit flips is oblivious to the write asymmetryProposal: move from the concept of “bit flip reduction” to “cost reduction”Cost Aware Flip Optimization (CAFO)Cost model: captures the write asymmetry and assigns a cost for a given write operationCoding engine: encodes the write data into a form that result in overall cost reductionSlide8

Cost Model

Compare write data to currently stored data and associate a cost to each cellThe costs “a”, “b”, “c” and “d” depend on the technology being modeled and the optimization objective (endurance, energy, error rate)

0

0

1

1

0

1

1

1

1

0

1

0

1

010

acdb

ab

d

b

Currently Stored DataNew Data

Cost of Writing

a: 01, b: 10, c: 00, d:11Write cost:

 

With a write cost we can define a gain among different encodingsSlide9

Gain Calculation

C= 2a + 3b + 1c + 2d = 8

C

encoded

= 1a + 2b + 2c + 3d = 5

Gain

G = C-

C

encoded

= 8 – 5 = 3

0

0

1

10111

1010101

0

Currently Stored Data

New Data

a

c

dbabdb

Cost of WritingCost of Writing cba

d

cd

adEncoded Dataa: 01, b: 10, c: 00, d:11Costs: a =

1, b = 2, c = 0, d = 0

0

1

01

01

01

A positive gain implies that it is less costly to write the data encodedHow to encode Data?Slide10

Encoding

Auxiliary bits

Auxiliary bits serve as inversion

flags

Coding

steps:

Compute rows gain

Flip all rows with positive gainSlide11

Encoding

Auxiliary bits serve as inversion

flags

Coding

steps:

Compute rows gain

Flip all rows with positive gain

Compute columns gain

Flip all columns with positive gain

Repeat process until all rows and columns show a zero or negative

gain

Alteration between row and column flips yields

in additional cost reductionSlide12

Encoding example

Costs: a =

1, b =

1,

c = 0, d =

0—”1” represents a cell that is to be flipped, “0” otherwise

1

0

0

1

0

1

1

0

1

1100100

011000110

1

10

11

00

101

11001110

10100010

01

00

1111001

01

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

+2

0

-2

+2

GainSlide13

Encoding example

Costs: a =

1, b =

1,

c = 0, d =

0—”1” represents a cell that is to be flipped, “0” otherwise

1

0

0

1

0

1

1

0

1

11

001000110

0

01

10

11

011

00101110

01110101

00

0

100100111

1

00

10

1

0

0

0

00

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

+2

0

-2

+2

0

0

0

0

1

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

-2

0

-2

-2

-2 +4 -2 -2 -2 +2 0 -4

1

0

0

1

0

1

1

0

1

1

1

0

0

1

0

0

0

1

1

0

0

0

1

1

0

1

1

0

1

1

0

0

0

1

0

0

0

1

1

0

1

1

0

1

0

1

0

0

0

1

0

0

1

0

0

1

0

0

0

1

1

0

1

0

Flip rows with + gainSlide14

Encoding example

Costs: a = 1, b = 1, c = 0, d =

0—”1” represents a cell that is to be flipped, “0” otherwise

0

0

0

0

1

0

0

1

0

1

0

0

0

1

0

0

1

1

0

1

0

0

1

0

1

0

1

0

0

0

0

00

01

00

11100101

000

0

000001010

0

1

0

0

0

0

0

0

0

0

1

1

0

1

0

1

0

1

1

1

1

0

0

-4

0

-4

-6

-4

-2

+2

-2 -4 -2 -2 -2 -2 0 -4

Flip columns with + gain

1

0

0

1

0

1

1

0

1

1

1

0

0

1

0

0

0

1

1

0

0

0

1

1

0

1

1

0

1

1

0

0

1

0

1

1

1

0

0

1

1

1

0

1

0

1

0

0

0

1

0

0

1

0

0

1

1

1

1

0

0

1

0

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

+2

0

-2

+2

0

0

0

0

1

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

-2

0

-2

-2

-2 +4 -2 -2 -2 +2 0 -4

1

0

0

1

0

1

1

0

1

1

1

0

0

1

0

0

0

1

1000110110110001000110110101000100100100011010

Flip rows with + gainSlide15

Encoding example

Costs: a =

1, b =

1,

c = 0, d

=

0—”1”

represents a cell that is to be flipped

, “0”

otherwise

0

0

0

0

1

0

0

0

0

1

0

0

0

1

0

0

-6

0

-6

-2

0

-2

-4

-2

0 -6 -0 -6 -4 -4 -2 -2

1

1

0

1

0

0

1

0

1

0

1

0

0

0

0

0

0

0

1

0

0

1

1

1

0

0

1

0

1

0

0

0

0

0

0

0

0

0

1

0

1

0

0

1

0

0

0

0

0

0

0

0

1

1

0

1

1

0

1

0

0

0

0

1

21 flips

33 flips

0

0

0

0

1

0

0

1

0

1

0

0

0

1

0

0

1

1

0

1

0

0

1

0

1

0

1

0

0

0

0

0

0

0

1

0

0

1

1

1

0

0

1

0

1

0

0

0

0

0

0

0

0

0

1

0

1

0

0

1

0

0

0

0

0

0

0

0

1

1

0

1

0

1

0

1

1

1

100-40-4-6-4-2+2

-2 -4 -2 -2 -2 -2 0 -4

1

0

0

1

0

1

1

0

1

1

1

0

0

1

0

0

0

1

1

000110110

110010111001110101000100100111100101

00000000

0

00000000000+20-2

+2

0

0

0

0

1

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

-2

0

-2

-2

-2 +4 -2 -2 -2 +2 0 -4

1

0

0

1

0

11011100100011000110110110001000110110101000100100100011010

Encoding terminates as no row or column shows a positive gain

Flip columns with + gain

Flip rows with + gain

Flip rows with + gainSlide16

Row only Inversion

100

1

0

1

1

0

1

1

1

0

0

1

0

0

011000110

1101100

1

0

1

1

1

0011101

0100010

01

00

1111001

0

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0

1

0

1

0

0

0

1

0

FNW

1

0

0

1

0

1

1

0

0

0

0

1

0

1

0

0

0

1

1

0

0

0

1

1

0

1

1

0

1

1

0

0

0

1

0

0

1

0

0

1

0

0

1

0

0

1

0

0

0

1

0

0

1

0

0

1

0

0

0

1

0

1

0

1

33 flips

25 flipsSlide17

Encoding example

Costs: a =

1, b =

1,

c = 0, d

=

0—”1”

represents a cell that is to be flipped

, “0”

otherwise

0

0

0

0

1

0

0

0

0

1

0

0

0

1

0

0

-6

0

-6

-2

0

-2

-4

-2

0 -6 -0 -6 -4 -4 -2 -2

1

1

0

1

0

0

1

0

1

0

1

0

0

0

0

0

0

0

1

0

0

1

1

1

0

0

1

0

1

0

0

0

0

0

0

0

0

0

1

0

1

0

0

1

0

0

0

0

0

0

0

0

1

1

0

1

1

0

1

0

0

0

0

1

21 flips

33 flips

0

0

0

0

1

0

0

1

0

1

0

0

0

1

0

0

1

1

0

1

0

0

1

0

1

0

1

0

0

0

0

0

0

0

1

0

0

1

1

1

0

0

1

0

1

0

0

0

0

0

0

0

0

0

1

0

1

0

0

1

0

0

0

0

0

0

0

0

1

1

0

1

0

1

0

1

1

1

100-40-4-6-4-2+2

-2 -4 -2 -2 -2 -2 0 -4

1

0

0

1

0

1

1

0

1

1

1

0

0

1

0

0

0

1

1

00011011011001

011100111010100010010011110010100

00000000

0

000000000+20-2+2

0

0

0

0

1

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

-2

0

-2

-2

-2 +4 -2 -2 -2 +2 0 -4

1

0

0

1

0

11

011100100011000110110110001000110110101000100100100011010Flip columns with + gain

Flip rows with + gain

Flip rows with + gain

Can We do better?Slide18

Encoding Optimization

Write cost can be further reduced even if no row or column shows a positive gain0

1

0

0

0

0

1

1

1

0

0

0

1

00

00

00

0

0

0

0

0

-2

0

-2

-2

0

1

0

0

1

0

0

0

0

-2

-2

-2

0

-2

-4

-4

-2

0

-4

-4

1

1

0

0

0

1

0

0

0

0

0

0

0

0

0

0

5 flips

3 flips

Flip row and column together

GainSlide19

Encoding Optimization

Write cost can be further reduced even if no row or column shows a positive gain

Flipping both a row and a column, leaves their intersecting cell

un-inverted

The local gain of the intersecting cell has to subtracted from the total gain of the corresponding row and

columns

Gain is achieved if G

r

+

G

c

– 2g

r+c

> 0

Gc

Gr gr+c

010000

11

1

00

01

000

00

0

0

0

0

0

0

-2

0

-2

-2

0

1

0

0

1

0

0

0

0

-2

-2

-2

0

-2

-4

-4

-2

0

-4

-4

1

1

0

0

0

1

0

0

0

0

0

0

0

0

0

0

Flip row and column together

GainSlide20

Encoding Optimization (cont.)

Generalize to Flipping 1 column with multiple rows (Vice Versa)0

0

0

0

0

0

1

1

1

0

0

1

1

10

00

00

0

0

0

0

0

-4

0

0

0

0

1

0

0

1

0

0

00

0101000

0

1

1

0

0

1

0

0

Gain

-2

-2

-2

-1

0

-2

-2

0

0

-2

-2

0

6 flips

4 flips

Flip 2 rows and 2column togetherSlide21

Aux. Bits Cost

The cost of updating the auxiliary bits can be easily incorporated in the gain calculation0

0

1

1

0

1

1

0

1

1

0

1

0

1

0

10a: 01, b: 10, c: 00, d:11

a

c

db

abd

c

bc

abdcdba

d

C = 2a + 3b + 2c +d = 8

Cinverted = 2a + 2b + 2c +2d = 6G= C – Cinverted = 8 - 6 = 2

Cost of

Writing

GainCurrently Stored Data

New Data

Cost of Writing

Inverted DataCosts: a = 1, b = 2, c = 0, d = 0

Old aux bit has to be flipped to “0”Old aux bit stays the same

0101010

1Slide22

Decoding

Simple: XOR the corresponding vertical and horizontal aux bits

Output of “1”: read cell value inverted

Output of “0”: read cell valued un-inverted

0

0

0

0

0

0

1

1

1

0

0

111

000

0

0

0

0

0

0

0

0

1

0

0

1

0

00

00

10

10000

11

0

0

1

0

0

0

0

0

0

0

0

1

1

1

0

0

1

1

1

0

0

Encode

DecodeSlide23

Decoding

Simple: XOR the corresponding vertical and horizontal aux bits

Output of “1”: read cell value inverted

Output of “0”: read cell valued un-inverted

0

0

0

0

0

0

1

1

1

0

0

1110

00

0

0

0

0

0

0

0

0

1

0

0

1

0

00

00

101

0000

11

0

0

1

0

0

0

0

0

0

0

0

1

1

1

0

0

1

1

1

0

0

Encode

DecodeSlide24

Evaluation

Compare Against Flip-Min and Flip-N-Write (FNW)Experiment with various block sizes of matching space overheadCompute average cost reduction achieved by every scheme relative to differential writeExperiment with random input stream and memory traces collected from various SPEC benchmark programsModel both PCM and STT-RAM through setting the cost labels to match the underlying technologySlide25

Cost Reduction vs. Cost oblivious FNW and Flip-Min

Overhead: 3.125%

Overhead: 12.5%

Overhead: 6.25%Slide26

Cost Reduction vs. Cost oblivious FNW and Flip-Min

Overhead: 3.125%

Overhead: 12.5%

Overhead: 6.25%Slide27

Cost Reduction vs. Cost aware FNW and Flip-Min

Overhead: 12.5%

Overhead: 6.25%

Overhead: 3.125%

Cost Model Improves FNW and Flip MinSlide28

Cost Model ImprovementSlide29

Optimization Isolation

At least 15% of cost reduction without encoding optimizationSlide30

STT-RAM Cost Reduction

Costs: a =

1

,

b = 0

,

c = 0, d = 0

Overhead: 12.5%

Overhead: 6.25%

Overhead: 3.125%Slide31

Benchmark Data

Costs: a =

1

,

b = 2

,

c = 0, d =

0

Block Size: 128B (6.25% overhead)Slide32

Conclusion

Bit flip Minimization techniques are oblivious to write asymmetriesMove from the concept of bit flip minimization to cost ReductionCAFOCost model that captures the asymmetry in the write cost2D Encoder that minimizes the overall cost of write operationsSlide33

Questions?