/
Nhut -Minh Ho, Ramesh Nhut -Minh Ho, Ramesh

Nhut -Minh Ho, Ramesh - PowerPoint Presentation

botgreat
botgreat . @botgreat
Follow
342 views
Uploaded On 2020-07-03

Nhut -Minh Ho, Ramesh - PPT Presentation

Vaddi and WengFai Wong Multiobjective Precision Optimization of DNNs for Edge Devices Deep NN accelerators boom in recent years Various approximation techniques applied Edge devices ID: 794480

precision objective devices optimization objective precision optimization devices dnns edge march multi 2019 bitwidth layer error rounding point fixed

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Nhut -Minh Ho, Ramesh" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Nhut-Minh Ho, Ramesh Vaddi and Weng-Fai Wong

Multi-objective Precision Optimization of DNNs for Edge Devices

Slide2

Deep NN accelerator’s boom in recent years

Various approximation techniques appliedEdge devices :

Floating point => fixed point inference engine

=>

reduced fixed point precision Tolerate classification error, how many bits to represent each layer?

Introduction

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

2

https://

medium.com

/@

RaghavPrabhu

/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148

Slide3

Prior work:Coarse Granularity : whole network 1 bitwidth

Long analysis time (searching), some not published their search timeThis work: Layer-wise granularityFast analysis time Complexity ~ O(L) times running the DNN, L: number of layers

One

bitwidth

fits all hardware designs?Parameterize hardware constraints to precision tuning.

Motivation

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

3

Slide4

Outline

21 March 2019Multi-objective Precision Optimization of DNNs for Edge Devices

4

Classification Accuracy

Input error

Output error

Fixed point rounding

Forward Error propagation

Backward analysis

Inter-layer relationship

Fixed point rounding

Multiple solutions

Slide5

Fixed-point rounding (of a value)

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

5

= 0.1234…

 

= 0.125

 

Float

Fixed point

3 fraction bits

Fixed point number I.F: I: Integer

bitwidth

F: Fractional

bitwidth

= -0.0016

 

= 4.5678…

 

= 4.625

 

= -0.0572

 

= 6.7890…

 

= 6.75

 

= 0.039

 

additive noise to

.

Analyzable

 

The error

(the noise)

Slide6

Fixed-point rounding (of arrays)

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

6

 

 

 

 

Array of data

(N elements)

Data in

fixedpoint

(

I_bits.F_bits

)

e.g. 5.1 bits

e.g. 80,000 elements

 

 

Histogram of error

 

 

=

log

2

Slide7

Fixed-point rounding (of feature maps)

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

7

 

 

 

N

images

N

samples

M

pixels rounded with the same fractional

bitwidth

M

random variables drawn from uniform distributions [

,

]

 

Slide8

Dot products

Propagation of rounding error (

cont

)

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

8

 

 

Dot product with W (W

: constants w/o

rouding

error)

Y = dot(X,W)

Cascading dot products + popular RELUs +

poolings

, output error still looks like normal distribution (excluding zeroes)

 

Property 1

Slide9

Full DNN

Propagation of rounding error (cont)

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

9

 

Input error caused by

fixedpoint

rounding

 

Propagated output error at layer L, caused by layer K.

Standard deviation =

 

https://

medium.com

/@

RaghavPrabhu

/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148

Slide10

Near-linear relationship

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

10

VGG

 

 

 

 

RELU + {max, average} pooling + batch norm do not affect the near-linear relationship

Property 2

Slide11

Full DNN

Propagation of rounding error (cont)

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

11

 

Measurable constants

https://

medium.com

/@

RaghavPrabhu

/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148

Slide12

Measurement

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

12

 

 

 

Forward pass

Inject

 

Measure

sd

of (

)

 

Repeat 20 times for linear regression

https://

medium.com

/@

RaghavPrabhu

/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148

Slide13

Combining the effect of rounding on different layers

Applications

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

13

Additive effect of multiple independent noise sources.

 

Forward pass

https://

medium.com

/@

RaghavPrabhu

/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148

Property 3

Slide14

Parameterize the objective

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

14

 

; …

Fraction

Bitwidth

X1 = 1, X2 = 3

 

where

 

Backward analysis:

Given desired

Find

of each layer K

 

Relationship:

classification accuracy

fixedpoint

bitwidth

of layer K

 

Varying

Varying approximation degree of each layer

 

; …

Fraction

Bitwidth

X1 = 2, X2 = 2

 

Slide15

A simple working solution:

1

= 1/L

 

Workflow

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

15

Classification Accuracy

Binary search

for

 

Trained Weights

Measure

and

for each layer K

 

Parameterize the objective (energy,

bandwith

) by using different

 

Get

 

Get

fixedpoint

representation of layer K

Accuracy ↓ when

↑, monotonically

 

Slide16

First objective (MAC energy)

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

16

= normalized

MAC count

of layer K

1

 

Layer

 

Judd, Patrick, et al. "Stripes: Bit-serial deep neural network computing." 

Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on

. IEEE, 2016

Slide17

Second objective (Input Read bandwidth)

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

17

= normalized

input count

of layer K

1

 

Layer

 

Slide18

Enough constraintsUsing non-linear optimization script (e.g. Octave’s

sqp)Give

in < 5 mins

More objectives (minimum & maximum

bitwidth

, inter-layer

bitwidth dependency, writeback output bandwidth)

 

Solving for

 

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

18

 

L equations

1

No objective ? Simply set

=1/L

 

Slide19

Minimizing input read bandwidth

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

19

Judd, Patrick, et al. "Stripes: Bit-serial deep neural network computing." 

Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on

. IEEE, 2016 and uniform

bitwidth

Slide20

Minimizing Energy of MAC operations

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

20

Slide21

Can reduce bitwidth further with retraining & finetuningWeight’s

bitwidth is also important. Currently uniform weight bitwidth across DNN.Need better model of energy from all components other than MAC alone.

Limitation & Future work

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

21

Slide22

Fast analysis timeCan embed hardware design constraints to the solverOpen ended optimization

Implemented in Caffe, available at: https://www.github.com/minhhn2910/mupod

Conclusion

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

22

Slide23

21 March 2019

Multi-objective Precision Optimization of DNNs for Edge Devices

23

Thank you

Q & A

Slide24

Appendix

 

where

 

 

 

Derive

bitwidth

Output error

Measured constants