Daya S Khudia Babak Zamirai Mehrzad Samadi and Scott Mahlke June 17 2015 Computer Engineering Laboratory University of Michigan Approximate Computing ID: 393329
Download Presentation The PPT/PDF document "Rumba: An Online Quality Management Syst..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Rumba: An Online Quality Management System for Approximate Computing
Daya
S
Khudia
,
Babak
Zamirai
,
Mehrzad
Samadi
and Scott
Mahlke
June 17, 2015
Computer Engineering Laboratory
University of
Michigan
Slide2
Approximate Computing
2
Performance
Energy
Design Point
Pareto Frontier
Accuracy
Speedup and Energy Reduction
9
0%
10%
10x
Accuracy
[
Esmaeilzadeh
, CACM’15]Slide3
Soft Applications3
100% accuracy not always required
Image Processing
Computer Vision
Data Analytics
Media Applications
Robotics
But …Slide4
100% accuracy
90% accuracy
80% accuracy
Quality is Important
4
Building
acceptable
systems out
of inexact hardware/software componentsSlide5
100%
High-Error Elements
Low-Error
Elements
Approximation Challenge - 1
Large errors in output elements are critical
10%
8
0%
Error
Percentage of
O
utput Elements
100%
No Errors
10% of the pixels have 100% error
100% of the pixels have 10% error
Requirement:
Large errors should be detected and small errors can be ignored
5Slide6
Approximation Challenge
- 2
Output quality is input dependent
Average Error
As high as 23%
6
Requirement:
Online detection of errorsSlide7
Time
Approximation Challenge - 3
Quality
Target
Target + delta
Target - delta
Check the quality
Measuring output quality is expensive
Usually by sampling
over
time
Requirement:
Inexpensive
continuous monitoring
7
I am here
Slide8
Rumba: Solution Overview
8
Application and Inputs
Approximate
Robust
Results
CPU
Results
Approximate Accelerator
Recover
Detect
It solves approximation challenges by being
L
ightweight (detection and recovery)
C
ontinuous and online
Tunable with quality feedback
RumbaSlide9
Inexact Hardware: Neural Processing Unit
9
Program
Find
approximable
code
Transfer accelerator configuration
Inexact but
2.3x
faster
and
3x
energy efficient
Neural Accelerator
Execution
Inputs/Outputs
Offline
Runtime
Train neural network
Find suitable number of layers and neurons in each layer
Get neural network weights
[
Esmaeilzadeh
, Micro’12]Slide10
CPU
Rumba Recovery
Approximate
Accelerator
Rumba: Design
10
Rumba Detection
Input Queue
Output Merger
Online Tuner
Output Queue
Config
Queue
User Specification
Detection Contributions
Output based methods
Calculate errors by observing approximate output
Input based
methods
Predict errors based on current inputs
Recovery QueueSlide11
Output Based Methods
Ex: Gamma correction on imagesPixels are read row-wiseExhibit temporal similarity Drastic changes only at the edges and endpoints
11
Inputs
Approximate
On Accelerator
Temporally similar?
no
yes
Useful output
Discard Accelerator output
recomputeSlide12
Output Based Method: Using Anomalies
12
Compare with the
history
of previously computed elements
Time
Detection system exploits
temporal similarity
in output elements
Many applications exhibit temporal similarity
history
False positiveSlide13
Lightweight DetectionMaintain a moving average
Compare accelerator output with the moving avg
13
Exponential Moving Average (EMA)
α = Smoothing factor
e = Current
element
Limitation: What if no temporal similarity?
Solution: Input-based methods Slide14
Observation: Input
Dependent Error
14Increasing Error
Specific inputs show errors
Inverse KinematicsSlide15
Input Based Methods
Error prediction need not be very accurate Since we just want to know high error or notPrediction should be low-costSo that gains of approximation are not nullified
15
Inputs
Approximate
o
n accelerator
Predict error
High Error?
yes
no
Useful output
Discard accelerator output
recomputeSlide16
Error Prediction: Decision Tree16
Cost is dependent on levelsHigher levels => more accuracy + higher cost
Best configuration by limited exhaustive searchTree depth limited to a max of 7 levels
X[0] <= .01
X[1] <= .15
X[0] <= .50
Error = .19
Error = .40
Error = .01
Error = .21Slide17
Training Error Predictors17
Set of train1
i
nputs/outputs
Train NPU
NPU model
Set of train2 inputs
Calculate Errors
Train Error Predictors
Error Predictor
M
odels
Output (
approx
)
Exact model
Output (exact)Slide18
Recovery: Re-computation
Recovery by re-computation
Possible due to data parallel computationsCPU re-computes elements with large errorsAccelerator and the CPU work in tandem
18
CPU
Approximate
Accelerator
Rumba
Recovery
Input Queue
Output Queue
Rumba
Detection
Recovery Queue
OutputSlide19
Experimental Setup/Benchmarks
19
ApplicationDomain
Error metric blackscholesFinancial analysisMean
Relative Error
fft
Signal
processing
Mean Relative Error
inversek2j
Robotics
Mean Relative Error
jmeint3D gaming
# of Mismatchesjpeg CompressionMean Pixel Difference
kmeansMachine learningMean Output Difference
sobelImage processingMean Pixel Difference
Neural Processing Unit (NPU) acceleratorUsing PyBrain
(Artificial Neural Network) libraryEnergy modelingGEM5 + McPAT for applications + CactiSlide20
Quality Vs.
Recomputations
(inversek2j)
20
Better
U
nchecked accelerator error
15%
62%Slide21
Performance Improvements
21
Quality monitoring have no performance overhead (
avg
case) while reducing errorsSlide22
Energy Savings22
Energy savings are determined by number of fixesSlide23
False Positives23
Detected error that is actually not a large error
Low false positives
no unnecessary recoveriesSlide24
24
Re-execute
Detect Errors
Robust Results
Conclusion
Rumba
O
utput-based methods are inexpensive
Input-based methods are broadly applicable
Quality is Important
Rumba provides quality control in approximate computing and reduces average output error by 2x
RumbaSlide25
Rumba: An Online Quality Management System for Approximate Computing
Daya
S
Khudia
,
Babak
Zamirai
,
Mehrzad
Samadi
and Scott
Mahlke
June 17, 2015
Computer Engineering Laboratory
University of
Michigan