Jason Flinn MobiHeld 2011 Jason Flinn 1 MobiHeld 2011 Game plan A bit of a rant about mobile system design Motivated by difficulties deploying successful research Motivated by common mistakes I we have made ID: 755466
Download Presentation The PPT/PDF document "Embracing Redundancy in Mobile Computing" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Embracing Redundancy in Mobile Computing
Jason FlinnMobiHeld 2011
Jason Flinn
1
MobiHeld 2011Slide2
Game plan
A bit of a rant about mobile system designMotivated by difficulties deploying successful(?) researchMotivated by common mistakes I (we?) have made
Sadly, I’m not this guy
But, I will try to be a little provocative
Hopefully, we can start a good discussion
Jason Flinn
MobiHeld 2011
2Slide3
Resource scarcity
What distinguishes mobile systems?One clear distinction: resource scarcity
Less powerful CPULess memorySmaller storage
Poorer network qualityLimited battery power…What’s the correct lesson for system design?
Jason Flinn
MobiHeld 2011
3Slide4
The real lesson?
Don’t waste CPU cycles!Every Joule is precious!Minimize network usage!Is this how we should build mobile systems?
Jason Flinn
MobiHeld 2011
4
Networks
Phones
Stop!
Danger!
Caution!
Warning!Slide5
Pitfall #1
Don’t focus on computational resourcesNetwork, battery, CPU, etc.Easy to measure (esp. with micro-benchmarks)
Instead, focus on
human resourcesE.g., interactive response timeApplication latency, not network bandwidth
Remember the person waiting for results
Jason Flinn
MobiHeld 2011
5Slide6
6
Diversity of networks
How to exploit this?
One systems challenge
MobiHeld 2011
Jason Flinn
Use option with strongest signal?
Use option with lowest RTT?
Use option with highest bandwidth?
Stripe data across all networks?
All optimize a
computational resource
Focus on using network efficiently
May or may not save
human resourcesSlide7
7
Diversity of networks
Diversity of behavior
Email
Fetch messages
Challenge (cont.)
MobiHeld 2011
Media
Download video
How to minimize response time?
Low latency network for e-mail
Stripe across networks for video
Jason FlinnSlide8
Intentional Networking
Can’t consider just network usageMust consider impact on user response timeUnfortunately, no one-size-fits-all strategy
Must consider intentions in using the networkAsk the user?
No, the user’s time is preciousInstead, ask the applications
work with Brett Higgins, Brian Noble, TJ Giuli, David Watson
Jason Flinn
MobiHeld 2011
8Slide9
Parallels to Parallel Programming
We’ve seen this before!Transition from uniprocessing to multiprocessingNow:
uninetworking to multinetworking
What lessons can we learn from history?
Automatic parallelization is hard!
More effective to provide abstractions to applications
Applications express policies for parallelization
System provides mechanism to realize policies
Jason FlinnMobiHeld 20119Slide10
Abstractions
Jason FlinnMobiHeld 2011
10
Multiple Processors
Multiple Networks
Multithreaded programs
Multi-Sockets
Locks
IROBSCondition VariablesOrdering ConstraintsPrioritiesLabelsAsync. Events/SignalsThunksKey insight: find parallel abstractions for applicationsSlide11
Abstraction: Socket
Socket: logical connection between endpoints11
Client
Server
MobiHeld
2011
Jason FlinnSlide12
Abstraction: Multi-socket
Multi-socket: virtual connectionMeasures performance of each alternative
Encapsulates transient network failure
12
Client
Server
MobiHeld 2011
Jason FlinnSlide13
Abstraction: Label
Qualitative description of network trafficSize: small (latency) or large (bandwidth)Interactivity: foreground vs. background
13
MobiHeld 2011
Client
Server
Jason FlinnSlide14
Abstraction: IROB
IROB: Isolated Reliable Ordered Bytestream
Guarantees atomic delivery of data chunkApplication specifies data, atomicity boundary
14
MobiHeld 2011
4
3
2
1
IROB 1
Jason FlinnSlide15
Abstraction: Ordering Constraints
App specifies partial ordering on IROBsReceiving end enforces delivery order
15
MobiHeld 2011
Server
IROB 1
IROB 3
IROB 2
IROB 2
IROB 3
Dep
: 2
IROB 1
use_data
()
Jason FlinnSlide16
Abstraction: Thunk
What happens when traffic should be deferred?Application passes an optional callback + stateBorrows from PL domainIf no suitable network is available:Operation will fail with a special code
Callback will be fired when a suitable network appearsUse case: periodic background messages
Send once, at the right time
16
MobiHeld 2011
Jason FlinnSlide17
Evaluation: Methodology
Gathered network traces in a moving vehicleSprint 3G & open WiFiBW up/down, RTTReplayed in lab (trace map here)
17
MobiHeld 2011
Jason FlinnSlide18
Evaluation: Comparison Strategies
Generated from the network tracesIdealized migrationAlways use best-bandwidth networkAlways use best-latency networkIdealized aggregationAggregate bandwidth, minimum latency
Upper bounds on app-oblivious performance
18
MobiHeld 2011
Jason FlinnSlide19
Evaluation Results: Email
Trace #2: Ypsilanti, MI
3%
7x
19
MobiHeld 2011
Jason FlinnSlide20
Evaluation Results: Vehicular Sensing
Trace #2: Ypsilanti, MIMobiHeld 2011
20
48%
6%
Jason FlinnSlide21
But, some issues in practice…
Followed lab evaluation with field trialDeployed on Android phonesLessons:Networks failed in unpredictable waysHard to differentiate failures and transient delaysTimeouts either too conservative/aggressive
Predictions not as effective as during lab evaluationMay be caused by walking vs. driving
May be caused by more urban environment
Jason Flinn
MobiHeld 2011
21Slide22
Pitfall #2
It’s harder than we think to predict the futureMany predictors that work well in the labMobile environments more unpredictableTraces often capture only limited types of variance
Modularity exacerbates the problemHigher levels often assume predictions 100% correct
Pick optimal strategy given predictions
Jason Flinn
MobiHeld 2011
22Slide23
An exercise
Number generator picks 0 or 1:Picking the correct number wins $10Costs $4 to pick 0Costs $4 to pick 1Which one should you pick if 0 and 1 equally likely?
Both! (trick question - sorry) What if I told you that 0 is more likely than 1?“How much more likely?”
“A lot” -> pick 0, “A little”, pick both(most mobile systems just pick 0)
Jason Flinn
MobiHeld 2011
23Slide24
Strawman adaptation strategy
Estimate future conditions (e.g., latency, bandwidth)Calculate benefit of each optionE.g., Response time Calculate cost of each optionEnergy, wireless minutes
Equate diverse metricsUtility function, constraints, etc.
Choose the best optionOptimal, but only if predictions 100% accurate
Jason Flinn
MobiHeld 2011
24Slide25
A better strategy
Estimates should be probability distributionsNeed to understand independence of estimatesCalculate best single option as beforeBut, also consider redundant strategies
E.g., send request over both networksDoes decrease
in exp. response time outweigh costs?Redundant strategies employed when uncertainSingle strategies used when confident
Jason Flinn
MobiHeld 2011
25Slide26
Back to intentional networking
Consider latency-sensitive trafficPredict distribution of network latenciesSend over lowest latency networkSend over additional networks when benefit exceeds costBut, not quite there yet…
Predict a 10 ms. RTT for network x with high confidence
How confident are we after 20 ms. with no response?
Jason Flinn
MobiHeld 2011
26Slide27
Pitfall #3
Re-evaluate predictions due to new informationSometimes, lack of response is new informationConsider conditional distributionsExpected RTT given no response after n
ms.Eventually, starting redundant option makes sense
Intentional networking: trouble modeIf no response after x
ms.
, send request over 2
nd
network
Cost: bandwidth, energy Benefit: less fragile system, less variance in response timeJason FlinnMobiHeld 201127Slide28
Generalizing: File systems
BlueFS can read data from multiple locationsPredict latency, energy for each, fetch from best option
Jason Flinn
MobiHeld 2011
28
>/
bluefs
BlueFS
Server
Fallacy: assumes perfect prediction
N/w degradation -> high latency
Detect n/w failures w/long timeout
Instead, embrace redundancy!
Uncertainty in device predictions
Devices largely independent
Consider fetching from >1 locationSlide29
Generalizing: Cyber-foraging
By another name: remote execution, code offloadIdea: move computation from mobile to serverNot always a good ideaShipping inputs and outputs over n/w takes time, energyNeed to recoup costs through faster, remote execution
Example systems:Spectra (I’ll pick on this one)RPF,
CloneCloud, MAUI, etc.
Jason Flinn
MobiHeld 2011
29Slide30
MobiHeld 2011
30Example: Language Translation
Output
Text
Dictionary
Engine
Language
Modeler
EBMTEngineGlossaryEngine
Input
Text
hypotheses
4 components that could be executed remotely
Execution of each engine is optional.
Input parameters:
translation type and text size
Jason FlinnSlide31
MobiHeld 2011
31Example: Execution Plan
Spectra chooses and execution plan:which components to execute where to execute them
Output
Text
Dictionary
Engine
Language
ModelerEBMTEngine
GlossaryEngine
Input
Text
hypotheses
Remote
Local
Jason FlinnSlide32
MobiHeld 2011
32Choosing an Execution Plan
Predict
Resource
Supply
Choose an
Execution Plan
Predict
ResourceDemandCalculate Performance,Energy, & QualityCalculateUtility
Heuristic Solver
Choose Plan
w/Maximum
Utility
Jason FlinnSlide33
Cyber-foraging: summary
Fallacy: assumes many perfect predictionsNetwork latency, bandwidth, CPU load, file system stateAlso computation, communication per inputReality: predicting supply and demand both hard
A better approach:Consider redundant execution (local & remote)
Reevaluate execution plan based on (lack of) progress
Jason Flinn
MobiHeld 2011
33Slide34
Embracing redundancy
When does it make sense to consider redundancy?Imperfect predictionsOptions that are (largely) non-interferingTrading computational resources for human resources
Why is this painful?With 20/20 hindsight, redundancy always wrong!But, right when the future is unknown.
Jason Flinn
MobiHeld 2011
34Slide35
Pushing the needle too far?
Sometimes embracing redundancy opens new doorsThought experiment for cyber-foraging:What if redundant execution is the common case?Work with Mark Gordon, Morley Mao
Jason Flinn
MobiHeld 2011
35Slide36
What
really distinguishes mobile systems?One clear distinction: resource scarcity Less powerful CPULess memorySmaller storage
Poorer network qualityLimited battery power…
What’s the correct lesson for system design?
What
really
distinguishes mobile systems?
One clear distinction:
resource variability
Variable CPU load
Different platform capacitiesVariable network bandwidthVariable network latency
Varying importance of energy…What’s the correct lesson for system design?
Reflections
Jason Flinn
MobiHeld 2011
36Slide37
Conclusions
Build systems robust to varianceTrade computational resources for human resourcesAccept that our predictions are imperfectEmbrace redundancyQuestions?
Jason Flinn
MobiHeld 2011
37Slide38
Deterministic replay
Record an execution, reproduce it laterAcademic and industry implementationsUses include:Debugging
: reproducing a software bugForensics: trace actions taken by an attacker
Fault tolerance: run multiple copies of executionMany others…
MobiHeld 2011
38
Jason FlinnSlide39
How deterministic replay works
Most parts of an execution are deterministicOnly a few source of non-determinismE.g., system calls, context switches, and signals
MobiHeld 2011
39
Log
Recorded
Execution
Initial state
Non-DeterministicEventsReplayedExecution
Jason FlinnSlide40
Prior approaches to improving latency
MobiHeld 201140
Code offload
Remote execution
Compute
State
Compute
(faster)
Results
Disadvantages:
Need large compute chunks
Need accurate CPU, N/W predictions
I/O
I/O
Compute
Compute
Compute
Disadvantages:
Poor interactive performance
App state disappears with N/W
Jason FlinnSlide41
Idea: Replay-enabled redundancy
Execute 2 copies of the applicationOne on the mobile phoneOne on the serverUse deterministic replay to make them the samePotential advantages:
App state still on phone if server/network diesLess waiting: most communication asynchronous
No prediction: fastest execution generates outputMobiHeld 2011
41
Jason FlinnSlide42
Redundant execution example
MobiHeld 201142
Initial app state
Get user input “A”
“A”
Supply “A” from log
Get user input “B”
“B”
Supply “B” from log
Get “Best of both worlds” by overlapping I/O w/slower execution
Jason FlinnSlide43
How much extra communication?
Common sources of non-determinism:User input (infrequent, not much data)Network I/O (server can act as n/w proxy)File I/O (potentially can use distributed storage)Scheduling (make scheduler more deterministic)
Sensors (usually polled infrequently)High-data rate sensors (e.g., video) may be hard
Hypothesis: will not send much extra data for most mobile applications
MobiHeld 2011
43
Jason FlinnSlide44
What about energy usage?
Compare to local executionExtra cost: sending initial state, logged eventsSeveral opportunities to save energy:
Faster app execution -> less energy usedNo need for mobile to send network outputServer proxy will send output on mobile’s behalf
Can skip some code execution on the mobileNeed to transmit state delta at end of skipped sectionA big win if application terminates after section
MobiHeld 2011
44
Jason Flinn