Krishna C Garikipati Kassem Fawaz Kang G Shin University Of Michigan 1 fronthaulnetwork What is CloudRAN Virtualization in Radio Access Network RAN Benefits Lower energy consumption compute HVAC ID: 555952
Download Presentation The PPT/PDF document "RT-OPEX: Flexible Scheduling for Cloud-R..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
RT-OPEX: Flexible Scheduling for Cloud-RAN Processing
Krishna C. Garikipati, Kassem Fawaz, Kang G. ShinUniversity Of Michigan
1Slide2
fronthaulnetwork
What is Cloud-RAN*?
Virtualization in Radio Access Network (RAN)
Benefits
Lower energy consumption (compute, HVAC)
Less site visitsfaster upgrade and replacement cyclesAdvanced signal processing
* C-RAN
2Slide3
C-RAN in Practice
3Slide4
Deadlines
Periodic (sub)frames every 1 msHard deadline of 3msTransport, decode and respond to LTE uplink frameRequires real-time scheduling
ACK
ACK
ACK
4
fronthaulnetworkSlide5
C-RAN Scheduling
5
Core network
Assign
basestations
to computing nodes
scheduler
Assign
subframes
to cores
BS 0 –
subframe
0
BS 1 –
subframe
0
BS 0 –
subframe
1
BS 1 –
subframe
1
Per-node scheduler
.
.
.
core 0
core 1
core 2
core
N
BS 0
BS 1
BS 0
BS 1Slide6
6
State-of-the-Art
Scheduling
ArchitectureSlide7
7
Two scheduling options:
Design for WCET
overprovision resources
Design for average case
deadline misses
Real-world Traffic
Band 17
Band 13
Max loadSlide8
RT-OPEX
Offers flexible scheduling for C-RANCombines offline partitioned scheduling with runtime parallelism (work stealing)
Achieves
resource pooling at finer time scale
Avoids
over-provisioning of resources8Slide9
9Slide10
End to End Model
10Slide11
Uplink Processing
ModelLTE processing in software
N
= # antennas
K
= modulation orderD = bits per carrier (load)L = decoding iterationsDominating termsFFT, Equalization, Turbo decodingError term
Platform variations
(kernel tasks/interrupt handling)
Comparable to benchmark stress test
11
GPP (
)
31.4
169.1
49.7
93.0
0.992
31.4
169.1
49.7
93.0
0.992
FFT, Equalization
De-mapping, De-matching
Turbo-decoding
Error Slide12
Parallelism
Decoder Block
Independent w.r.t code blocks
FFT
Independent w.r.t antenna and OFDM symbols
12Slide13
Parallelism
Task ModelDivide tasks into parallel and independent subtasks
13
Parallel processing
Precedence constraintsSlide14
End-to-End Model
Assuming Tx processing starts 1ms before deadline
14
RTT/2
RTT/2
RTT/2
Tx
processing startsSlide15
Scheduling
15Slide16
Conventional Approaches
StaticDeterministic, offline
Offers real-time guarantees
Deadline miss:
GlobalSingle-queue of subframesFIFO (or EDF) de-queuingNon-deterministic, flexibleNo real-time guarantees16Slide17
Scheduling Gaps
WCET design + non-optimal design gaps in execution
17Slide18
RT-OPEX
Exploit the gaps dynamically at runtime18
core is idle
core is idleSlide19
Local FFT
decode
RT-OPEX Migration
Subtasks migrated to cores with enough slack time
Local processing does not wait for migrated task
Ensures no performance degradation
Otherwise perform recovery
19
Core 0
Core 4
Core 1
Core 3
Core 2
Start migration
Local FFTSlide20
Implementation
& evaluation20Slide21
RT-OPEX Implementation
OpenAirInterface (LTE Rel 10)Modularize the tasksAbstraction of FFT, Demod
, Decode
Utilize
pthread
libraryMigration Data references from shared memoryOpen-sourceEnables different configurationshttps://github.com/gkchai/RT-OPEX21Slide22
Evaluation Platform
GPP32-core Intel Xeon E5, 128 GB RAM, 15 MB L3 cacheUbuntu 14.0.4 low latency kernelLTE data collectionUSRP to collect load of 4 cellular towers30000
subframes
Replay load from each BS trace
4 BS, 2 Antennas, 10MHz LTE FDD1 UE per BS, 100% PRB utilizationSimulated transport delay (RTT/2)22Slide23
Performance Evaluation
Performance Comparison23
Large gaps
Narrower gapsSlide24
Migration Overhead
FFT median overhead is 26
Decoding
overhead is 20
24
Overhead = cost of transfer OAI variables from shared memory to core
Account for overhead at migrationSlide25
Partitioned Scheduler
25
RTT/2 > 400
Budget<1.6
ms subframes with MCS > 20 miss deadlinesPartitioned scheduler cannot exploit gaps Slide26
Global Scheduler
Fails to deliver performance gains26
Cache thrashing causes deadline performance to saturate beyond 8 cores
At MCS 27, processing time increases with more coresSlide27
Conclusion
RT-OPEX: Real-Time Opportunistic ExecutionLow overheadMigration on top of partitionedFlexible to resources
Exploits added resources for migration
Flexible to load
Leverages load variations to improve deadline miss rate
27Slide28
Thank You!
Questions?28Slide29
RT-OPEX Performance
Lower RTT larger gaps
Larger RTT
narrower gaps
29
migrate decode tasks of high MCS deadline miss goes to zeromigrate only FFT subtasks deadline miss reducedSlide30
Transport Latency
Latency between and RadioFronthaul (
):
Fixed latency (~20us/Km)
Cloud network latency (
):Switch, Ethernet and driver delay 30Latency per packetAverage 0.15ms1Gbps Ethernet to switch1/10
Gbps Ethernet to GPPSlide31
Uplink Processing
Dynamic and depends on:MCS selection
Number of antennas
SNR of channel
31
2.8x increase w.r.t MCS0.5ms increase w.r.t L50% increase w.r.t SNR
per antenna
Slide32
RT-OPEX Performance
At miss rate threshold ≤ 0.01, RT-OPEX supports 4 Mbps of extra load
32
RTT/2 = 500
Slide33
RT-OPEX
Challenges33
When to migrate?
What to migrate?
How to migrate?Slide34
RT-OPEX
34