/
Enabling Protocol Coexistence: Enabling Protocol Coexistence:

Enabling Protocol Coexistence: - PowerPoint Presentation

min-jolicoeur
min-jolicoeur . @min-jolicoeur
Follow
349 views
Uploaded On 2018-09-18

Enabling Protocol Coexistence: - PPT Presentation

HardwareSoftware Codesign of Wireless Transceivers on Heterogeneous Computing Architectures Benjamin Drozdenko Graduate Research Assistant amp PhD Candidate Northeastern University Boston MA ID: 669639

amp 802 zynq lte 802 amp lte zynq preamble ghz fpga 11a time processing data research path cpu wireless

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Enabling Protocol Coexistence:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Enabling Protocol Coexistence: Hardware-Software Codesign of Wireless Transceivers on Heterogeneous Computing Architectures

Benjamin Drozdenko

Graduate Research Assistant & Ph.D. Candidate

Northeastern University, Boston, MA

Dissertation

Prof. Miriam Leeser (RCL)

Committee

Prof. Kaushik Chowdhury (GENESYS)

Members

Prof. Stefano Basagni

Ph.D. Dissertation Proposal Review

August 5, 2016Slide2

Dissertation Proposal OutlineIntroductionRelated WorkPreliminary Research

802.11b on Host PC with USRP

IEEE 802.11a Background

802.11a on Zynq SoC with FMComms3Proposed ResearchCodesign Estimation, Automation, OptimizationLTE / Wi-Fi Coexistence StudyAutomated Transceive Chain GenerationConclusion

3

4

5

2

1Slide3

Introduction: Issues in Wireless NetworkingIncreasing CongestionMany more devices using different protocols, now often on the same bandwidthsSpectrum ScarcityAll bandwidths currently occupied by now FCC has opened up new ones for multiple protocolsBig Data, Fast Rates

Streaming applications need to send more in less time; protocols must have higher bit rates, lower error rates

Energy Efficiency

Wireless mobile devices must be recharged infrequently; devices must evolve to support lower power consumption 3

4

5

2

1

LTE

Wi-FiSlide4

Introduction: Issues in HW-SW Systems CodesignInflexible HardwareExisting wireless transceivers usually made with static HW, in which modern protocols cannot be modifiedSlow SoftwareSoftware-defined radios (SDR) usually made for low-end processors; too slow for all PHY layer processing blocks

FPGA Design Difficult

High-level designers may not have HDL experience; designs don’t consider FPGA-CPU-RFFE interfaces

Not Specific to WirelessExisting HW-SW codesign environments don’t take into account wireless comms and transceiver design issues3

4

5

2

1Slide5

Introduction: ContributionsTheoretical FrameworkFor approaching the problem of HW-SW codesign of modern wireless transceiversLive Prototype on Heterogeneous ArchitectureComprised of unlike computing elements, like FPGA & processorExample: Zynq SoC, and extension to apply to latest generation Ultrascale Multi-Processor SoC (MPSoC) devices

Protocol Coexistence at PHY Layer

A means for researching issues regarding protocol coexistence at PHY layer using real RF front end HWHW/SW Codesign for ABS DetectionModeling environment can study HW-SW codesign issues using multiple ABS detection as proof of concept

3

4

5

2

1Slide6

Related WorkHardware for RF Front EndAD9361/4, FMComms3, USRPSW for Interface with RF Front EndGNURadio, MathWorks CST/USRPHigh-Level SDR FrameworksATOMIX, CODIPHYSDR/CR Platforms on CPU w/ FPGA

WARP, SORA, AirBlue,

CoPR

SDR/CR Platforms on Zynq SoCIris, CRASH, NI LTE/Wi-Fi TestbedLTE / Wi-Fi Coexistence Studies3

4

5

2

1

[2Ettus]

[6ADI]

[4GNURadio]

[5MathWorks]Slide7

Related Work: Comparison of SDR Systems

Features

ATOMIX

CODIPHY

WARP

Sora

—ZiriaCoPRAirblue

IrisCRASHNI LTE / Wi-Fi Testbed

Our MethodSystem can be described at a high level

Combines SW

processor & reconfigurable HW

Developer

can program HW/FPGA

Can use to study HW-SW

codesign tradeoffs

Easily modifiable for Next Generation protocols

Can use

to study protocol coexistence

Full design open and available to public

3

4

5

2

1Slide8

The first SDR platform we tried connected a host PC with a USRP, which allows for flexibility and tunability in RF-related parametersTwo important lessons learned about SDRs: To enact timing synchronization, need to built the transceiver system functionality around the RF front end

Rely on USRP clock for fixed radio time

Build

transceive() function as basis for state logicRegardless of the RF front end used, certain algorithms always needed to recover a signal and find packet startRF Front End Algorithms: AGC, Frequency Offset CompensationPreamble Detection Algorithm

Preliminary Work: 802.11b on USRP

3

4

5

2

1Slide9

Preliminary Work: 802.11b on USRPTarget HW and SW, Experimental Setup

System design must be based on the HW used

We relied on standard-fare PC's running Ubuntu OS

PCs are incapable of doing anything in consistent time intervals

We instead relied upon the USRP N210 to provide our timing information

3

4

5

2

1Slide10

Preliminary Work: 802.11b on USRPTransceive Function for Synchronized Opsfunction

dr

= transceive(ft, d2s)

% Initialization

persistent

hrx

htx; dr

=complex(zeros(nspf,1)); ns=0;

if

isempty

(

hrx

),

hrx

= ...;

htx

= ...;

end

if

~

ft

% Transmit one frame step(htx,d2s); % Receive one frame using polling

while

(ns == 0), [dr,ns

] = step(

hrx);

end

% Termination

else

, release(hrx

); release(htx);

end

3

4

5

2

1

Reliance upon an external clock requires us to design transceiver system around its interactions with USRP

We create a

transceive

() function designed to execute every

t

radio

secs

We build higher levels using finite state machine (FSM) that executes transceive each time it enters or remains in a stateSlide11

Preliminary Work: 802.11b on USRPRF Front End and Preamble DetectionRF Front End Algorithms: Detection of packets is not possible unless sufficient work is done first to recover the received signal, incl.:ensuring that the received signal's envelope magnitude is amplified to approximately 1, frequency-domain artifacts resulting from differences between the

Tx's

and Rx's local oscillators are removed to correct the received signal's spectrum

Operations involving FFTs must complete in <tradio sec Preamble Detection Algorithms: Most essential component to reducing error rateThe more accurate the algorithm, the longer it takesTo save time, FFT-based algorithm and 2 stages to confirm the presence of the SFD bits.

3

4

5

2

1Slide12

Preliminary Work: 802.11b on USRPCalibration Results

3

4

5

2

1

Compiled MEX shows less

overshoot

and

undershoot

; conforms more strictly to expected

t

radio

=7.04ms

Compiled MEX shows better average time for RFFE processing block, and lower time for higher frequency resolutionSlide13

Recognizing timing limitations in the PC-based 802.11b, I realized SDR needs a heterogeneous system The next SDR platform contains a FPGA, which allows for flexibility and tunability in HWWe Develop a HW-SW Modeling Environment for SDR Needs

Model Wireless Processing Blocks (PBs)

Given a HW-SW Prototyping Platform

Reconfigurable Hardware (HW) ComponentsSoftware (SW) Tools for High-Level DesignModel a HW-SW Divide PointEnact HW-SW Interfacing

Can Reuse & Adapt to Modern Standards

Preliminary Work: 802.11a on ZynqTestbed Requirements

3

4

5

2

1Slide14

Preliminary Work: 802.11a on ZynqHardware Components: Xilinx Zynq

Zynq-Based Heterogeneous Computing System

Zynq-7000 series System-on-Chip (SoC)

Processing System: ARM Cortex-A9 CPUProgrammable Logic: FPGA with DSPs & BRAMWe prototype on 2 varieties: ZC706 & Zedboard

FPGA

Zynq SoC

CPU

FPGA

Zynq SoC

CPU

FPGA

Zynq SoC

CPU

3

4

5

2

1Slide15

JTAG(to FPGA)Preliminary Work: 802.11a on ZynqHardware Components

Host PC: Runs SW Tools

RF Front End:

ADI FMComms3

FPGA

Zynq SoC

CPU

3

4

5

6

4

3

2

1

Receive Path

Transmit

Path

6

5

1

2

Ethernet

(to CPU)

2Tx 2Rx

AD9361

FMC

Slot

Zynq-Based Heterogeneous Computing System

Radio Frequency (RF) Front End

Host Personal Computer (PC)

Zynq-Based Heterogeneous Computing System

3

4

5

2

1Slide16

Scrambler

: performs an XOR operation with a constant sequence to make the data more pseudo-random, unbiased, and independent

Convolutional Encoder

: adds redundancy by producing parity bits using a Boolean polynomial function, producing a coded output of twice the original length

Block Interleaver

: rearranges the bit indices to make random errors and burst errors seem more pseudorandom

BPSK Modulator

: each coded bit is represented as a complex symbol; true becomes 1+0i, false becomes

-1+0i

Symbol to Subcarrier Mapping

: a set of 48 symbols is mapped to subcarriers, divisions of the transmission bandwidth, and

Pilot

signals inserted between them

OFDM Modulator

: an Inverse Fast Fourier Transform (IFFT) converts 64 frequency-domain subcarriers (incl. 12 guard channels) to a 64-sample time-domain sequence

Prepend Cyclic Prefix

: the final 16 time-domain samples are prepended to the start of the OFDM symbol to mitigate inter-symbol interference and to make channel estimation easier

Preamble Switch

: Decide between sending pre-calculated

Preamble

data samples at the start of the sequence or the modulated DATA samples

IEEE 802.11a Background:

Transmitter PHY Layer Processing Blocks

Long Preamble

←2 Frames →

Short Preamble

←2 Frames→

[1IEEE]

3

4

5

2

1Slide17

IEEE 802.11a Background: Receiver PHY Layer Processing Blocks

Frame Recovery

Matched Filter: Short Preamble

Update Rx Buffer

Matched Filter: Long Preamble

Preamble Found Decision:

Return Flag & Index

3

4

5

2

1Slide18

Preliminary Work: 802.11a on ZynqModeling for Wireless Processing Blocks

PB

Transmitter (Tx)

Receiver (Rx)1

ScramblingPreamble

Detection2

Convolutional Coding

OFDM Demodulation3

Block InterleavingBPSK Demodulation

4

BPSK Modulation

Block De-interleaving

5

OFDM Modulation

Viterbi Decoding

6

Preamble Insertion

Descrambling

Tx: Data Bits

Tx: Samples

Rx: Samples

Rx: Data Bits

Simulink Model for Tx or Rx path

Simulink: Design Synchronous Dataflow (SDF) Models

Integrated Profiling: Look at Entire 802.11a PHY Layer Processing Chain

3

4

5

2

1Slide19

Preliminary Work: 802.11a on Zynq

Software Tools

FPGA

Zynq SoC

CPU

3

4

5

6

5

4

3

2

Receive Path

Transmit

Path

7

6

1

2

JTAG

(to FPGA)

Ethernet

(to CPU)

MathWorks Simulink™ Model

HDL Code

Xilinx

Vivado

®

C Code

ARM

Executable

FPGA Bitstream

Embedded Coder™

HDL Coder™

Zynq-Based Heterogeneous Computing System

Host PC: Runs SW Tools

HDL Coder: Create HW Description Language (HDL) code

Vivado: Synthesize, Implement, and Generate FPGA Bitstream

Embedded Coder: Generate C code for ARM Processor

3

4

5

2

1Slide20

Preliminary Work: 802.11a on ZynqModeling 7 HW-SW Divide Points

FPGA

Zynq SoC

CPU

3

4

5

6

4

3

2

1

Receive Path

Transmit

Path

6

5

1

2

V1

SW HW

V2

SW HW

V3

SW HW

V4

SW HW

V5

SW HW

V6

SW HW

V7

SW HW

V1: SW-only model

V2: Adds Tx6 & Rx1 to HW

V3: Adds Tx5 & Rx2 to HW

V4: Adds Tx4 & Rx3 to HW

V5: Adds Tx3 & Rx4 to HW

V6: Adds Tx2 & Rx5 to HW

V7: HW-only model

Zynq-Based Heterogeneous Computing System

Tx Path

1: Additive Scrambling

2: Convolutional Encoding

3: Block Interleaving

4: Digital (BPSK) Modulation

5: OFDM Modulation

6: Preamble Insertion

Rx Path

1: Preamble Detection

2: OFDM Demodulation

3: Digital Demodulation

4: Block Deinterleaving

5: Viterbi Decoding

6: Descrambling

3

4

5

2

1Slide21

Preliminary Work: 802.11a on ZynqResults: CPU Execution Time: Tx on Zynq

Moving one processing block from SW to HW does not necessarily cause speedup

Increase in Tx frame time on ZC706 from V1 to V2 is proof

V1 is SW-only, requires no AXI communication

Keeps all operations in SW

V2 adds small component to HW

Time saved < time spent on CPU-FPGA data transferOur modeling environment can identify location at which

HW-SW interface is best placed

3

4

5

2

1Slide22

Preliminary Work: 802.11a on ZynqResults: CPU Execution Time: Rx on ZC706

Rx maximum CPU frame time decreases as more blocks are moved onto the FPGA

Preamble detection is revealed to be the biggest bottleneck in the Rx model

Moving it in V2 results in the largest drop in frame timeAlso drops with FFT in V3 & Viterbi Decoder in V6 Moving Descrambler in V7 does not show decrease, suggesting we can put it in SW3

4

5

2

1Slide23

Preliminary Work: 802.11a on Zynq ResultsFPGA Resource Utilization & Power Usage

PB

Tx

Rx

1

1.53

1.572

1.82

2.343

1.84

2.35

4

1.84

2.11

5

1.84

2.11

6

1.85

2.11

7

1.84

2.12

Transmitter Res

UtilReceiver Res Util

Power

3

4

5

2

1Slide24

Preliminary Work: 802.11a on Zynq ResultsBlock Variants: Preamble Detection

MF Variant

Default

HDL LongHDL TrainingData Path Delay

(ns)

500314132

% LUTs8.938.215.8

% Registers4.32.0

1.3% DSPs

99.2

35.3

14.7

Total Power (W)

2.65

2.34

2.09

Block uses a matched filter to correlate 2 frames with a fixed set of coefficients

1

st

MF manually assembled from adders & multipliers

Not ideal: uses 99% of DSPs

2

nd MF correlates with full long preamble

But long preamble composed of repetitions of training seq3rd MF correlates with only the training sequence2.38X reduction in path delay1.12X reduction in power

34

5

2

1Slide25

Proposed Research OverviewCo-design Estimation, Automation, OptimizationTheoretical Metric Derivation

Optimization Exploration

Alternate Heterogeneous Platforms

Wi-Fi / LTE CoexistenceCommonalities in Processing BlocksJoint Preamble DetectionMultiple ABS DetectionAutomated Transceive Chain Generation

3

4

5

2

1Slide26

Proposed Research: 1A: Theoretical Metric Derivation

Var

Description

Est TimeEst FreqtC

FPGA Logic Fabric Clock5-10 ns

100-200 MHz

cSTConstant # Logic Stages8

t

DFPGA Data Path Delay40 ns25 MHz

t

S

Sample Time

50 ns

20 MHz

t

F

Frame Time

4

μ

s

250 kHz

c

U

Upsampling Factor1,000

tPProcessor Update Time4 ms250 Hz

Var

Description

Zedboard

ZC706

uS# Logic Slices13,300

54,650

uL# LUTs

53,200218,600

uR

# 1-bit Registers106,400437,200

u

D# DSPs220

900

u

B# BRAM blocks140

545

pHPower of HW

piece

p

S

Power of SW piece

1.530 W

1.566 W

3

4

5

2

1

 

A major design problem: don’t know how long a Tx or Rx chain takes, and how much space or energy it requires

I plan to derive metrics for timing, utilization, and energy from the PBs in a chainSlide27

Proposed Research: 1B: Optimization ExplorationCustom implementations for the processing blocks that show the longest data path delaysMathWorks tools allow developers to create their own Simulink blocksS-function for incorporating custom C codeBlack Box Interface for incorporating custom HDL code

Incorporate Xilinx IP cores to implement LTE downlink

Algorithms ideal for one protocol & target arch

Example: Schmidl-Cox for 802.11a expects repeated training sequencesPreamble detection via matched filter for multiple protocols3

4

5

2

1Slide28

I plan to derive theoretical metrics on alternate SoC devices

Xilinx Zynq Ultrascale+ MPSoC has Cortex-A application & Cortex-R real-time processors

3 types of HW-SW divide points

Proposed Research:

1C: Alternate Heterogeneous Platforms

3

4

5

2

1

SWA

SWR

HW

HW-SWA

HW-SWR

SWA-SWR

[7Xilinx]Slide29

Proposed Research: 2: Wi-Fi / LTE Coexistence MotivationDue to scarcity, unlicensed spectrum valuableMobile phones could use 2.4/5.8 GHz if unoccupiedLTE variants developed to operate alongside Wi-Fi (e.g. Qualcomm LTE-U, MulteFire; Ericsson LAA,)Modern smartphones already speak both protocols, only using different bandwidths

In case of one connection outage, switch to the other

Issues with how to design a single SDR to accommodate both protocols while optimizing metrics

Challenges of coexistenceSynchronization, OFDM vs. OFDMADCF vs. flexible resource allocation in FDD/TDDCSMA/CA vs. eNodeB subchannel allocation

3

4

5

2

1Slide30

Proposed Research: 2A: Commonalities in Processing Blocks

3

4

5

2

1

I plan to identify how easily processing blocks can adapt to support multiple standards

802.11a

802.11b

802.11g/n

LTE DL-SCH

Coding

Convolutional

(opt.)

Convolutional

Turbo

Rates

1/2,

2/3, 3/4

½ only

1/2,

2/3, 3/4

1/3, ½, 2/3, 5/6

Digital Modulation PSKBPSK, QPSKDBPSK,

DQPSK(D)BPSK,(D)QPSKQPSK only

QAM16QAM, 64QAM

n/a

16QAM, 64QAM

16QAM, 64QAMInterleaving

Blockn/aBlockn/a

Spectrum Spreadingn/a

DSSSDSSSn/a

Mapping/PrecodingTo subcarriers n/aTo subcarriers

To layers, resource grid

OFDMXn/a

XOFDMA FFT

size64n/a64

128-2,048

Cyclic Prefix (μs)

0.8n/a0.85.21-33.33Slide31

Proposed Research: 2A: Block Variant: OFDM IFFT

IFFT Size

64

128256512

1024

Data Path Delay (ns)15.2

16.816.815.618.0

% LUTs19.922.4

27.837.254.9

% Registers

12.3

14.5

19.1

27.7

44.1

% DSPs

6.4

7.7

9.1

10.5

11.8

Total Power (W)

1.84

1.841.851.851.87In LTE, OFDM modulation uses different IFFT sizes to spread symbols onto a larger number of subcarriers

We vary the IFFT sizes to identify its impact on FPGA metricsDelay, resources, and power rises for higher IFFT sizesLimiting factors: #LUTs for multiple IFFTs on FPGA3

4

5

2

1Slide32

Proposed Research: 2B: Wi-Fi/LTE Joint Preamble Detection

3

4

5

2

1

[8MathWorks]

I plan to prototype a preamble detection mechanism that suits both the LTE and Wi-Fi protocols

LTE Primary Synchronization Signal (PSS) is

Zadoff

-Chu seq

At first glance, no similarities between these preambles

Plan to implement matched filters for each, then explore optimization methodsSlide33

Proposed Research: 2C: Multiple ABS Detection

3

4

5

2

1

[9Do]

In LTE, the Almost-Blank Subframe (ABS) helps the Enhanced Inter-Cell Interference Control (eICIC) process

By using a variation of energy detection on PL fabric, I can detect the presence of an ABS

If HW sends a flag from PL to PS to detect a single ABS appearance, SW routine can recognize consistent ABS frames

Thus can tell LTE from 802.11 better than preamble detectionSlide34

Proposed Research: 3: Automated Transceive Chain Generation

3

4

5

2

1

I plan to develop a user interface that automates the generation of a transmit and receive chain

SW HW

HW-SW divide point(s) can be automatically chosen, or set by userSlide35

To do this, need to auto-determine how and when to transfer dataAdvanced eXtensible Interface (AXI): Bus to Connect CPU & FPGA

Direct Memory Access (DMA): To Hold Data Sent b/w CPU & FPGA

First-In First-Out (FIFO): Queue to Buffer Bits in Transit

Proposed Research: 3: ATCG HW-SW Interfacing

Note: t

he data to transfer between CPU & FPGA has a different size and class for each model variant

FPGA

Zynq SoC

CPU

3

4

5

6

7

5

4

3

2

Receive Path

Transmit

Path

1

7

6

1

2

2Tx 2Rx

DAC:

I

1,2

,

Q

1,2

AD9361

ADC:

I

1,2

,

Q

1,2

AXI

DMA

Controller

FIFO

unpack

FIFO

slice

FIFO

concat

FIFO

pack

RF Front End:

ADI FMComms3

Zynq-Based Heterogeneous Computing System

3

4

5

2

1Slide36

Proposed Research: 3: ATCG Data Packing

Data to Send

Data Type

Size of 1#Elements

V1

SamplesSigned Fixed Point16 bits80

V2SamplesSigned Fixed Point

16 bits64V3

SymbolsSigned Integer1-8 bits64

V4

Coded Bits

Boolean

1 bit

48

V5

Coded

Bits

Boolean

1 bit

48

V6

Data

Bits

Boolean1 bit24V7Data Bits

Boolean1 bit24Before sending data between CPU & FPGA, we translate to a 32-bit unsigned integer format for transfer on AXI interconnectWe build a library of Packing blocks to facilitate this transfer

3

4

5

2

1Slide37

Proposed Research: TimelineFall 2016: Derive theoretical framework for estimating time, utilization, & power metrics. Prototype Wi-Fi and LTE DL-SCH protocols on Zynq SoC.

Research issues regarding protocol coexistence at the PHY layer.

Study HW-SW codesign issues with one HW-SW divide point.

Spring 2017: Expand framework to more precisely model data path delay & AXI-S transfers. Employ optimizations to preamble detection for Wi-Fi and LTE DL-SCH chains. Research issues regarding protocol coexistence at MAC layer. Study HW-SW codesign issues with multiple HW-SW divide points. Summer 2017:

Expand framework to model Zynq Ultrascale+ MPSoC.

Collect accuracy and error rate metrics from Wi-Fi and LTE DL-SCH chains. Research issues regarding protocol coexistence with cross-layer interaction. Study HW-SW codesign issues with both real-time and application processors.

3

4

5

2

1Slide38

Conclusion (1)I propose a method for modeling next generation protocol coexistence, and the implementation of a joint Wi-Fi/LTE processing chain as a proof of conceptThe modeling environment will allow researchers to analyze any processing chain at a high level

Derive theoretical estimates for time delay, utilization, power

Assist in selection of one or multiple HW-SW divide points

Processing blocks will be optimized based on their relative size compared to the overall processing chain Save time, improve reliability over alternate SDR solutions

3

4

5

2

1Slide39

Conclusion (2)Wi-Fi / LTE coexistence study of interest to wireless communityAnalyze the commonalities and differences between Wi-Fi and LTE processing blocksPrototype joint preamble detection to handle the most costly processing block in either chain

Further distinguish an LTE from a Wi-Fi transmission by detecting multiple ABS’s in a HW-SW collaborative manner

Automation of transceive chain generation, as time permits

A new path and procedure for studying protocol coexistenceCan be utilized and expanded upon by wireless networking and HW-SW systems design research communities

3

4

5

2

1Slide40

PublicationsPublished: B. Drozdenko, R. Subramanian, K. Chowdhury, and M. Leeser, “Implementing a MATLAB-Based Self-configurable Software Defined Radio Transceiver,” in Cognitive Radio Oriented Wireless Networks - 10th International Conference, CROWNCOM 2015, Doha, Qatar, April 21-23, Revised Selected Papers, 2015, pp. 164–175.R. Subramanian, B. Drozdenko, E. Doyle, R. Ahmed, M. Leeser, and K. R. Chowdhury, “High-Level System Design of IEEE 802.11b Standard-Compliant Link Layer for MATLAB-Based SDR,” IEEE

Access, vol. 4, pp. 1494–1509, 2016.

Submitted, Pending:

B. Drozdenko, M. Zimmermann, T. Dao, K. Chowdhury, and M. Leeser, “Modeling Considerations for the Hardware-Software Co-design of Flexible Modern Wireless Transceivers,” in 2016 26th International Conference on Field Programmable Logic & Applications (FPL), Lausanne, Switzerland, August 29-September 2, 2016. [Accepted, to appear August 2016]B. Drozdenko, M. Zimmermann, T. Dao, K. Chowdhury, and M. Leeser, “Hardware-Software Codesign of Wireless Transceivers on Zynq Heterogeneous Systems,” in IEEE Transactions on Emerging Topics in Computing, Special Issue on Next Generation Wireless Computing Systems, vol. 5, no. 1, March 2017. [Accepted with Major Revision in May 2016, expected publication March 2017]

3

4

5

2

1Slide41

Posters & AcknowledgmentsExtended Abstracts & Posters: B. Drozdenko, R. Subramanian, K. Chowdhury, and M. Leeser, "Bi-Directional Transceiver Implementation with Optimal Parameter Selection", 5th Annual New England Workshop on Software Defined Radio (NEWSDR 2015), Worcester, MA, May 22, 2015. B. Drozdenko, M. Zimmermann, T. Dao, K. Chowdhury, and M. Leeser, "High-Level Hardware-Software Codesign of an 802.11a Transceiver System using Zynq SoC", 2016 Boston Area Architecture Workshop (BARC 2016), Boston, MA, January 29, 2016.

B. Drozdenko, M. Zimmermann, T. Dao, M. Leeser, and K. Chowdhury, "High-Level Hardware-Software Co-design of an 802.11a Transceiver System using Zynq SoC", in Proceedings of the IEEE International Conference on Computer Communications, INFOCOM 2016, San Francisco, CA, April 10-15, 2016, pp. 469-470.

B. Drozdenko, M. Zimmermann, T. Dao, K. Chowdhury, and M. Leeser, “Profiling 802.11a Standard on the Xilinx Zynq System-on-Chip”, in 6th Annual New England Workshop on Software Defined Radio (NEWSDR 2016), Boston, MA, June 3, 2016.

Acknowledgments:

3

4

5

2

1Slide42

References[1] Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications: High-speed Physical Layer in the 5 GHz Band, IEEE Std. 802.11a-1999, 1999.[2] United States Department of Commerce, National Telecommunications & Information Administration, Office of Spectrum Management. “United States Frequency Allocations: The Radio Spectrum. URL: http://www.ntia.doc.gov/files/ntia/publications/2003-allochrt.pdf

.

[3] I. O. Ettus Research, “USRP N200/N210 networked series,” 2015. [Online]. URL:

http://www.ettus.com. [4] GNU Radio Project. [Online], “GnuRadio: The free and open source radio ecosystem,” 2015. [Online]. Available: http://www.gnuradio.org[5] MathWorks, Inc. (2016) Zynq SDR support from communications system toolbox. [Online]. Available: http://www.mathworks.com/hardwaresupport/zynq-sdr.html[6] Analog Devices, Inc. (2015) Integrated transceivers, transmitters, and receivers. [Online]. http://www.analog.com/en/products/rfmicrowave/integrated-transceivers-transmitters-receivers.html

. [7] Xilinx, Inc. (2016) Zynq UltraScale+ All Programmable Heterogeneous MPSoC. [Online].

Available: http://www.xilinx.com/products/silicon-devices/soc/zynq-ultrascale-mpsoc.html. [8] MathWorks, Inc. (2016) Synchronization Signals (PSS and SSS). [Online]. Available:

http://www.mathworks.com/help/lte/ug/synchronization-signals-pss-and-sss.html. [9] M. M. Do and H. J. Son. (2014) Interference Coordination in LTE/LTE-A (2): eICIC (enhanced ICIC). [Online]. Available: http://www.netmanias.com/en/post/blog/6551/lte-lte-a-eicic/interference-coordination-in-lte-lte-a-2-eicic-enhanced-icic.

3

4

5

2

1Slide43

Supplementary Material: Introduction: Issue: Spectrum Scarcity

54-698 MHz:

802.11af TV Whitespace Reuse

3.55-3.65 GHz:

Military RADAR Reuse

2.4, 5.8 GHz:

802.11a/b Designated ISM Bands

3

4

5

2

1

[2USNTIA]Slide44

Modeling Environment

Supplementary Material: Barriers:

Why Making Such Wireless Transceivers Is Hard

Comms protocols evolve; transceiver HW/SW must evolve too!

SW

f

c

=

5.8 GHz

f

s

=

20 Msps

HW

B2: HW & SW must be reconfigurable

B4: Map behaviors to HW or SW

B3: Each processing block (PB) must be same on HW&SW

B1: HW-SW modeling environment

y=FFT(x)

module FFT(

x,y

)

function FFT(

x,y

)

==

==

ProcBlk3

ProcBlk1

ProcBlk2

Effective Bus

FPGA

3

4

5

2

1Slide45

Supplementary Material: Introduction: Approach to Modeling Transceivers for Coexistence

3

4

5

2

1

Signal Processing for

Wireless Communications

at the PHY Layer

Implementation Portability

Equivalent functionality

on HW & SW

Time Synchronization

Low Energy

Low Error Rate

Spectrum Reuse:

Put multiple protocols

on specific bandwidths

Simulink/Vivado-based

HW-SW Modeling Environ

ADI FMComms board for

Tunable Radio Parameters

Enabling Technologies

System goals and challenges

Fundamental Research

Adapt wireless processing

chain to new specifications

and to handle contention

Zynq SoC: RISC processor,

Reconfigurable HW (FPGA)

Time & Energy

Optimization Techniques

Map processing blocks

to HW & SW, and est.

timing & utilization

R1

R2

R3

R4

T1

T2

T3

Wireless Networking

Study Effects of Multiple

Protocols on Same BW

R5Slide46

Supplementary Material: Introduction to SDR: from Static HW to SW

Radio Hardware:

Superheterodyne

Transmitter/Receiver

[10Wikipedia]

SDR: Universal Software Radio Peripheral (USRP)

[3Ettus]

3

4

5

2

1Slide47

Supplementary Material: SDR ESL Design: SWAP Tradeoffs

3

4

5

2

1Slide48

Supplementary Material: IEEE 802.11 Std Characteristics & Processing BlocksPLCP PreambleScramblingConvolutional EncodingBlock InterleavingPSK ModulationSymbol-to-Subcarrier Mapping and Pilot InsertionOFDM Modulation with Cyclic Prefix Attachment

3

4

5

2

1Slide49

Background: IEEE 802.11b Transmitter Processing Blocks

[Ref 9.3]

Scrambling

[Ref 9.4]

Modulation

[Ref 3]

Spectrum Spreading

[Ref 7]

Framing (802.11b)

[Ref 13.1]

Raised Cosine Filter

3

4

5

2

1Slide50

Background: IEEE 802.11b Receiver Processing Blocks

3

4

5

2

1

Receiver Front End:

Receiver Controller:

[Ref 9.1]

AGC

[Ref 10]

Frequency Offset Compensation

[Ref 9.2]

[Ref 10]

Synchronization

[Ref 2]Slide51

Supplementary Material: 802.11b on USRPDesignated Transmitter & Receiver

RFFE

R

adio

F

requency

F

ront

E

nd: AGC, Frequency Offset Estimation & Compensation, and Raised Cosine Receive Filter (RCRF)

PD

P

reamble

D

etection

DDD

D

espreading,

D

emodulation, and

D

escrambling

SMS

RCTF

S

crambling,

M

odulation, and

S

preading

R

aised

C

osine

T

ransmit

F

ilter

3

4

5

2

1Slide52

Supplementary Material: 802.11b on USRPPreamble Detection Algorithm

1.) Compare Received Signal (complex samples) to Expected Spread Preamble (real samples)

Despread

and demodulate to get real bits

2.) Compare Demodulated Signal to Expected Scrambled Preamble (real symbols)

−Window1

+Window1

Descrambled Next USRP Frame

Expected SFD Sequence

−Window2

+Window2

3.) Compare Descrambled 2

nd

USRP Frame to Expected SFD Sequence (real bits)

3

4

5

2

1Slide53

Preliminary Work: 802.11a on Zynq ResultsBlock Variants: Viterbi Decoder

VD Variant

Delay-Based

BRAM-BasedData Path Delay (ns)308314

% LUTs

41.040.3

% Registers4.23.2

BRAM Tiles02

Total Power (W)2.362.36

VD Power (W)

0.011

0.005

Block reverses effects of Convolutional Encoder

Requires memory to hold intermediate state values

1

st

VD uses delay blocks to hold state memory

Exhibits lower path delay

2

nd

VD uses BRAM tiles to hold state memory

Uses fewer LUTs and registersSlightly lower power

Illustrates tradeoff between time and powerCan dynamically tune design to target either objective

34

52

1Slide54

Supplementary Material: 802.11a Transmit and Receive Chains

Preamble

Detect

OFDM Demodulate

Block

Deinterleave

Digital

Demodulate

Viterbi

Decode

Descramble

From RF Board

Data

Bits

Scramble

Convolutional Encode

Digital Modulate

Block Interleave

OFDM Modulate

Preamble

Switch

Data Bits

To RF Board

BPSK, QPSK

,

16QAM, 64QAM

FFT Size:

64

Cyclic Prefix: 0.8

μ

s

For live, online tests, insert:

Automatic Gain Control (AGC)

Frequency Offset Compensation

For changing datasets, insert:

Zero Bit Padding

PHY Layer Framing

CRC Calculation & Appending

802.11a: Frequency Band: 5.8 GHz; Rates: 6, 12, 18, 24, 36, 42, 48, 54 Mbps

½, 2/3, ¾ rates

For changing datasets, insert:

CRC Checking

PHY Layer Deframing

Zero Bit Removal

3

4

5

2

1Slide55

Supplementary Material: 802.11b Transmit and Receive Chains

Raised Cosine Rx Filter

Preamble

Detect

Spectrum

Despread

Digital

Demodulate

Descramble

From RF Board

Data

Bits

802.11b: Frequency Band: 2.4 GHz; Rates: 1, 2, 5.5, 11 Mbps

Scramble

Digital Modulate

Spectrum Spread

Preamble

Switch

Raised Cosine Tx Filter

Data Bits

To RF Board

For changing datasets, insert:

Zero Bit Padding

PHY Layer Framing

CRC Calculation & Appending

For live, online tests, insert:

Automatic Gain Control (AGC)

Frequency Offset Compensation

D

BPSK,

D

QPSK

DSSS:

Direct Sequence Spread Spectrum

For changing datasets, insert:

CRC Checking

PHY Layer Deframing

Zero Bit Removal

Optional:

Convolutional Encode

Optional:

Convolutional Decode

PBCC:

½ rate only

128 One Bits, scrambled & spreaded

+ fixed SFD sequence

3

4

5

2

1Slide56

Supplementary Material: 802.11g/n802.11g: Frequency Band: 2.4 GHz; Rates: 1, 2, 5.5, 6, 11, 12, 18, 24, 36, 42, 48, 54 MbpsEffectively Combines 802.11a and 802.11b standards802.11g preamble is combined 802.11b + 802.11a preambles

802.11n: Frequency Band: 2.4 GHz or 5.8 GHz; Additional Rates up to

600

Mbps

Uses up to 4 Antennas to improve throughput using MIMO SDM

3

4

5

2

1Slide57

Supplementary Material: LTE Downlink Shared Channel

PSS/SSS

Detect

OFDMA Demodulate

De-

precode

Layer De-map

Digital

Demodulate

Turbo

Decode

Descramble

From RF Board

Data

Bits

Turbo Encode

Rate Match

Scramble

Digital Modulate

Layer Map

Precode

OFDMA

Modulate

PSS/SSS

Switch

Data Bits

To RF Board

QPSK

,

16QAM, 64QAM

FFT Sizes:

128-2,048

Cyclic Prefix: 5.21-33.33

μ

s

For live, online tests, insert:

Automatic Gain Control (AGC)

Frequency Offset Compensation

DL Channel Estimation

For changing datasets, insert:

Zero Bit Padding

CRC Calculation & Appending

Code Block Segmentation

LTE Downlink Shared Channel (DL-SCH): Frequency Bands: 700 MHz, 1.7 GHz, 1.9 GHz; Bit rates of 1.7-403.2 Mbps

LTE standard is up to 8x8 MIMO for DL, but we will prototype for 1 antenna

1/3, ½, 2/3, 5/6 rates

For changing datasets, insert:

Code Block Desegmentation

CRC Checking

Zero Bit Removal

For LTE, synchronization signals determine the frame timing

3

4

5

2

1Slide58

Supplementary Material: LTE Fragmented Bandwidths

Band

LTE UL

LTE DL11.92-1.98 GHz2.11-2.17 GHz2

1.85-1.91 GHz

1.93-1.99 GHz31.71-1.785 GHz

1.805-1.88 GHz41.71-1.755 GHz

2.11-2.155 GHz5824-849 MHz

869-894 MHz6830-840 MHz875-885 MHz

7

2.5-2.57

GHz

2.62-2.69 GHz

8

880-915 MHz

925-960 MHz

9

1.75-1.785 GHz

1.845-1.88 GHz

10

1.71-1.77 GHz

2.11-2.17 GHz

11

1.428-1.448 GHz1.476-1.496 GHz12699-716 MHz729-746 MHz13

777-787 MHz746-756 MHz14788-798 MHz758-768 MHz

Band

LTE UL

LTE DL

17704-716 MHz734-746 MHz

18815-830 MHz860-875 MHz

19830-845 MHz

875-890 MHz20

832-862 MHz791-821 MHz21

1.448-1.463 GHz1.496-1.511 GHz

223.41-3.49 GHz3.51-3.59 GHz

232-2.02 GHz2.18-2.2 GHz

241.627-1.661 GHz

1.525-1.559 GHz251.85-1.915 GHz

1.93-1.995 GHz

26814-849 MHz859-894 MHz

331.9-1.92 GHz

1.9-1.92 GHz342.01-2.025 GHz

2.01-2.025 GHz

43

3.6-3.8 GHz

3.6-3.8 GHz

80 MHz

TDD

200 MHz, TDD

Rel11

3

4

5

2

1