/
April 9, 2012 Case Study: Complex Multi-Core Virtual Platform Enablement Using TLM 2.0 April 9, 2012 Case Study: Complex Multi-Core Virtual Platform Enablement Using TLM 2.0

April 9, 2012 Case Study: Complex Multi-Core Virtual Platform Enablement Using TLM 2.0 - PowerPoint Presentation

lastinsetp
lastinsetp . @lastinsetp
Follow
344 views
Uploaded On 2020-08-04

April 9, 2012 Case Study: Complex Multi-Core Virtual Platform Enablement Using TLM 2.0 - PPT Presentation

Navaneet Kumar Sandeep Jain Rajesh Jain Networking amp Multimedia Solutions Group Updated Jan 2012 Outline Virtual Platform Challenges amp Requirement Significance of TLM 20 TLM LT Methodology ID: 797111

multi tlm model dsp tlm multi dsp model dmi simulation thread adapter virtual side models region platform partition memory

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "April 9, 2012 Case Study: Complex Multi-..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

April 9, 2012

Case Study: Complex Multi-Core Virtual Platform Enablement Using TLM 2.0 to co-simulate Diverse Simulation Models in a Multi-threaded Environment

Navaneet Kumar, Sandeep Jain, Rajesh Jain

Networking & Multimedia Solutions Group

Updated Jan 2012

Slide2

Outline

Virtual Platform – Challenges & RequirementSignificance of TLM 2.0TLM LT MethodologyTLM LT AdapterMoving to Multi -Threading

Challenges and Learning’s of Multi-ThreadingChallenges and Learning’s of DMIPerformance resultConclusionReferences

Slide3

Virtual Platform – Challenges & Requirement

SoCs are becoming more complex day by dayToday’s SoC contain multiple heterogeneous cores, hardware accelerators, peripherals, complex memory hierarchy with hardware supported coherency

High-fidelity models developed by different divisions and teams in a large organization follow diverse modeling frameworkPorting all the complex models to a common modeling framework is no easy taskTimely availability of Virtual Platform is criticalUseful for Software driver developmentA common methodology and infrastructure is required To enable interoperability of such diverse simulation models To demonstrate quick virtual platform integration and co-simulation

Slide4

Example of FSL B4860 Baseband Processor

Slide5

Significance of TLM 2.0

TLM 2.0 introduces interoperability layerGeneric payload class is suitable for carrying most common payload information Core interfaces are suitable for various model-to-model communication scenarios

TLM 2.0 enables seamless integration of diverse simulation modelsExtensions can be used to carry any type of payload and phase informationModels from different modeling framework can suitably comply to TLM 2.0 APIs Our virtual platform integration is a proofHere, we present our TLM LT methodology as a case study

Slide6

TLM LT Methodology

TLM definition

TLM integrationTLM Adapter Creation

TLM Extension file

TLM LT Adapters

Specs

C/C++ and SystemC Model Library

Slide7

Based on blocking transport interface Provides fast-functional model-to-model communication

Has no dependency on SystemC schedulerUses TLM multi-sockets to allow connectivity to multiple initiators/targets

Zero delay model Doesn’t buffer transaction at all Immediately maps C/C++ API call to TLM API and vice versaTLM LT Adapter – The Key piece

Slide8

TLM LT Adapters – An example

TLM 2.0

DSP side C++ ComponentsPower Arch side C++ ComponentsCentral Interconnect Module

TLM LT PowerAdapterMemAccess I/ftlm_generic_payloadSnoop I/fTLM LT DSPAdapter

tlm_generic_payload

DSPCustom I/f

b_transport

b_transport

Multi

Init/Target

Socket Pair

On

Each side

Mem

Access

Mem/

Reg

Access

Snoop

Request-

Responses

DSP Module

Slide9

bool snoop_enableHelps to distinguish between cacheable and non-cacheable memory regions

Transactions with this attribute set are broadcasted to all masters with caches implemented to maintain coherencyuint32_t decoration_attrUsed with special decoration instructions which can atomically set, clear, increment or decrement another region in memory along with read/write to the specified region

Set by the SC3900/e6500 cores on execution of a decoration instruction to indicate L3 cache to take required actionsuint32_t addr_only_typeHelps to model specific address only transactions types, e.g.TLBIE causes broadcast of a TLB invalidate entry operation to all snoopers TLM Extension – Key Attributes

Slide10

Full system virtual platforms is becoming very complex

Running a highly complex system in a single thread becomes a severe bottleneck to simulation performanceMulti-Threading the simulator is the approach to speed up simulation Helps effective utilization of the concurrent host system resources, which are heavily into Multi-Core and Multi-Thread

Logical partitioning of sub-system models to run in separate threads helps achieve Multi-ThreadingMost straight forward way to partition work among different threadsMinimal changes needed in single-threaded version of the codeMoving to Multi-Threading (MT)

Slide11

Example of FSL B4860 Baseband Processor

Slide12

Challenges & Learning’s in Multi-Threading

TLM Adapter with Multi-Initiator/Target socket became a severe bottleneckAll threads on DSP side contended for the shared resources in the TLM Adapter, even when they were accessing different resources on the Power Arch side

We chose to instantiate multiple TLM Adapters – one per DSP threadAn adapter may still connect to multiple entities working on same threadNo contention due to TLM connectivity! All DSP side threads still contend for the shared memoryNeed to protect memory and this again impacts simulation performance TLM 2.0 DMI helps minimize this impactMost of the accesses are satisfied via DMIProtection needed only when acquiring DMI pointers

Slide13

DMI should be used cautiously!Causes any bus snooping capabilities provided by the interconnect model to be bypassed

Certain device models have internal caches and these needs to be kept coherent with the main memoryIf a master modifies certain memory region via DMI and if this region is present in one of the cache models, then there is no way keep the system coherent!

We employed a mechanism to allow DMI while maintaining coherencyDMI acquire succeeds only if a memory region is not already cachedAll requests to cached region go via interconnect model – no DMI, snoop broadcast if askedWhenever a new region is cached, DMI invalidates is sent to all masters to prevent them from using DMI for such regionsChallenges and Learning’s with DMI

Slide14

Multi-Thread setup

TLM 2.0

DSP side Power Arch side

Inter-connectTLM PA AdapterTLM PA AdapterTLM PA AdapterTLM DSP Adapter

TLM DSP Adapter

TLM DSP Adapter

DSP Partition 1

DSP Partition 2

DSP Partition 3

Power Partition 1

Power Partition 2

Thread-3

Thread-4

Thread-5

Thread-1

Thread-2

Slide15

Performance Chart

Slide16

TLM 2.0 based methodology is a good technique to integrate and co-simulate diverse LT simulation models

TLM Adapters can be designed and deployed in a smart way that ensures coherencyMulti-threading is the way to go

Great speed-up for virtual platform simulationConclusion

Slide17

SystemC IEEE 1666-2011 Language Reference Manual

http://www.freescale.comReferences

Slide18

Thank You!

Slide19