Dan Ginsburg Valve Summary Source 2 Overview Porting to Vulkan Shaders and Pipelines Command Buffers Memory Management Descriptor Sets Source 2 Overview Source 2 OpenGL DX9 DX11 Vulkan ID: 633504
Download Presentation The PPT/PDF document "Porting Source 2 to Vulkan" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1Slide2
Porting Source 2 to
VulkanDan GinsburgValveSlide3
Summary
Source 2 OverviewPorting to VulkanShaders and PipelinesCommand BuffersMemory Management
Descriptor SetsSlide4
Source 2 OverviewSlide5
Source 2
OpenGL, DX9, DX11, VulkanWindows, Linux, MacDota 2 RebornSlide6
Source 2 Rendering
DX11-like rendersystem abstractionMultithreadedDX9/GL: software command buffers DX11: deferred contextsSingle submission threadSlide7
Source 2 Rendering (GL)Slide8
Source 2 Vulkan Port
Started with GL and DX11 rendererDX11 deferred contexts mapped well to Vulkan command buffersLeveraged GLSL shader conversionSlide9
Shaders and PipelinesSlide10
Porting Shaders to Vulkan
HLSL -> GLSLSee: Moving Your Games to OpenGL, Steam Dev Days 2014GLSL -> SPIR-VDescriptor set layout qualifiers to GLSLOpen source
glslang SPIR-V backendSPVremapper for compression
https://github.com/KhronosGroup/glslangSlide11
Pipeline State Objects (PSOs)
Each thread caches pipeline stateGlobal pipeline managerPSO MapPending and currentReduces mutexing
End of Frame
Pending PSO Map
Current PSO Map
PSO0
PSO1
PSO2
Pipeline State Hash
Create PSO
Lookup
PSO3
LookupSlide12
Command BuffersSlide13
Command Buffers
Used where DX11 deferred contexts were usedEach thread builds command bufferSingle thread performs submission to queueSlide14
Command Buffers
Recycled within per-thread poolsThread 0 CommandPool
Thread 1 CommandPool
CmdBuf
CmdBuf
CmdBuf
CmdBuf
CmdBuf
CmdBuf
CmdBuf
Submit1
Submit2
Thread 2
CommandPool
Thread0
Thread1
Thread2
Submit
Main
Check FencesSlide15
Command Buffer Performance
Submit in batchesvkQueueSubmit has cost on WindowsFaster to group submissions togetherMinimize number of command buffersMinimize memory referenced per command bufferUse VK_CMD_BUFFER_OPTIMIZE_ONE_TIME_SUBMIT_BITOptimize for one-time submission Slide16
Memory ManagementSlide17
General Strategies
Pool resources togetherReduces memory reference countUse per-thread pools to reduce contentionRecycle dynamic pools on frame boundariesSlide18
Resources
Static ResourcesDynamic ResourcesGlobal PoolsDevice Only
Textures/Render Targets128MB Pools
VB/IB/CBs8MB PoolsPer-Thread Pools
Host Visible (Persistently Mapped)
VB/IB/CBs
8MB PoolsSlide19
Dynamic Vertex/Index Buffers
To update:Grab new offset from per-thread poolmemcpy into poolBind VBs with: vkCmdBindVertexBuffers(..,buffer,offset)Bind IBs with:
vkCmdBindIndexBuffer(..,buffer,offset,..)Recycle pools when last GPU fence of frame retiresSlide20
Dynamic
Uniform BuffersDifferences from VB/IBs:UBOs are bound via descriptorsUse dynamic UBOs to avoid vkUpdateDescriptors
Pass UBO offset to vkCmdBindDescriptorSetsSlide21
Dynamic Textures
Staged in persistently mapped buffersRecycled per-frameCopy with vkCmdCopyBufferToImageSlide22
Descriptor SetsSlide23
Descriptor Set - Ideal
Allocate and bake descriptor sets up frontGroup sets by update frequencyOnly update changed setsSlide24
Descriptor Set - Reality
Difficult to bake descriptors with DX11-like abstractionOur approachPre-allocate descriptor sets with fixed slotsOnly bind to used slotsUpdate descriptors each drawSlide25
Allocating Descriptor Sets
Allocated from per-thread poolsPre-allocate on pool creation and recycle on frame boundariesGrab new descriptor set by just incrementing offsetSlide26
Future optimizations
Group resources into multiple setsUse dynamic UBOs to avoid updating UBO descriptorsBake texture bindings higher up in abstractionAvoid updating descriptors where possibleSlide27
MiscellaneousSlide28
Swap Queue Depth
Manage yourselfWindow System Interaction (WSI) extensionCreate N images for swap queue depthFence or semaphore to manage swap queue depthSlide29
Image Layout Transitions
Transitions require previous and new layoutExample: Color Attachment -> Shader ReadCommand buffers generated out-of-orderProblem: how do we know previous layout?OptionsReturn image layouts to known statesGood for CPU Perf, Bad for GPU PerfGenerate command buffers to transition layouts
Bad for CPU Perf, Good for GPU Perf
Tried both:Better for CPU perfBetter for GPU perfSlide30
Coordinate Systems
D3D11 conventions, except:Y direction of clipspace is invertedIf porting from HLSL:Invert gl_Position.yReasoning: kept all spaces consistentSlide31
Summary
Source 2 OverviewPorting to VulkanShaders and PipelinesCommand Buffers
Memory ManagementDescriptor SetsSlide32
Questions?
dang@valvesoftware.comKhronos BOF:Wed. August 12th, 5:30-7:30JW Marriott LA Live in the Platinum Ballroom Salon F-I