VR Software Engineer GameWorks VR How is VR rendering different How is VR rendering different High framerate low latency 90 frames per second Motion to photons in 20 ms How is VR rendering different ID: 244287
Download Presentation The PPT/PDF document "Nathan Reed" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Nathan Reed, VR Software Engineer
GameWorks VRSlide2
How is VR rendering different?Slide3
How is VR rendering different?High framerate, low latency
90 frames per second
Motion to photons in
≤ 20 msSlide4
How is VR rendering different?Stereo rendering
Two eyes, same sceneSlide5
How is VR rendering different?Correcting for lens distortion
Rendered image
Distorted imageSlide6
GameWorks VRSDK for VR headset and game developers
Increase rendering performance by putting your pixels where they count
Scale performance with multiple GPUs
Minimize
head-
tracking latency with asynchronous time warp
Plug-and-play
compatibility from GPU
to
headset
Reduce latency by rendering directly to the front bufferSlide7
DesignWorks VRExtra VR features for professional graphics
API for geometry and
i
ntensity adjustments for
seamless multi-monitor display
Provides tear-free
VR environments by
synchronizing scanout across GPUs
Fine-grained
control to pin OGL contexts to
specific GPUs
Reduces latency for video transfer to and from the GPU Slide8
VR SLISlide9
VR SLI
Two eyes...two GPUs!Slide10
Typical SLIGPUs render alternate frames
N
N+1
N
N+1
N
N+1
CPU
GPU 0
GPU 1
Display
LatencySlide11
VR SLIEach GPU renders one eye—lower latency
N
L
N+1
R
N
N+1
N
R
N+1
L
N
N+1
CPU
GPU 0
GPU 1
Display
LatencySlide12
VR SLIGPU affinity masking: full control
Shadow maps,
GPU physics,
etc.
Left eye rendering
Right eye renderingSlide13
VR SLIBroadcasting reduces CPU overhead
R
L
Render scene onceSlide14
VR SLI
Per-GPU constant buffers, viewports, scissors
L
R
Multi-GPU API
R
L
EngineSlide15
VR SLICross-GPU data transfer via PCI ExpressSlide16
VR SLIExplicitly control work distribution for up to 8 GPUsNot automatic—needs renderer integrationPublic beta very soonOpenGL extension too
—under NDA (ask us!)
Multi-GPU API extensions for DX11Slide17
Single-GPU Stereo RenderingSlide18
Single-GPU stereoGPU still has to draw twice—not much we can do thereCPU has to submit twice—can we solve that?
Reducing CPU overheadSlide19
Single-GPU stereoRecord rendering API calls in a command listReplay for each eye—minimal CPU overheadStore view matrix in a global constant buffer
DX11/12 command listsSlide20
Single-GPU stereoSame idea: record once, replay per eyeApp writes bytecode-like rendering commands to a bufferSubmit with one API callSpec:
GL_NV_command_list
For more info:
GPU-Driven Large Scene Rendering (GTC 2015)
GL_NV_command_listSlide21
Single-GPU stereoGenerate two instances of everythingVertex shader projects one instance to each eyeOne viewport + user clip planes to keep views separateTwo viewports + passthrough GS to route (pretty fast)
Or
GL_AMD_vertex_shader_viewport_index
(supported on recent NV GPUs too)
Stereo instancingSlide22
Multi-Resolution ShadingSlide23
VR headset opticsDistortion and counter-distortionSlide24
VR headset opticsDistortion and counter-distortion
User’s view
Image Displayed
OpticsSlide25
Distorted renderingRender normally, then resample
Rendered image
Distorted imageSlide26
Distorted renderingOver-rendering the outskirts
Rendered image
Distorted imageSlide27
Multi-resolution shadingSubdivide the image, and shrink the outskirtsSlide28
Multi-resolution shadingFast viewport broadcast on NVIDIA Maxwell GPUs
Viewport 1
Viewport 2
Viewport N
...
Geometry PipelineSlide29
Standard renderingMaximum density everywhere
Ideal pixel density
Rendered pixel densitySlide30
Multi-resolution shading25% pixels saved
Ideal pixel density
Rendered pixel densitySlide31
Multi-resolution shading50% pixels saved
Ideal pixel density
Rendered pixel densitySlide32
Multi-resolution shadingNeeds renderer integration (especially postprocessing)DX11 & OpenGL API extensions in developmentRequires Maxwell GPUGTX 900 series, Titan X, Quadro M6000
SDK coming soonSlide33
GPU Multitasking and VRSlide34
GPU multitaskingGPU is massively parallel under the hood (vertices, tris, pixels)Fed by a serial command buffer (state changes, draw calls)
How does it work?Slide35
GPU multitaskingMany running apps may want to use the GPUIncluding the desktop compositor!DX/GL contexts create command packetsWindows enqueues packets for GPU to execute
How does it work?Slide36
GPU multitaskingProblem: long packets can’t be interrupted (before Windows 10!)One app using GPU heavily can slow down entire system!Desktop compositor needs reliable 60 Hz for good experience
Cooperative more than preemptiveSlide37
GPU multitaskingExtra WDDM “node” (device) on which work can be scheduledGPU time-slices between nodes in 1ms intervalsBut current GPUs can only switch at draw call boundariesP
reempts & later resumes main node
Low-latency nodeSlide38
GPU multitaskingDesktop compositor uses low-latency nodeStill runs at 60 Hz, even if other apps don’tLow-latency
nodeSlide39
VR applicationsMust refresh at 90 Hz for good experienceHitches are really bad in VR!Need protection similar to desktop compositor
Also need reliable framerateSlide40
VR compositorOculus and Valve both use a VR compositor processVR apps submit frames to compositor; it owns the displayCombine multiple VR apps, layers, etc.Warp old frames for new head pose (asynchronous timewarp)
Safety: if app hangs/crashes, fall back to basic VR environment
A lot like the desktop compositorSlide41
Context priority APIVR compositor can use low-latency node tooDX11 extension API to create a low-latency contextFuture: take advantage of Win10 scheduling improvements
Enables control over GPU prioritizationSlide42
Direct ModeHide headset from OS—don’t extend desktop to itVR apps get exclusive access to displayLow-level access to video modes, vsync timing, flip chain
Plug-and-play compatibility for VR headsetsSlide43
Front buffer renderingNormally not accessible in DX11Direct Mode enables access to front bufferEnables advanced latency optimizations
For low-level wizardsSlide44
NVIDIA VR toolkits
GameWorks VR
DesignWorks
VR
Audience
HMD
&
Game Developers
HMD
& Application Developers
Environments
HMD
HMD,
CAVE
& Cluster Solutions
APIs
DirectX 11, OpenGL
DirectX 11,
OpenGL
FEATURES
VR SLI
Context Priority
Direct Mode
Front Buffer Rendering
Multi-Res Shading (Alpha)
Synchronization
GPU Direct for Video
Warp
& Blend
GPU Affinity
Slide45
nreed@nvidia.comQuestions?
pDevice->Flush();
Slides will be posted:
http://developer.nvidia.com