Whitted Ray Tracing Ryan Overbeck Columbia University Ravi Ramamoorthi Columbia University William R Mark University of Texas at Austin Intel Corporation Current Realtime Ray Tracing ID: 253560
Download Presentation The PPT/PDF document "Large Ray Packets for Real-time" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Large Ray Packets for Real-time Whitted Ray Tracing
Ryan Overbeck Columbia University
Ravi Ramamoorthi Columbia University
William R. Mark
University of Texas at Austin Intel CorporationSlide2
Current Real-time Ray Tracing
Use large ray packets (16-256 rays)Focused on primary visibility and point-light shadowsImages somewhat dullEasily generated using rasterization
algorithmsSlide3
This Work: Whitted Ray Tracing
Whitted Ray tracing:Primary VisibilityPoint-light ShadowsReflections
RefractionsSlide4
Mission
Study large ray packets for Whitted ray tracingScene traversal (BVH)Partition Traversal: New!
Frustum CullingGeneral secondary ray packets: New!Slide5
Mission
Evaluation of old and new algorithmsRay packet sizeScene complexity Ray Recursion ComplexitySlide6
Result: Real-time Whitted RTSlide7
Outline
MotivationFrustum Culling for Secondary Rays
Scene Traversal with Large Ray PacketsResultsSlide8
Frustum Culling
Primary RaysReshetov et al. 2005Point-Light Shadow Rays Boulos et al. 2006 , Wald et al. 2007
Reflection + Refraction RaysNew!Slide9
Frustum Culling: Point Light
Choose dominant axis: Find intersection points at .Get min/max
coords.
1Slide10
Choose dominant axis:
Find intersection points at .Get min/max coords.Create 4 corner rays.
Compute plane normals.Frustum Culling: Point Light
1Slide11
Frustum Culling: ReflectionSlide12
Frustum Culling: Reflection
Samples’ Bounding Box
Scene Bounding BoxSlide13
Frustum Culling: Reflection
Choose dominant axis:
Find intersections with 2 planes: from scene AABB. from samples’ AABB.Get min/max coords.Create 4 corner rays.Compute plane normals
.Slide14
Outline
MotivationFrustum Culling for Secondary Rays
Scene Traversal with Large Ray PacketsResultsSlide15
3 Algorithms for Scene Traversal
Masked Traversal – Control MethodBased on Wald 2001, Reshetov 2005 (MLRT)Only good for small packets
Ranged Traversal – State-of-the-artWald et al. 2007Degrades performance for incoherent rays!Partition Traversal: New!Robust to incoherent rays and large packetsSlide16
Scene Traversal
Use 2x2 ray packets as atomic ray primitive.Group packets based on the image raster.Slide17
Masked Traversal
0
1
2
3
4Slide18
Masked Traversal
0
1
2
3
4
BVH Inner CellSlide19
Masked Traversal
0
1
2
3
4
Near BVH Leaf CellSlide20
Masked Traversal
0
1
2
3
4
Far BVH Leaf CellSlide21
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Alive
Alive
0
1
2
3
4Slide22
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Alive
Alive
0
1
2
3
4Slide23
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Alive
Alive
0
1
2
3
4Slide24
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Alive
Alive
0
1
2
3
4Slide25
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Alive
Alive
0
1
2
3
4Slide26
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Alive
Dead
0
1
2
3
4Slide27
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Dead
Dead
0
1
2
3
4Slide28
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Dead
Dead
0
1
2
3
4Slide29
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Dead
Dead
0
1
2
3
4Slide30
Masked Traversal
0
1
2
3
4
Dead
Alive
Alive
Dead
Dead
0
1
2
3
4Slide31
Masked Traversal
0
1
2
3
4
Dead
Alive
Alive
Dead
Dead
0
1
2
3
4Slide32
Masked Traversal
0
1
2
3
4
Dead
Alive
Alive
Dead
Dead
0
1
2
3
4Slide33
Masked Traversal
0
1
2
3
4
Dead
Alive
Alive
Dead
Dead
0
1
2
3
4Slide34
Masked Traversal
0
1
2
3
4
Dead
Alive
Alive
Dead
Dead
0
1
2
3
4Slide35
Masked Traversal
0
1
2
3
4
Dead
Alive
Alive
Dead
Dead
0
1
2
3
4Slide36
Masked Traversal
0
1
2
3
4
Dead
Alive
Alive
Dead
Dead
0
1
2
3
4Slide37
Masked Traversal
0
1
2
3
4
Dead
Alive
Alive
Dead
Dead
0
1
2
3
4Slide38
Masked Traversal
0
1
2
3
4
Dead
Alive
Alive
Dead
Dead
0
1
2
3
4Slide39
Masked Traversal
0
1
2
3
4
Dead
Alive
Alive
Dead
Dead
0
1
2
3
4Slide40
Masked Traversal
0
1
2
3
4
Dead
Alive
Alive
Dead
Dead
0
1
2
3
4Slide41
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Alive
Alive
0
1
2
3
4Slide42
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Alive
Alive
0
1
2
3
4Slide43
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Alive
Dead
0
1
2
3
4Slide44
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Dead
Dead
0
1
2
3
4Slide45
Masked Traversal
0
1
2
3
4
Alive
Alive
Alive
Dead
Dead
0
1
2
3
4Slide46
Masked Traversal
0
1
2
3
4
Alive
Dead
Alive
Dead
Dead
0
1
2
3
4Slide47
Masked Traversal
0
1
2
3
4
Dead
Dead
Alive
Dead
Dead
0
1
2
3
4
Causes extra ray—cell testsSlide48
Ranged Traversal
0
1
2
3
4Slide49
Ranged Traversal
0
1
2
3
4Slide50
Ranged Traversal
0
1
2
3
4Slide51
Ranged Traversal
0
1
2
3
4Slide52
Ranged Traversal
0
1
2
3
4Slide53
Ranged Traversal
0
1
2
3
4Slide54
Ranged Traversal
0
1
2
3
4Slide55
Ranged Traversal
0
1
2
3
4Slide56
Ranged Traversal
0
1
2
3
4Slide57
Ranged Traversal
0
1
2
3
4Slide58
Ranged Traversal
0
1
2
3
4Slide59
Ranged Traversal
0
1
2
3
4Slide60
Ranged Traversal
0
1
2
3
4Slide61
Ranged Traversal
0
1
2
3
4Slide62
Ranged Traversal
0
1
2
3
4Slide63
Ranged Traversal
0
1
2
3
4Slide64
Ranged Traversal
0
1
2
3
4Slide65
Ranged Traversal
0
1
2
3
4Slide66
Ranged Traversal
0
1
2
3
4Slide67
Ranged Traversal
0
1
2
3
4Slide68
Ranged Traversal
0
1
2
4
3Slide69
Ranged Traversal
2
1
0
4
3Slide70
Ranged Traversal
2
1
0
4
3Slide71
Ranged Traversal
2
1
0
4
3Slide72
Ranged Traversal
2
1
0
4
3Slide73
Ranged Traversal
2
1
0
4
3Slide74
Ranged Traversal
2
1
0
4
3Slide75
Ranged Traversal
2
1
0
4
3Slide76
Ranged Traversal
2
1
0
4
3Slide77
Ranged Traversal
2
1
0
4
3Slide78
Ranged Traversal
2
1
0
4
3Slide79
Ranged Traversal
2
1
0
4
3Slide80
Ranged Traversal
2
1
0
4
3Slide81
Ranged Traversal
2
1
0
4
3Slide82
Ranged Traversal
2
1
0
4
3Slide83
Ranged Traversal
2
1
0
4
3Slide84
Ranged Traversal
2
1
0
4
3Slide85
Ranged Traversal
2
1
0
4
3Slide86
Ranged Traversal
2
1
0
4
3
Reduces ray—cell tests
Increases ray—triangle testsSlide87
Partition Traversal (New)
0
1
2
3
4
2
1
0
4
3
ASlide88
Partition Traversal (New)
0
1
2
3
4
2
1
0
4
3
ASlide89
Partition Traversal (New)
0
1
2
3
4
2
1
0
4
3
A
B
ESlide90
Partition Traversal (New)
0
1
2
3
4
2
1
0
4
3
A
B
ESlide91
Partition Traversal (New)
0
1
2
3
4
2
1
0
4
3
A
B
ESlide92
Partition Traversal (New)
0
1
2
3
4
2
1
0
4
3
A
B
ESlide93
Partition Traversal (New)
0
1
2
3
4
2
1
0
4
3
A
B
ESlide94
Partition Traversal (New)
0
1
2
3
4
2
1
0
4
3
A
B
ESlide95
Partition Traversal (New)
0
1
4
3
2
2
1
0
4
3
A
B
ESlide96
Partition Traversal (New)
0
1
4
3
2
2
1
0
4
3
A
B
ESlide97
Partition Traversal (New)
0
1
4
3
2
2
1
0
4
3
ASlide98
Partition Traversal (New)
0
1
4
3
2
2
1
0
4
3
ASlide99
Partition Traversal (New)
0
1
4
3
2
2
1
0
4
3
A
B
ESlide100
Partition Traversal (New)
0
1
4
3
2
2
1
0
4
3
A
B
ESlide101
Partition Traversal (New)
0
1
4
3
2
2
1
0
4
3
A
B
ESlide102
Partition Traversal (New)
0
1
4
3
2
2
1
0
4
3
A
B
ESlide103
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
A
B
ESlide104
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
A
B
ESlide105
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
ASlide106
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
ASlide107
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
ASlide108
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
ASlide109
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
ASlide110
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
ASlide111
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
ASlide112
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
A
B
ESlide113
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
A
B
ESlide114
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
A
B
ESlide115
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
A
B
ESlide116
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
ASlide117
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
ASlide118
Partition Traversal (New)
0
4
1
3
2
2
1
0
4
3
A
More ray—cell intersection tests
No unnecessary ray—triangle intersection tests
Robust to incoherent raysSlide119
Outline
MotivationFrustum Culling for Secondary Rays
Scene Traversal with Large Ray PacketsResultsSlide120
Experimental Setup: System
Dual Quad-core 2.0 GHz Intel Xeon Processors (8 cores total).All timings include ray casting and shading (without time to send image to graphics card).1024x1024 Images.BVH Acceleration Structure.All surfaces set to reflective and/or refractive.Slide121
3 Performance Variables
Ray Packet SizeScene ComplexityRay Recursion ComplexitySlide122
3 Performance Variables
Ray Packet SizeScene ComplexityRay Recursion Complexity
4x4
8x832x32
16x16Slide123
3 Performance Variables
Ray Packet SizeScene ComplexityRay Recursion Complexity
ERW6
(804 Triangles)
Toasters
(11,141 Triangles)
Fairy
(172,669 Triangles)
Rings
(217,812 Triangles)Slide124
3 Performance Variables
Ray Packet SizeScene ComplexityRay Recursion Complexity
Primary Visibility
Primary Visibility+Point-light Shadows
Primary Visibility
+
2-Bounce
Reflections
Primary Visibility
+
2-Deep
RefractionsSlide125
Masked
Ranged
Partition
0
500
1000
1500
Time
(Million CPU Cycles)
4x4
8x8
16x16
32x32
Ray Packet Size
Traversal vs. Ray Packet Size
Masked Traversal: performance degrades for > 4x4 packets
Ranged Traversal: performance degrades for > 8x8 packets
Paritition
Traversal: performance increases with packet size
Fairy Scene
2-Bounce ReflectionsSlide126
Masked
Partition
Ranged
0
400
800
1200
Time
(Million CPU Cycles)
Traversal vs. Recursion Complexity
Masked Traversal: Degrades with recursion complexity
Ranged Traversal: Degrades with recursion complexity
Partition Traversal: More robust to increasing recursion complexity
Fairy Scene
16x16 Packets
Primary
Shadows
Primary
+
2-Deep Refractions
Primary
+
2-Bounce Reflections
Primary
+
Recursion ComplexitySlide127
Masked
Ranged
Partition
0
1000
2000
3000
Time
(Million CPU Cycles)
Traversal vs. Recursion Complexity
Difference between Partition and Ranged more pronounced on more complex scenes
Rings Scene
16x16 Packets
Primary
Shadows
Primary
+
2-Deep Refractions
Primary
+
2-Bounce Reflections
Primary
+
Recursion Complexity
4000
2xSlide128
Masked
Ranged
Partition
0
1000
2000
3000
Time
(Million CPU Cycles)
Traversal vs. Scene Complexity
Ranged Traversal: Degrades with scene complexity
Partition Traversal: More robust to increasing scene complexity
4000
2-Bounce Reflections
16x16 Packets
Scene Complexity
ERW6
Toasters
Fairy
RingsSlide129
Ranged vs. Partition Traversal
Ranged TraversalMost ray—cell tests deep in BVHMore extra ray—triangle tests for incoherent raysPartition TraversalMost ray—cell tests high in BVHNo extra ray—triangle testsSlide130
Traversal: Summary
Use Ranged TraversalPrimary rays and coherent point-light shadow raysSmall to medium packets: <= 8x8 raysUse Partition TraversalIncoherent rays: deep reflections + refractionsLarge ray packets: > =16x16 raysSlide131
With Culling
No Culling
0
100
200
300
Time
(Million CPU Cycles)
Frustum Culling vs. Recursion Complexity
Frustum culling provides modest benefits
Toasters Scene
16x16 Packets
Primary
Shadows
Primary
+
2-Deep Refractions
Primary
+
2-Bounce Reflections
Primary
+
Recursion Complexity
400Slide132
Frustum Culling: Summary
Primary + Point-light ShadowsUp to 2x performance benefit possibleBut more often ~1.5x or lessReflections + RefractionsExpect 1.2x – 1.3x performance benefitSlide133
Conclusion
Use ranged and partition traversal in correct situation16x16 packets: 3x – 6x benefit over 2x2Whitted Ray Tracing is now interactive to real-time
11.8 FPS
6.7 FPS8.5 FPS4 FPSSlide134
Real-Time Whitted Ray Tracing
Primary Visibility
Ranged TraversalPoint-Light
Ranged Traversal1-Bounce Reflections
Ranged TraversalSlide135
Conclusion
Primary Visibility
Ranged Traversal
3-Bounce ReflectionsPartition TraversalSlide136
Conclusion
Primary Visibility
Ranged TraversalPoint-Light Shadows
Partition Traversal3-Deep
Refractions
Partition Traversal
1-Bounce Reflections
Partition TraversalSlide137
Future Work
Study traversal algorithms and frustum culling for other global illumination algorithms:Distributed ray tracing.Path tracing.Partition traversal on other hardware:Wider SIMD.
This Session! @2:20:“Coherent Ray Tracing using Stream Filtering”, Gribble and Ramani.Slide138
Acknowledgements
Thanks to:Lab mates at Columbia University.Paid for by:NSF (grants CCF 03-05322, CCF 04-46916, CCF 07-01775), a Sloan Research Fellowship, and an ONR Young Investigator Award N00014-17-1-0900. Intel fellowship to Ryan Overbeck
and related equipment donations from Intel and NVIDIA.Slide139
Questions?