Ubisoft Montreal Michael LAVAIRE Technical Lead Ubisoft Montreal Remi QUENIN Squeeze the Juice out of CPUs Post Mortem of a DataDriven Scheduler Table of Contents Issues on common MT Archi ID: 559481
Download Presentation The PPT/PDF document "Technical Architect" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Technical Architect
Ubisoft Montreal
Michael LAVAIRE
Technical
Lead
Ubisoft Montreal
Remi QUENIN
Squeeze the Juice out of CPUs
Post Mortem of a Data-Driven SchedulerSlide2
Table of Contents
Issues on common MT
Archi.“Shears” SolutionTips and TricksSlide3
1. Common architectural
DesignS
Usual multithreading patternsSlide4
Folded loop
Gameplay
Engine LoopGraphicSmaller Engine Loop
Thread 1Thread 2Background LoadingThread 3
FrameNN+1N+2Slide5
Sync point
Sync
Sync
Sync
Sync
Applying modifications
Applying modifications
Applying modifications
Gameplay
Graphic
Background LoadingApplying modificationsSlide6
Tasks Scheduling
Stage 1
WaitingStage 2
Stage 3GateTask ATask D
Task C
Task B
Task E
Task FSlide7
2. « SHEARS » SOLution
60 FPS for
everyoneSlide8
ObjectivesSlide9
2.1 Data Driven Scheduling
Focus on data, not on the codeSlide10
Task A
Task A
Task AData driven schedulingTask A
Task ATask ATask ATask BTask ATask ATask A
Task CData D1Data D0
Data D0Data D1Slide11
D0
D1
Data driven schedulingTask A
Task ATask ATask ATask ATask ATask A
Task BTask A
Task ATask ATask C
Data D1
Data D0Slide12
Data driven scheduling
D0
D1Task ATask B
Task CSlide13
Data driven scheduling
D0
D1Task ATask B
Task CSlide14
Data driven
schedulingSlide15
Data driven
schedulingSlide16
Data driven
schedulingSlide17
2.2 workloads
No locks, be scalableSlide18
WorkloadSlide19
Lock-free : Internal
Container
StateThread 2Container::Remove()
StateStateThread 1Container::Add()
State
State
State
StateTest &SetTest &Set
SUCCESS !FAILED !SUCCESS !Slide20
Lock-freeSlide21
Lock-free : Comparison Q6600
Thread Count
1234
Lock-freeMean
Op Time (cycles)233678
21453278
Op/msec103003540
1119732Race count0957191205161371
Crit.Sec.
Mean Op Time (cycles)771
2486236835
49079Op/msec3113976549Idle %57%74%83%87%
Lock-free /
Crit.Sec
. ratio
3.31
36.67
17.17
14.97Slide22
Lock-free : Comparison X360
Thread Count
123
456
Lock-freeMean Op Time (usec)
0.290.50.60.83
0.911.2Op/msec34482000
16671205
1099833
Race count01044772716152283218379329422
Crit
. Sec.
Mean Op Time (
usec
)
0.96
10.06
13.76
19.16
24.66
29.88
Op/
msec
1042
99
73
52
41
33
Idle %
23%
75%
85%
89%
92%
93%
Lock-free /
Crit.Sec
.
ratio
3.31
20.12
22.93
23.08
27.10
24.90Slide23
Lock-free : Comparison PS3
Thread Count
12
Lock-freeMean Op Time (usec)
0.270.46
Op/msec
37042174
Race count0253
Crit.Sec.Mean Op Time (usec)1.15
5.15Op/
msec870
194Idle %25%60%Lock-free / Crit.Sec. ratio
4.2611.20Slide24
2.3 Working
with SPU
Easy cross-platforming, easy debuggingSlide25
Working with SPUCross Platform API
?Slide26
Working with SPUCross Platform API
Main Memory
ImplDMA ImplMemory Access Interface
SatelliteTask Slide27
Working with SPU Easy Debugging
Main Memory
ImplDMA ImplMemory Access Interface
SatelliteTask Slide28
Working with SPUEasy DebuggingSlide29
Working with SPU Easy Debugging
Main Memory
ImplDMA ImplMemory Access Interface
SatelliteTask Named PipeSlide30Slide31
3. Tips & tricks
Multithreading & peace of mindSlide32
Tip 1: Clever profilingSlide33
Tip 2: WatchdogSlide34
Tip 3: Unit TestsSlide35
Trick 1: PerturbationSlide36
Trick 1: Perturbation
Test A
Test CTest BTest D
Test ATest CTest BTest DLoop nLoop n+1Thread AThread B
New thread synchronizationSlide37
Trick 2: State validationSlide38
Trick
2: State validation
State 1Process AProcess BProcess C
Process XState 2State 3State XAssert !Slide39
Trick 2: State validation
class
StateChecker{public: enum State { State1, State2, State3 };StateChecker() { m_state = State1; } bool SetState( State oldState, State newState ) { return Atomic::TestAndSet ( &m_state, oldState, newState ) ==
oldState; }private: volatile State m_state;};Slide40
Trick 3: Access verificationSlide41
Trick 3: Access verification
class
AccessChecker{public: AccessChecker() { m_access = 0; } bool StartReadAccess() { return Atomic::Inc( &m_access ) > 0; } bool EndReadAccess() { return Atomic::Dec( &m_access ) >= 0; } bool StartWriteAccess
() { return Atomic::Dec( &m_access ) == -1; } bool EndWriteAccess() { return Atomic::Inc( &m_access ) == 0; }private: volatile int m_access;};Slide42
Trick 4: Multithreaded AssertSlide43
Trick 4: Multithreaded Assert
extern volatile
bool g_waitOnAssert = false;#define ASSERT( condition ) \while(g_waitOnAssert) {} \if( !(condition) ) \{ \ g_waitOnAssert = true; \ DoAssert(); \ g_waitOnAssert = false; \}Slide44
Squeeze the Juice !Slide45
Inspiration
Game Programming Gems 6: Lock-free Algorithms
by Toby JonesDesign and Implementation of Multi-Threaded Gamesby Bruce DawsonFloodgate: Maximizing SPU parallelism without sacrificing cross platform developmentby David Asbell & Michael NolandSPU Shadersby Mike ActonSlide46
michael.lavaire@ubisoft.com
remi.quenin@ubisoft.comSlide47
Ubisoft
is recruiting!
Come see us at the Ubisoft Booth in the Career Pavilion (CP 2308, South Hall)You can also check out:
www.creatorsofemotions.com