MainMemory Workloads Iraklis Psaroudakis EPFL Tobias Scheuer SAP AG Norman May SAP AG Anastasia Ailamaki EPFL 1 Scheduling for high concurrency 2 Queries gtgt HW contexts ID: 277601
Download Presentation The PPT/PDF document "Task Scheduling for Highly Concurrent An..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads
Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May (SAP AG), Anastasia Ailamaki (EPFL)
1Slide2
Scheduling for high concurrency
2
Queries >> H/W contexts
Limited number of H/W contexts
How should the DBMS use available CPU resources?Slide3
Scheduling for mixed workloads
3
OLTP
OLAP
Short-lived
Reads & updates
Scan-light
Long-running
Read-only
Scan-heavy
How to schedule highly concurrent mixed workloads?
Single thread
Parallelism
Contention in highly concurrent situationsSlide4
1
2
1
3
2
1
2
1
3
2
Scheduling tactics
OS scheduler
4
1
2
3
Time
1
2
1
3
2
Time
Time
Context switch
Cache thrashing
Admission control
We need to avoid both underutilization and overutilization
# Threads
Time
# H/W contexts
Overutilization
} underutilization
} overutilization
Coarse granularity of controlSlide5
Task scheduling
A task can contain any code5
run() {
...
}
One worker thread per core processing tasks
Socket 1
Socket 2
Task queues
Provides a solution to efficiently utilize CPU resources
OLAP queries can parallelize w/o overutilization
Distributed queues to minimize sync contention
Task stealing to fix imbalanceSlide6
Task scheduling problems for DBMS
OLTP tasks can blockProblem: under-utilization of CPU resourcesSolution: flexible concurrency levelOLAP queries can issue an excessive number of tasks in highly concurrent situations
Problem:
unnecessary scheduling overhead
Solution:
concurrency hint
6Slide7
Outline
IntroductionFlexible concurrency levelConcurrency hintExperimental evaluation with SAP HANAConclusions
7Slide8
Fixed concurrency level
8
A fixed concurrency level is not suitable for DBMS
Typical task scheduling:
Bypasses the OS scheduler
OLTP tasks may block
Underutilization
FixedSlide9
Flexible concurrency level
Issue additional workers when tasks blockCooperate with the OS scheduler9
Concurrency level = # of worker threads
Active Concurrency level = # of active worker threads
OS
Active concurrency level
=
# H/W contexts
The OS schedules the threadsSlide10
Task Scheduler
Task Scheduler
Worker states
10
Inactive Workers
Watchdog:
Monitoring, statistics, and takes actions
Keeps active concurrency level
≈
# of H/W contexts
Blocked
in
syscall
Inactive
by user
Waiting
for a task
Parked
workers
Active
workers
Watchdog
Other
threads
We
dynamically
re-adjust
the
scheduler's
concurrency
levelSlide11
Outline
IntroductionFlexible concurrency levelConcurrency hintExperimental evaluation with SAP HANAConclusions
11Slide12
Partitionable operations
Can be split in a variable number of tasks12
Calculates its task granularity
Σ
1
≤ # tasks
≤ # H/W contexts
Problem: calculation independent of the system’s concurrency situation
High concurrency:
e
xcessive number of tasks
Unnecessary scheduling overhead
We should restrict task granularity under high concurrency
Partition 1
Partition 2
Partition 3
Final resultSlide13
Restricting task granularity
Existing frameworks for data parallelismNot straightforward for a commercial DBMSSimpler way?
13
free worker threads = max(0,
# of H/W contexts - # active worker threads)
The concurrency hint serves as an upper bound for # tasks
concurrency hint = exponential moving average of
free worker
threadsSlide14
High latency
Low scheduling overheadHigher throughputConcurrency hint
14
Low concurrency situations
Lightweight way
to restrict task granularity under high concurrency
High concurrency situations
Concurrency hint
# H/W contexts
Concurrency hint
1
Σ
Σ
Σ
Σ
Σ
Low latencySlide15
Outline
IntroductionFlexible concurrency levelConcurrency hintExperimental evaluation with SAP HANA
Conclusions
15Slide16
Experimental evaluation with SAP HANA
TPC-H SF=10 TPC-H SF=10 + TPC-C WH=200Configuration:8x10 Intel Xeon E7-8870 2.40 GHz, with hyperthreading, 1 TB RAM, 64-bit SMP Linux (
SuSE
) 2.6.32 kernel
Several iterations. No caching. No thinking times.
We compare:
Fixed (fixed concurrency level)
Flexible (flexible concurrency level)Hint (flexible concurrency level + concurrency hint)
16Slide17
TPC-H – Response time
17
Task
granularity
can
affect
OLAP
performance by
11%Slide18
TPC-H - Measurements
18
Unnecessary overhead by too many tasks under high concurrencySlide19
TPC-H - Timelines
19
Fixed
HintSlide20
TPC-H and TPC-C
20
Throughput experiment
Variable TPC-H clients = 16-64. TPC-C clients = 200.Slide21
ConclusionsTask scheduling for
Resources managementFor DBMSHandle tasks that blockSolution: flexible concurrency levelCorrelate task granularity of analytical queries with concurrency to avoid unnecessary scheduling overheadSolution: concurrency hint
21
Thank you! Questions?Slide22
Outline
IntroductionFlexible concurrency levelConcurrency hint Experimental evaluation with SAP HANA
Conclusions
22Slide23
TPC-H and TPC-C - Measurements
23Slide24
Metadata Manager
Authorization
Transaction Manager
SAP HANA’s architecture
24
Persistence Layer (Logging, Recovery, Page Management)
Row-Store
Column-Store
Graph Engine
Text Engine
Scheduler
Execution
Engine
Executor
Dispatcher
Optimizer and Plan Generator
Calculation Engine
Various access interfaces (SQL, SQL Script, etc.)
Metadata Manager
Authorization
Transaction Manager
Connection and
Session Management
Receivers
NetworkSlide25
Our task scheduler
25
Root
Node
Non-root
Nodes
Uninitiated graphs
Task queues
Worker threads
Task graph
Distributed queues to minimize sync contention
Task stealing for load-balancing