Sunil Agarwal Principal Program Manager SQL Server sunilaMicrosoftcom BRK4552 Definition and Value Prop Operational Analytics with DiskBased Tables Operational Analytics with InMemory OLTP ID: 551666
Download Presentation The PPT/PDF document "Operational Analytics in SQL Server" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1Slide2
Operational Analytics in SQL Server
Sunil AgarwalPrincipal Program Manager, SQL Serversunila@Microsoft.com
BRK4552Slide3
Definition and Value Prop
Operational Analytics with Disk-Based TablesOperational Analytics with In-Memory OLTP
AgendaSlide4
Refers to Operational Workload (i.e. OLTP)
Examples: Enterprise Resource Planning (ERP) – Inventory, Order, Sales, Machine Data – Data from machine operations on factory floor
Online Stores (e.g. Amazon, Expedia)
Stock/Security trades
Mission Critical
No downtime (High Availability) – impact on revenueLow latency and high transaction throughput
What is Operational?Slide5
Analytics
Studying past data (e.g. operational, social media) to identify potential trends To
analyze the effects of certain decisions or
events (e.g. Ad campaign)
Analyze past/current data to predict outcomes (e.g. credit score)
GoalsEnhance the business by gaining knowledge to make improvements or changes.
What is Analytics?
Source – MIT/SLOAN Management ReviewSlide6
Traditional Operational/Analytics Architecture
SQL Server
Database
Application Tier
Presentation Layer
IIS
Server
SQL Server
Relational DW
Database
ETL
BI and analytics
Dashboards
Reporting
SQL Server
Analysis Server
Key Issues
Complex Implementation
Requires two Servers (
CapEx
and
OpEx
)
Data Latency
in Analytics
More businesses demand/require real-time Analytics
Hourly, Daily, WeeklySlide7
Minimizing Data Latency for Analytics
SQL Server
Database
Application Tier
Presentation Layer
IIS
Server
BI and analytics
Dashboards
Reporting
Benefits
No Data Latency
No ETL
No Separate DW
Challenges
Analytics queries are resource intensive and can cause blocking
How to minimize Impact on Operational workload
Sub-optimal execution of Analytics on relational schema
Add analytics specific indexes
This is
OPERATIONAL ANALYTICS
SQL Server
Analysis ServerSlide8
Operational Analytics
Ability to run Analytics Queries concurrently with
Operational workload using the same schema.
Not
a
replacement for Extreme Analytics Queries performance possible using schemas customized (e.g. Star/Snowflake) and pre-aggregated cubes
Data coming
from non-relational sources
Data coming from multiple relational sources requiring integrated analyticsSlide9
Operational Analytics
SQL Server 2016Slide10
Goals
Minimal impact on Operational Workload with concurrent analyticsPerformant Analytics on operational schema
SQL Server 2016: Operational Analytics
Achieved using columnstore IndexSlide11
Columnstore Index: Why?
Improved compression:
Data from same domain compress better
Reduced I/O
:
Fetch only columns needed
…
Data stored as rows
Data stored as columns
Ideal for OLTP
Efficient
operation on small set of rows
C1
C2
C3
C5
C4
Improved Performance
:
More data fits in memory
Optimized for CPU utilization
Ideal for DW WorkloadSlide12
Clustered Columnstore Performance: TPC-HSlide13
Operational Analytics
Disk-Based TablesSlide14
Operational Analytics: With Columnstore Index
Key Points
Create an
updateable
non-clustered
columnstore index
(NCCI) for analytics queries
Drop all other indexes that were created for analytics.
No Application
changes.
ColumnStore index is maintained just like any other
index
Query Optimizer will choose columnstore index where needed
Relational Table
(Clustered Index/Heap
)
Btree
Index
Delete bitmap
Nonclustered columnstore index (NCCI)
Delta rowgroupsSlide15
Operational Analytics:
Columnstore Index Overhead
DML Operations on OLTP workload
Operation
BTREE (NCI)
Non
Clustered ColumnStore Index (NCCI)
Insert
Insert row
into
btree
Insert
row into
btree
(delta store)
Delete
Seek row(s) to be deleted
Delete the rowSeek for the row in the delta stores (there can be multiple)If row found, then deleteOtherwise insert the key into delete row buffer
UpdateSeek the row(s) UpdateDelete the row (steps same as above)Insert the updated row into delta storeSlide16
Operational Analytics: Minimizing Columnstore overhead
Key Points
Create Columnstore only on cold data – using filtered predicate to minimize maintenance
Analytics query accesses both columnstore and ‘hot’ data transparently
Example –
Order Management Application –
create nonclustered columnstore index ….. where
order_status
= ‘SHIPPED’
Relational Table
(Clustered Index/Heap
)
Btree
Index
Delete bitmap
Nonclustered columnstore index (NCCI) – filtered index
HOT
Delta rowgroups
DML OperationsSlide17
Operational Analytics: Minimizing Columnstore overhead
Key Points
Mission Critical Operational Workloads typically configured for High Availability using
AlwaysOn
Availability Groups
You can offload analytics to readable secondary replica
Primary
Replica
Log records
Log records
Log records
Secondary Replica
Secondary
Replica
Secondary
Replica
Analytic
Queries
AlwaysOn
Availability GroupSlide18
Demo
Sunil AgarwalSlide19
Operational Analytics
In-Memory TablesSlide20
Operational Analytics:
Columnstore on In-Memory Tables
SQL Server 2016 – CTP2 limitation
You can create columnstore index on empty table
All columns must be included in the columnstore
No explicit delta rowgroup
Rows (tail) not in columnstore stay in in-memory OLTP table
No columnstore index overhead when operating on tail
Background task migrates rows from tail to columnstore in chunks of 1 million rows not changed in last 1 hour.
Deleted Rows Table (DRT) – Tracks deleted rows
Columnstore data fully resident in memory
Persisted together with operational data
No application changes required.
In-Memory OLTP Table
Updateable CCI
DRT
Tail
Range
Index
Hash Index
Hot
Like
Delta rowgroupSlide21
Operational Analytics:
Columnstore Overhead
DML Operations on
In-Memory OLTP
Operation
Hash
or Range Index
HK-CCI
Insert
Insert
row into HK
Insert
row into HK
Delete
Seek row(s) to be deleted
Delete the row
Seek row(s) to be deleted
Delete the row in HKIf row in TAIL then return else insert <colstore-RID> into DRT
UpdateSeek the row(s) Update (delete/insert)Seek the row(s) Update (delete/insert) in HK
If row in TAIL then return else insert <colstore-RID> into DRTSlide22
Operational Analytics: Minimizing Columnstore overhead
DML
Operations
In-Memory OLTP Table
Updateable CCI
DRT
Tail
Range
Index
Hash Index
Like
Delta rowgroup
Hot
Keep hot data only in in-memory tables
Example – data stays hot for 1 day, 1 week…
CTP2: Work-Around
Use TF – 9975 to disable auto-compression
Force compression using a spec-
proc
“
sp_memory_optimized_cs_migration
”
Analytics Queries
Offload Analytics to
AlwaysON
Readable SecondarySlide23
Summary – Operational Analytics
Analytics in real-time with no data latency
Rich set of options to control impact on Operational workload
Industry leading solution Integrating in-memory OLTP with in-memory Analytics
No Application changes requiredSlide24
Visit
Myignite
at
http://myignite.microsoft.com
or download and use the
Ignite
Mobile
App
with
the QR code above.
Please evaluate this session
Your feedback is important to us!Slide25Slide26
Operational Analytics: with CCI
CCI
Btree
Index
delta
HOT DATA
SQL
V.Next
Allows creating one or more NCIs on CCI
Allows locking @
rowlevel
for updates/deletes
Ability to delay compression of rows in delta rowgroup
Some Limitations
No Triggers
No transaction replication
Cursor based access not allowed
Comparison with NCCI
Seek of the row in the compressed store is comparatively expensive
Short-range scans comparatively expensive
Lower concurrency: when rowgroup is
compressed (TM),
it is not available for UPDATE/DELETE