Marco Russo Senior Consultant SQLBI DBI319 Marco Russo marcosqlbicom httpsqlblogcomblogsmarcorusso marcorus What well discuss History recap amp roadmap xVelocity Patterns comparison ID: 759772
Download Presentation The PPT/PDF document "BISM: Multidimensional vs. Tabular" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
BISM:Multidimensional vs. Tabular
Marco RussoSenior ConsultantSQLBI
DBI319
Slide2Marco Russo
marco@sqlbi.com
http://sqlblog.com/blogs/marco_russo
@
marcorus
Slide3What we’ll discuss
History recap & roadmap
xVelocity
Patterns comparison
Considerations about porting/migration
Conclusions
Slide4UDM was born in 2005
Once upon a time, there was UDM
Unified Dimensional Model
Only one BI data modeling tool
Slide52012: BI Semantic Model
There
was
UDM,
now
there
is
BISM
Business Intelligence
Semantic
Model
Only
one
BI data
modeling
tool
Divided
in
two
flavors
Tabular
Multidimensional
Now
there
are
two
modeling
techniques
Under a single
technology
Slide6BI Semantic Model: Vision
BI Semantic Model
Data model
Business logic
and queries
Data access
ROLAP
MOLAP
xVelocity
Direct
Query
MDX
DAX
Multi-
dimensional
Tabular
Third-party
applications
Reporting
Services
Excel
PowerPivot
Databases
LOB Applications
Files
OData
Feeds
Cloud Services
SharePoint
Insights
Power
View
Slide7Tabular: the new kid in town
Part of the BISM
vision
Super-
duper
fast
xVelocity
in-
memory
engine
Amazing
compression
ratio of
tables
Based
on the
relational
data model
Programmed
with the new DAX
language
When
compared
with OLAP
Quite
always
faster
.
Much
faster
Easier
to
learn
and use
Slide8Is This The Right Approach?
I am xVelocity, your new engineThou shalt have no other engine before me
Slide9Or is it Better to Investigate?
I
have
two
engines
now
What
are
their
strenghts
?
When
should
I
choose
one
against
the
other
?
Slide10Analysis Services Server Mode
Choose server mode of Analysis ServicesDeploymentMode setting in msmdsrv.ini0 – Multidimensional and Data Mining (default)1 – SharePoint2 – TabularDifferent icons in Object ExplorerSame code, different behaviorChoice at the INSTANCE levelTwo SSAS instances needed to use both
Slide11What is xVelocity in-memory?
In-memory database
Based on the relational methodology
Column oriented database
Data is stored in a compressed format
Slide12Row Storage Layout
IDNameAddressCityStateBal Due1Bob………3,0002Sue………5003Ann………1,7004Jim………1,5005Liz………06Dave………9,0007Sue………1,0108Bob………509Jim………1,300
1
Bob
…
…
…3,0002Sue………5003Ann………1,700
4Jim………1,5005Liz………06Dave………9,000
7Sue………1,0108Bob………509Jim………1,300
Nothing special here. This is the standard way database systems have been laying out tables on disk since the mid 1970s.Technically, it is called a “row store”
Customers Table
Slide13Column Storage Layout
IDNameAddressCityStateBal Due1Bob………3,0002Sue………5003Ann………1,7004Jim………1,5005Liz………06Dave………9,0007Sue………1,0108Bob………509Jim………1,300
Tables are stored “column-wise” with all values from a single column stored in a single block
Customers Table
ID
1
2
3
456789
NameBobSueAnnJimLizDaveSueBobJim
Address………………………
City………………………
State………………………
Bal
Due
3,000
500
1,700
1,500
0
9,000
1,010
50
1,300
Slide14Column vs Row Storage
Column Storage
Quick access to a single column
Time needed to materialize rows
Trade CPU
vs
I/O
Row Storage
Quick access to a single row
No materialization needed
Trade I/O
vs
CPU
Slide15Run Length Encoding (RLE)
QuarterQ1Q1Q1Q1Q1Q1…Q2Q2Q2Q2Q2Q2Q2Q2Q2…
Quarter
Start
Count
Q11310Q2311290………
ProdIDStartCount115263………15152563
ProdID11111222…11111222
Price
100
120
315
100
315198450320320150256450192184310251266
Price100120315100315198450320320150256450192184310251266
RLE
Compression
applied
only
when
size
of
compressed
data
is
smaller
than
original
Slide16xVelocity
Store
Dictionary Encoding
Quarter
Q1Q1Q1Q1Q2Q2…Q2Q3Q3Q3Q3Q4Q4Q4Q4…
Only 4 values.2 bits are enough torepresent it
DISTINCT
Q.IDQuarter0Q11Q22Q33Q4
Q.ID111122…233334444…
R.L.E.
Q.ID
Start
Count
1
1
4
2
5
10
3
11
4
4
15
15
Slide17xVelocity in-memory Compression
Dictionary Encoding
Always happens
Makes tables
datatype
-independent
RLE Encoding
Only if compressed data is smaller than original
Strongly depends on data order
SSAS automatically chooses best sorting
Slide18xVelocity works only in-memory
Compressed data
Must fit in memory
Otherwise, it simply does not work
On the hardware side this means
Very fast CPU
Very fast memory
Disks are not important at all
Slide19Optimize Degenerate Dimensions
Storing an ID for
DrillThrough
is expensive
One different value for every row
Large dictionary in large fact table
Consider
spliting
in more columns
Every column has a smaller dictionary
Impact on query performance
Good for drillthrough or single lookup
Bad for distinct count / filters
Slow response time
Requires memory for spooling
Slide20Split String Column
SELECT LEFT( TransactionID, 5 ) AS TransactionHighID, SUBSTRING( TransactionID, 6, LEN( TransactionID ) - 5 ) AS TransactionLowID, Quantity, PriceFROM Fact
Split a 10-character length
string into
two 5-character strings
Slide21Split Integer Column
SELECT TransactionID / 10000 AS TransactionHighID, TransactionID % 10000 AS TransactionLowID, Quantity, PriceFROM Fact
Split 100 million range in two 10.000 ranges
Beware of possible materialization later
on
Slide22Split Column Optimization
Splitting saves memory but increases process timeQuery performance penalty for materialization
Number of Columns
Process Time
Cores Used
Disk Size
1 (original)
02:48
1
2,811 MB
2
03:21
up to 8
191 MB
3
03:49
up to 8
129 MB
4
04:01
up to 8
97 MB
8
05:32
up to 8
105 MB
Slide23Which Server for SSAS?
FeatureMultidimensionalTabularRAMSome (16/32 Gb)A lot (64/128 Gb)RAM SpeedImportantCrucialNumber of cores4/8/164/8/16Core speedLess ImportantCrucialDisk speedVery ImportantUselessSSD Disk UsageStrongly recommendedUselessNetwork speedImportantImportantConcurrencyPretty goodNot enough experience…
Don’t use the same
server for both
Slide24SQL Server Appliances
SQL Engine
appliance
PowerPivot appliance
Slide25Main Limitations of Tabular
Database
must
fit
in
memory
Missing
Features
No support for
translations
No
concept
of «Set»
MDX Script
not
available
Only
plain
vanilla
1:M
relationships
Slide26Distinct Count in Multidimensional
In
Multidimensional
they
are a
pain
Dedicated
measure
group
Partitioning
needed
for performance
Based
on the
number
of
cores
Designed
for the hardware
Poor
processing performance
ORDER BY
during
processing
On the
fact
table
…
Slide27Distinct Count in Tabular
Simply
… easy
as
you
might
want
Distinct
Count
on
any
column
Dictionary
encoding
helps
a
lot
In-
memory
calculations
No special processing for
Distinct
Counts
No
partitioning
needed
No special
measure
group
Slide28What about aggregations?
Hard to design
If
you
don’t
rely
on the
wizard
Which
you
should
not
!
Improvement
on a subset of
queries
Need
to
check
them
periodically
Usage
Based
Optimizations
helps
But
it
is
hard to
manage
In Tabular?
They
are
gone
. No
aggregation
!
Slide29Tool Dimensions
Common SSAS
technique
One
tool
dimensions
modifies
calculations
Frequently
used
for the time
dimension
Technique
based
on
[
Measures
].
CurrentMember
SCOPE statement
Features
not
available
in DAX…
Complex
DAX code
needed
to solve
it
Slide30Parent / Child Hierarchies
Multidimensional
supports
P/C
hierarchies
But
they
are slow
Thus
,
you
naturalize
them
Tabular
supports
only
naturalized
hierarchies
They
are standard
hierarchies
And
they
are fast
HideMemberIf
property
not
available
Can be
created
using
DAX
Not
an easy task,
but
it
works
just fine
Slide31Unary Operators
Multidimensional
supports
unary
operators
Frequently
used
on P/C
hierarchies
Pretty
slow
calculations
Used
on
financial
cubes
On
pre-aggregated
data
Provide
a
very
nice
user
experience
No support in Tabular for
unary
operators
Use DAX to
implement
them
Formulas
tend
to be
very
complex
Slide32Custom Rollups
Unary
operators
on
steroids
Data
driven
calculations
Pretty
slow,
but
they
work
Again
, on small
financial
cubes
Or to
implement
currency
exchange
No support in Tabular
Data-
driven
cannot
be
implemented
in DAX
But
currency
exchange
can
run
faster
Slide33Hierarchy Usage
Hierarchies
in
Multidimensional
One
of the
modeling
pillars
E.g. «Ratio To
Parent
»
Hierarchies
in Tabular
Collection of
column
No
function
to
handle
them
in DAX
Ratio to
parent
is
pretty
hard
Not
a big
issue
,
after
all
Users
will
adapt
to the
limitation
Slide34Leaf-Level Calculations
Several
scenarios
Weighted
aggregations
Currency
exchange
Easy to compute in DAX
Much
faster
than
MDX
Don’t
require
changes
in the data model
Slide35Classical Weighted Avg Solution
MeanPrice :=SUM( [PriceMultipliedByQuantity] )/ SUM( [OrderQuantity] )
Slide36Weighted Aggregation in DAX
MeanPrice =SUMX( FactResellerSales, [OrderQuantity] * [UnitPrice] )/SUM( [OrderQuantity] )
Slide37MDX Script is not Available
Usages
of MDX Script
Simple
calculations
More
complex
SCOPES
Entire
business
logic
Should
you
store
business
logic
in MDX?
No,
really
Sometimes
it
is
necessary
And DAX
is
not
the best option
Anyway
,
try
to
move
business
logic
in ETL
Slide38Should I Port My Solutions?
Not
necessarily
Different
modeling
options
Calculated
columns
Complex
Relationships
in DAX
Different
calculation
techniques
Evaluation
Contexts
Porting
means
building a new
project
Slide39Multidimensional: The BI Path
Data
Model
Complexity
User
Requirements
SSAS Multidimensional
Visual Studio
Slide40Tabular: The BI Path
Data
Model
Complexity
User
Requirements
PowerPivotfor Excel
PowerPivotfor SharePoint
SSAS Tabular
EXCEL
Visual Studio
Slide41So… the conclusions?
Tabular
is
not
the new UDM
It
is
one
of the
flavors
of BISM
Side by side with
Multidimensional
DAX
is
easier
to
learn
and to use
New on SSAS? Go Tabular
New
project
?
Consider
Tabular
as
an option
Beware
of the
limitations
You
need
to
learn
DAX in
advance
Different
modeling
techniques
Slide42What Next?
Learn
DAX
Seriously
,
learn
it!
DAX
is
not
easy,
but
it
is
simple
!
Learn
different
modeling
techniques
Spend
time to
get
acquainted
with
this
new
kid
Build
some
prototype
Don’t
make
a
porting
,
build
a new
project
Learn
to
think
in DAX
Slide43Related Content
DBI305
Developing
and Managing a BI Semantic Model in
Analysis
Services
DBI413 Many-to-Many Relationships in BISM Tabular
DBI62-HOL Optimizing a MS SQL Server 2012 Tabular BI Semantic Model
Slide44Resources
Connect. Share. Discuss.
http://northamerica.msteched.com
Learning
Microsoft Certification & Training Resources
www.microsoft.com/learning
TechNet
Resources for IT Professionals
http://microsoft.com/technet
Resources for Developers
http://microsoft.com/msdn
Slide45Track Resources
@sqlserver
@ms_teched
m
v
a
Microsoft Virtual Academy
SQL Server 2012 Eval Copy
Get Certified!
Hands-On Labs
Slide46Required Slide
Complete an evaluation on CommNet and enter to win!
Slide47MS Tag
Scan the Tag
to evaluate thissession now onmyTechEd Mobile
Slide48©
2012 Microsoft
Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the
part
of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT
MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Slide49Who’s Speaking?
BI Expert and ConsultantFounder of www.sqlbi.com Problem SolvingComplex Project AssistanceDataWarehouse Assesments and DevelopmentCourses, Trainings and WorkshopsBook WriterMicrosoft Business Intelligence PartnerSSAS Maestro – MVP – MCP
Slide50