Faculty Computer Science and Engineering York University CA Course CSE 6421 Advanced Database Systems Instructor Prof Jarek Gryz Speaker Benedikt Iltisberger ID: 500839
Download Presentation The PPT/PDF document "Energy Consumption of small Database Sys..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Energy Consumption of small Database Systems
Faculty
:
Computer Science
and
Engineering, York University (CA)
Course
:
CSE
6421
- Advanced Database
Systems
Instructor:
Prof.
Jarek
Gryz
Speaker:
Benedikt Iltisberger
Date
:
2010-
12-01Slide2
Presentation Time:
35-45
minutesQuestions: I am looking forward to answer your questions during the presentation.
Slide
2Slide3
Agenda of the Presentation
What is the topic?
Why is it important?How to solve the problem?What do you have to keep in mind?Does it work in the real world?What is the conclusion?What can be done in the future?Slide 3Slide4
What is the Topic?
Show a way how to measure the power consumption of a single query!
Slide 4
Speed
Energy
ConsumptionSlide5
What are the Scientific Challenges?
Never done before
[12]Could/should be implemented in any DBMSMultiple factors need to be consideredUpcoming and important research areaSlide 5Slide6
Energy Consumption
Slide
6Source: [7]Slide7
Why is it worth to focus on energy saving?
Saving the environment
Growing market Job opportunitySaving costs for companiesEnergy bills are rising quicklyVery important research area in EuropeInteresting from the technical point of viewSlide 7Slide8
Benchmarking Organizations
SPEC
“New SPECweb2009 benchmark adds ability to measure power consumption of web servers” (June 5, 2009)[4]TPC“Transaction Processing Performance Council Announces First Results for its TPC-Energy Specification” (July 7, 2010)[5]Slide 8Slide9
Energy consumption in the IT
Slide
9Source: [6]Slide10
What have DBs to do with that?
Core Service
Delivering data for any business processStore the foundation of a companyNeed to be accessible 24/7Today integrated even in small projectsData warehouses and Data martsLong running energy consuming queriesSlide 10Slide11
Reasons for Decision-Makers
Reducing costs
Keep up a good image (Green IT)InnovationKeep up with the timesBeing faster than competitorsDepending on the area of workAdditional company policies
Slide
11Slide12
Idea of this Proposal
Estimate the energy consumption of a single query to
Compare queries in terms of efficiencyOptimize queries for energy savingGet energy information in real timeTeach the optimizer to do this taskGreen IT initiativeSlide 12Slide13
How to achieve this goal?
DBMS (e.g. DB2) can generate reports about executed queries containing:
Original statement -> optimized statementAccess Plan with many details like:CPU Cost CPUI/O Cost HDDBufferpool Buffers Main Memory
Slide
13Slide14
What information is given?
IBM
DATABASE 2 Explain Table Format ToolDB2 version: 09.07.3Detailed information about queries includingAccess Plan (also graphical)Original query Optimized QueryDetailed information about the different costsDetailed information about tables, indexes, …Slide 14Slide15
Information from the Statistic File
Slide
15Database Context:----------------Parallelism: NoneCPU Speed: 1.889377*107Comm Speed: 100Buffer Pool size: 6000Sort Heap size: 1000
Database Heap size:
900
Lock List size:
4096
Maximum Lock List:
22
Average Applications:
1
Locks Available:
288354) SORT : (Sort)----------------Cumulative Total Cost: 27541.2
Cumulative CPU Cost: 5.39421*109Cumulative I/O Cost: 23667.2Cumulative Re-Total Cost: 27541.1Cumulative Re-CPU Cost: 5.39388*109
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost:
27541.2
Estimated
Bufferpool
Buffers:
585
Detailed information: [
9, 10, 11
]Slide16
Key Metrics in Detail
CPU Cost
Instructions needed to execute a queryI/O CostNumber of seeks and page transfersBufferpool BufferNumber of used pagesTotal CostMeasured in Timerons (proprietary IBM algo.) Slide 16Slide17
The Optimizer’s Access Plan
Slide
17Slide18
Slide 18Slide19
Slide 19
| /------------------+------------------\ 18000 211.307 299.933 TABLE: TPCDS ^HSJOIN FETCH ITEM ( 8) ( 18) Q2 26913.4 30.4522 23078.2 4 /----------+-----------\ /---+----\ 13889.9 29221.1 300 300 NLJOIN TBSCAN IXSCAN TABLE: TPCDS ( 9) ( 17) ( 19) PROMOTION
17710.9 9200.94 0.105098 Q1
14832.2 8246 0
/-------+--------\ | |
352.257 39.4311 1.9208e+006 300
TBSCAN FETCH TABLE: TPCDS INDEX: SYSIBM
( 10) ( 13) CUSTOMER_DEMOGRAPHICS SQL100701103316930
723.246 39.5703 Q4 Q1
656 32.9996
| /----+-----
\ 352.257 39.4311 2.8804e+006
SORT RIDSCN DP-TABLE: TPCDS ( 11) ( 14) STORE_SALES 723.246 7.59231 Q5 656 1 | | 352.257 39.4311 TBSCAN SORT 3 ( 12) ( 15
)Slide20
Parse and Compute
Perl script to gather important information
Grab all CPU, HDD and Memory operationsSum them upProvide script with following parameters:Energy consumption of 1 CPU cylceEnergy consumption of 1 HDD operationEnergy consumption of 1 Memory operationCalculate the query's power consumptionUsing the information from no. 1 and no. 2Slide 20Slide21
Perl Script
Slide
21Slide22
Assumed DB System
CPU
[1]Intel Xeon X7560 @2.266 GHz, 130W8 Cores, 16 Threads, 24MB L3 CacheHDD[2]Seagate Cheetah 15K, 600GB, 16MB CacheI/O Data transfer rate 600MB/s, 16.35W (avg operating power consumption)RAM[6]
4GB DDR3 RAM (2 * 2GB DIMM
#
) with 9W per DIMM
18W
Additional Hardware
[
6
]
:30% of average power consumption of other components which leads to (130W + 16.35W + (2 * 9W)) * 0.3 = 49,305WSlide 22Source: [
6], Page 1233, Section 3.2#) DIMM Dual In-Line Memory ModuleOverall:213.66WSlide23
How to measure the parameters? I
CPU
Peak power consumption (p) of the manufacturer (Watts per hour) Total MIPS per hour (c)Calculate avg power consumption (a)p / c = aProblemsSlide calculation flaws may end in high differencesMany instructions take more the one cycleThis also depends on the CPU typeCurrent multicore architecture need special customizationSlide 23Slide24
How to measure the parameters? II
Slide
24HDDPeak power consumption (p) of the manufacturer (Watts per hour) Calculate the max. avg I/O operations per hour (c)Calculate avg power consumption per I/O and hour (a)p / c = aProblemsDifferent energy consumption for read & writeRAID-n or other storage systemsSlide25
How to measure the parameters? III
Main Memory
Peak power consumption (p) of the manufacturer (Watts per hour) Calculate the max. page change rate per hour (c)Calculate avg power consumption per page change (a)p / c = aProblems:Very complicated and inpreciseStrongly dependent on memory architecture and manufacturing processSlide 25Slide26
Exemplary Technical Issue:“Samsung DDR3”
Slide
26Source: [8]Slide27
Automation of Data Collection about CPU, HDD, RAM
Proposal: “
energyDB”Name, Modell, Speed, Energy consumption, … Online service to provide needed informationComparable with CDDB[13] or freedb[14]Useful for other projects as well
Slide
27Slide28
Optimizing Queries
Areas of optimization:
Execution speed (typical parameter)When time is relevantResource usageFor many concurrent users/queriesNew: Energy ConsumptionSlide 28Slide29
Optimizing Queries No. 1
Original Query:
SELECT i_item_id, CAST(avg(CAST(ss_quantity AS DECFLOAT)) AS DECIMAL(10,6)) agg1, CAST(avg(ss_list_price) AS DECIMAL(10,6)) agg2, CAST(avg(ss_coupon_amt) AS DECIMAL(10,6)) agg3, CAST(avg
(
ss_sales_price
) AS DECIMAL(10,6)) agg4
FROM
tpcds.store_sales
,
tpcds.customer_demographics
, tpcds.date_dim
, tpcds.item, tpcds.promotion WHERE ss_sold_date_sk = d_date_sk and ss_item_sk =
i_item_sk and ss_cdemo_sk = cd_demo_sk and ss_promo_sk = p_promo_sk and cd_gender = 'M' and cd_marital_status = 'S' and cd_education_status
= 'College' and (p_channel_email = 'N' or
p_channel_event
= 'N') and
d_year
= 2000
GROUP BY
i_item_id
ORDER BY
i_item_id
FETCH FIRST
100
ROWS ONLY
Slide
29Slide30
Optimizing Queries (Results)
Original Query 1
(with index)Declined Query 1(index deleted)CPU Cost4.96336*1095.39446*109I/O Cost
8636.78
23667.20
Buffer Pool
usage*
68303.99
70816.80
Energy Consumption
TBD
TBD
Total Cost
12119.327541.2Slide 30
*) Sum of all buffer pool operations for all involved queries.Slide31
Optimizing Queries No. 2
Original Query:
SELECT i_item_id, CAST(avg(CAST(ss_quantity AS DECFLOAT)) AS DECIMAL(10,6)) agg1, CAST(avg(ss_list_price) AS DECIMAL(10,6)) agg2, CAST(avg(ss_coupon_amt) AS DECIMAL(10,6)) agg3,
CAST
(
avg
(
ss_sales_price
) AS DECIMAL(10,6)) agg4
FROM
tpcdsnp.store_sales
, tpcdsnp.customer_demographics, tpcdsnp.date_dim, tpcdsnp.item, tpcdsnp.promotion WHERE ss_sold_date_sk
= d_date_sk and ss_item_sk = i_item_sk and ss_cdemo_sk = cd_demo_sk and ss_promo_sk = p_promo_sk and
cd_gender = 'M’ and cd_marital_status
= 'S' and
cd_education_status
=
'
College'
and (
p_channel_email
= 'N'
or
p_channel_event
= 'N') and
d_year
= 2000
GROUP BY
i_item_id
ORDER BY
i_item_id
FETCH FIRST100 ROWS ONLY
Slide
31Slide32
Optimizing Queries (Results)
Original Query 2
(with index)Declined Query 2(index deleted)CPU Cost4.51373*1071.45693*109I/O Cost
654.55
8495.00
Buffer Pool
usage*
3971.55
17575.20
Energy Consumption
TBD
TBD
Total Cost
1124.438772.47Slide 32*) Sum of all buffer pool operations for all involved queries.Slide33
Optimizing Queries (Problems)
Better performance but higher energy usage:
Precomputation for faster response timesMaterialized ViewsCPUCheap in terms of time consumption butExpensive in terms of energy consumptionSlide 33Slide34
Conclusions I
Reducing costs is always welcome in
companiesHigh need for this research in the long termDue to corporate identity and rising energy costsEnvironmental savingQuery optimization does not necessarily lead to power savingProposed technique can be added to the optimizerSlide 34Slide35
Conclusions IICons
You
need to measure the single components in the first placeNot easily applicable to complex systemsAccuracy must be determinedChecking results with hardware based measurement dataHow to measure VMs?Slide 35Slide36
Future Work I
Add power consumption optimization techniques directly to the optimizer
IBM seems to be working on that[12]Optional output of estimate power consumptionImprove calculation techniquesEven slide inaccuracy can have a big impactRAM usage is pretty inaccurate at the momentAutomization of the processSlide 36Slide37
Future Work II
Implementation
of complex Dataware House systems or for distributed database systemsInaccuracy with measuring multi CPU systems
„
energyDB
“ web
service
implementation
Impact
of
hardware power savingGPU for databases, SSD, Dynamic voltage scaling, ...
Slide 37Slide38
Concluding …
This t
opic is worth for further research!Slide 38Slide39
Reference:
http
://www.intel.com/p/en_US/products/server/processor/xeon7000/specificationshttp://www.seagate.com/www/en-us/products/enterprise-hard-drives/cheetah-15k#tTabContentSpecificationshttp://www.gruene-it.de/index.php/2010/09/17/ram-stromverbrauch-server/http://www.spec.org/web2009/press/release.htmlhttp://www.tpc.org/information/press/tpcpress20100707.aspM. Poess
, R. O.
Nambiar
, Energy Cost,
“The
Key Challenge of Today’s Data Centers: A Power Consumption Analysis of TPC-C
results”,Oracle
Corporation and Hewlett-Packard Company, N.A.
.
Thank
you
for
your
attention
!
Any
questions
?
Slide
39Slide40
Reference (cont.):
http://www.bp.com/liveassets/bp_internet/globalbp/globalbp_uk_english/reports_and_publications/statistical_energy_review_2008/STAGING/local_assets/2010_downloads/statistical_review_of_world_energy_full_report_2010.
pdfhttp://www.samsung.com/global/business/semiconductor/Greenmemory/Applications/ServerStorage/ServerStorage_DDR3.htmlhttp://www-01.ibm.com/support/docview.wss?rs=0&uid=swg21207055http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r0000295.htmhttp://www.ibm.com/developerworks/data/library/techarticle/0212wieser/index.html
http://www.freepatentsonline.com/y2009/0281986.
html
http://www.gracenote.com
/
http://www.freedb.org
R.
Ramahrishnan
, J.
Gehrke
, „Database Management Systems”, Third Edition, McGraw-Hill, Boston, 2003.
Slide 40