/
Parallel Databases COMP3017 Advanced Databases Parallel Databases COMP3017 Advanced Databases

Parallel Databases COMP3017 Advanced Databases - PowerPoint Presentation

calandra-battersby
calandra-battersby . @calandra-battersby
Follow
393 views
Uploaded On 2018-03-16

Parallel Databases COMP3017 Advanced Databases - PPT Presentation

Dr Nicholas Gibbins nmg ecssotonacuk 20122013 Definitions 2 Parallelism An arrangement or state that permits several operations or tasks to be performed simultaneously rather than consecutively ID: 652919

scan join data customer join scan customer data parallel order node rows worker commit shared exchange hash coordinator query

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Parallel Databases COMP3017 Advanced Dat..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Parallel Databases

COMP3017 Advanced Databases

Dr Nicholas

Gibbins -

nmg

@ecs.soton.ac.uk

2012-2013Slide2

Definitions

2

Parallelism

An arrangement or state that permits several operations or tasks to be performed simultaneously rather than consecutively

Parallel Databases

have the ability to split

processing of data

access to data

across multiple processorsSlide3

Why Parallel Databases

3

Hardware trends

Reduced elapsed time for queries

Increased transaction throughput

Increased scalability

Better price/performance

Improved application availability

Access to more data

In short, for better performanceSlide4

Tightly coupledSymmetric Multiprocessor (SMP)

P = processorM = memory

Shared Memory Architecture

4

P

Global Memory

P

PSlide5

Less complex database softwareLimited scalability

Single bufferSingle database storage

Software

Shared Memory

5

P

Global Memory

P

PSlide6

Loosely coupledDistributed Memory

Shared Disc Architecture

6

P

P

P

M

M

M

SSlide7

Avoids memory bottleneckSame page may be in more than one buffer at once

– can lead to incoherenceNeeds global locking mechanismSingle logical database storage

Each processor has its own database buffer

Software

Shared Disc

7

P

P

P

M

M

M

SSlide8

Massively ParallelLoosely Coupled

High Speed Interconnect (between processors)

Shared Nothing Architecture

8

P

P

P

M

M

MSlide9

One page is only in one local buffer –

no buffer incoherenceNeeds distributed deadlock detectionNeeds multiphase commit protocol

Needs to break SQL requests into multiple sub-requests

Each processor has its own database buffer

Each processor owns part of the

data

Software - Shared Nothing

9

P

P

P

M

M

MSlide10

Hardware vs. Software Architecture

10

It is possible to use one software strategy on a different hardware arrangement

Also possible to simulate one hardware configuration on another

Virtual Shared Disk (VSD) makes an IBM SP shared nothing system look like a shared disc setup (for Oracle)

From this point on, we deal only with shared

nothi

ng

software modeSlide11

Shared Nothing Challenges

11

Partitioning the data

K

eeping the partitioned data balanced

Splitting up queries to get the work done

Avoiding distributed deadlock

Concurrency control

Dealing with node failureSlide12

Hash Partitioning

12

TableSlide13

Range Partitioning

13

A-H

I-P

Q-ZSlide14

Schema Partitioning

14

Table 1

Table 2Slide15

Rebalancing Data

15

Data in proper balance

Data grows, performance drops

Add new nodes and disc

Redistribute data to new nodesSlide16

Dividing up the Work

16

Application

Coordinator Process

Worker

Process

Worker

Process

Worker

ProcessSlide17

DB Software on each node

17

App1

DBMS

W1

W2

C1

DBMS

W1

W2

App2

DBMS

W1

W2

C2Slide18

Transaction Parallelism

18

Improves throughput

Inter-Transaction

Run many transactions in parallel, sharing the DB

Often referred to as ConcurrencySlide19

Query Parallelism

19

Improves response times (lower latency)

Intra-

o

perator (horizontal) parallelism

Operators decomposed into independent

operator instances

, which perform

the same operation on different subsets of

data

Inter-operator (vertical) parallelism

Operations are overlapped

Pipeline data from one stage to the next without materialisationSlide20

Intra-Operator Parallelism

20

SQL Query

Subset

Queries

Subset

Queries

Subset

Queries

Subset

Queries

Processor

Processor

Processor

ProcessorSlide21

Intra-Operator Parallelism

21

Example query:

SELECT c1,c2 FROM t1 WHERE c1>5.5

Assumptions:

100,000 rows

Predicates eliminate 90% of the rowsSlide22

I/O Shipping

22

Coordinator

and

Worker

Network

Worker

Worker

Worker

Worker

25,000 rows

25,000 rows

25,000 rows

25,000 rows

10,000

(c1,c2)Slide23

Function Shipping

23

Coordinator

Network

Worker

Worker

Worker

Worker

2,500 rows

2,500 rows

2,500 rows

2,500 rows

10,000

(c1,c2)Slide24

Function Shipping Benefits

24

Database operations are performed where the data are, as far as possible

Network traffic in minimised

For basic database operators, code developed for serial implementations can be reused

In practice, mixture of Function Shipping and I/O Shipping has to be employedSlide25

Inter-Operator Parallelism

25

time

Scan

Join

Sort

Scan

Join

SortSlide26

Resulting Parallelism

26

time

Scan

Join

Sort

Scan

Join

Sort

Scan

Scan

Join

Join

Sort

SortSlide27

Basic Operators

27

The DBMS has a set of basic operations which it can perform

The result of each operation is another table

Scan a table

sequentially or selectively via an index

to produce another smaller table

Join two tables together

Sort a table

Perform aggregation functions (sum, average, count)

These are combined to execute a querySlide28

The Volcano Architecture

28

The Volcano Query Processing System was devised by Goetze Graefe (Oregon)

Introduced the Exchange operator

Inserted between the steps of a query to:

Pipeline results

Direct streams of data to the next step(s), redistributing as necessary

Provides mechanism to support both vertical and horizontal parallelism

Informix ‘Dynamic Scalable Architecture’ based on VolcanoSlide29

Exchange Operators

29

Example query:

SELECT county, SUM(

order_item

)

FROM customer, order

WHERE

order.customer_id

=

customer_id

GROUP BY county

ORDER BY SUM(

order_item)Slide30

Exchange Operators

30

SORT

GROUP

HASH

JOIN

SCAN

SCAN

Customer

OrderSlide31

Exchange Operators

31

EXCHANGE

SCAN

SCAN

Customer

HASH

JOIN

HASH

JOIN

HASH

JOINSlide32

Exchange Operators

32

EXCHANGE

SCAN

SCAN

Customer

HASH

JOIN

HASH

JOIN

HASH

JOIN

EXCHANGE

SCAN

SCAN

SCAN

OrderSlide33

EXCHANGE

SCAN

SCAN

Customer

HASH

JOIN

HASH

JOIN

HASH

JOIN

EXCHANGE

SCAN

SCAN

SCAN

Order

EXCHANGE

EXCHANGE

GROUP

GROUP

SORT

33Slide34

Query ProcessingSlide35

Some Parallel Queries

35

Enquiry

Collocated Join

Directed Join

Broadcast Join

Repartitioned JoinSlide36

Orders Database

36

CUSTKEY

C_NAME

C_NATION

ORDERKEY

DATE

CUSTKEY

SUPPKEY

S

_NAME

S

_NATION

SUPPKEY

CUSTOMER

ORDER

SUPPLIERSlide37

Enquiry/Query

37

“How many customers live in the UK?”

SCAN

COUNT

COORDINATOR

SLAVE TASKS

SUM

Multiple partitions

of customer table

Return

subcounts

to coordinator

Return to application

Data from applicationSlide38

Collocated Join

38

“Which customers placed orders in July?

SCAN

UNION

ORDER

Tables both partitioned on

CUSTKEY (the same key) and

therefore corresponding entries

are on the same node

Requires a JOIN of CUSTOMER

and ORDER

SCAN

JOIN

CUSTOMERSlide39

Directed Join

39

“Which customers placed orders in July?”

(tables have different keys)

CUSTOMER

SCAN

ORDER

SCAN

JOIN

Slave Task 1

Slave Task 2

Coordinator

Return to application

ORDER partitioned on ORDERKEY, CUSTOMER partitioned on CUSTKEY

Retrieve rows from ORDER, then use ORDER.CUSTKEY to direct

appropriate rows to nodes with CUSTOMERSlide40

Broadcast Join

40

“Which customers and suppliers are in the same country?”

CUSTOMER

SCAN

SUPPLIER

SCAN

JOIN

Slave Task 1

Slave Task 2

Coordinator

Return to application

SUPPLIER partitioned on SUPPKEY, CUSTOMER on CUSTKEY.

Join required on *_NATION

Send all SUPPLIER to each CUSTOMER node

BROADCASTSlide41

Repartitioned Join

41

“Which customers and suppliers are in the same country?”

CUSTOMER

SCAN

SUPPLIER

Slave Task 1

Coordinator

SUPPLIER partitioned on SUPPKEY, CUSTOMER on CUSTKEY.

Join required on *_NATION. Repartition both tables on *_NATION to

localise and minimise the join effort

SCAN

Slave Task 2

JOIN

Slave Task 3Slide42

Query Handling

42

Critical aspect of Parallel DBMS

User wants to issue simple SQL, and not be concerned with parallel aspects

DBMS needs structures and facilities to parallelise queries

Good optimiser technology is essential

As database grows, effects (good or bad) of optimiser become increasingly significantSlide43

Further Aspects

43

A single transaction may update data in several different places

Multiple transactions may be using the same (distributed) tables simultaneously

One or several nodes cou

ld

fail

Requires concurrency control and recovery across multiple nodes for:

Locking and deadlock detection

Two-phase commit to ensure ‘all or nothing’Slide44

Deadlock DetectionSlide45

Locking and Deadlocks

45

With Shared Nothing architecture, each node is responsible for locking its own data

No global locking mechanism

However:

T1 locks item A on Node 1 and wants item B on Node 2

T2 locks item B on Node 2 and wants item A on Node 1

Distributed DeadlockSlide46

Resolving Deadlocks

46

One approach

Timeouts

Timeout T2, after wait exceeds a certain interval

Interval may need random element to avoid ‘chatter’

i.e. both transactions give up at the same time and then try again

Rollback T2 to let T1 to proceed

Restart T2, which can now completeSlide47

Resolving Deadlocks

47

More sophisticated approach (DB2)

Each node maintains a local ‘wait-for’ graph

Distributed deadlock detector (DDD) runs at the catalogue node for each database

Periodically, all nodes send their graphs to the DDD

DDD records all locks found in wait state

Transaction becomes a candidate for termination if found in same lock wait state on two successive iterationsSlide48

ReliabilitySlide49

Reliability

49

We wish to preserve the ACID properties for parallelised transactions

Isolation is taken care of by 2PL protocol

Isolation implies Consistency

Durability can be taken care of node-by-node, with proper logging and recovery routines

Atomicity is the hard part. We need to commit all parts of a transaction, or abort all parts

Two-phase commit protocol (2PC) is used to ensure that Atomicity is preservedSlide50

Two-Phase Commit (2PC)

Distinguish between:

The global transaction

The local transactions into which the global transaction is decomposed

G

lobal

transaction

is managed by a single site,

known as the

coordinator

L

ocal transactions

may be executed on separate sites, known as the participants50Slide51

Phase 1: Voting

Coordinator sends “prepare T” message to all participants

Participants respond with either “vote-commit T” or

“vote-abort T”

Coordinator

waits for participants to respond within a timeout periodSlide52

Phase 2: Decision

If all participants return “vote-commit T” (

to commit), send

“commit T” to

all participants. Wait for acknowledgements within timeout period.

If any participant returns

“vote-abort T”,

send

“abort T” to

all participants. Wait for acknowledgements within timeout period.

When all acknowledgements received, transaction is completed.

If a site does not acknowledge, resend global decision until it is acknowledged.Slide53

Normal Operation

53

C

P

prepare T

vote-commit T

commit T

ack

vote-commit T

received from all

participants Slide54

Parallel UtilitiesSlide55

Parallel Utilities

55

Ancillary operations can also exploit the parallel hardware

Parallel Data Loading/Import/Export

Parallel Index Creation

Parallel Rebalancing

Parallel Backup

Parallel Recovery