CC5212-1 Procesamiento - PowerPoint Presentation

342 views
Uploaded On 2020-06-16

CC5212-1 Procesamiento - PPT Presentation

Masivo de Datos Otoño 2018 Lecture 8 NoSQL Overview Aidan Hogan aidhoggmailcom HadoopMapReducePigSpark Processing UnStructured Information Information Retrieval Storing Unstructured Information ID: 778925

nodes consensus distributed blue consensus nodes blue distributed pop amazon users agree key white asia bigtable there

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/778925" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download The PPT/PDF document "CC5212-1 Procesamiento" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

Slide1

CC5212-1Procesamiento Masivo de DatosOtoño 2018Lecture 8NoSQL: Overview

Aidan Hogan

aidhog@gmail.com

Slide2

Hadoop/MapReduce/Pig/Spark:Processing Un/Structured Information

Slide3

Information Retrieval:Storing Unstructured Information

Slide4

Storing Structured Information???

Slide5

Big Data: Storing Structured Information

Slide6

Relational Databases

Slide7

Relational Databases: One Size Fits All?

Slide8

Slide9

RDBMS: Performance OverheadsStructured Query Language (SQL):Declarative LanguageLots of Rich FeaturesDifficult to Optimise!Atomicity, Consistency, Isolation, Durability (ACID):Makes sure your database stays correctEven if there’s a lot of traffic!

Transactions incur a lot of overhead

Multi-phase locks, multi-versioning, write ahead logging

Distribution not straightforward

Slide10

Transactional overhead: the cost of ACID640 transactions per second for system with full transactional support (ACID)12,700 transactions per second for system without logs, transactions or lock scheduling

Slide11

RDBMS: Complexity

Slide12

Alternatives to Relational Databases For Big Data?

Slide13

NoSQL

Anybody know anything about NoSQL?

Slide14

Many types of NoSQL stores

Using the relational model

Relational Databases

with focus on scalability to compete with NoSQL

while maintaining ACID

Batch analysis of data

Not using the relational model

Real-time

Documents

Not only SQL

Maps



Column Oriented

Graph-structured data

Decentralised

Cloud storage

Slide15

http://db-engines.com/en/ranking

Slide16

NoSQL

Slide17

NoSQL: Not only SQLDistributed!Sharding: splitting data over servers “horizontally”ReplicationDifferent guarantees: typically not ACIDOften simpler languages than SQLSimpler ad hoc APIsMore work for the application

Different flavours

(for different scenarios)

Different CAP emphasis

Different scalability profiles

Different query functionality

Different data models

Slide18

Limitations of distributed computing: CAP Theorem

Slide19

But first … ACIDFor traditional (non-distributed) databases …Atomicity: Transactions all or nothing: fail cleanlyConsistency:

Doesn’t break constraints/rules

solation:

Parallel transactions act as if sequential

urability

System remembers changes

Slide20

What is CAP?Three guarantees a distributed sys. could makeConsistency:All nodes have a consistent view of the systemA

vailability

Every read/write is acted upon

artition-tolerance

The system works even if messages are lost

CA in CAP not the same as CA in ACID!!

Slide21

A Distributed System (with Replication)

–

Slide22

Consistency

–

There’s 891 users in ‘M’

Slide23

Availability

–

How many users start with ‘M’

891

Slide24

Partition-Tolerance

–

891

How many users start with ‘M’

Slide25

The CAP QuestionCan a distributed system guaranteeconsistency (all nodes have the same up-to-date view),availability (every read/write is acted upon)

and

partition-tolerance

(the system works if messages are lost)

at the same time?

What do you think?

Slide26

The CAP Answer

Slide27

The CAP TheoremA distributed system cannot guaranteeconsistency (all nodes have the same up-to-date view),

availability

(every read/write is acted upon)

and

partition-tolerance

(the system works if messages are lost)

at the same time!

Slide28

The CAP “Proof”

–

How many users start with ‘M’

There’s 891 users in ‘M’

891

There’s 892 users in ‘M’

Slide29

The CAP “Proof” (in boring words)Consider machines m1 and m2 on either side of a partition:If an update is allowed on m2 (Availability), then m

cannot see the change: (loses

onsistency

)

To make sure that

and

have the same, up-to-date view (

onsistency), neither m1 nor

can accept any requests/updates (lose

vailability

)

Thus, only when

and

can communicate (lose

artition tolerance

) can

vailability

and

onsistency

be guaranteed

Slide30

The CAP TriangleC

Choose

Two

Slide31

CAP SystemsC

(No intersection)

Guarantees to give a correct response but only while network works fine

(

Centralised / Traditional

)

Guarantees responses are correct even if there are network failures, but response may fail (

Weak availability

)

Always provides a “best-effort” response even in presence of network failures (

Eventual consistency

)

Slide32

CA System

–

How many users start with ‘M’

There’s 891 users in ‘M’

There’s 892 users in ‘M’

892

Slide33

CP System

–

How many users start with ‘M’

There’s 891 users in ‘M’

Error

There’s 892 users in ‘M’

Slide34

AP System

–

How many users start with ‘M’

There’s 891 users in ‘M’

891

There’s 892 users in ‘M’

Slide35

BASE (AP)Basically AvailablePretty much always “up”

oft State

Replicated, cached data

ventual Consistency

Stale data tolerated, for a while

In what way does Twitter act as a BASE

(

) system?

Slide36

High-fanout creates a “partition”

Users may see retweets of celebrity tweets

before the original tweet.

Later when the original tweet arrives the

timeline will be reordered and made consistent.

Slide37

CAP in practical distributed systemsC

Fix

Choose trade-off point between

and

Slide38

Partition Tolerance

Slide39

Faults

Slide40

Fail–Stop FaultA machine fails to respond or times-out often hardware or loadneed at least f + 1 replicated machines f

= number of fail-stop failures

Word

Count

de 4.575.144

la 2.160.185

n 2.073.216

l 1.844.613

1.479.936

…

Slide41

Byzantine FaultA machine responds incorrectly/maliciously

Word

Count

de 4.575.144

la 2.160.185

n 2.073.216

l 1.844.613

1.479.936

…

el 4.575.144

2.160.185

sé

2.073.216

1.844.613

al 1.479.936

…

de 4.575.144

la 2.160.185

n 2.073.216

l 1.844.613

1.479.936

…

How many working machines do we need in the general case to be

robust against Byzantine faults?

Slide42

Byzantine FaultA machine responds incorrectly/maliciouslyNeed at least 2f +1 replicated machines

= number of (possibly Byzantine) failures

Word

Count

de 4.575.144

la 2.160.185

n 2.073.216

l 1.844.613

1.479.936

…

el 4.575.144

2.160.185

sé

2.073.216

1.844.613

al 1.479.936

…

de 4.575.144

la 2.160.185

n 2.073.216

l 1.844.613

1.479.936

…

Slide43

Distributed Consensus

Slide44

Distributed ConsensusColour of the dress?

Slide45

Consensus.

Distributed Consensus

Strong consensus:

All nodes need to agree

Blue

Slide46

Distributed ConsensusStrong consensus: All nodes need to agree

Blue

White

Blue

No consensus.

Slide47

Distributed ConsensusMajority consensus: A majority of nodes need to agree

Blue

White

Consensus.

Slide48

Distributed ConsensusMajority consensus: A majority of nodes need to agree

Blue

White

Consensus.

Slide49

Distributed ConsensusMajority consensus: A majority of nodes need to agree

Blue

Green

White

No consensus.

Slide50

Distributed ConsensusPlurality consensus: A plurality of nodes need to agree

Blue

Green

White

Orange

Consensus.

Slide51

Distributed ConsensusPlurality consensus: A plurality of nodes need to agree

Blue

Green

White

No consensus.

Slide52

Distributed ConsensusQuorum consensus: n nodes need to agree

Blue

White

= 3 Consensus.

= 4 No consensus.

Slide53

Distributed ConsensusQuorum consensus: n nodes

need to

agree

Blue

Green

White

= 2 Consensus.

(First 2 machines asked, but not unique!)

Slide54

Distributed ConsensusQuorum consensus: n nodes need to agree

Blue

Green

White

Value of

needed for unique consensus with

nodes?

N/2

Slide55

Distributed ConsensusConsensus off: Take first answer

Blue

Green

White

Orange

Consensus.

Slide56

Distributed Consensus

Strong consensus:

All nodes need to agree

Majority consensus:

A majority of nodes need to

agree

Plurality consensus:

A plurality of nodes need to

agree

Quorom

consensus:

“

Fixed”

nodes need to agree

Consensus off:

Take first

answer

CP vs. AP?

Slide57

More replication

Less replication

Distributed Consensus

Strong consensus:

All nodes need to agree

Majority consensus:

A majority of nodes need to

agree

Plurality consensus:

A plurality of nodes need to

agree

Quorom

consensus:

“

Fixed”

nodes need to agree

Consensus off:

Take first

answer

Scale?

Slide58

Distributed ConsensusStrong consensus: All nodes need to agreeMajority consensus: A majority of nodes need to agree

Plurality consensus:

A plurality of nodes need to

agree

Quorom

consensus:

“

Fixed”

nodes need to agree

Consensus off:

Take first

answer

Choice

application

dependent

Many

NoSQL

stores

allow

you

choose

level

consensus

replication

Slide59

NoSQL: KEY–VALUE STORE

Slide60

The Database Landscape

Using the relational model

Relational Databases

with focus on scalability to compete with NoSQL

while maintaining ACID

Batch analysis of data

Not using the relational model

Real-time

Stores documents (semi-structured values)

Not only SQL

Maps



Column Oriented

Graph-structured data

In-Memory

Cloud storage

Slide61

Key–Value Store Model

It’s just a Map / Associate

Array / Dictionary



put(

key,value

)

get(key)

delete(key)

Key

Value

Afghanistan

Kabul

Albania

Tirana

Algeria

Algiers

Andorra la Vella

Angola

Luanda

Antigua and Barbuda

St.

John’s

…

Slide62

But You Can Do a Lot With a Map… actually you can model any data in a map (but possibly with a lot of redundancy and inefficient lookups if unsorted).

Key

Value

country:Afghanistan

capital@city:Kabul,continent:Asia,pop:31108077#2011

country:Albania

capital@city:Tirana,continent:Europe,pop:3011405#2013

…

city:Kabul

country:Afghanistan,pop:3476000#2013

city:Tirana

country:Albania,pop:3011405#2013

…

user:10239

basedIn@city:Tirana,post

:{103,10430,201}

…

Slide63

The Case of Amazon

Slide64

The Amazon Scenario

Products Listings: prices, details, stock

Slide65

The Amazon ScenarioCustomer info: shopping cart, account, etc.

Slide66

The Amazon ScenarioRecommendations, etc.:

Slide67

The Amazon ScenarioAmazon customers:

Slide68

The Amazon Scenario

Slide69

The Amazon Scenario

Databases struggling …

But many Amazon services don’t need:

SQL

(a simple map often enough)

or even:

transactions

strong consistency

, etc.

Slide70

Key–Value Store: Amazon Dynamo(DB)

Goals:

Scalability

(able to grow)

High availability

(reliable)

Performance

(fast)

Don’t need full SQL, don’t need full ACID

Slide71

Key–Value Store: Distribution

How might we distribute

key–value store over multiple machines?

Slide72

Key–Value Store: Distribution

What happens if a machine leaves or joins afterwards?

How can we avoid rehashing everything?

Slide73

Consistent HashingAvoid re-hashing everythingHash using a ringEach machine picks n pseudo-random points on the ringMachine responsible for arc after its pointIf a machine leaves, its range moves to previous machineIf machine joins, it picks new points

Objects

mapped to ring



How many keys (on average) would

need to be moved if a machine

joins or leaves?

Slide74

Amazon Dynamo: HashingConsistent Hashing (128-bit MD5)

Slide75

Amazon Dynamo: ReplicationA set replication factor (e.g., 3)Commonly primary / secondary replicasPrimary replica elected from secondary replicas in the case of failure of primary

Slide76

Amazon Dynamo: ReplicationReplication factor of n?Easy: pick n next buckets (different machines!)

Slide77

Amazon Dynamo: Object VersioningObject Versioning (per bucket)PUT doesn’t overwrite: pushes versionGET returns most recent version

Slide78

Amazon Dynamo: Object VersioningObject Versioning (per bucket)DELETE doesn’t wipeGET will return not found

Slide79

Amazon Dynamo: Object VersioningObject Versioning (per bucket)GET by version

Slide80

Amazon Dynamo: Object VersioningObject Versioning (per bucket)PERMANENT DELETE by version … wiped

Slide81

Amazon Dynamo: ModelCountriesPrimary Key

Value

Afghanistan

capital:Kabul,continent:Asia,pop:31108077#2011

Albania

capital:Tirana,continent:Europe,pop:3011405#2013

…

Named table with primary key and a value

Primary key is hashed / unordered

Cities

Primary Key

Value

Kabul

country:Afghanistan,pop:3476000#2013

Tirana

country:Albania,pop:3011405#2013

…

Slide82

Amazon Dynamo: CAP

Two options for each table:

Eventual consistency

High availability

Strong consistency

Lower availability

What’s a CP system again?

What’s an AP system again?

Slide83

Amazon Dynamo: ConsistencyGossipingKeep-alive messages sent between nodes with stateDynamo largely decentralised (no master node)Quorums:Multiple nodes responsible for a read (R) or write (W)At least

nodes acknowledge for success

Higher

= Higher consistency, lower availability

Hinted Handoff

For transient failures

A node “covers” for another node while it is down

Slide84

Amazon Dynamo: Consistency

Vector Clock

A list of pairs indicating a node and time stamp

Used to track branches of revisions

Slide85

Amazon Dynamo: ConsistencyTwo versions of one shopping cart:

Application knows best

(… and must support multiple versions being returned)

How best to merge multiple conflicting versions of a value

(

known as

reconciliation

Slide86

Amazon Dynamo: ConsistencyHow can we efficiently verify that two copies of a block of

data are the same (and find where the differences are)?

Slide87

Amazon Dynamo: Merkle TreesMerkle tree: A hash treeLeaf node compute hashes from dataNon-leaf nodes have hashes of their childrenCan find differences between two trees level-by-level

Slide88

Aside: Merkle Trees also used in ...

Slide89

CC5212-1 Procesamiento - PowerPoint Presentation

CC5212-1 Procesamiento - PPT Presentation

Share:

Link:

Embed:

Related Contents