/
HDB++: High Availability with l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg HDB++: High Availability with l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

HDB++: High Availability with l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg - PowerPoint Presentation

ellena-manuel
ellena-manuel . @ellena-manuel
Follow
342 views
Uploaded On 2019-11-05

HDB++: High Availability with l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg - PPT Presentation

HDB High Availability with l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Page 1 Overview What is Cassandra C Who is using C CQL C architecture Request Coordination Consistency Monitoring tool ID: 763388

page node tango 2015 node page 2015 tango meeting reynald bourtembourg request cassandra read write coordinator client consistency driver

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "HDB++: High Availability with l TANGO Me..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

HDB++: High Availability with l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Page 1

Overview What is Cassandra (C*)? Who is using C*? CQL C* architectureRequest CoordinationConsistencyMonitoring toolHDB++ Page 2 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Overview What is Cassandra (C*)? Who is using C *? CQL C* architecture Request Coordination ConsistencyMonitoring tool HDB++ Page 3 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

What is Cassandra? Mythology: an excellent Oracle not believed. A massively scalable open source NoSQL (Not Only SQL) databaseCreated by FacebookOpen Source since 2008Apache license, 2.0, compatible with GPLV3 Page 4 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

What is Cassandra? Peer to peer architecture No Single Point of Failure Replication Continuous AvailabilityMulti Data Centers support100s to 1000s nodesJavaHigh Write Throughput Read efficiency Page 5 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

What is Cassandra? Page 6 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Source: http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Overview What is Cassandra (C*)? Who is using C *? CQL C* architecture Request Coordination ConsistencyMonitoring tool HDB++ Page 7 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Who is using Cassandra? Page 8l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Overview What is Cassandra (C*)? Who is using C *? CQL C* architectureRequest CoordinationConsistency Monitoring tool HDB++ Page 9 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Cassandra Query Language CQL : Cassandra Query Language Very similar to SQLBut restrictions and limitationsJOIN requests are forbiddenNo subqueries String comparisons are limited (when not using SOLR) select * from my_table where mystring like ‘%tango %’ No OR operator Can only apply a WHERE condition on an indexed column (or primary key ) Page 10 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Cassandra Query Language Collections (64K Limitation): list set m apTTL INSERT = UPDATE (UPSERT) Doc: http://www.datastax.com/documentation/cql/3.1/cql/cql_intro_c.html cqlsh Page 11 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Cassandra Query Language CREATE TABLE IF NOT EXISTS att_scalar_devdouble_ro ( att_conf_id timeuuid, period text, data_time timestamp, data_time_us int , value_r double, quality int, error_desc text, PRIMARY KEY (( att_conf_id ,period), data_time,data_time_us ) ) WITH comment='Scalar DevDouble ReadOnly Values Table‘; Page 12 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Cassandra Query Language CREATE TABLE IF NOT EXISTS att_scalar_devdouble_ro ( att_conf_id timeuuid, period text , data_time timestamp , data_time_us int , value_r double , quality int , error_desc text , PRIMARY KEY ( ( att_conf_id ,period) , data_time,data_time_us ) ); Page 13 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Partition key Clustering columns

Overview What is Cassandra (C*)? Who is using C *? CQL C* architecture Request Coordination Consistency Monitoring tool HDB++ Page 14 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Cassandra Architecture Node : one Cassandra instance (Java process) Page 15 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 5 Node 6 Node 3 Node 4 Node 7 Node 8 Token Range +2 63 -1 -2 63

Cassandra Architecture Partition : ordered and replicable unit of data on a node identified by a token Partitioner (based on mumur3 algorithm by default) will distribute the data across the nodes.Page 16l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 5 Node 6 Node 3 Node 4 Node 7 Node 8 Token Range + 2 63 -1 -2 63

Cassandra Architecture Page 17l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Rack : logical set of nodes Rack 1 Rack 2 Rack 4 Rack 3 Node 1 Node 5 Node 7 Node 3 Node 2 Node 6 Node 4 Node 8 Token Range -2 63 + 2 63 -1

Cassandra Architecture Page 18l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Data Center : logical set of racks Rack 1 Rack 2 Rack 4 Rack 3 Node 1 Node 5 Node 7 Node 3 Node 2 Node 6 Node 4 Node 7 Data Center 1 Data Center 2 Token Range + 2 63 -1 -2 63

Request Coordination Page 19 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Cluster : full set of nodes which maps to a single complete token ring Rack 1 Rack 2 Rack 4 Rack 3 Node 1 Node 5 Node 7 Node 3 Node 2 Node 6 Node 4 Node 7 Data Center 1 Data Center 2 Cassandra Cluster Token Range + 2 63 -1 -2 63

Overview What is Cassandra (C*)? Who is using C *? CQL C* architecture Request Coordination Consistency Monitoring tool HDB++ Page 20 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Request Coordination Page 21 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Coordinator : the node chosen by the client to receive a particular read or write request to its cluster Data Center 1 Node 1 Node 2 Node 4 Node 3 Client

Request Coordination Page 22 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Coordinator : the node chosen by the client to receive a particular read or write request to its cluster Data Center 1 Node 1 Node 2 Node 4 Node 3 Client Coordinator

Request Coordination Page 23 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Coordinator : the node chosen by the client to receive a particular read or write request to its cluster Data Center 1 Node 1 Node 2 Node 4 Node 3 Client Read/Write Coordinator

Request Coordination Page 24 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Any node can coordinate any request Each client request may be coordinated by a different node Data Center 1 Node 1 Node 2 Node 4 Node 3 Client Acknowledge Coordinator No Single Point of Failure

Request Coordination Page 25 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg The Cassandra driver chooses the coordinator node Round-Robin pattern, token-aware pattern Client library to manage requests Many open source drivers for many programming languages Node 1 Node 2 Node 4 Node 3 Client Coordinator Driver Java Python C++ C# Node.js PHP Perl Go Clojure Haskell R (GNU S) Ruby Scala Erlang ODBC Rust

Request Coordination Page 26 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg The coordinator manages the replication process Replication Factor (RF) : onto how many nodes should a write be copied The write will occur on the nodes responsible for that partition 1 ≤ RF ≤ ( # nodes in cluster) Every write is time-stamped Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3

Request Coordination Page 27 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 The coordinator manages the replication process Replication Factor (RF) : onto how many nodes should a write be copied The write will occur on the nodes responsible for that partition 1 ≤ RF ≤ (#nodes in cluster) Every write is time-stamped

Overview What is Cassandra (C*)? Who is using C *? CQL C* architecture Request Coordination Consistency Monitoring tool HDB++ Page 28 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Consistency Page 29 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 4 The coordinator applies the Consistency Level (CL) Consistency Level (CL) : Number of nodes which must acknowledge a request Examples of CL : ONE TWO THREE ANY ALL QUORUM (= RF/2 + 1) EACH_QUORUM LOCAL_QUORUM CL may vary for each request On success, the coordinator notifies the client (with most recent partition data in case of read request)

Consistency ONE - READ - Single DC Page 30l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Direct Read Request Digest Read Request (Hash) + eventual read repair

Consistency ONE - READ - Single DC Page 31 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Direct Read Request Digest Read Request (Hash) + eventual read repair

Consistency ONE – READ - Single DC Page 32 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Direct Read Request Digest Read Request (Hash) + eventual read repair

Consistency ONE - READ - Single DC Page 33 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Direct Read Request Digest Read Request (Hash) + eventual read repair

Consistency QUORUM – READ - Single DC Page 34 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Direct Read Request Digest Read Request (Hash)

Consistency QUORUM – READ - Single DC Page 35 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Direct Read Request Digest Read Request (Hash)

Consistency QUORUM – READ - Single DC Page 36 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Direct Read Request Digest Read Request (Hash)

Consistency QUORUM – READ - Single DC Page 37 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Direct Read Request Digest Read Request (Hash) In case of inconsistency: the most recent data is returned

Consistency QUORUM – READ - Single DC Page 38 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Direct Read Request Digest Read Request (Hash) Read repair if needed

Consistency ONE – WRITE - Single DC Page 39 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Write Request

Consistency ONE – WRITE - Single DC Page 40 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 ACK ACK

Consistency ONE – WRITE - Single DC Page 41 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Write Request

Consistency ONE – WRITE - Single DC Page 42 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 ACK ACK SUCCESS

Consistency ONE – WRITE - Single DC Page 43 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 ACK ACK hint max_hint_window_in_ms property in cassandra.yaml file Hinted handoff mechanism SUCCESS

Consistency ONE – WRITE - Single DC Page 44 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Write Request hint max_hint_window_in_ms property in cassandra.yaml file Hinted handoff mechanism

Consistency ONE – WRITE - Single DC Page 45 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Write Request hint max_hint_window_in_ms property in cassandra.yaml file Hinted handoff mechanism

Consistency ONE – WRITE - Single DC Page 46 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Hinted handoff mechanism

Consistency Page 47 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 4 i f node downtime > max_hint_window_in_ms Anti-entropy node repair

Consistency QUORUM – WRITE - Single DC Page 48 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Write Request

Consistency QUORUM – WRITE - Single DC Page 49 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 ACK ACK

Consistency QUORUM – WRITE - Single DC Page 50 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 ACK ACK SUCCESS

Consistency QUORUM – WRITE - Single DC Page 51 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Write Request

Consistency QUORUM – WRITE - Single DC Page 52 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 ACK ACK SUCCESS

Consistency QUORUM – WRITE - Single DC Page 53 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 Write Request

Consistency QUORUM – WRITE - Single DC Page 54 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Node 1 Node 2 Node 4 Node 3 Coordinator Client Driver RF=3 Node 5 Node 6 ACK ACK FAILURE

Overview What is Cassandra (C*)? Who is using C *? CQL C* architecture Request Coordination Consistency Monitoring tool HDB++ Page 55 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Monitoring tool: OpsCenter Page 56 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg http:// cassandra2:8888

Overview What is Cassandra (C*)? Who is using C *? CQL C* architecture Request Coordination Consistency Monitoring tool HDB++ Page 57 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

HDB++ Page 58 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg h db ++ es-srv h db ++cm- srv l ibhdb ++ l ibhdb ++ cassandra <<implements>> l ibhdb ++ mysql <<implements>> h db ++ es-srv h db ++ es-srv h db ++ es-srv h db ++ es-srv h db ++cm- srv h db ++ es-srv h db ++ es-srv h db ++ es-srv <<use>> <<use>> MySQL Cassandra Cassandra Cassandra

Conclusion: C* pros Page 59 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg High Availaibility SW upgrade with no downtime HW failure Linear Scalability Need more performances? => Add nodes Big community with industrial support Can use Apache Spark for analytics (distributed processing) List, Set, Map data types (tuples and user defined types soon) Tries not to let you do actions which do not perform well Backups = snapshot = hard links => very fast Difficult to lose data Good fit for time series data

Conclusion: C* Cons Page 60l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg Requires more total disk space and machines sstable format can change from one version to another No easy way to come back to a previous version once the sstables have been converted to a newer version Cannot rename keyspaces or tables easily (not foreseen in CQL) Difficult to modify existing partitions (Needs to duplicate the data at some point in the process) Different way of modelling Not designed for huge read requests Can be tricky to tune to avoid long GC pauses Maintenance: Need to run nodetool repair regularly if some data are deleted to avoid resurrections (CPU intensive operation) Can take quite some time to redeem disk space after deletion in some specific cases.

The End

Useful links Page 62 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg http ://cassandra.apache.org Planet Cassandra ( http://planetcassandra.org ) Datastax academy ( https://academy.datastax.com ) Cassandra Java Driver getting started ( https://academy.datastax.com/demos/cassandra-java-driver-getting-started ) Cassandra C++ Driver: https://github.com/datastax/cpp-driver Datastax documentation ( http://www.datastax.com/docs ) Users mailing list: user-subscribe@cassandra.apache.org #Cassandra channel on IRC (http://webchat.freenode.net/?channels=#Cassandra)

Cassandra FUTURE DEPLOYMENT Page 63 l Cassandra HDB++ Implementation Status l 9 th April 2015 l Accelerator Control Unit DC Prod1 partition/hourKeyspace prod RF:3(write LOCAL_QUORUM)7200 RPM Disks Big CPU - 64GB RAM DC Analytics 1 Keyspace prod RF:3 (read LOCAL_QUORUM) Keyspace analytics RF:3 (write LOCAL_QUORUM) SSD Disks Big CPU – 128 GB RAM DC Analytics 2 Keyspace analytics RF:5 (read LOCAL_QUORUM) 7200 RPM Disks Tiny CPU – 32 GB RAM

Cassandra FUTURE DEPLOYMENT Page 64 l Cassandra HDB++ Implementation Status l 9 th April 2015 l Accelerator Control Unit DC Prod1 partition/hourKeyspace prod RF:3(write LOCAL_QUORUM)7200 RPM Disks Big CPU - 64GB RAM DC Analytics 1 Keyspace prod RF:3 (read LOCAL_QUORUM) Keyspace analytics RF:3 (write LOCAL_QUORUM) SSD Disks Big CPU – 128 GB RAM

Cassandra’s node-based architecture Page 65 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Basic Write Path Concept Page 66 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Basic READ Path Concept Page 67 l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg