David Lomet Microsoft Research Mohamed Mokbel University of Minnesota Kevin Zhao University of California San Diego Deuteronomy Transaction Support for Cloud Data work done while at Microsoft Research ID: 756594
Download Presentation The PPT/PDF document "Justin Levandoski* University of Minneso..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Justin Levandoski*University of MinnesotaDavid LometMicrosoft ResearchMohamed Mokbel*University of MinnesotaKevin Zhao*University of California San Diego
Deuteronomy: Transaction Supportfor Cloud Data
*work done while at Microsoft ResearchSlide2
Motivation2Want ACID transactions for data anywhere in cloudCurrently, cloud providers support:Transactions for data guaranteed to exist on the same node
Eventual consistencyCurrently no support for transactions “across the cloud”Slide3
Application Motivation – Transactions AnywhereThe Cloud
My new mobile application
3Slide4
What We Want
Begin Transaction
1. Add me to Dave’s friend list
2. Add Dave to my friend list
End Transaction
What We Have Today
Eventual consistency*
click
*Thanks to
Divy
Agrawal
for example
4
Application Motivation – Transactions AnywhereSlide5
Talk OutlineApplication MotivationTechnical MotivationDeuteronomy ArchitectureTransaction Component Implementation
PerformanceWrap Up
5Slide6
Technical MotivationCIDR 2009: “Unbundling Transaction Services in the Cloud” (Lomet et al)Partition storage engine into two componentsTransactional component (TC): transactional recovery and concurrency controlData component (DC): tables, indexes, storage, and cache
6
TC
DC
1
DC
2
CIDR 2011:
“Deuteronomy: Transaction Support for Cloud Data”
Reduce number of round trips between TC and DC
Large latencies (Network overhead, context switching)
No caching at the TC
ApplicationSlide7
Technical MotivationCIDR 2009 required logging before sending operation to DCDrawback: requires two messages and/or double logging to perform operation (e.g., update)7
TC
DC
1
1. Request before image
2. Return image
3
. Log operation (
generateLSN
)
4. Send operation
5
. Perform operation
6
. Log operation success
Our new protocol logs
after
operation returns from DC
TC
DC
1
2
. Send operation
3
. Perform operation and send back before image
4
. Log operation
1
. Locking (generate LSN)
Must deal with LSNs out of order on log
Also required us to rethink TC/DC control protocolSlide8
Talk OutlineApplication MotivationTechnical MotivationDeuteronomy ArchitectureTransaction Component
ImplementationPerformanceWrap Up
8Slide9
Deuteronomy ArchitectureTransaction Component (TC)
Guarantee ACID Properties
No knowledge of physical data storage
Logical locking and logging
Physical data storage
Atomic record modifications
Data could be anywhere (cloud/local)
Storage
Data Component (DC)
Record
Operations
Control Operations
Client Request
Interaction Contract
Reliable messaging
“At least once execution”
Idempotence
“At most once execution”
Causality
“If DC remembers message, TC must also”
Contract termination
“Mechanism to release contract”
9Slide10
Cloud Storage
Local Storage
. . .
Session Manager
Lock Manager
Log Manager
Session 1
. . .
Session
N
Session 2
Record/ Table
Operations
Record/ Table
Operations
Data Component 2
Record/ Table
Operations
Data Component N
Meta-data Management
Transaction Component
Cloud Storage
Session
Table
Manager
Record
Manager
Data Component 1
TC-DC Control Operations
TC-DC Control Operations
TC-DC Control Operations
TC-DC Control Operations
ClientSlide11
Logical
locking
Cloud Storage
Local Storage
Record/ Table operations
. . .
Session 1
. . .
Session
N
Session 2
Logical
logging
Record/ Table
Operations
Record/ Table
Operations
Data Component 2
Record/ Table
Operations
Data Component N
Meta-data Management
Transaction Component
Cloud Storage
Session
Table
Manager
Record
Manager
Data Component 1
Client
Session Manager
Lock Manager
Log Manager
Thread forking and pooling
Thread aware and some thread management
Protect lock data from race condition.
Block threads for conflict
Protect data structures
Block committing threads for log flushSlide12
Talk OutlineApplication MotivationTechnical MotivationDeuteronomy Architecture
Transaction Component ImplementationPerformance
Wrap Up
12Slide13
Record Manager – An Insert Operation Example13Receive request and dispatch a session threadCall to lock managerLock resourceGenerate Log Sequence Number (LSN)
Sends LSN & operation to DCCall to log managerLog operation with LSN
TC Record Manager
Lock
Manager
Log
Manager
DC
Client & Session Manager
“insert record”
1
Lock
LSN
2
“insert record”, LSN
3
Log Record
LSN
4
1
2
3
4Slide14
Lock Manager14Deuteronomy locking hierarchy:
Support locking of range read by using partitions, not next key locking
SELECT * FROM Employees
WHERE id > 10 AND id < 40
Partitions to lock [0, 30], [30, 60]
So that inserts DO NOT have to know or test the lock on next key
In
charge of generating LSN before sending operation to data
components
Table
Record
PartitionSlide15
Log Manager15Different from conventional log manager in two aspects:Need to coordinate TC’s log with DC’s cacheWrite Ahead Logging: Which updates are allowed to be
made stable at DCLog truncation: Which updates must be made stable at DC
Complexity: LSNs are stored
out-of-order
on TC’s log, but DC only understands LSN
TC
LSN 1
LSN 2
LSN 3
LSN 4
LSN 5
DC
DC
DC
TC Physical Log
LSN 4
LSN 2
LSN 1
LSN 3
LSN 5
…
…
Log flushed
Not flushed
What can DC write to stable storage?Slide16
Control Operations: End-Of-Stable-Log16TC sends an LSN value eLSN from TC to all DCsFor DC:
Updates with LSN <= eLSN can be made stable.
Updates with LSN >
eLSN
must not
be made stable and must be forgettable.
For TC: all log records with LSN <=
eLSN
must be flushed.
LSN 4
LSN 2
LSN 1
LSN 3
LSN
6
Flushed (stable)
Not flushed
Physical Log
LSN 3
eLSN
=2
LSN Vector
LSN 5
LSN 6
LSN 1
LSN 2
LSN 4Slide17
The Deuteronomy Project
17Built TC, DCs, test environment, and performed experiments
Different “flavors” of DCs implemented
Cloud DC
Windows Azure used as cloud storage
Collaborators: Roger
Barga
, Nelson
Araujo
,
Brihadish Koushik, Shailesh Nikam
Flash DCStorage manager from Communications and Collaboration Systems group at MSRCollaborators: Sudipta Sengupta, Biplob Debnath, and Jin Li
Windows Azure
Cloud
DC
Buffer Pool
Flash
DC
Flash Storage
TC
Operation LogSlide18
Talk OutlineApplication MotivationTechnical MotivationDeuteronomy Architecture
Transaction
Component
Implementation
Performance
Wrap Up
18Slide19
Performance of TC19Adapted TPC-W benchmarkControlled DC latencies from 2ms to 100msHigh latency requires high level of multithreading, which appears to impact throughput
Ideas on improving throughputThreading mechanismImplementation language from C# to C/C++Slide20
Conclusion20Application & Technical MotivationsOverview of project, teams, and developmentArchitecture of our multithreaded TCA new TC:DC interface protocol to suit the cloud scenarioExperiments that show good performance and the impact of cloud latency on performance