slides.02222011 - Background Distributed Key/Value stores...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Background Distributed Key/Value stores provide a simple put / get interface Great properties: scalability, availability, reliability Increasingly popular both within data centers Dynamo Cassandra Voldemort 2 Dynamo: Amazon's Highly Available Key-value Store Giuseppe DeCandia etc. Presented by: Tony Huang 3 Motivation Highly scalable and reliable. Tight control over the trade-offs between availability, consistency, cost-effectiveness and performance. Flexible enough to let designer to make trade- offs. Simple primary-key access to data store. Best seller list, shopping carts, customer preference, session management, sale rank, etc. 4 Assumptions and Design Consideration Query Model Simple read and write operations to a data item that is uniquely identified by a key. Small objects, ~1MB. ACID (Atomicity, Consistency, Isolation, Durability) Trade consistency for availability. Does not provide any isolation guarantees. Efficiency Stringent SLA requirement. Assumed non-hostile environment. No authentication or authorization. Conflict resolution is executed during read instead of write. Always writable. Performed either by data store or application 5 Amazon's Platform Architecture 6 Techniques Problem Technique Advantage Partitioning Consistent Hashing Incremental Scalability High Availability for writes Vector clocks with reconciliation during reads Version size is decoupled from update rates. Handling temporary failures Sloppy Quorum and hinted handoff Provides high availability and durability guarantee when some of the replicas are not available. Recovering from permanent failures Anti-entropy using Merkle trees Synchronizes divergent replicas in the background. Membership and failure detection Gossip-based membership protocol and failure detection. Preserves symmetry and avoids having a centralized registry for storing membership and node liveness information. 7 Partitioning Consistent hashing: the output range of a hash function is treated as a fixed circular space or ring. Virtual Nodes: Each node can be responsible for more than one virtual node. Node fails: load evenly dispersed across the rest. Node joins: its virtual nodes accept a roughly equivalent amount of load from the rest. Heterogeneity. 8 Load Distribution Strategy 1: T random tokens per node and and partition by token value. Ranges vary in size and frequently change. Long bootstrapping. Difficult to take a snapshot. 9 Load Distribution Strategy 2: T random tokens per node, partition by token value. Turn out to be the worst, why? Strategy 3: Q/S tokens per node, equal-sized partitions....
View Full Document

Page1 / 42

slides.02222011 - Background Distributed Key/Value stores...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online