lundi 7 avril 2014

Amazon - lectures cohérentes de DynamoDB Strong, elles sont plus récentes et comment ? -Débordement de pile


In an attempt to use Dynamodb for one of projects, I have a doubt regarding the strong consistency model of dynamodb. From the FAQs



Strongly Consistent Reads — in addition to eventual consistency, Amazon DynamoDB also gives you the flexibility and control to request a strongly consistent read if your application, or an element of your application, requires it. A strongly consistent read returns a result that reflects all writes that received a successful response prior to the read.



From the definition above, what I get is that a strong consistent read will return the latest write value.


Taking an example: Lets say Client1 issues a write command on Key K1 to update the value from V0 to V1. After few milliseconds Client2 issues a read command for Key K1, then in case of strong consistency V1 will be returned always, however in case of eventual consistency V1 or V0 may be returned. Is my understanding correct?


If it is, What if the write operation returned success but the data is not updated to all replicas and we issue a strongly consistent read, how it will ensure to return the latest write value in this case?


The following link AWS DynamoDB read after write consistency - how does it work theoretically? tries to explain the architecture behind this, but don't know if this is how it actually works? The next question that comes to my mind after going through this link is: Is DynamoDb based on Single Master, multiple slave architecture, where writes and strong consistent reads are through master replica and normal reads are through others.




You can find answer to your question here: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/APISummary.html


When you issue a strongly consistent read request, Amazon DynamoDB returns a response with the most up-to-date data that reflects updates by all prior related write operations to which Amazon DynamoDB returned a successful response.


In your example, if the updateItem request to update the value from v0 to v1 was successful, the subsequent strongly consistent read request will return v1.


Hope this helps.




Short answer: Writing successfully in strongly consistent mode requires that your write succeed on a majority of servers that can contain the record, therefore any future consistent reads will always see the same data, because a consistent read must read a majority of the servers that can contain the desired record. If you do not perform a strongly consistent read, the system will ask a random server for the record, and it is possible that the data will not be up-to-date.


Imagine three servers. Server 1, server 2 and server 3. To write a strongly consistent record, you pick two servers at minimum, and write the data. Let's pick 1 and 2.


Now you want to read the data consistently. Pick a majority of servers. Let's say we picked 2 and 3.


Server 2 has the new data, and this is what the system returns.


Eventually consistent reads could come from server 1, 2, or 3. This means if server 3 is chosen by random, your new write will not appear yet, until replication occurs.


If a single server fails, your data is still safe, but if two out of three servers fail your new write may be lost until the offline servers are restored.


More explanation: DynamoDB uses a ring topology, where data is spread to many servers. Strong consistency is guaranteed because you directly query all relevant servers and get the current data from them. There is no master in the ring, there are no slaves in the ring. A given record will map to a number of identical hosts in the ring, and all of those servers will contain that record. There is no slave that could lag behind, and there is no master that can fail.


Feel free to read any of the many papers on the topic. A similar database called Apache Cassandra is available which also uses ring replication.


http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf



In an attempt to use Dynamodb for one of projects, I have a doubt regarding the strong consistency model of dynamodb. From the FAQs



Strongly Consistent Reads — in addition to eventual consistency, Amazon DynamoDB also gives you the flexibility and control to request a strongly consistent read if your application, or an element of your application, requires it. A strongly consistent read returns a result that reflects all writes that received a successful response prior to the read.



From the definition above, what I get is that a strong consistent read will return the latest write value.


Taking an example: Lets say Client1 issues a write command on Key K1 to update the value from V0 to V1. After few milliseconds Client2 issues a read command for Key K1, then in case of strong consistency V1 will be returned always, however in case of eventual consistency V1 or V0 may be returned. Is my understanding correct?


If it is, What if the write operation returned success but the data is not updated to all replicas and we issue a strongly consistent read, how it will ensure to return the latest write value in this case?


The following link AWS DynamoDB read after write consistency - how does it work theoretically? tries to explain the architecture behind this, but don't know if this is how it actually works? The next question that comes to my mind after going through this link is: Is DynamoDb based on Single Master, multiple slave architecture, where writes and strong consistent reads are through master replica and normal reads are through others.



You can find answer to your question here: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/APISummary.html


When you issue a strongly consistent read request, Amazon DynamoDB returns a response with the most up-to-date data that reflects updates by all prior related write operations to which Amazon DynamoDB returned a successful response.


In your example, if the updateItem request to update the value from v0 to v1 was successful, the subsequent strongly consistent read request will return v1.


Hope this helps.



Short answer: Writing successfully in strongly consistent mode requires that your write succeed on a majority of servers that can contain the record, therefore any future consistent reads will always see the same data, because a consistent read must read a majority of the servers that can contain the desired record. If you do not perform a strongly consistent read, the system will ask a random server for the record, and it is possible that the data will not be up-to-date.


Imagine three servers. Server 1, server 2 and server 3. To write a strongly consistent record, you pick two servers at minimum, and write the data. Let's pick 1 and 2.


Now you want to read the data consistently. Pick a majority of servers. Let's say we picked 2 and 3.


Server 2 has the new data, and this is what the system returns.


Eventually consistent reads could come from server 1, 2, or 3. This means if server 3 is chosen by random, your new write will not appear yet, until replication occurs.


If a single server fails, your data is still safe, but if two out of three servers fail your new write may be lost until the offline servers are restored.


More explanation: DynamoDB uses a ring topology, where data is spread to many servers. Strong consistency is guaranteed because you directly query all relevant servers and get the current data from them. There is no master in the ring, there are no slaves in the ring. A given record will map to a number of identical hosts in the ring, and all of those servers will contain that record. There is no slave that could lag behind, and there is no master that can fail.


Feel free to read any of the many papers on the topic. A similar database called Apache Cassandra is available which also uses ring replication.


http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf


0 commentaires:

Enregistrer un commentaire