I'm unclear as to what benefits I get from EBS vs. instance-store for my instances on Amazon EC2. If anything, it seems that EBS is way more useful (stop, start, persist + better speed) at relatively little difference in cost...? Also, is there any metric as to whether more people are using EBS now that it's available, considering it is still relatively new?
The bottom line is you should almost always use EBS backed instances.
Here's why
- EBS backed instances can be set so that they cannot be (accidentally) terminated through the API.
- EBS backed instances can be stopped when you're not using them and resumed when you need them again (like pausing a Virtual PC), at least with my usage patterns saving much more money than I spend on a few dozen GB of EBS storage.
- EBS backed instances don't lose their instance storage when they crash (not a requirement for all users, but makes recovery much faster)
- You can dynamically resize EBS instance storage.
- You can transfer the EBS instance storage to a brand new instance (useful if the hardware at Amazon you were running on gets flaky or dies, which does happen from time to time)
- It is faster to launch an EBS backed instance because the image does not have to be fetched from S3.
I'm a heavy user of Amazon and switched all of my instances to EBS backed storage as soon as the technology came out of beta. I've been very happy with the result.
Keep in mind that any piece of cloud-based infrastructure can fail at any time. Plan your infrastructure accordingly. While EBS-backed instances provide certain level of durability compared to ephemeral storage instances, they can and do fail. Have an AMI from which you can launch new instances as needed in any availability zone, back up your important data (e.g. databases), and if your budget allows it, run multiple instances of servers for load balancing and redundancy (ideally in multiple availability zones).
99% of our AWS setup is recyclable. So for me it doesn't really matter if I terminate an instance -- nothing is lost ever. E.g. my application is automatically deployed on an instance from SVN, our logs are written to a central syslog server.
The only benefit of instance storage that I see are cost-savings. Otherwise EBS-backed instances win. Eric mentioned all the advantages.
[2012-07-16] I would phrase this answer a lot different today.
I haven't had any good experience with EBS-backed instances in the past year or so. The last downtimes on AWS pretty much wrecked EBS as well.
I am guessing that a service like RDS uses some kind of EBS as well and that seems to work for the most part. On the instances we manage ourselves, we have got rid off EBS where possible.
Getting rid to an extend where we moved a database cluster back to iron (= real hardware). The only remaining piece in our infrastructure is a DB server where we stripe multiple EBS volumes into a software RAID and backup twice a day. Whatever would be lost in between backups, we can live with.
EBS is a somewhat flakey technology since it's essentially a network volume: a volume attached to your server from remote. I am not negating the work done with it – it is an amazing product since essentially unlimited persistent storage is just an API call away. But it's hardly fit for scenarios where I/O performance is key.
And in addition to how network storage behaves, all network is shared on EC2 instances. The smaller an instance (e.g. t1.micro, m1.small) the worse it gets because your network interfaces on the actual host system are shared among multiple VMs (= your EC2 instance) which run on top of it.
The larger instance you get, the better it gets of course. Better here means within reason.
When persistence is required, I would always advice people to use something like S3 to centralize between instances. S3 is a very stable service. Then automate your instance setup to a point where you can boot a new server and it gets ready by itself. Then there is no need to have network storage which lives longer than the instance.
So all in all, I see no benefit to EBS-backed instances what so ever. I rather add a minute to bootstrap, then run with a potential SPOF.
We like instance-store. It forces us to make our instances completely recyclable, and we can easily automate the process of building a server from scratch on a given AMI. This also means we can easily swap out AMIs. Also, EBS still has performance problems from time to time.
Eric pretty much nailed it. We (Bitnami) are a popular provider of free AMIs for popular applications and development frameworks (PHP, Joomla, Drupal, you get the idea). I can tell you that EBS-backed AMIs are significantly more popular than S3-backed. In general I think s3-backed instances are used for distributed, time-limited jobs (for example, large scale processing of data) where if one machine fails, another one is simply spinned up. EBS-backed AMIS tend to be used for 'traditional' server tasks, such as web or database servers that keep state locally and thus require the data to be available in the case of crashing.
One aspect I did not see mentioned is the fact that you can take snapshots of an EBS-backed instance while running, effectively allowing you to have very cost-effective backups of your infrastructure (the snapshots are block-based and incremental)
I've had the exact same experience as Eric at my last position. Now in my new job, I'm going through the same process I performed at my last job... rebuilding all their AMIs for EBS backed instances - and possibly as 32bit machines (cheaper - but can't use same AMI on 32 and 64 machines).
EBS backed instances launch quickly enough that you can begin to make use of the Amazon AutoScaling API which lets you use CloudWatch metrics to trigger the launch of additional instances and register them to the ELB (Elastic Load Balancer), and also to shut them down when no longer required.
This kind of dynamic autoscaling is what AWS is all about - where the real savings in IT infrastructure can come into play. It's pretty much impossible to do autoscaling right with the old s3 "InstanceStore"-backed instances.
I'm just starting to use EC2 myself so not an expert, but Amazon's own documentation says:
we recommend that you use the local instance store for temporary data and, for data requiring a higher level of durability, we recommend using Amazon EBS volumes or backing up the data to Amazon S3.
Emphasis mine.
I do more data analysis than web hosting, so persistence doesn't matter as much to me as it might for a web site. Given the distinction made by Amazon itself, I wouldn't assume that EBS is right for everyone.
I'll try to remember to weigh in again after I've used both.
I'm unclear as to what benefits I get from EBS vs. instance-store for my instances on Amazon EC2. If anything, it seems that EBS is way more useful (stop, start, persist + better speed) at relatively little difference in cost...? Also, is there any metric as to whether more people are using EBS now that it's available, considering it is still relatively new?
The bottom line is you should almost always use EBS backed instances.
Here's why
- EBS backed instances can be set so that they cannot be (accidentally) terminated through the API.
- EBS backed instances can be stopped when you're not using them and resumed when you need them again (like pausing a Virtual PC), at least with my usage patterns saving much more money than I spend on a few dozen GB of EBS storage.
- EBS backed instances don't lose their instance storage when they crash (not a requirement for all users, but makes recovery much faster)
- You can dynamically resize EBS instance storage.
- You can transfer the EBS instance storage to a brand new instance (useful if the hardware at Amazon you were running on gets flaky or dies, which does happen from time to time)
- It is faster to launch an EBS backed instance because the image does not have to be fetched from S3.
I'm a heavy user of Amazon and switched all of my instances to EBS backed storage as soon as the technology came out of beta. I've been very happy with the result.
Keep in mind that any piece of cloud-based infrastructure can fail at any time. Plan your infrastructure accordingly. While EBS-backed instances provide certain level of durability compared to ephemeral storage instances, they can and do fail. Have an AMI from which you can launch new instances as needed in any availability zone, back up your important data (e.g. databases), and if your budget allows it, run multiple instances of servers for load balancing and redundancy (ideally in multiple availability zones).
99% of our AWS setup is recyclable. So for me it doesn't really matter if I terminate an instance -- nothing is lost ever. E.g. my application is automatically deployed on an instance from SVN, our logs are written to a central syslog server.
The only benefit of instance storage that I see are cost-savings. Otherwise EBS-backed instances win. Eric mentioned all the advantages.
[2012-07-16] I would phrase this answer a lot different today.
I haven't had any good experience with EBS-backed instances in the past year or so. The last downtimes on AWS pretty much wrecked EBS as well.
I am guessing that a service like RDS uses some kind of EBS as well and that seems to work for the most part. On the instances we manage ourselves, we have got rid off EBS where possible.
Getting rid to an extend where we moved a database cluster back to iron (= real hardware). The only remaining piece in our infrastructure is a DB server where we stripe multiple EBS volumes into a software RAID and backup twice a day. Whatever would be lost in between backups, we can live with.
EBS is a somewhat flakey technology since it's essentially a network volume: a volume attached to your server from remote. I am not negating the work done with it – it is an amazing product since essentially unlimited persistent storage is just an API call away. But it's hardly fit for scenarios where I/O performance is key.
And in addition to how network storage behaves, all network is shared on EC2 instances. The smaller an instance (e.g. t1.micro, m1.small) the worse it gets because your network interfaces on the actual host system are shared among multiple VMs (= your EC2 instance) which run on top of it.
The larger instance you get, the better it gets of course. Better here means within reason.
When persistence is required, I would always advice people to use something like S3 to centralize between instances. S3 is a very stable service. Then automate your instance setup to a point where you can boot a new server and it gets ready by itself. Then there is no need to have network storage which lives longer than the instance.
So all in all, I see no benefit to EBS-backed instances what so ever. I rather add a minute to bootstrap, then run with a potential SPOF.
We like instance-store. It forces us to make our instances completely recyclable, and we can easily automate the process of building a server from scratch on a given AMI. This also means we can easily swap out AMIs. Also, EBS still has performance problems from time to time.
Eric pretty much nailed it. We (Bitnami) are a popular provider of free AMIs for popular applications and development frameworks (PHP, Joomla, Drupal, you get the idea). I can tell you that EBS-backed AMIs are significantly more popular than S3-backed. In general I think s3-backed instances are used for distributed, time-limited jobs (for example, large scale processing of data) where if one machine fails, another one is simply spinned up. EBS-backed AMIS tend to be used for 'traditional' server tasks, such as web or database servers that keep state locally and thus require the data to be available in the case of crashing.
One aspect I did not see mentioned is the fact that you can take snapshots of an EBS-backed instance while running, effectively allowing you to have very cost-effective backups of your infrastructure (the snapshots are block-based and incremental)
I've had the exact same experience as Eric at my last position. Now in my new job, I'm going through the same process I performed at my last job... rebuilding all their AMIs for EBS backed instances - and possibly as 32bit machines (cheaper - but can't use same AMI on 32 and 64 machines).
EBS backed instances launch quickly enough that you can begin to make use of the Amazon AutoScaling API which lets you use CloudWatch metrics to trigger the launch of additional instances and register them to the ELB (Elastic Load Balancer), and also to shut them down when no longer required.
This kind of dynamic autoscaling is what AWS is all about - where the real savings in IT infrastructure can come into play. It's pretty much impossible to do autoscaling right with the old s3 "InstanceStore"-backed instances.
I'm just starting to use EC2 myself so not an expert, but Amazon's own documentation says:
we recommend that you use the local instance store for temporary data and, for data requiring a higher level of durability, we recommend using Amazon EBS volumes or backing up the data to Amazon S3.
Emphasis mine.
I do more data analysis than web hosting, so persistence doesn't matter as much to me as it might for a web site. Given the distinction made by Amazon itself, I wouldn't assume that EBS is right for everyone.
I'll try to remember to weigh in again after I've used both.
0 commentaires:
Enregistrer un commentaire