Ask yc.news: Opinions on EC2?

kirubakaran · on Sept 19, 2007

I use EC2 and I do swear by it. I see the initial extra effort you need to put in to get it to work right (designing for failover etc) as: 'the more you sweat in practice, the less you bleed in battle'.

I first ported a toy app that I wrote (http://www.instantwordsearch.com) to EC2 - as an exercise. I personally learnt a lot in the process. Now I use EC2 for everything I do. It is a LOT of fun (imho).

Interesting links: http://jimmyg.org/2007/09/01/amazon-ec2-for-people-who-prefe... http://jimmyg.org/2007/09/01/custom-debian-ec2-amis-from-xen...

wmf · on Sept 19, 2007

I haven't used EC2, but the complexity would worry me if I was on a schedule. If you only have a few months of runway and you spend most of them sharpening your EC2 tools rather than coding features you could be doomed.

kirubakaran · on Sept 19, 2007

I would say you can grok the whole thing in a weekend.

tx · on Sept 19, 2007

Hm... the toy app you ported took about 5 seconds to respond. Slo-o-o-o-ow. Is that EC2 overhead?

This is the kind of thing we were trying to avoid and ultimately (again) decided on our own servers in a data center.

kirubakaran · on Sept 19, 2007

It is near instantaneous for me and I am in Seattle,WA,USA. I've heard from many sources that AWS is slow if you are outside the US. Someone even brought this up in the AWS dev conference and the Amazon guys just dodged it and gave a generic 'things are being improved' response.

If http://www.instantwordsearch.com/ look up is painfully slow but http://www.instantwordsearch.com/backstage.html (statically served 103KB text) is very fast, then you can conclude that the problem is with my app. Otherwise it might be safe to blame EC2.

tx · on Sept 20, 2007

I am in Austin, TX and the front page took 5 seconds. I did not type anything or clicked anywhere.

The problem with EC2 and other super-managed hosting solutions (Joyent comes to mind) is inconsistency: I am sure they're moving VMs around, and occasionally they end up on overloaded servers.

wmf · on Sept 20, 2007

But I am equally sure that EC2 does not oversell CPU/RAM resources and does not move VMs around. I don't think Xen even supports memory overcommit.

DocSavage · on Sept 19, 2007

I bet most of that response delay is from the autosuggest lookup against the dictionary. It seems to be querying every time you enter a letter, so if you quickly type a word, you'll be initiating multiple GET requests, each of which hits the DB.

davidw · on Sept 19, 2007

So how'd you deal with its ephemeral nature?

kirubakaran · on Sept 19, 2007

I have my own AMI - so, I can bring the instance up as soon as it craps out. (This has never happened so far - which means I am either lucky or the fear is overblown). I have another server outside of AWS babysitting EC2.

I am designing a little bit on the paranoid side and making automatic incremental backups in S3. Recovery script will automatically restore using this too. I've simulated failures and I know it is safe. (this is something you need to do anyway - there are no guarantees in life!)

I still haven't automated the DNS change that needs to be done when you get a new instance. I will be handling this soon. There are some cool hacks already out there if you google for ec2 dns

davidw · on Sept 19, 2007

Having a central server to run everything seems pretty key to the entire operation. Is that where you store your data, or do you just hope to be able to fix things from backups to S3 if you lose a server?

kirubakaran · on Sept 19, 2007

IMO EC2 lends itself very well to higher availability by redundancy - if that is what you want.

My current approach is more exploratory coz I am learning as I go. But, to answer your question, if the instance were to fail right this moment, I'll get a notification and I'll run a recovery script that I've set up. I won't have this manual-intervention method for long though.

SwellJoe · on Sept 19, 2007

We're announcing a product that manages EC2 instances in the next week or so (along with Solaris Zones, Xen, and vserver instances). So far we're seeing pretty good results with the people who are using EC2, but reliability is currently less than ideal. If EC2 were on the various hosting reliability charts I'm pretty sure it'd be down in the bottom half of the pack (and given some of the podunk three-server operations out there calling themselves "hosting providers" this is pretty dismal). As far as I can tell, you can expect several minutes of downtime every month, and sometimes longer. Most of the top tier hosts average a couple of minutes per year downtime...but the fluid nature of EC2 should make it possible to get near 100% uptime...when Amazon takes a network segment down, they can theoretically do a live migration of your instances. So far, they don't seem to actually have the capability to do so.

The S3 piece of the equation brings it back up in usefulness...but I don't know that I'd want to rely on it exclusively for my web applications just yet, unless high volume storage was a core part of my problem domain.

piers · on Sept 19, 2007

I've not used it, but have a read of this: http://www.25hoursaday.com/weblog/2007/07/04/AmazonEC2S3Does...

gojomo · on Sept 19, 2007

That post ("Amazon EC2 + S3 Doesn't Cut it for Real Applications") contains a false statement which the author has not corrected, even though commenters there quickly pointed out its falsehood. The falsehood is: "There is no persistent storage in EC2 so if your virtual server goes down for any reason such as taking it down to install security patches or a system crash, all your data is lost."

Instances can reboot, under operator control or due to a crash, and still retain their existing hard drive storage. Only less frequent 'instance termination' causes a loss of hard drive contents.

True, there are no guarantees that any instance won't be terminated by Amazon or other system failures at any time, so you're supposed to have your own backup and persistence strategy. But in practice, full terminations are rare, and Amazon has even given warnings when system upgrades mean large instance turnover is expected. (Though, there is no guarantee they will do so.)

So the falsehoods above are "[data is lost] if your virtual server goes down for any reason" and both concrete examples given, "taking it down to install security patches" and "system crash".

felipe · on Sept 20, 2007

Three days ago I received an email from Amazon alerting me that "one or more of your instances are running on hosts degraded due to hardware failure.", and giving me a week to migrate to another instance. I have a script that creates daily instance images as a back-up procedure, so the "migration" was just a matter of starting another instance using the most recent image and I was up and running.

I looked back on my logs and found out that the particular degraded instance was running since June without any problems or shutdowns. Not bad for a VPS...

I'm very aware that Amazon provides no guarantees, but I have to say that my confidence on EC2 increased after this incident.

davidw · on Sept 19, 2007

From reading the comments, it's kind of difficult to figure out what gets persisted and what doesn't. I mean, if it's persistent until you shut the machine down, that's not so bad... it's like having a VPS somewhere, no?

wmf · on Sept 19, 2007

The local storage is persistent until Amazon shuts down the virtual machine whenever they want. (They say that it only happens due to hardware failures or EC2 platform upgrades, but you have no control over those events.)

kirubakaran · on Sept 19, 2007

You can design around this, given the upside of using EC2+S3. It is anyway better to design your app with potential failure in mind, don't you think?

staunch · on Sept 19, 2007

I think it's super useful for certain kinds of things where you need lots of servers on-demand. For most startups I think boring old fashioned dedicated server accounts are the way to go. The price, simplicity, and control is unbeatable IMHO.

kvogt · on Sept 20, 2007

We run a live video cluster on EC2 that scaled from 10 nodes to 100+ and back again several times. Amazon keeps things running smoothly even when we're pushing close to 2gbps of video to 14,000 simultaneous clients. So I would say it performs very well for jobs requiring lots of computation or bandwidth.

However... we have our web servers in a colo. Ping times to EC2 aren't fantastic, so we use a traditional CDN for static content and a colo for application servers in order to keep things snappy.

breck · on Sept 19, 2007

I talked to a guy last night at TC40 who swears by EC2. But he recommends using RightScale on top of it. I've given only a cursory glance at RS, but it looks interesting. And the guy just sold his company in the past week so it definitely worked for him.

davidw · on Sept 19, 2007

First impression:

My requests time out an awful lot:

Read timeout. Please try again later. If this persists please visit the AWS developer forums to see if it's the result of a known issue.

That and doubts about permanent storage are downers.

kirubakaran · on Sept 19, 2007

Can you please give more details about your set up, the country you are accessing EC2 from etc? I am curious to know coz I am in the process of putting all my eggs in the EC2 basket now.

davidw · on Sept 19, 2007

I'm in Austria, and I quite often ssh into servers in the US, with no problems at all. I've been having lots of issues with the EC2 tools timing out, though.

kirubakaran · on Sept 19, 2007

Thanks. I've heard that many non-US users have problems with EC2.