How to Deploy a Rails app to EC2 in less than an hour using Rubber

moe · on March 10, 2011

I glanced over the rubber docs and it sadly looks like too many ruby projects; big promises, shaky implementation (opinionated in questionable ways), extremely thin documentation.

There are better building blocks if you're going to roll your own EC2 stack (fog and sumo come to mind).

And if you lack the knowledge to roll your own then no, commands like the following are not the answer:

  ./script/generate vulcanize complete_passenger_mysql

Yes, this may work sometimes and set you up with "something". But there is insufficient documentation about what this "something" actually is, what the rationale behind it was, and how someone who needs a tool like rubber is supposed to keep it running and in good shape.

nirvdrum · on March 10, 2011

Full disclosure: I'm one of the rubber devs.

But, did you look at any of the wiki pages? I think all of your questions are answered right there:

https://github.com/wr0ngway/rubber/wiki

I think you largely missed the point of rubber. Rubber could use something like fog (and indeed, we plan to), but it's not a fog competitor. It allows you to do role-based deployment with capistrano.

At the core of it, you have a set of roles (e.g., app, db, redis, web, resque_worker) and a set of machines belonging to those roles. You have role-based config and you can interact with subsets of your cluster through those roles.

heroku is good at what it does. But there is simply no way I could have built Mogotest on it. And most of the rubber projects I've come across would never work on heroku.

moe · on March 10, 2011

But, did you look at any of the wiki pages?

Yes.

I think all of your questions are answered

No.

I think you largely missed the point of rubber.

Could be. My perception is that it wants to be a turnkey solution for people with little admin knowledge. But it is nowhere near that, in its current shape and form.

At the core of it [...]

Sure. It surely does some useful things. But none of that is nearly adequately documented. For example, what exactly does the command I cited do? I couldn't see it in the wiki. What if I want postgres instead of mysql? What about backups? How do I upgrade system packages later? Where is the /etc/hosts magic documented that the blog-post mentioned (which is a pretty bad idea btw)? What kind of network topology will my instances have, what are the security groups? How do I manage EBS volumes? What if I need a redis, memcached, or other components? What AMI will my instances be running? What about loadbalancing, failover? What about cronjobs, e-mail, all the systems stuff? ...

I could go on for a while. Maybe I just didn't see the wiki-page where all this is explained. And I'm sure much of it can be "discovered" just by trying out and guessing.

However, Heroku exists, is free, well tested and well documented. As such it's the better blackbox to choose for someone who doesn't yet know what he's doing.

For people who know what they're doing there's fog, puppet and chef.

nirvdrum · on March 10, 2011

I have a feeling this discussion is going to go basically nowhere. But I'll bite.

The command you cited is a generator. It generates a complete passenger and MySQL setup. It's a sensible default that handles a large set of use cases. If you want postgres instead, there's a complete_passenger_postgresql. The full list of modules can be seen at:

https://github.com/wr0ngway/rubber/tree/master/lib/generator...

Backups are set up automatically. There's a whole mess of capistrano tasks for this stuff. Run "cap -T" to see them. That's the standard way of documenting such.

Actually, you have a load of questions there that are evident once you actually try rubber. That's not a great answer and I'm not saying the documentation is great. But, hey, the documentation for fog is virtually non-existent.

But, no, rubber is absolutely not meant to be a turnkey solution. We grossly simplify a ton of tasks, but at the end of the day, you need to know what you want your topology to look like, what roles you want on which machines, how many of each, how many read slaves you want, etc. The templates we provide make setting up things like backup and slaves trivial, but we don't do it for you automatically.

moe · on March 10, 2011

Well, I'm sorry for being entirely negative, but after looking over the templates... in my opinion you're doing it wrong. For example (at a glance) I don't see any mechanism to upgrade anything after the fact. Am I just missing that, or what's the approach there?

Either way, it seems you're effectively reinventing a subset of puppet. Wouldn't it be more sensible to implement rubber in the form of puppet manifests (or chef recipes)?

nirvdrum · on March 10, 2011

You're certainly entitled to your opinion. It's just a strong stance to take for something you surveyed for 15 min. and then proceeded to compare to a few things that don't make a ton of sense. But that may very well be due to holes in the docs (hopefully the article did give a motivating example though).

I'd have to check the timeline again (I wasn't involved with rubber from the outset), but I think it predates both puppet & chef. Regardless, we've discussed adding a chef adapter. However, it's unlikely that would ever work with the chef configuration server since a large tenet of rubber is your entire deployment and provisioning configuration is wholly contained in your project.

Re: upgrading. If you want to upgrade to the latest OS packages, you can "cap rubber:bootstrap" or use "cap invoke" to execute whatever command you want on the remote machines. If you're referring to the rubber modules, you just re-run the generator when we issue an update, much like with other generated source projects. Or modify any of the rubber-*.yml files to specify whatever version number you want, since it's in your version-controlled source tree.

Don't get me wrong. I'm highly critical of the projects I'm involved in. It's the best motivator I can think of for improving things. I'm just trying to clear up any misunderstandings to elucidate real issues. I think the rubber mailing list might be a more appropriate venue for a technical deep dive, if you want. Or feel free to email me directly (nirvdrum@gmail). Or we could keep going back and forth here, but it doesn't feel very productive.

defroost · on March 10, 2011

Oh my god, if you don't like the way rubber in implemented, either don't use it, or submit some patches to fix areas that you dislike, but give the guy a break.

rgrieselhuber · on March 10, 2011

> My perception is that it wants to be a turnkey solution for people with little admin knowledge.

We're using it for two reasons. One, we wanted a way to assist in our migration to EC2 without doing a bunch of manual configuration. Two, we wanted a more sane way to scale out and provision our instances based on what had been working on Rackspace, but had all been manually configured. So, we had decent sysadmin experience and knew what the moving pieces were, but didn't want to keep going down the same path.

If I understand correctly, your criticism is two-fold: 1) the documentation is too thin and 2) there is too much voodoo going on.

The docs and surrounding blogs posts that we have found have been enough to make pretty good progress in 2-3 days with a relatively distributed architecture. James' blog post is designed to further help people new to it.

Based on the brief amount of time that I've had to look at it, I didn't see too much black magic. It's basically just built on top of Capistrano and provides lots of configuration templates for different components and architectures. I'm sure there is a lot to improve but no complaints so far.

> For people who know what they're doing there's fog, puppet and chef.

This doesn't really move the conversation forward.

bad_user · on March 10, 2011

I don't know how other people do it; but I can deploy a Rails or a Django app in less than an hour without having automated scripts.

Of course, I do have a step-by-step wiki document, reminding me what to do.

But if you know the basics of configuring a web-server, and you're sticking to a preferred Linux distro, like Ubuntu or Debian, the process itself is pretty painless after the first time.

That said - it's pretty useful to have an automated process for getting a new web server up and running in case you're experiencing scalability problems.

nirvdrum · on March 10, 2011

It's good to formalize your deploy & provisioning scripts to cut out on human error. And it's nice when bringing up a decent number of machines. I've used rubber to bring up 150 machines without any problems.

The other nice thing that a tool like rubber or moonshine helps out with is auto-updating configuration. E.g., add a new app server and it automatically gets added to the haproxy pool.

tjogin · on March 10, 2011

I don't even know if I'd need any notes to be able to deploy to my platform of choice, Ruby Enterprise Edition and Phusion Passenger running on Debian, in less than an hour.

Not because I'm particularly skilled at it, but because the Phusion guys have excellent install scripts and docs.

bad_user · on March 10, 2011

I'm also taking into account the time it takes to install prerequisites like Postfix, create necessary users, setup the firewall, etc...

revorad · on March 10, 2011

Do you mind sharing your wiki?

bad_user · on March 10, 2011

It's internal, but I was planning on writing a big blog-post about configuring EC2 / S3 / RDS.

ScotterC · on March 10, 2011

I've been using Rubber for the past few weeks and I'm a huge fan of it. Completely configurable, helpful developers and discussion forum and the best part is you have all of your server administration scripts right inside with your app. Took me a bit of time to cater it to my needs but I don't think I'll be moving away from it anytime soon.

spooneybarger · on March 10, 2011

Given how long it takes to deploy a rails app to heroku, an hour to deploy now seems like a horribly long period of time.

patio11 · on March 10, 2011

That's the time to do from-zero provisioning. It is roughly comparable with the amount of time it takes to get a bare-metal VPS spun up and running, if you have Capistrano tasks which are sufficiently up to date to do it for you. (Upload keys, create accounts, apt-get big nasty list of software, install rails, install all gems and dependencies, set config files, check out code, etc etc.)

(If you are doing toy applications, you can get a Heroku instance spun up in under a minute, but I have never had a non-toy Heroku application work on the first try. There is virtually invariably a gem which works fine locally and then dies hard when you expose it to the Heroku environment. After you've debugged and addressed this, subsequent deploys on Heroku take about as long as subsequent deploys via Capistrano.)

Heroku is a wonderful, wonderful system, but "git push look-ma-no-sysadminning-whee" it is not.

dylanz · on March 10, 2011

We ported a pretty hefty application to Heroku, and surprisingly, didn't have much customization that needed to happen in regard to gems. We did have a hand written Nginx module that we had to toss out, and took care of the stuff we needed to take care of on the Rack level. The Rack middleware we had to write was really minimal, and took only an hour or two to write and test. The application gets around 20-60 rq/s on Heroku, and we haven't had to touch a thing since the migration (/me knocks on wood).

nirvdrum · on March 10, 2011

By having full access to EC2 you naturally can do things that are either considerably harder or just not possible with heroku. If you don't need that additional flexibility, then by all means, use heroku as it may be the best tool for the job.

ajaimk · on March 10, 2011

Costs a lot less to bypass heroku for the larger apps

spooneybarger · on March 10, 2011

i wouldnt argue against this point or any of the others in response. i was more musing out loud about how an hour just doesn't seem that fast to me anymore whereas a few years ago, it would have.

samstokes · on March 10, 2011

According to the article, it's an hour (ish) for the initial setup, around 20 minutes to provision new infrastructure (add or remove instances), and around 10 seconds to deploy. The key point being you don't need to add a new server every time you push code.

mtodd · on March 10, 2011

Moe has some good points about maintaining these instances, keeping things up-to-date, and launching more sophisticated infrastructure beyond your vanilla Rails/MySQL stack apps.

In my experience, Chef is really not that difficult to get setup, especially for the benefits it gives you: idempotent recipes with a great deal of control over the configuration, setup, and control of your infrastructure.

In the blog post, it mentions having to comment out migrations that will fail. That's bad: when bootstrapping machines, a db:schema:load is far more appropriate and less brittle.

Rubber seems like a half-solution for maintaining infrastructure. The blog post mentions that it was picked to prevent having to manually configure boxes, but it sounds like they chose a tool that only half solves their problem and then leaves them to do manual maintenance.

Now, with some experience, I can get a Rails machine booted and running an app in 20 minutes total with Chef. It's really pretty easy, and having tons of resources, lots of recipes, excellent support, and a sane system makes it super flexible.

rgrieselhuber · on March 10, 2011

The thing that annoyed me about Chef was the need for a Chef server (or their Opscode service). Rubber may turn out to not be ideal so we'll see, but so far it seems to be pretty logical and not that difficult to maintain.

The part about commenting out a failing migration is a bug that shouldn't exist in that particular process, not relevant to Rubber.

I don't mean to sound like I'm vigorously defending Rubber because we just started playing with it but, so far, the criticism seems a little overblown.

bodhi · on March 10, 2011

> The thing that annoyed me about Chef was the need for a Chef server

You may not have heard of Chef-solo? I've been managing 8 EC2 servers for a project with it and Capistrano, relatively painlessly.

rmason · on March 10, 2011

If you're looking for something similar for the GAE the folks at http://www.openbd.org/ have a new installer that you can use for their open source CFML engine. It's a matter of choosing in the installer whether you want it on your local machine or the GAE.

I believe every language is going to have to simplify the cloud install.

aymeric · on March 10, 2011

When would you use this kind of setup over Heroku?

rapind · on March 10, 2011

As a general guideline, I look into EC2 when:

1) My scale is significant enough to require at least 2 servers (unless one of them is RDS).

2) I have some background processing.

3) I can commit to reserving an instance for at least a year.

A small ec2 instance running 24x7 will cost me around $40 / month (+ bandwidth) if I reserve up at least a year's worth of usage. That's 1.7GB of RAM and 1 EC2 compute unit. For $163.43 you can get a pretty honking large instance (way cheaper than any competing VPS offering). Add in ELB and RDS and you've got a very scalable setup for very little coin.

I'm not sure how the Heroku dyno or worker performance measures up to EC2 Compute units though, but I'd love to hear from someone who does know.

mtodd · on March 10, 2011

It's interesting to see this line of questioning because the post to me was more about Rubber than about EC2 vs Heroku.

For me, Chef is a no brainer for bootstrapping and managing my instances.

Heroku is also a no brainer, but as rapind pointed, it's for simpler setups or for getting an app up quickly without a lot of ceremony.

We've deployed to Heroku and eventually moved to EC2, but have kept several apps on Heroku.

rapind · on March 10, 2011

Do you remember what your reasons were for moving to EC2 direct? I.e. needed how much more processing / memory / etc.? I know that when I start adding in some typical features (background processing, redis and / or memcached, wildcard domains, solr search, hourly crons) the heroku prices really start to add up. Significantly more than EC2.

For example the Heroku addon for 100MB of memcached is $20 / month... which far exceeds the cost on an EC2 box.

mtodd · on March 13, 2011

Sorry, didn't see this reply.

We've been using EC2 for a while now, before most of these plugins were available for Heroku. We're extremely comfortable with the platform and our tools for managing it.

We have most of our customer-facing nodes on m1.large instances since we have a lot of caching (mostly loading all of the data in memory) up front and then the rest is CPU bound.

Scaling Heroku is nice since it's just some knobs you turn, but Chef isn't that much more difficult and it's just a different type of knob.

aymeric · on March 11, 2011

How much does an EC2 box cost? Less than $20/month?

rapind · on March 11, 2011

I assume you're referring to the memcached addon reference? That's $20 for 100MB of cache ram... The amazon small instance is 1.7GB of RAM for around $40. Making the memcached addon a 850% markup.

super9 · on March 10, 2011

Is there something like this for Django?

flexterra · on March 10, 2011

I have a fabfile I made to deploy to EC2. https://gist.github.com/860576 Edit: It takes a lot less than 1 hour

super9 · on March 10, 2011

cool, I'll check it out!

mryan · on March 10, 2011

I'm thinking about writing a series of how-tos describing how to set up something like this with EC2 and Puppet - in my tests, I can go from bare-metal to a running app in under ten minutes. That's including the OS install (for physical/VMWare-based servers) and provisioning time (for EC2-based servers).

I have recently been testing it with a pretty basic Django app, but I have previously had success with large PHP CMSs, including all of the moving parts such as the DB server etc.

Would you be interested in reading these, if I make them a bit more Django-centric and push all the code to github?