EC2 to VPC: A transition worth doing.

snorkel · on Nov 3, 2011

"All [EC2] nodes are internet addressable."

Not true. You'd have to adjust your Security Group policies accordingly for that to be true.

"All [EC2] nodes are on a shared network, and are addressable to each other."

Misleading. You'd have to be clinically brain dead to allow this to happen by explicitly setting that policy in your Security Groups. You should also try to avoid Tweeting your admin login creds if this is an issue.

VPC has good features not available on EC2 but this perspective on it boils down to "I need a sandbox because I don't understand how firewalls work."

datums · on Nov 3, 2011

All [EC2] nodes are internet addressable.

They all have internet routable ips/hostanames.

Proper use of security groups would address most of the concerns described in the post.

nodes in security group A (load balancer) can access nodes in security group B (app server)

kgtm · on Nov 3, 2011

TL,DR; According to the author, VPC offers better security because your nodes live inside a non-publicly facing VPN. This also simplifies pre-and-post-deployment processes.

My take from the blog post: VPC is a transition not worth doing, if you are a startup (we are on HN after all), since, and I quote: [it] requires a substantial amount of domain knowledge [and] VPC is far more complicated than EC2. Keep things simple, and stay focused. EC2 is perfectly adequate for most endeavors.

jvehent · on Nov 3, 2011

Big news: you actually need to learn networking if you're going to manage a network.

VPC is just basic networking, reinvented by AWS because they screwed up that part real bad in the first EC2 arch. VPC is not complicated, it probably takes half a day to figure it out. And compared to the ridiculous amount of time it takes to setup EC2 in the first place (when you come from classic datacenter), it's a good investment that's not going to show in the budget.

amalag · on Nov 3, 2011

Does this guy know what he is talking about? The default Amazon security policies don't allow anyone to access your machines by the internal IP (or external IP). His link to "Serious security issues" goes to a slideshare on caching. Can anyone write anything on a blog and be an expert? Also he talks about EC2 & VPC like they are different things. VPC is part of the same infrastructure. Is more of a feature.

armon · on Nov 3, 2011

I think if you take a moment to read the context you would have a better understanding of the security issues involved. EC2 acts like one giant multi-tenant network. The only thing preventing access to your nodes is the SG policies, which are fairly trivial to mess up. The linked slides shows people running memcached in EC2 with weak SG configurations exposing their data to the public internet (this includes some very large companies).

The primary advantage of VPC is that it provides a private, single-tenant network. The advantage here is that you don't need to rely on complicated SG policies to maintain security.

Although VPC is part of the same infrastructure, the two have very dramatically different network configurations, which makes then clearly distinct. The networking environment of VPC much more closely resembles what you would have if you were using physical infrastructure. This means you have control over the routing tables, DHCP, DNS, network topology, etc. These are all things which are out of your control on EC2.

mechanical_fish · on Nov 3, 2011

So: I'm trading the need to cluefully and actively manage and audit my security group policies for... the need to cluefully and actively manage and audit an entire network topology? Which requires a lot more knowledge and moving parts, does it not? Can't you mess up your VPC configuration just as easily as you mess up security groups?

Just because various famous companies failed to properly audit their security group settings doesn't mean it's rocket science. It just means that they didn't notice the problem in advance.

Write a script that captures your SG policies and writes them to a file in some nice parseable format. The dumbest possible version of that might be:

  ec2-describe-group | sort > /home/MY_SG_SETTINGS

where I threw in a `sort` because I don't know that ec2-describe-group returns rows in a consistent order.

Write another script that runs the above command every hour and yells loudly if the output ever changes. Wire that script to (e.g.) Nagios.

Now, use the time you've saved not implementing VPC to write some tests that attempt to connect to various important ports from outside your security group. Log lots of scary warnings if they ever succeed. Wire those to Nagios, too.

Unless your security groups are numerous and complicated, or your developers demand the power to open and close arbitrary ports on arbitrary machines seven times per working day, this would seem adequate for most use cases.

I know there are use cases for VPC, but it doesn't feel like this is one.

armon · on Nov 3, 2011

I'm not saying that one solution avoids the need for setting up things in a sane way. In both cases, you need to configure things properly or you can shoot yourself in the foot.

However, depending on the systems your company uses and your configurations, one solution might be simpler than another. Network topologies tend to be very static. You might have a public subnet, and a few private subnets.

Over time, you might add or remove web servers to the public subnet, or add various services to the private subnets. Especially if you are using a Service Oriented Architecture (SOA), you might have many different services that need to interact with each other, but not with the public internet.

When you are using security groups, you need to make sure to have separate SG's for each category of machine (public, private) and then manage them at a service level, and manage the interaction between multiple SGs. For example, Web should be available on port 80 from the internet, but the DB should only be available on port XX accessible from the Web SG. But also accessible from the same port from a Service SG, etc.

So as you continue to iterate on your services and deploy new ones, you need to be constantly tweaking the security group configurations. With VPC, once you have a sane public/private topology you can forget about it.

Additionally, most SOAs try to provide some form of high availability. For us, that means cross region / cross AZ replication and availability. Doing cross AZ is fairly simple in both EC2 and VPC, but doing cross region in EC2 is a pure nightmare. You cannot apply a security policy across regions, so you have no simple way to allow your nodes to communicate.

Since VPC acts as a distinct private network, we can simply use site-to-site VPN configurations between our regions, and nodes can easily and freely communicate with each other. There is nothing to worry about, since the private subnets are connected over VPN, and are using hostnames that are only routable within our private network.

Don't get me wrong, security groups can be properly used to provide a totally secure environment where only trusted nodes are allowed to communicate. You can add monitoring and configuration testing easily as well. But once you try to scale up past a few servers, move to a SOA, and provide cross-region availability, VPC becomes the simpler alternative.

mitchellh · on Nov 3, 2011

Sorry, you're right that the default amazon security policies only allow your nodes to access each other. However, the slideshare I linked to goes to a post about cache _mining_ of EC2 nodes, where open Memcached instances were found on big sites such as Bit.ly, Gowalla, and PBS.

The point I meant to make was: Yes, you can use SGs to make your network secure on EC2, but its very easy to shoot yourself in the foot, which is evident if even larger companies on EC2 are making the mistake. VPC on the other hand provides a secure-unless-you-try-really-hard environment.

NOTE: Modified the post to hopefully make this more clear.

dwhitney · on Nov 3, 2011

I agree (in a less caustic way). I'd make this jump only if I had ample time on my hands. I'm sure there is context here that justifies it which was left out of the post.

thyrsus · on Nov 3, 2011

The slideshare is a talk about leveraging memcached access into more "interesting" access. From the slides: "Cached high scores suck; where's the good stuff?" Apparently EC2 was a target rich environment at the time of the talk. Using a non-publicly routable address for your memcached server (and back end servers in general) is one way to fail closed.

stevefink · on Nov 3, 2011

The showstopper for us when it came down to whether we will use VPC or not ended up being a no due to lack of support on elastic load balancing:

http://aws.amazon.com/vpc/faqs/#E13

I will certainly revisit this topic when ELB is inevitably available for VPC based customers.

jvoorhis · on Nov 4, 2011

This can be worked around by adding a gateway.

Micro instances are also unsupported within a VPC, which is also a showstopper for some.

simpleenigma · on Nov 3, 2011

I recently launched a B2B platform that required a VPN connection to each of my customers back end server running a proprietary software package on top of MSSQL. In order for my app server to have access to the client MSSQL server which was on a different computer than the VPN server itself I needed the advanced routing on AWS VPC ...

EC2 was enough for me to prototype my system to get 2 clients on board, but I needed VPC to accomplish the multiple VPN connections that I use ...

Also, my storage servers have no need to be addressable by my clients and/or the Internet at large. So while I could have done that with security groups, having them in a private subnet made things much easier ... and now I know that from an IP level the file storage cannot be found from outside of my VPC.

I'm sure if I sat down and thought about it I would come up with more ...

inopinatus · on Nov 3, 2011

Every time I think of VPC I ponder the missing features and wish that Amazon had made/will make a better job of building an ecosystem around AWS.

In particular, I would love to be able to create & sell managed loadbalancers, accelerators, component applications etc as DevPay billable AMIs. But that's not possible with VPC (and because I'm in Australia), so everyone using VPC has to reinvent a whole bunch of wheels to use it. It's a reduction of value all round.

So ingenious though the general concept of VPC is; two years on, it's still a half-baked product with a high opportunity cost.

samlittlewood · on Nov 4, 2011

Cripes - that's a lot of machinery - we simply use security groups as roles, no iptables at all.

The only groups/roles that allow external access are:

  - proxy (80,443 /0)
  - extranet (80,443 office-ips/32)
  - admin (22 office-ips/32)

The rest of the security groups are set up appropriate to the roles eg: db allows 3306 to app,extranet,...

instances can do discovery thru' the ec2 api - looke for a machine wit the role they require.

Once an instance is booted and the packages are installed, it is maybe a minute or less to availability.

gtaylor · on Nov 3, 2011

It's very likely that the memcached instances that they found are companies that were using IP blocks in security groups instead of other security group IDs. This is their own fault.

rcrowley · on Nov 3, 2011

VPC is definitely more like real hardware than EC2 but I take that as evidence VPC is premature optimization rather than support for undertaking this sort of migration.

Fingel · on Nov 4, 2011

Last I checked, you could not use cluster compute instances within VPC. Nor could you use the Elastic Load Balancer. Just be aware of what is supported and what isn't.