With I/O being the top bottleneck in modern databases and web applications, a lot of what people call "unpredictable performance" or "virtualization overhead" has mostly to do with sharing (and thin provisioning) storage devices between multiple servers/nodes.
I think there would be a huge market for a performance-oriented VPS provider who could provide each node with its own, dedicated hard drives/SSDs. All major virtualisation tech (KVM/Xen) already supports raw disk mode.
Obviously, space and SATA ports inside servers are an expensive commodity with off-the-shelf hardware, so this project would require at least some custom hardware to offer competitive pricing. I think the tiny mPCIE SSDs sometimes found in laptops would be a good area to explore.
> With I/O being the top bottleneck in modern databases and web applications, a lot of what people call "unpredictable performance" or "virtualization overhead" has mostly to do with sharing (and thin provisioning) storage devices between multiple servers/nodes.
The thing is, most of the problem with sharing a disk simply goes away once you go SSD.
Spinning disk has the characteristic that sequential access is pretty good... and random access is terrible.
If you have two processes both streaming sequentially off the same disk at the same time, if your scheduling algorithm doesn't allow for terrible latency, the disk is going to see random access. This is why spinning disk is so terrible for multi-user setups.
SSD doesn't have that problem. SSD has problems with writes that are not cell-aligned, but this is functionally similar to the problems raid5 has, and much like raid5, it's not a problem when it comes to reads.
EC2 is generally very expensive for CPU. RAM and storage are okay but CPU is crazy.
Anyone know what recommends EC2 over Digital Ocean, Vultr, Linode, etc.? Are they more reliable? Enterprise features? Network bandwidth? Cause right now they look hugely overpriced.
I've hosted on Digital Ocean and Vultr for some time and my uptime is great on both. I run constant ping testing and I do see little glitches from time to time between data centers, but that could be network weather on the global backbone. (I have a geo-distributed architecture so there's stuff running at five different locations.)
You can't look at EC2 as just a place to go for a VM. It's not worth using EC2 unless you either:
A) Already have a lot of other infrastructure on EC2.
B) Want to use the other services offered by AWS.
AWS is a collection of services and APIs that you can compose to build big, complex, scalable things. If you just need a VM host, AWS should be the last place you look.
On the flipside, if you have an application that could benefit from outsourcing some of your infrastructure, AWS could save you time and money. For example, we have AWS managing our DB server (RDS), Load Balancing (ELB), Redis/Memcache servers (ElastiCache), static media storage (S3), DNS (Route53), and some of our CDN capacity (CloudFront). We can manage all of these through the same API client (boto), and the inter-service latency is typically low.
Amazon offers a full ecosystem of services - compute, object storage, CDN, DNS, DbaaS and a lot more - DO and Linode are VPS services with hourly pricing.
I use Linode for production sites. It's easy to create private networks. Performance is better than DO in my testing. I use Digital Ocean for my hobby sites like blog and quick apps since they have inexpensive instances.
EC2 is generally very expensive for CPU. RAM and storage are okay but CPU is crazy.
I'm leaning towards becoming an EC2 apologist on here, but just running quick benchmarks on a t2.micro versus both the $5 and $10 Droplets.
sysbench --test=cpu --cpu-max-prime=40000 run
$5 Droplet ("2.0Ghz", bogomips 4000)- 99.4981s
$10 Droplet ("2.4Ghz", bogomips 4800) - 88.3740s
(I can't find any actual documentation detailing the $10 option being faster, so perhaps this is just random luck on instantiation)
t2.micro ("2.5Ghz", bogomips 5000) - 69.5248s
Now of course the t2.micro won't let you run that around the clock, which for many workloads is entirely fine: as a standard blog host and the like, or the overwhelming majority of server implementations, bursty CPU is exactly what most natural workloads look like.
Of particular relevance is that the Amazon instance (E5-2670) exposes SSE4 and AVX to your VM, which for many workloads could dramatically increase its advantage.
I guess the whole point of this is that the vague CPU terminology that the various cloud vendors use is seldom really comparable. However to your core question, Amazon becomes a value proposition when you are using all of the parts -- S3, load balances, elastic IPs, shared volumes, availability zones, security zones, VPCs, private networks...it is all multipliers to the value of the platform.
As a quick addition on this, the m3.medium -- running on the same processor but governed differently -- takes 160 seconds to run the same benchmark (after repeated runs).
Amazon used to promote their instances via the somewhat comparable ECU metrics. Now, however, unless I'm missing something, you need to try to determine by narrative, because 1 vCPU is very much not equal to 1 vCPU on other instance types.
They still use ECU, but I'm not sure it's comparable across generations of instances, e.g. an m3 with more ECU than an older m1 instance of similar size seems at times to be slower.
The value proposition of these instance types seems to be entirely focused on CPU burst performance, with no local storage, no EBS optimization (though there are provisioned IOPS), and only moderate network performance that is shared.
Relatively poor disk performance is somewhat expected. I'm not sure how fair it is to compare it to instance volumes on other platforms, given the significantly reduced flexibility that brings with it.
Any thoughts on how this compares to simply spinning up a standard instance for a few hours then turning it off when you don't need it?
I run a service that needs to run about 72 hours worth of processing each day, and it all needs to happen during a 3 hour window. That's a natural fit for spinning up a couple dozen instances then killing them when they finish.
I'd love to see a comparison of what would happen if I kept the same amount of compute power on standby 24/7 using this new instance type.
It seems like this fits two needs, for smaller companies and/or people just getting started with EC2.
1. Laziness. Which I don't necessarily mean in a pejorative sense. Maybe someone just doesn't have time, yet, to learn/configure/maintain spinning up an instance for limited times.
2. Single instance. To spin up an instance, you need another computer. If you want that "manager" computer to be an instance at EC2, too, now you need two instances. With this approach, you can set up just one instance and get much of the same economic benefit.
EDIT: Also...
3. Predictable cost. If your manual spun-up instance turns out to need to run for 4 hours instead of 2, you get a bigger bill. With the t2 instances, you'll get a slower compute (if you run out of "credits") but not a bigger bill.
Again, this probably appeals most to small/new customers?
> Single instance. To spin up an instance, you need another computer.
I think you can do this with cloudformation, having it respond to the size of a work queue, however:
> Maybe someone just doesn't have time, yet, to learn/configure/maintain spinning up an instance for limited times.
This is why I can't answer the question above for certain, I got about that far in documentation and went off to find a simpler solution (for me, tutum: https://www.tutum.co/ )
As far as #2 goes, you don't need another instance. You can do it with time-based autoscaling groups. This does present problems of its own, but ones that are not hard to solve.
Yeah exactly. I do think however a method where they automatically repurchase credits when it runs low such that it just ends up costing something like double a normal instance when it does run low.
Well a c3.xlarge is $0.21 per hour (their "compute optimized current generation cpu" line).
24 * $0.21 = $5.04 per hour
You could burn those for two hours, almost, to match the lowest $9.50 per month cost of what they're talking about in the blog.
The c3 approach would give you 96 vCPUs during that time. The t2 micro for $9.36 or whatever per month, gives you one vCPU. I'd have to strongly favor spinning up 24 to 48 instances of the c3 large and clocking the job in one to three hours if possible.
Burstable jobs? This is where Spot Requests shine. Current spot price for c3.xlarge is $0.032/hr
24 * $0.032 = $0.78 per hour
I run all my CI infrastructure from spots for dirty cheap. Sure, it could all be yanked out from under me, but it's been running non-stop for over a year now. Plus, it keeps you from making "special snowflake" instances that you shouldn't. A minute and Puppet/Chef have got you a splendid new instance. ;)
If you are running a service that needs to be up all the time this is ideal. For other scenarios, what you describe is probably appropriate.
With burstable instances, you accumulate 6 CPU credits every hour so you can run at 100% load for an hour, once per 10 hours for t2.medium (once every 13.3 hours for t2.medium; once every 15 hours for t2.micro)
It would be nice to have a credit window greater than 24 hours though.
EDIT: ColinCera pointed out the math is incorrect. Updated and removed erroneous conclusion.
That's not correct. The way you accumulate credits lets you run for a full hour at 100% CPU, once every 10 hours. So if you have a lightly loaded server most of the day, you can burst to 100% CPU for 2+ hours per day.
A t2.medium starts with enough credits for about 15 minutes of 2 core CPU saturation and accumulates 12 minutes/hour thereafter. In non-burst mode it is about 5X slower. For this type of workload you'd likely be better off with a c3 or m3. t2 is a better fit for long term usage with periodic spikes (20% or less of total operational time).
Super interesting. If I did the math right, a 3year heavy reserved t2.micro instance comes out to $4.48/mo, which is cost competitive with Digital Ocean. The proof will come in the benchmarks, but this may become my preferred hosting solution.
It's $77 for 1 year reserved if I'm reading it correctly. That's $6.44 per month for an instance with double the RAM of the DO $5 instance. The specs look like the size that DO is charging $10 for currently. For a 3 year reserved instance it's $4.48 a month for double the size of the DO $5 instance. There's also a free tier, so the first year is free to try it out.
DO was competitive with EC2 on price but not on features (and certainly not on security), now with the price advantage gone...
How many customers use more than 1GB of outbound traffic per month for a $5 server? Data transfer in is free on EC2 and the first 1GB outbound is free too according to the pricing page.
Isn't the first GB of your account traffic free with AWS vs DO giving you 1 TB per droplet?
While 1 EC2 instance may not use more then 1 GB (which is a very low quota unless your CDNing everything), if you have a couple of instances your almost certainly going over that.
ok, so about $6.90 per month for 1 year then including storage. Thats still less than a 1GB RAM DO server which goes for $10 per month, and much less if you go for 3 years. Factoring in 1 year of free tier makes it even less.
It doesn't include storage
Id doesn't include SSD storage
It doesn't include provisioned IOPS for SSD Storage (well, ridiculously low)
It doesn't include bandwidth
IT doesn't include support.
The issue is that this offering is complex to understand as opposed to DO which is incredibly simple to understand. It is actually pretty funny how hard it is to understand this offering from AWS, it takes many paragraphs of reading to figure it out.
There's nothing to understand with DO because they don't tell you what their definition of a "CPU" is. For all I know it could be oversubscribed in a worse way than EC2.
Most people don't care the exact speed of the CPU for a server, as long as it's stable and doesn't lockup with little load, especially when purchasing a Micro instance. If CPU utilization is high then it's time to upgrade to another CPU or start another instance if possible.
It would be better to do real speed tests of each service to determine average "CPU" speed. I'm sure both services are constantly optimizing for both shared hardware usage and speed, so the stats would have to be updated regularly.
AWS is a simple model: pay for what you use. Most cheap hosts (i.e. please kill me DreamHost) give you no SLA and your rolling the uptime dice. Three years on DH and I wanted to kill myself. Three years on AWS and life is just splendid!
To understand your billing, you need to understand what you're consuming, which you always should. These credits add a little wrinkle, but also make the service cheaper and more deterministic. If you have credits, you'll get the CPU you bought with them.
I would replace 'super interesting' with 'Super complex'. It's an example of how you can make the price of a $10-40/mo server complex to the extent that you need to read the blog post numerous times before you understand the construct.
And even then, one still needs to factor in the 'other' costs like I/O or IOPs, disk (persistent/EBS), IPs, internet and inter-region data transfer… before you understand the real cost.
And then you need to compare to other instance types (which soon will cover the full alphabet -- c, cg, cr, g, h, i, m, r, t… ) and then other providers.
You still have several unresolved issue -
1. Are your assumption on usage (cpu, I/O, internet etc) correct? Will they change?
2. How do I compare performance across providers for a given VM specification.
3. Can I get support when I need it?
And I am sure there are others
It certainly means there is room for other players who just make it simple, whether they are infrastructure folk (like DO/Linode etc) or platform plays that make the pricing understandable by the audience they are trying to target (like Heroku/Ninefold)
The interesting thing about Amazon (vs the VPS market, where DigitalOcean, Linode, and I live) is that when amazon lowers prices, they lower prices for existing customers who don't make changes to their accounts. When a VPS provider like Linode lowers their prices, they usually charge existing customers the same amount, and simply give them more resources.
Just an observation. I'm not criticizing either way of doing things; obviously, lowering prices straight out is better for the customer, and keeping revenue stable while just upgrading hardware is better for the provider. Last time I lowered prices, I lowered prices directly, and just took the revenue hit. I'm planning my next upgrade now, and instead of lowering prices, I plan on giving everyone more ram/disk/ssd, while holding prices steady.
It is something I've thought about... the problem is that I'm going to have to go down by more than half, and it's way easier to lease enough hardware to more than double everyone's allocations than it is to double my customer base to make up for the lost revenue.
>Perhaps I have selective memory, but I've never seen Linode lower prices, they just keep upping the specs on the lowest tier.
Yes, exactly. I'm saying that is the standard way to do it in the VPS market, in part because until D.O. most of us were self-funding, and it's way easier to pay for double the compute resources than to deal with a 50% cut in revenue.
In the "cloud" market where amazon is, the standard way to do it is to directly lower prices.
AWS is a little bit a hybrid of both. If you're paying hour-to-hour, all cuts are immediate. But there is such a huge discount for reserved instances, that many large clients are using a large proportion of them. With a 3-year "heavy utilization" reserved instance, Amazon has gotten a significant % of the total price for running that instance up front, locked in for 3 years. Since the biggest part of the revenue (the reservation fee) is locked in, cutting the hourly rate only gives back a smallish part of the revenue to those kinds of clients.
Really? when they lower the price of the per-hour billing they don't lower the locked-in fees?
Huh. In the VPS market, from what I've seen, the rule is "treat your existing customers as well as your new customers"
while, say, the co-location market is like the real-estate market. "Subsidize your new customers, and if they are still alive when the lease is up, take profits in the form of much higher renewal rent."
I guess what you describe with pre-pays is sort of inbetween. There's a difference in most minds, I think, between raising a price and just not lowering it when you perhaps could be expected to. Most people new to the real-estate market feel pretty bent out of shape when they find out that they have to pay significantly more in rent to renew their existing contract than they will pay if they move.
Thanks. That's good feedback. We are working hard on the upgrades, but I have been way too slow :(
I do observe that there seems to be a price floor phenomena; for any customer, any price below $x is largely equivalent; they will go for the best thing they can get for $x, so providing a better product helps, but lowering the price below $x doesn't change the equation for that customer. Of course, $x is different for each person, so lowering your price does get you customers who had a lower value for $x.
I've already lost most of the customers that had a value for $x that was greater than what they were paying me at this point; I'm not losing customers nearly as quickly as I predicted. Right now, if I screw something up, of course, I lose the effected customers; I mean, it's really dramatic. You always lose some customers when you screw something up, but I lose way more now than when my prices were lower than the credible competition. But other than that, things have largely stabilized.
I should note that I'm a hobbyist/enthusiast type of customer; if you're losing business on price grounds, I can definitely see where a price cut would help.
(Also, it doesn't help that the wiki is crufty and out of date, and boot menu, last time I rebooted, was still on CentOS 5.)
> I should note that I'm a hobbyist/enthusiast type of customer; if you're losing business on price grounds, I can definitely see where a price cut would help.
I think similar principles govern business spending, only $x for them is usually higher. I have a couple of business co-lo customers who have been customers for like half a decade; some of them are still using the hardware they came in on. They could save a lot of money by upgrading hardware (and thus reducing their footprint) or even moving to "the cloud" at this point, because while co-locating modern hardware is cheaper than "The Cloud" - co-locating ancient hardware is not.
The idea is that it works for them, so they aren't going to fuck with it. I'd bet money, though, that if I fucked something up and caused them a serious outage, they'd be gone pretty quick.
>(Also, it doesn't help that the wiki is crufty and out of date, and boot menu, last time I rebooted, was still on CentOS 5.)
I just want to acknowledge those problems. We only have vague plans for the wiki, but we're actively working on upgrading the rescue image and the hypervisor (which, I imagine, is the part of the boot menu you are complaining about.) - these changes will probably not be implemented until our switchover to the new ganeti-based system, but... that should be soonish.
Be careful here. Despite announcing 42 price reductions over the 7 years of AWS existence, the M1 tier was only reduced for the first time in 3 years after Google/Azure dropped their pricing. They push through a huge number of price drops, but sometimes (actually often) their headline drop is one very small, unique charge, leaving everything else unaffected.
They will, but only for people who didn't lock in a reserved instance. Amazon doesn't typically upgrade instance types, but rather introduces new ones. So if you lock in a t2.micro instance for a 3-year reserved period, you are stuck with exactly those specs for 3 years. If they introduce a t3.* instance class next year, you don't get a free upgrade to the new specs.
That increases the flexibility, but afaict doesn't help with the upgradeability angle. When Amazon passes along the technological dividend by creating new instances with a better price/performance profile, it's usually as new family types. So your reserved t2.* instance can swap with other t2.* instances, but won't be able to take advantage of any t3.* instance that's introduced in the future.
It's not a huge deal, just something to take into account when projecting out costs: the 3-year reserved instances are locking in today's prices until 2017, in return for a discount over today's non-reserved prices. Whether this produces long term gains requires some assumptions about how the market will change over the next 3 years.
And that's worth quite a lot, given Amazon's bandwidth is expensive. The 2tb in the Linode $10 account will run you about $200 with Amazon. Doubtful anyone is burning 2tb on a $10 account, and I'm sure Linode has that in mind, but even if you're using 200gb of bandwidth that would still cost you over $20.
Pricing plans for Amazon also include the best security and interconnects bar none, fastest access to the wide array of AWS and AWS hosted services and an ecosystem of tools and apps.
And of course Amazon has transparent disclosure for outages and security issues unlike say Linode.
> The T2 instances use Hardware Virtualization (HVM) in order to get the best possible performance from the underlying CPU and you will need to use an HVM AMI.
I've always used paravirtual AMI's, as I understood that gets the best performance for a Linux box.
Given that I try to use the same self-baked base AMI's for various purposes (and instance sizes), I would either have to mix and match or switch everything to HVM. However, I have no clue what the practical consequences of that would be.
HVM gives the best performance because you can take advantage of certain hardware features through the hypervisor. It's basically more direct access to the hardware, which makes it faster as you don't have as much hypervisor overhead. Amazon's "enhanced" networking and SSDs need HVM to get a good chunk of performance.
Yes you'd have to build new AMIs with HVM. It'd be easiest if you had some kind of configuration management so you didn't need as many AMIs baked. When I build machines I use a script to handle the creation and mounting any extra volumes on a machine that I have as "nonstandard". I have only 2 custom AMIs - one for PV and the other for HVM. You'll need to have at least both, because if you wanted to use certain instances (t1.micro, m1.small come to mind) you can only use PV.
HVM vs PV is confusing because as Xen improves the performance characteristics of the two modes change. Brenden Gregg covered the differences in quite some detail in a recent blog post [1]. Basically, if you are running a new enough kernel on the guest os you will get better performance from HVM.
This looks like it's a reaction to (and effective solution for) the problem with t1 instances that made them largely useless (or a gamble at best) due to sharing a CPU with instances that run at full load all the time.
Any recommendations for software builds? I usually go with c3.4xlarge for building Android platforms but wondering if there are alternatives out there.
That was my understanding too. At $9.50/mo, a server with 96GB of RAM would bring in $912/mo.
A quick click around dell finds that a mid-range 1U rackmount server (R320) with that much RAM costs $3,135.
So a back-of-the-envelope calculation makes it seem workable, especially for high-RAM low-CPU configurations, which is what this is.
There are other tricks that they might be employing, such as swapping out part of RAM to SSDs behind the scenes, as well as compressing RAM contents. On low-load servers like these, typical usage would imply that RAM would be mostly static.