> the amount of compute resources and electricity that would be needed to power ...

dragonwriter · on May 5, 2021

> Customer sets a monthly spend limit. Every time they start up an instance, create a volume, allocate an IP, or do anything else that costs money, you subtract the monthly cost of that new thing from their spend limit. If the spend limit would go negative, you refuse to create the new resource.

AWS systems are highly distributed; this kind of sharp billing cap would necessarily introduce a new strong consistency requirement across multiple services, many of which aren’t even strongly consistent considered one at a time (and that’s often true even if you limit to a single region.)

> Every time they start up an instance, create a volume, allocate an IP, or do anything else that costs money, you subtract the monthly cost of that new thing from their spend limit

For the motivating use case (avoiding a bill on the scale of even $200—possibly even $1—from a free-tier-eligible account), using monthly chunks doesn’t work; you suddenly couldn’t spin up a second simultaneous EC2 instance of ant kind after an initial t3.micro instance, which would cutoff many normal free tier usage patterns.

I mean, that’s a good way of capping if you are using AWS as a super overpriced steady-state VPS, but that’s not really the usage pattern that causes the risks that the cap idea is intended to protect against.

This is a particularly poor solution to completely the wrong problem.

KirillPanov · on May 5, 2021

> AWS systems are highly distributed

Hogwash, I tried to spin up 100 of your F1 instances in us-east-1 a week or two after they first became available, and found out about this thing called "limits".

Wherever you're enforcing the limit on max number of instances per region is already a synchronization point of exactly the sort needed here.

I'm sorry, this just doesn't pass the bullshit test. Resource allocation API calls are not even remotely close to lightning-quick. There is no fundamental immutable constraint here.

> For the motivating use case (avoiding a bill on the scale of even $200—possibly even $1—from a free-tier-eligible account),

Avoiding a $1 bill is definitely not the motivating use case.

A lot of people would be happy to have a mechanism that could prevent them from being billed 5x their expected expenditure (i.e. they set their budget limit to 5x what they intend to spend). It doesn't matter that that isn't perfect. It is massively better than what you're offering right now.

dragonwriter · on May 5, 2021

> Hogwash, I tried to spin up 100 of your F1 instances

I don’t have any F1 instances. Have you mistaken me for an AWS employee rather than a user?

> in us-east-1 a week or two after they first became available, and found out about this thing called "limits".

Yes, individual services, especially in individual regions, and especially a single type of resource within a service within a region like, say, instances in EC2, are often at least enough like centralized to impose hard limits reasonably well.

Billing accounts (and individual real projects which—and this is one disadvantage AWS has vs, say, GCP—AWS has only the weakest concept of) tend to span multiple resource types in each of multiple services, and sometimes multiple regions.

> Resource allocation API calls are not even remotely close to lightning-quick.

Resource allocation API calls that have high latency aren’t the only API calls that cost money and would need coordination. Heck, API calls aren’t the only thing that costs money.

czgs2woh4ue3 · on May 5, 2021

> Update that setting in your routers (qdisc in Linux) as part of the API call that allocated a resource. If you claim your routers don't have a limit like this I call shenanighans.

Eh. AWS's edge network is highly distributed. Unless you want an even split of your rate limit across every possible way out of the network, you'd be much better off settling for an even split across your EC2 instances, and there's no room for bursting in this model. Enforcing per-instance limits (on any dimension) sounds pretty feasible, though.

This wouldn't generalize straightforwardly to services that don't have obvious choke points that can impose this sort of throttling, such as, I think, DynamoDB.