Hacker News new | past | comments | ask | show | jobs | submit login
AWS announces per-second billing for EC2 instances (techcrunch.com)
305 points by jonny2112 on Sept 18, 2017 | hide | past | favorite | 139 comments



That's sure nice, but I'm waiting for AWS to switch to automatic sustained use discounts [0] like GCP offers.

[0]: https://cloud.google.com/compute/docs/sustained-use-discount...


They would legitimately lose a lot of money if they did this now (and I would argue, wouldn't make up for it in market share). Most companies I work with who switch to AWS think it's a 1:1 conversion in terms of cost from a data center and pretty much just "leave AWS on" not realizing that they are not only paying for computing cost but the hidden cost of being able to scale up more instances quickly.


It's really really difficult to be able to turn on and off instances as needed, like turn a test instance on in the morning when developers come in and turn it off in the evening when they leave.


I don't understand. If autoscaling groups are too complicated, just create the instance and stop and start it when you want. You can control permissions via IAM groups and instance tags if you need to be careful. Or you can indirect through a lambda if you need to be super careful.

Good morning! `aws ec2 start-instances --instance-ids i-12345678900`

Good night! `aws ec2 stop-instances --instance-ids i-12345678900`


You can implement this with a simple auto-scaling group or write a small Lambda function which runs at a certain time to start/stop the instance


Autoscaling groups can be scheduled without a lambda function.

http://docs.aws.amazon.com/autoscaling/latest/userguide/sche...


You typically want exactly 0 or 1 instances for testing instances. I don't recall autoscaling groups able to do that.


This is entirely possible. Scheduled actions can make this happen on a schedule (simply schedule desired-capacity=1 when you want it on and desired-capacity=0 when you want it off). Or you can simply run a command to do it on demand.


I have an autoscaling group that goes from 0 to 1 every day for some batch processing. It took a bit of finagling but it's definitely possible.



Tag it with something like shutdown=non-biz-hours, or can even specify a schedule that a lambda function can pick up each minute or hour and parse to stop/start any instance that falls within that window.


Well that works if you're a startup in one timezone with no teams depending on your service. If you work on a larger company, it's quite possible the sun never sets on your project (west coast, east coast, Europe, India, Australia or SE Asia)


If the sun never sets on your project, then you probably don't want to shut it down because you're using it 24x7.

But since a Lambda is a program, you can make the criteria as complex as you like.

Tag it with "shutdown=non-biz-hours", and "timezones=UTC-7,UTC+0,UTC+5:30" and your shutdown script can figure out when it's outside of business hours in all of those timezones and shut it down.


If the sun never sets on your project, why would you want to shut the instance down?


One reason would be that the savings is worth the increased latency. Depends on the situation.


I'd only do this strictly in dev or qa. Remote teams can always spin it back up if they need it on a wider schedule.


This is really easy in Azure, its a built in function of the admin console. I control my dev VMs manually usually but have the cutoff set for a couple of hours after usual quitting time as a backstop.



Yes I still don’t know why this feature isn’t in the AWS Console yet. Even Oracle Cloud of all people has this feature in their Virtual Compute.


It's, just that AWS just has a building blocks.

To implement it use autoscaling group of size of 1. Then change desired to 0 or 1 based on need.


So basically you want to [pause] all the instances.


Which is why reserved instances exist.


The reserved instance pricing model is really poorly implemented. You have to commit to an instance class and a region, and if you need to change you can, but there is some secondary market. When compared to a sustained use model, it is a disaster. Managing all that in a large org is a real pain in the ass.


There are so many options for RI purchases honestly.

Standard vs. Convertible: Convertible allows you to switch between instance families (like c3, m3, m4, i2, r3 etc...) but it requires you to make a 3yr commitment, commit to a specific AZ and doesn't offer the same level of savings you'd get with a 3yr standard, I think it's closer to the savings you'd get with a 1yr Standard RI. Standard RI's come in 1yr or 3yr commitments. You can choose between the default option of the "Regional Benefit" which allows you to apply the RI to any instance that meets the RI criteria in a given region or you can choose the "Capacity Reservation" option which requires you to commit to a specific AZ to guarantee your reservation. That's right, your 'reserved' instance isn't necessarily reserved in the case of an outage unless you commit to a AZ.

One benefit that you get with Standard RI's is that they can be applied to any size instance in the family for which you've purcahsed them. Amazon allows you to convert between nano, micro, small, large, 2xlarge etc. within an Instance Family (t2, c3, m3, r4 etc.) with Standard RI's and they apply these conversions automatically on your bill.

Then you also have to choose how you pay, no-upfront and all monthly (smallest capex), partial upfront/partial monthly or all-upfront (largest capex).

It's frustrating because it's so complex that it makes me hesitate. I never feel like I'm getting the best deal and it just feels like AWS is taking advantage of the market by making pricing impossible to decipher.

Does anyone have any good methods for determining how many RI's to purchase, how often etc?


Have you seen/tried https://stax.io/


Reserved instances are really a cultural hack. It's easy to get open budgets but capex is a different, harder process. Reserving strays into Capex.


Not just region, I think it's per AZ.


It is per region, but you can request a reserved instance be moved between AZs.


I do this with Python+Boto+Jenkins: https://giorgiosironi.github.io/talks/ec2_ci/slides/ Until now we were turning off at the end of the hour (e.g. 2:55:00, 3:55:00) but now we have to change the model a bit.


I'm waiting for GCP to support preemptible GPU instances. Would love to be able to spend ~50% less on GPU instances when running a batch job that can be stopped and reloaded.


Not sure that will happen anytime soon. It's the same with Local SSD. The virtualization of this environment is very challenging, maybe beyond a point where it makes sense for them to do so.


But GCE already has local SSD’s for preempt instances, and they even cost less.


Wasn't aware they added the support. Well, maybe they can figure out GPUs too.


I'd be surprised if it'd be that much additional work, if the local SSD options are NVMe -- which is also PCI-e like GPU's.


Wait what, would they expose the raw SSD pci device to your vm? What's stopping you from scraping all the leftover data from the previous customer?


Probably something along the lines of secure erase. Most modern SSDs/NVMe drives are encrypted by default in firmware. All the firmware needs to do is throw away the old keys and generate new ones. It's better than zeroing the drive as there is no wear to the write cycles and guarantees that the slack space in the SSD is also cleared, which DD'ing to /dev/nvme0 wont be certain of. The nvme-format tool can be used for this: http://manpages.ubuntu.com/manpages/zesty/man1/nvme-format.1...


On newer SSDs, the sanitize command would be preferable for this use over the format command. IIRC, the format command doesn't require quite as strong a security guarantee as the sanitize command: the latter ensures that user data is cleaned from both the flash and all buffers, CMBs, etc.


I've seen some hints that GCE uses NVMe emulation not PCI passthrough. This would allow the hypervisor to implement features like live migration.


I find it interesting to think about virtualization on a spectrum with passthrough at one end and pure emulation at the other. Particularly when applied to I/O peripherals, if the bulk of the I/O can effectively be classed as payload, stepping slightly away from passthrough while maintaining roughly the same layer of abstraction grants the implementer considerable liberties in implementation with very little overhead.

If one were to look at storage, specifically, the move to 4k block sizes would seem like a particular boon in terms of increasing the volume of data covered by any given IOP.


Disclosure: Jon works on virtualization for Compute Engine :). (and I work adjacent-ish)


Your data is encrypted as soon as it exits the VM and before it's written to storage.

https://cloud.google.com/compute/docs/disks/#ssd_encryption


It wouldn't be a switch, it'd be a new feature. GCP also offers the equivalent of explicit instance reservations


Good point, poor choice of words on my part.


That is cool if you maintain a fixed number of instances but it doesn't seem to be very cost efficient if you are implementing autoscale


GCP billing is all calculated on a cpus/ram/hours basis. You're basically buying capacity units (and can even commit long-term for more discount) and can use that capacity in whatever way you want.

Running 2x 4cpu instances for 1 month is the same as 4x 2cpu instances for 1 month, and will come out the same even if you switch in the middle.



I really wish AWS would allow users to cap billing. Something that freezes all AWS services if the monthly bill exceeds X would make me a lot more comfortable when experimenting with AWS.


Agreed. Right now you can set billing alarms, but not actually freeze billing. http://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/...


Looking at the CloudWatch console, I see I can also add AutoScaling and EC2 actions that are triggered by an alarm. That still leaves open other risks like a bandwidth bill for S3 hosted content.


This is the main reason I keep a $40/month Linode running, when I'd probably save if I used AWS -- most of the time my instance isn't doing a lot. I just don't want the anxiety of a billing surprise due to a mistake, or a DDOS coming my way.


Have you looked at https://amazonlightsail.com/ as a way to make get a billing system similar to Linode?


As someone who has used both and DigitalOcean, Lightsail performance is absolutely terrible compared to the above. It feels like you have perpetually run out of CPU credits.


Lightsail is just renamed t2 instances with storage and bandwidth bundled in. Also, last time I checked (which, granted, was the week Lightsail came out), you can't even check your CPU credit balance on a Lightsail instance. You also can't attach an EIP to a Lightsail instance.

Lightsail is dumbed down to the point of basically being worthless. There's really no reason to use Lightsail over Digital Ocean.


Lightsail is a baby step into the AWS environment. Once you're in the AWS console and Amazon has your credit card number, a psychological barrier has been breached, making it much easier to start experimenting with their other offerings.


I lost $1200 to an s3 mistake when I was broke years ago, it really sucked.


I mistakenly got a 1400 dollar bill . I just called and said it was a mistake and I don't want to pay it and they said ok. Probably too late for you but maybe helpful for someone in the future.


They said my demo was publicly accessible and it was, I probably talked too much :(


Have you looked into a tool like Gorilla Stack - https://www.gorillastack.com/?

(Note: I have no investment in the company)


Thanks Matteblack (whoever you may be) - my name's Oliver and I got alerted to this mention of GorillaStack.

Without wishing to be too promotional, you can set a trigger to shut off EC2 when a cost threshold is reached. You can currently automate shut off of RDS but not yet from a cost threshold (that will be available shortly).

TLDR: we can do a lot of what you ask but not all of it.

Feel free to reach out if you'd like more info.


Oliver, Curious to know how did you get this alert of mention?


Probably from some online brand monitoring service. I've used https://brand24.com/ and it was quite powerful.


This is really dangerous due to the way clouds calculate billing, it isn't a mistake that clouds want to enable users to make.


Per second billing is somewhat of a gimmick just so Amazon can say they are more granular than Google Compute. The difference between seconds and a minute of billing is fractions of a cent. Rounding errors.

The exception is Google Compute has a 10 minute minimum, so if you are creating machines and destroying them quickly, per second billing will be noticeable.


I think the useful comparison people are making is the difference between the previous per-hour billing and new per-second billing. Sure, if they can get some mileage comparing per-second to per-minute, great. At the end of the day isn't the increased granularity better?


I think "second vs. minute" might be into "distinction without a difference" territory. Sure, per-second will definitionally help at the margins, but I'm dubious about the amount of money it will actually save.

That all being said, it can enable a bunch of interesting things (e.g. more interesting stuff with AWS Lambda), and I look forward to see what per-second billing becomes an enabling technology for.


Sure. I read 'nodesocket as arguing that AWS is promoting per-second as a distinction between AWS and GCP ("somewhat as a gimmick" in their words). That's not mentioned in the submission, nor the AWS blog post announcement.

The important and very real distinction is the change from per-hour to per-second. If you're going to make it more granular (which from per-hour is a good thing), why would AWS stop at per-minute if it's the same or only marginally more difficult to make it per-second, particularly when they have the added benefit of being more granular that GCP? In other words, I don't see the reduction as primarily a marketing move on AWS's part. I'm sure they felt pressure to make it more granular. Stopping at parity doesn't necessarily make sense, nor should they be called out for doing more purely for marketing reasons.


Do you have anything in mind for "interesting stuff with AWS Lambda"? Like more granular chargeback for a customer, even though Lambda is already billed per ms for most the part.


This is one of the better things to happen in ec2 in years for me. We have a bunch of scripts so a spot instance can track when it came online and shut itself down effectively. It took far too much fiddling around to work around aws autoscale and get efficient billing with the per hour model. In the end we came up with a model where we protect the instances for scale in and then at the end of each hour, we have a cron that tries to shut all the worker services down, and if it can't it spins them all up again to run for another hour. If it can, then it shuts the machine down (which we have set on terminate to stop). The whole thing feels like a big kludge and for our workload we still have a load of wasted resources. We end up balancing not bringing up machines too fast during a spike against the long tail of wasted resource afterwards. This change by ec2 is going to make it all much easier.


Back to the future: this was how computing worked back in the punch card days. Minicomputers and personal computers were supposed to liberate you from this tyranny: computing so cheap that you could have a whole computer to your self for a while!


Somehow we've reached a point where a 2GHz, 2GB computer that fits in your pocket is only worth using as a terminal.


Apple is moving in the opposite direction with onboard-only face ID and ARKit.


The global scale makes even that cheap computing become fairly expensive. So, it's only natural that we'll take punch card type ideas (at local scale) from the past and apply them at our global scale.


We still have whole computers to ourselves. Those that wish to throw their money away renting can knock themselves out.


We have these things. It's even cheaper in most cases. :)


Likely due to GCP competition. I believe GCP was always per-second? [Edit: Misremember that, they were always per-minute. Lots of good information below directly form the related parties.]

Azure looks to be per-hour [Edit: Wrong again, they are per-minute as well. Oddly enough, I did check their pricing page before, but missed the per-minute paragraph and only saw the hourly pricing] but I'm seeing something about container instances possibly being per-second.


GCP VMs are per-minute, with a minimum of 10 minutes (vs AWS' new minimum of 1 minute). Second resolution is nice, but I doubt it makes much difference in pricing for most workloads. https://cloud.google.com/compute/pricing#billingmodel

Azure's containers don't use a full VM-- they're more like AWS Lambda or other serverless frameworks, so they do per-second billing with no minimums.

Disclaimer: I work at Google on Container Engine.


Azure's EC2 equivalent is Azure Virtual Machines[0], which bills by the minute.

[0] - https://azure.microsoft.com/en-us/services/virtual-machines/

"Keep your budget in check with low-cost, per-minute billing. You only pay for the compute time you use."


I would disagree on no minimums and equivalency to lambda et al, as azure container instances charge a create fee (iirc its equivalent to 100 seconds of their minimum configuration) which sits on top of the per-second billing.


We don't mind per-minute billing on GCP, but would love to get the minimum down to 1 minute or even less. We have some tasks that finish under 4 minutes where scaling horizontally instead of vertically makes much more sense to us.


Hyper.sh runs docker containers, not VM's, and has per-second billing with a minimum of 10s.

I really want to use them to parallelise CI test runs, but haven't gotten round to setting this up yet.



The minimum of 1 minute makes a difference... Granted 10min is not that much of a problem :)


I was under the impression Azure was also per minute, and AWS was the last hold out of hourly billing among the big three until today.


Per-minute for GCP, but I am not sure it makes that much of a difference. Anyway, now the Game is on


This should enable some entirely new use cases, especially around CI and automation in general.

Per-second billing greatly reduces the overhead to bringing up an instance for a short task then killing it immediately - so I can do that. There's no need to build a buffer layer to add workers to a pool and leave them in the pool, so that you didn't end up paying for 30 hours of instance time to run 30, two-minute tasks within an hour.


For us it will mean we can spin down Bamboo elastic agents much quicker and save money.


I once considered writing an EC2 autoscaler that knew the exact timestamps of the instances so that it could avoid shutting down VMs that still had 59 minutes of "free" time left because they'd been up across another hour-long threshold. That sort of nonsense logic shouldn't be useful, but Amazon was giving a huge economic incentive for it.

This is certainly a long time coming.


If it helps, AWS' default auto scaling algorithm specifically takes into account instances which are nearest to their next billing hour and prioritizes those for termination accordingly to, in theory, save money.


> I once considered writing an EC2 autoscaler that knew the exact timestamps of the instances so that it could avoid shutting down VMs that still had 59 minutes of "free" time left because they'd been up across another hour-long threshold.

Years ago my boss at the time did this (this was when scaling had to mostly be done in code/by hand). I just recently updated all the code as I moved it to using spots. The low price of spots made it less important to shut ones down closer to the hour mark though.


Remember that AWS has been in the game for over a decade. Per-hour billing was amazing when it came in.

Also, is the economic incentive really that huge? Or is it just a nicety?


it's totally dependant on your workload. for some users there will be absolutely no difference, for others it could easily be thousands or tens of thousands of dollars of savings over a year.


Thanks, I stand corrected. I've read through a few other use cases in the comments here and I can see now that there's scope for savings, depending on workload.


> that it could avoid shutting down VMs that still had 59 minutes

AWS batch currently does this. I presume that will change now.


This is great news and a long time coming.

I really hope Amazon build something like Azure Container Instances [1], as per second billing would make this sort of thing feasible.

[1] https://azure.microsoft.com/en-us/services/container-instanc...


Ah, finally. They've ruined my idea for an optimal EMR job runner. Under the old system, if you have a linearly scalable Hadoop job, it's cheaper to, say, use 60 instances to do some work in an hour vs 50 instances to do the work in 70 minutes, assuming you're getting rid of the cluster once you're done. No more!


I think the per-second billing is off the point. How does it help, if the EC2 instance takes tens of seconds to launch, and tens of seconds to bootstrap?

To make the most of per-second billing, the compute unit should be deployed within seconds, e.g. immutable. prebaked container. You launch containers on demand, and pay by seconds.


It has a one minute minimum anyway. And does it not help? Let's say a deployment strategy has a temporary increase in instances so it can transition to a new version of the application. If your deployment takes 5 minutes, you're only paying for 5 minutes worth of extra instances whereas the hourly billing would get you for an entire hour. Am I completely misunderstanding something?


You're describing exactly https://hyper.sh.


Or Azure ACI.

Per-second billing doesn't add much value to EC2.


EC2's granularity was hourly before. That's the value being added.


Per-minute billing makes more sense to EC2, given the reason above.


Well, on average it's per-minute billing with a 30 second discount.

I agree it's basically marketing from AWS, but it's still strictly better than per minute billing


Really welcome, although per millisecond would be better.

It's now possible to boot operating systems in milliseconds and have them carry out a task (for example respond to a web request) and disappear again. Trouble is the clouds (AWS, Google, Azure, Digital Ocean) do not have the ability to support such fast OS boot times. Per second billing is a step in the right direction but needs to go further to millisecond billing, and clouds need to support millisecond boot times.


Just curious here, what OS can do millisecond boot times? How many milliseconds are you talking? And the constant boot time of the OS is so much less than the OS responding to the web request that this is actually worth it?



Intersting. There's a link to here on the Wikipedia page: http://zerg.erlangonxen.org/

But if I'm reading things correctly, it still took over two orders of magnitude longer to boot than it did to reply. So what sort of use case does millisecond boot help with? Very sporadic requests?


Cold EBS boot will still be super slow...


If you're concerned about cost, AWS is almost never the right place to host to begin with.


Agreed - just check cloud comparison, AWS is rarely at the top: https://www.cloudorado.com/cloud_server_comparison.jsp


Well, if you're going to use bad figures, then sure, AWS won't win. The default size there is 768MB RAM, 1 cpu, and 50GB disk... which it says AWS will provide for $54. Whereas in actuality a t2.micro with those specs only costs $14, lower than all the listed prices (which are all clearly out of date)

Not to mention all the big names missing from that list. For some reason Dimension Data makes the list (and it's woeful, from experience), but there's no Digital Ocean, OVH, Hetzner, etc...


As per my other reply: A t2.micro does not allow you to use more than 10% of the vCPU on a sustained basis. Any use over that needs to be earned, and you only earn 6 credits (for one minute each) per hour.


Wow thanks for sharing this link. Didn't know about this.

One thing I noticed though is the pricing seems a bit biased; for example for AWS it recommends an m1.small with 1GB Ram and 20GB of Storage at $35 a month ... However if you used a t2.micro that would give you the same specs for $10.79


Not quite the same, you don't own the whole core on the t2 and will get cpu throttled.


> you don't own the whole core

Moving the goalposts here. 'Not owning the whole core' is the default in the cloud.


For the other instances you get a specific number of units of processing capacity that you can use 100% of continuously if you like. For the micro instances, you get a base level and build up credits towards bursts, and can not maintain 100% utilization continuously. It's very much different and not the default. To quote Amazon:

> "A CPU Credit provides the performance of a full CPU core for one minute. Traditional Amazon EC2 instance types provide fixed performance, while T2 instances provide a baseline level of CPU performance with the ability to burst above that baseline level. The baseline performance and ability to burst are governed by CPU credits."

A t2.micro allows only 10% of the vCPU baseline performance. Anything above that needs to be "earned" at a rate of 6 credits per hour. The t2.micro can accumulate a maximum of 144 CPU credits (+ the 30 initial credits, that do not renew), each good for 1 minute of 100% use.

So in other words, you can on average only use 100% of the CPU for 6 minutes per hour.


m1.smalls are also ancient, when the current generation m4 is more than a year old at this point.

Odd site.


that's a very handy site, previously I had mostly been using http://www.ec2instances.info/ and http://www.gceinstances.info/

Thanks for pointing it out!


You miss the point.

The lifetime of a web request, for example, can be measured in milliseconds.

It is now possible, technically anyway, for operating systems to boot, service the request and disappear.

There needs to be pricing models that reflect computing models like this.


There is likely a point of diminishing returns in this type of scenario. The OS boot time, the web service setup time, the actual request and then the shutdown time. Also consider that unless you have an external caching layer, you may be processing some requests that could have been cached by an always-on server. If your site has predictable traffic patterns then I suspect the math would be in favor of always-on provisioned servers with scale up/down based on traffic. If you have a very high traffic site the extra OS boot time (even in milliseconds) is going to add up quickly. You'd have to be very sure the spin up/down time is less than the idle time of the always-on server.


>It is now possible, technically anyway, for operating systems to boot, service the request and disappear.

This was already possible even with a 2 second boot time. The problem is that it's a stupid use case because (unless the OS boots up in <10ms) the latency of waiting for the bootup is intolerable in any use case where a 2 second boot time was intolerable.


If you're doing that, why bother loading an entire operating system? Just use something like AWS lambda instead.


And that's great. But it's meaningless when the base-cost per unit of computation capacity is so high that it is in most cases cheaper to have a whole farm of servers running idle elsewhere.


Isn't that what lambda is all about? Sub-second billing?


Sounds like you're describing AWS Lambda / serverless architecture. But maybe I'm not understanding your use case?


There are a wide range of tiny operating systems that can boot in a matter of milliseconds.

The applications are "whatever you want imagine" but yes one application is building FAAS Function As A Service in which the operating system carries out a single function.

Put anther way, Docker is complex, overweight, and requires re-implementation of much computing infrastructure. You can meet many of the same goals as Docker in much more simple way not by building containers but by building tiny operating systems.


I'm somewhat amused by the idea of booting an operating system from scratch to service a single request being described as "much more simple" than the alternative of, y'know, having a single instance serve many requests.


From an ops perspective, spinning up an instance to process a single request seems like

1) a complete and utter nightmare to debug.

2) A huge waste of computing resources. Even with a unikernel you're wasting time initialising resources and getting in to a ready state to be able to process a request. Why bother when you can be ready and respond effectively instantly?


OS does not serve requests -- applications do. While it may be possible to demo a toy OS+app, real-world applications take seconds if not minutes to start and warm up. Throughput on a cold-cache is a fraction of that on a warm cache.

Starting in milli-seconds is not the hard problem. Starting + warming caches in that time is -- that will get you a bunch of awards when you solve it.


Docker is just a manager of Linux namespaces. You'll need one to manage your operating systems anyway - start/stop them, copy them to the machine, delete them, etc.


Interesting that the techcrunch link has thrice as many upvotes as the amazon link


I suspect it's simply a function of which one happened to catch people's eye and where they started their discussion. Multiple submissions on the same topic (from the same or different sources) aren't that uncommon. Once one gets some momentum, it's likely to be reinforced: it'll appear higher on the front page, more people will notice it, more people will comment, more people will notice the comments on that article, which will make them notice that article, et cetera. I don't think you can read much more into it than that. I wouldn't be surprised if a mod comes along and merges the comments from one into the other, if they notice it.

What would be interesting is if they had exactly the same upvotes and comments.


Hackernews black magic :/


Serverless advocates/engies are probably the only people celebrating this, everyone else keeps waiting for self renew instance reservation... last time i forgot about them it was too late.


The market for this is much broader. We do a bunch of data science so spinning up a heavy machine and only getting billed for the 5 minutes usage is a massive saving for us. I'm quite excited by this news!


That is basically the serverless main "point of sell" you are just one step away from automation (if you aren't already doing it) and it will be virtually the same as serverless


You're right, this is effectively a serverless mode. But the serverless instances that are currently available (at least on aws) aren't powerful enough for some applications. For those of us stuck in the middle, needing big machines with serverless behaviour, this is a massive win.


I'm in the same boat as you: data science workloads that run intermittently on big hardware.

What are you running your big jobs on? Because I'm currently using Batch, but given you've got to wait for the compute environment/VM to start up (if it's not already running), and that's a pain because it takes forever to startup.

I wish I could just run containers on large hardware the same way we can run lambda's: press the button and it just runs, I don't really care about having my own full compute environment, I just need enough memory and CPU to run it.


Ours are actually user generated and the running time of each task is variable (few minutes to an hour). Users can to dump anywhere between 1 and 200 tasks on at a time.

The way we have it set up is:

- simple job queue with RQ (redis)

- monitoring watches the queue and pumps a metric into Cloud Watch (there are a few different types of job and it calculates a single aggregate value for "queue pressure")

- autoscale then sets the desired capacity for a fleet of r4.2xlarge machines (somewhere between 1 and 20)

- the autoscale config protects all those machines from scale-in so they have to be shutdown externally

- each of those machines has a cron on boot that tracks the start time

- this enables a cron to run just before the end of each hour. If that machine isn't doing anything at the time, it will shut itself down

- the machines are set to terminate on shutdown so they die completely

- additionally, we've hacked RQ so that workers that are closer to death will move themselves to the back of the queue more frequently. This ensures that we have a higher chance of not being busy / shutting them down at the end of the hour.


This is great and will save a lot of people a good amount of money.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: