How does they author (or anyone else) propose that you do autoscaling without lo...

spmurrayzzz · on Feb 22, 2023

If multi-cloud is actually something an org cares about, which I'll concede ought to be exceedingly rare for an early stage startup, there ways you can design your infrastructure to mitigate that problem.

There is some up front work of course, but there's a ton of cloud provider-agnostic library tooling out there these days that you can use to build something like autoscaling (pulumi, terraform, et al.). You might not end up with full feature parity of what k8s offers, but you'd get 90% of the way there.

runlevel1 · on Feb 22, 2023

Using Pulumi and Terraform don't really make your setup cloud-agnostic. The config becomes cloud-specific very quickly due to the differences between the providers.

Kubernetes, in and of itself, only makes a few core things somewhat cloud agnostic (compute, LBs, and volumes). You can use those primitives to run cloud-agnostic alternatives to managed services within your cluster, but for most startups that's going to be a premature optimization.

spmurrayzzz · on Feb 22, 2023

Definitely agree that the multi-cloud thing is a premature optimization for sure.

But k8s isn't cloud agnostic either, at least not if you hold it to the same standard that you're holding pulumi/terraform.

Many of the lower level native abstractions, like CNI plugins, aren't 100% interchangeable and don't always just work depending on your use case. There's a reason AWS had to build its own VPC CNI plugin to get EKS fully functional across all of its networking services (particularly any service involving peering like DirectConnect etc).

julianlam · on Feb 22, 2023

The real question here is what makes you think you need a product that automatically scales?

If you have to think about it, chances are you don't need it. It's often the case about a lot of optimizations. I've had clients ask for help sharding their database, when that's almost never the correct course of action.

paulgb · on Feb 22, 2023

GP mentioned GPUs. GPUs + bursty traffic = either you autoscale, or you burn through a bunch of credits and VC cash on idle GPUs.

mattnewton · on Feb 22, 2023

This, we burn a lot of cash for every idle machine. Manually scaling up and down resulted in both worse performance and a lot more cost.

thundergolfer · on Feb 22, 2023

For GPUs startups can now leverage serverless GPU cloud providers: https://ramsrigoutham.medium.com/the-landscape-of-serverless....

Much simpler than setting up K8s with scale-to-zero autoscaling nodegroups.

mattnewton · on Feb 22, 2023

Will have to revisit this, when we originally evaluated banana.dev and similar platforms they lacked the ability to mount a network drive to quickly load a bunch of model weights after spinup, which is a weird requirement we had with a previous pipeline that we don’t need with other products.

thundergolfer · on Feb 23, 2023

Modal.com has a network filesystem feature called 'shared volumes'.

Disclaimer: I work for modal.com, but honestly Modal is the most comprehensive of the serverless GPU platforms because it didn't start as just a 'serverlessly run the latest open-source model' platform, it's aiming to be an end-to-end cloud platform.

LunaSea · on Feb 22, 2023

I believe that being cloud independent is a bit of a pipe dream for startups.

Sure, Kubernetes gives you some independence but then you still depend on a lot of vendor specific services like S3, RDS, SES, SQS, etc.

kube-system · on Feb 22, 2023

You don't have to use those services, and there are some abstraction you can do to make things portable. For instance, each platform may have its own block storage, but you can have different storage provisioner configurations for each platform so that you can move your application smoothly between them.

LunaSea · on Feb 22, 2023

> You don't have to use those services

Sure, you don't have to but that is the whole point of Clouds.

Otherwise, using regular old-school hosting providers is much, much cheaper.

> and there are some abstraction you can do to make things portable.

I would disagree with this point.

I'm sure that there are some APIs that try to abstract out the Cloud service used but in the end you are tied to the pricing and technical specificities of each service.

If I want to use a file storage service, I need to know how to authenticate to it, handle the access control, host static sites with it, handle CDN integration, configure access logging, etc.

All of this is possible in multiple cloud services but will be different for each provider. That is sufficient enough for it to be a leaky abstraction.

kube-system · on Feb 22, 2023

There's plenty of value in just the block storage and compute at the large cloud providers, and these are not difficult to abstract. I know because I've done it. Yes, some of the abstractions are a bit leaky, but all those leaks are variables in our helm chart. My application code is written so that it doesn't care where it's running, nor does it need to know.

> in the end you are tied to the pricing and technical specificities of each service.

That's one of the primary driving factors behind our decision to design our application to be portable.

mattnewton · on Feb 22, 2023

Ah, by cloud independence i mean we can easily switch where the machines with gpus come up, not the whole stack. We might give up on this flexibility but so far it seems like it will save us a ton of cash.

LunaSea · on Feb 22, 2023

If you're simply using EC2 hardware I agree but then you might as well go for a lower-level hosting provider which will be much cheaper.

The point of Cloud Services is to provide all these additional services.

If you don't use those services then the flexibility is relatively trivial to achieve.

kube-system · on Feb 22, 2023

I don't use cloud providers because they have their branded value-add services. I use them because they're reliable, automated, and they have APIs. I can't point terraform at whatever random IPMI a traditional hosting provider gives me. The last time I spun up a new dedicated instance at a traditional hosting provider, it wasn't an API call. It was a few emails, an invoice, and a week wait.

mattnewton · on Feb 22, 2023

Gotcha. That’s exactly the problem I am trying to solve, using just the ec2 style hardware, and being able to spin up on smaller clouds like coreweave to take advantage of availability and prices.

BadassFractal · on Feb 22, 2023

Being locked into a specific cloud is usually nowhere near the top of threats to a startup.

mattnewton · on Feb 22, 2023

True; I really mean in terms of being able to quickly move the worker machines to a new cloud to take advantage of price differences. So far we have manually moved around which is painful but saves us a bunch of money.

frankfrank13 · on Feb 22, 2023

Do you actually need autoscaling? Or is it something you're worried you might need one day

mattnewton · on Feb 22, 2023

Yes. 100% need to scale up the number of gpu workers and scale them back down based on request queue size, which is bursty. Otherwise we could spend 5 figures/month on gpus doing nothing for half the day and then still have unacceptable waits during traffic spikes

mikedelago · on Feb 22, 2023

The implication of using K8s to avoid vendor lock-in might also lead to self-hosting your databases and other supporting software.

In this case, you're just standing up normal servers in the cloud, which is pretty trivial in all of the major clouds' terraform providers. As long as the OS on the servers is standard (say, CentOS or ubuntu), your Config as Code stack (think ansible/puppet) won't really care which cloud provider you're using.

If they need to scale, they can add in another server to their infra as code, and make sure the config as code can handle more than one server. Not really that difficult with a little experience with terraform and ansible. There's also room for Auto scaling groups, which all of the major cloud providers support as well.

mattnewton · on Feb 22, 2023

Gotcha, it sounds like terraform by itself might be enough too handle the autoscaling groups, that’s mentioned a few times here and something I was looking at with k8.