100% agreed that Kubernetes is overkill for many if not most deployments. We constantly see small startups prematurely adopting Kubernetes which is a costly investment in terms of building up internal knowledge and maintaining the cluster. Gradient (https://gradient.paperspace.com) is push to deploy service built on Kubernetes but you don't need to know anything about Kubernetes to use it. We feel like is the right way to leverage the power of Kubernetes unless you're at Netflix or Lyft scale. In Gradient, you just provide a model, you select an instance type (several affordable GPU options offered), and a docker container. Everything else e.g. autoscaling, auth, rolling updates, etc. is handled automatically. Kubernetes does an amazing job providing the backend for these operations but data scientists and even devops teams at startups should not be wasting time rolling their own Kubernetes cluster and installing/maintaining an inferencing service on top.
They are just two different solutions that have pros and cons just like any two solutions :) A few that jump out:
- Setup time: Setting up GCP, setting up a certificate, adding a static IP, etc. is not seamless/adds friction
- Autoscaling and rolling updates (no downtime)
- Team management and collaborative environment with usage tracking, permissions, etc.
- Optional integration with a pipelining service for training, tuning, deploying models in a single tool
And a point of clarification: Practically speaking, neither tool is free. Both require a cloud instance so they will cost roughly the same for the end user (Gradient also supports preemptible instances).