Hacker News new | past | comments | ask | show | jobs | submit login
Containerize Go and SQLite with Docker (awstip.com)
122 points by hhaluk on Jan 17, 2022 | hide | past | favorite | 85 comments



If you are using Go (which solves most of your dependency problems) and SQLite (which means you don't need to integrate with an external database via service discovery) why do you need Docker at all?


Perhaps because Docker has great stories for deployment. Many of the complexities of deployment are handled for you (writing a systemd unit, managing rollback, etc).


Yeah, but for Docker to handle the complexities of deployment, you first need to handle the complexities of Docker. So OP's question is valid: for most Go apps, all you have to do is compile a binary and copy it to the server - no Docker or other paraphernalia required. Of course that may not be so simple due to various reasons, but it helps to keep that possibility in mind...


> you first need to handle the complexities of Docker

The complexity of Docker is not that big for a Go deployment though, especially if you have all the other bits for orchestrating your Docker containers (for the rest of your stack) already in place. You mostly just copy the binary into a slim image and you are done.


> You mostly just copy the binary into a slim image and you are done.

You don't need docker for that, just 'tar czf my-layer.tar.gz my-dir'. If you want a manifest file, you can get the digest using `sha256sum my-layer.tar.gz`.


Agree, and most complexities will occur in enterprise environments when the os/hardware is locked down — which can make something like SQLite “hard” as would any cpu/disk-bound container. However that should be a platform teams job to resolve, not a backend dev.


> all you have to do is compile a binary and copy it to the server

Docker does this quite well, and solves a bunch of other problems you're likely to have regardless. One really simple example is "how do you copy it to the server?" Do you have ssh keys for your server on development machines? How do you handle the "Oh I'll just remote in and fix this one X"?

It's also _crazily_ easy to get started with, and handle middling amounts of scale. If you're on AWS, Elastic Beanstalk is plug and play. If you're not, DigitalOcean App platform will host it for you, with automated deployments from git for $5/month with basically no configuration needed from you.


I don't get this answer.

Publishing your docker image to some open or private store still needs to happen. Then the host needs to be updated. This is not really that much simpler than "scp your binaries to the server". And has many more moving parts that can fail.

Now, there are better ways to distribute go projects than scp, for example heroku style or by just abusing its builtin git support.


Yea, Docker is crazy to me. I can never understand people who think it's awesome or something. It's awful.

Of course, if you're doing somethign in Python/Ruby/Node.js then it's useful because these language do not provide a reliable method to compile source code into an independent program/application.

But that does not make it "awesome" or anything. It's just a band-aid. It's an extra layer to hide fundamental design problems.

With Go, the root problem is solved: the language has a compiler that reliably produces statically linked binary executable programs.


Docker isn't a cache-all for containers. Docker has a pretty terrible deployment story at this point and I forsee in a couple years it will probably be considered legacy compared to Podman/K3d/Local K8s/etc


I think people use the word Docker to mean "produce an OCI-compliant image that a CRI runtime can run".


Podman does the same thing, as well as other tools. Using Docker to refer to something that doesn't use Docker is like saying that your Android Phone is an iPhone.


> Docker has a pretty terrible deployment story (...)

This is the very first time I ever heard such nonsense. In all companies I've been, Docker is a renowned problem solver, not only for production deplyments but also for local testing environments and deployments.

It even shines as a stand-alone barebones clustering solution with Docker swarm mode.


I’ve honestly not touched docker in years, but all of my prod apps run in docker. So it’d be paradoxical for me to see I don’t need docker, but I really want to.


Docker can be considered a deployment tool. You package your application in an image and run said image. Development and test of that application does not have to be in a docker image.


Nor does deployment. Podman, Rancher, etc all solve those challenges without relying on Docker and with Docker changing their licensing I don't think it will be the de-facto tool in a couple years. There will be others that will replace it.


Docker has become synonymous with OCI images, and my comment was exactly in that context (or at least state of mind) - I should have stated that as well.


> So it’d be paradoxical for me to see I don’t need docker, but I really want to.

I find it quite amusing that projects that try to position themselves as Docker alternatives end up basing their presentation on how their project can be used just like Docker, down to Docker's choice of command line interface.


I deploy my app (c# with sqlite, some native dependencies) using custom-made scripts to ubuntu, centos and redhat dedicated servers and it's a major pain to have a separate script for each version of each OS. I'll switch to docker soon so that I have a single target


> If you are using Go (which solves most of your dependency problems) and SQLite (which means you don't need to integrate with an external database via service discovery) why do you need Docker at all?

Dependencies is not really a problem with Docker, nor the thing it is designed to solve. If dependencies was the problem people cared about, everyone would just go with the single statically linked executable/fat JAR and no one would ever be bothered with Docker.

Docker is primarily about containerization, but it's also about ease of packaging and deployment. It's also a deployment format that provides horizontal scaling for free.

Also, the one-database-per-service architecture pattern is quite common, as well as ephemeral databases and local caching, and keep in mind that SQLite also supports in-memory databases.


Okay, I'll bite.

> It's also a deployment format that provides horizontal scaling for free.

Um. Docker does not provide horizontal scaling at all, for that you need orchestration. And those tools are anything but free if your time has any value.


I think what the OP means that using Docker you can use tools that give horizontal scaling for free (see: Cloud Run etc.)


The claim that horizontal scaling is free with Docker is simply not accurate.

Maybe it will be some day, but these orchestration and deployment tools built on Docker have enormous hidden costs.

IME, Docker takes an enormous amount of focus away from the customer problem and moves it to the how to get this mess working problem.

Now may be the right time in a given business to make that shift in focus, but to claim that it will be "free" is just misleading (IME).


> IME, Docker takes an enormous amount of focus away from the customer problem and moves it to the how to get this mess working problem.

This is the opposite in my experience: Docker lets me focus on the business problem by making deployment easier.


Docker lets me focus on cool problems like "which commands can I use to free up space on this cloud-based container-running VM, given that there are 0 bytes free, and many tools will crash if they can't make a tempfile/dir?".

At least, that's been my experience when maintaining a mess of other people's Docker crap.


> Um. Docker does not provide horizontal scaling at all, for that you need orchestration.

It does. Please take the time to learn about Docker and it's Docker swarm feature.


They probably have 100 other apps and deploying this one without docker would make it a snowflake.


Because the deployment environment is or may be a Kubernetes cluster or some other kind of containerized environment. Wrapping up your application in a neat package makes other people's jobs easier - to them, it's a black box container, not a binary they need to install and manage on a server.


It looks great on CV.


And now he can add front page of HN


I think the only reason I still use docker for everything is for automatic restarts without worrying about another tool for whatever language/framework I'm working with


Uniformity perhaps? I for one help manage several workloads that has become quite a lot easier to manage through containers and I'dd rather deal with that than having a few workloads being deployed differently from everything else. Maybe containerizing a go project like this is more work than strictly necessary, it would be a quick 10 line Dockerfile and pretty much done.


In a production environment one would probably deploy using something like Fargate, Kubernetes or Fly.io.


> In a production environment one would probably deploy using something like Fargate, Kubernetes or Fly.io.

Docker swarm mode is pretty good and terribly easy to get up and running in no time.

I have a few small personal projects hosted on Herzner on a couple of Docker swarm mode deployments with 100% uptime in the past two years, and all it took to get that infra up and running is installing Docker on a bare Linux node.

The only downside I'm aware is that inter-node traffic speeds can be relatively low.


I've worked with docker swarm extensively. I've managed it but also automated the deployment and implemented several features to ease deployment using the swarm API.

Swarm pros:

- Easy to setup

- Easy to run

- Relatively easy to debug

Swarm cons:

- Many problems persist for years. Some because of lack of resources, others because the problem is simply too hard.

- The community and automation around swarm is small

- Problems solved by third party tools, apps, etc. in kubernetes require in-house workarounds or solutions (e.g. there was an API to perform autoscaling, but we had to write the python app that will read data from prometheus and scale-in/out the deployment)

What I found it with Swarm is that it was extremely resilient. At some point the swarm cluster was running on AWS for ~2 years with minimal maintenance. That would have been impossible even for a managed EKS cluster for example. There are simply too many things that can go wrong.


Oh. TIL. Sounds very useful!


> C libraries are required to interact with SQLite

Or: modernc.org/sqlite (plus https://github.com/zombiezen/go-sqlite), "an automatically generated translation of the original C source code of SQLite into Go"

as mentioned yesterday: https://news.ycombinator.com/item?id=29959193#29960726


This article is just an example of someone that encountered a problem and didn't do their due diligence of finding the best solution for their use case.

The discussion of building fully static go binaries for docker has been rehashed a million times on the internet, the modernc.org/sqlite is at version 3+, so nothing in this should be new information for anyone that has an interest in the topic.

I strongly dislike this type of content that just repacks basic existing information and uses it as an excuse to click bait people on to a spammy blog.


This is really interesting. Copying to the scratch container is powerful and I’ve used it a lot, regularly, but occasionally something comes up where I need to use a more fully featured base to support things. One other downside I’ve encountered with Scratch is no shell so doing docker exec or kubectl exec doesn’t work. Does anyone know a good solution to this problem?


The Kubernetes folks' solution to this is the addition of `kubectl debug` (added as `kubectl alpha debug` in Kube 1.18, graduated to `kubectl debug` in Kube 1.20) as an alternative to `kubectl exec`. It takes an existing Pod and lets you attach a new container with whichever image you like, so that your production images don't need debugging tools.


Also, before `kubectl debug`[0] existed, you could always edit a Deployment and add a sidecar container of `alpine` or `busybox` and enable process namespace sharing[1] get some leverage to debug with.

A bunch of other options in the docs as well

[0]: https://kubernetes.io/docs/tasks/debug-application-cluster/d...

[1]: https://kubernetes.io/docs/tasks/configure-pod-container/sha...


That would trigger a restart, no? What if you want to live debug without restarting the scratch container?


Yup it would trigger a restart -- that first link details some other options if restarting the workload is absolutely not an option. You can debug a copy, start a privileged pod, or jump on to the node and actually enter the namespace manually. At the end of the day all these containers are just sandboxed+namespaced processes running on a machine somewhere, so if you can get on that machine with the appropriate permissions then you can get to the process and debug it.

Of course, if you're in a tightly managed environment and can't get on the nodes themselves things get harder, but probably not completely impossible :)


We drop a statically compiled BusyBox binary on images like that as "sh". If we need more we can symlink to it in /tmp at debug time (or just call it directly). It strikes a good balance between slim and debuggable.


Do you do this during build time or debug time?


Been using containers and fighting this problem for a long time... I never thought of this simple solution, thank you for sharing it!


In Kubernetes 1.22+, you can use ephemeral containers: https://kubernetes.io/docs/tasks/debug-application-cluster/d...


An emerging solution I've been investigating is using Nix as a build system for docker. The syntax is fairly lightweight and you can build containers that are scratch + coreutils, or whatever else you decide to put in there.

I was originally turned onto this by a post on the repl.it blog about using Nix this way.

You can also do this with other build systems or using weird docker file hacks by hand.

Here's some idea of what the syntax looks like:

https://github.com/NixOS/nixpkgs/blob/master/pkgs/build-supp...


If you run a single process in a Docker container, have it output logs over STDOUT, and have no meaningful way to interact with it other than shutting it down, you don't have much need for a shell. It can't do anything in the container anyhow. By contrast, if you've basically got a full OS in there, then yeah, a shell is really useful.


I would say most of my Go projects involve what you described (logs to stdout, simple interface) but I often find myself needing a shell to debug my docker build, like ensuring that files have ended up in the correct place, and have the correct contents, within the docker container.




If your stack involves running containers on a Linux machine, you can use `nsenter` to use the debugging tools on the host OS to debug the processes within the container.


Distroless works well. Still no shell, but comes with “what’s usually missing in scratch”.


Instead of a container, just statically build the golang program with sqlite. Single binary deployment.


Default SQLite package requires gcc so you need to use one of the alternatives.


In the article they statically link in SQLite, It’s the reason the Go program can just be copied to a scratch image and still work.

You only need gcc during the build fase, that doesn’t change just because you run without Docker.

There’s nothing preventing you from statically link C programs either and have a single binary to deploy either. You just have to rebuild everything every single time security updates are a available. But that no different from Docker images, they need to be rebuild constantly as well.


Testing locally, adding the "-linkmode external -extldflags '-static'" flags doesn't bring down the binary size. The heavy lifting is done by '-s -w'. It is a good idea to drop privileges even if there's a small attack surface.

At $work we use the following template. Sharing in case you find it interesting:

    FROM golang:1.17.1-alpine3.14 as builder
    RUN apk update && apk add ca-certificates curl git make tzdata
    RUN adduser -u 5003 --gecos '' --disabled-password --no-create-home mycompany
    COPY . /app
    WORKDIR /app
    RUN make build
    
    FROM scratch
    COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
    COPY --from=builder /app/bin/app-3 /bin/app-3
    COPY --from=builder /etc/passwd /etc/passwd
    USER mycompany
    CMD ["app-3"]

And a very simple makefile:

    help:
     @echo "Please use 'make <target>' where <target> is one of the following:"
     @echo "  test                 to run unit tests."
     @echo "  build                to build the app as a binary."
     @echo "  build-image          to build the app container."
     @echo "  run                  to run the app with go."
    
    run:
     go run -ldflags="$(govvv -flags -pkg) -w -s" ./cmd/app-3/main.go
    
    test:
     go test -v -coverpkg=./... -coverprofile=profile.cov ./...
     go tool cover -func profile.cov
     rm profile.cov
    
    build:
     CGO_ENABLED=0 go build -mod=readonly -a -ldflags="$(govvv -flags -pkg $(go list ./info)) -w -s -linkmode external -extldflags '-static'" -o ./bin/app-3 ./cmd/app-3/*
    
    build-image:
     docker build -t mycompanytown/app-3:latest .


What do you use the SSL certs in the container for? I always have run Docker behind either a reverse proxy like Nginx or in some kind of cloud platform that handles SSL at the perimeter.


Certs are used by the client to perform validation when initiating HTTP calls to APIs.


Add -ldflags '-s -w' to go build to strip DWARF, symbol table and debug info.

See also: -trimpath


And if you can spare an extra ~150ms startup time:

https://blog.filippo.io/shrink-your-go-binaries-with-this-on...


Serious question: Is upx worth the tradeoff in 2022?

Trying to decide if it's worth working into the containers that I maintain.


Probably not for Go in containers since binaries end up compressed anyway as part of the OCI layers. But Go binaries do get quite large so if you're distributing them other ways, it might be useful.


I totally understand removing annotations in the DWARF tables since they're not really needed in a production environment, but why would you want to remove stack traces? They provide invaluable information in a production system, and removing them increases the start up time with negligible difference.

In reality you'll tell whatever deployment system to deploy your binary/container, and you'll just wait until it's done. In such cases, 150ms won't make a difference in the grand scheme of things due to other systems such as your CI and other things in your pipeline.


I must confess to copypasta for this approach. Yes, stack traces are incredibly valuable. The upx compression itself does seem to be a no-brainer with the exception of any situation where that 150ms time actually matters.


That article is about -s -w and goupx.

goupx is not necessary after go 1.6. https://github.com/pwaller/goupx#update-20160310


It's actually about upx, which is still relevant. I didn't notice the goupx bit but it doesn't diminish the value of upx, which was my intention in sharing.


goupx isn't. upx still is.


Doesn't that make crashes much more difficult to debug? Or can you re-attach this information later on if you get a crash report?

I mean sure, in theory you shouldn't get any of those in production, but no software is perfect.


I'm currently being forced to use Docker against my will, with exactly this setup. The CGO overhead every time the container is built is absurd, making developemnt and testing painful. I habe to try this article out, but it doesn't seem like it solves the problem. Does anyone have advice how to solve this without too much Docker magic?


Why are you forced? We rarely touch docker any more, even though our prod apps run on K8s.


It was a requirement for the project I was working on (even though I disagree).


I don't see why Docker is needed here.


Ease of deployment?


Seems more difficult. Without Docker I don't have to bother with a new technology, use new commands and learn new syntax.


That is true, learning Docker to just do this would be wasteful.

However, many people already know how to use Docker and have flows around it so it makes sense for them to do this.


No, but you do have to deal with... however you deploy your app right now.

If that works for you, then there's no need to change. But if you ever end up in an environment where multiple different applications are deployed, or if your app needs to be deployed a thousand times, that's where e.g. Docker may come in handy.


There are `-alpine` variants of the golang docker images, so if you're expecting to deploy to scratch or to alpine, the _builder_ step should be using `golang:1.17-alpine` instead.


this makes sense for the scenario- a mock server.

Our backend teams rarely do anything with containers, but all apps deploy to K8s (platform team manages).

We deploy SQLite… don’t remember any Cgo requirements, however we’re also deploying each SQLite “instance”, or db-files, to a node directly, and for low data gravity on SSDs. Go interacts with database/sql.


If using a docker already why not just use postgresql and run go binary without the CGO overhead?


I'm sure you know this already, but sqlite and postgresql are not equivalent, neither in what they offer a user nor in what is required to set them up.


Succinct article, with a most excellent TL;DR.


It's hard to believe anyone is still promoting Docker.

Has anyone actually used it?

If you haven't jumped on the bandwagon yet, don't do it. It's a sucker's game.

I got sucked into the Docker hype some years ago and the demos worked well enough in isolation.

I pushed it on a small team that didn't need it. One of the biggest regrets of my career.

Docker may be useful if an application has enormous reach and needs Google scale, but very few apps do.

1) It is incredibly slow to build

2) It is slow to run

3) It is an unbelievable resource hog

4) It has an entirely new set of problems (file system, networking, etc.) that must be understood to get it working

5) Hard deploy time dependency on Docker's web app?!

6) If you do not believe me on the complexity, just observe how docker arguments always seem to turn into k8s (or other orchestration) arguments. It is hard to reason about, hard to deploy, hard to coordinate.

K8s is a total disaster for a small team working on a small project.

We lost many months by choosing these tools for a startup that didn't (yet) need them, and the ongoing, neverending pile of problems and distractions they generated was shocking.

I still feel bad about it. The company could have run on a couple $400 machines under a desk for years, but instead spent tens of thousands on AWS services it didn't need.

Of course, YMMV


> Docker may be useful if an application has enormous reach and needs Google scale, but very few apps do. > I got sucked into the Docker hype some years ago and the demos worked well enough in isolation. > I pushed it on a small team that didn't need it. One of the biggest regrets of my career.

It sounds like (no offence) issues that containerization solved were missed and you had no dedicated operations/build-engineer guy on your team that could've explained/fixed these things for you and steer your infra in a right direction.

> I still feel bad about it. The company could have run on a couple $400 machines under a desk for years, but instead spent tens of thousands on AWS services it didn't need.

Does not sound like a docker issue. If you actually plan to run a money making business that your livelihood depends on, then running it under your desk is not something that you can reliably do even 10-15 years ago.

AWS is a money hog with a lot of things that are a distraction for a small project and it is easy to assume that their products are necessary to run your app effectively, but most often it is not. There is plenty of cloud providers that are not as expensive, and are much easier to manage compared to AWS.

> K8s is a total disaster for a small team working on a small project.

This is truly a YMMV thing. in my experience managed k8s for a small team can be a godsend for infra management after some level of complexity is reached, while self-hosted bare-metal k8s can utterly decimate the team. But again YMMV.

I am currently running several clusters of Nomad for different types of workloads each. CI, HPC, VM's, public facing services you name it. Containerization makes it all possible to run (mostly) single-handedly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: