But because everyone is already pointing at gcr.io (just like many openfaas users point at docker.io/) - they're having to do a huge campaign to announce the new URL - the same would apply with the author's solution here.
The registry is also something you can run on a VM if you so wish, and have act as a pull through cache.
Apart from reliability - GitHub's container registry is the current next best option - but we have to ask ourselves, what happens when they start charging or the outages start to last longer or are more frequent than 1-2 times per week as we've seen in Q1 2023.
I'm the annoying kind that always has answers to people's questions. Want me to argue for everything liberal and against everything conservative? Easy. For everything conservative, and against everything liberal? Also easy.
I cannot for the life of me understand why people use google products like google isn't going to shut them down whenever they want to. Stadia was the most amazing example.
"Surely, google wouldn't do that. Stadia customers are paying customers."
3 years from launch to shut down announcement. 3 months from shutdown announcement to actually shutting down and losing all your investment in Stadia.
I don't understand how long google can keep getting away with this.
People didn't lose all their investment in Stadia. They lost save files, but not monetary investment - Google refunded all game/software purchases and all Google Store hardware purchases.
People may just be avoiding google stuff where possible, and not really forgetting. A lot of bad stigma there, at least on hackernews from what I've read.
Their registry forwards to a container registry by default. But if it detects if the request is coming from inside AWS or Google Cloud, it forwards requests for blobs to a S3 or GCS bucket near the requester. This saves money on cloud egress charges.
I find it a bit sad that the only two options we have is either using central services or everyone manages their own infrastructure (at the very least their own domain name).
I would've hoped that at this point we would have a true decentral solution for this sort of thing. Despite all the blockchain/dapp/web3 hype for many years they have no practical solution for anything.
We have all the pieces it seems, torrent and dht/magnet links work, ipfs works, web of trust works. And yet we don't seem to manage to work collectively on true decentral solutions to the issue of centralization of critical internet infrastructure. Why can't we all work collectively together and share resources so we aren't dependent on the whims of some shaky businesses, we are all constantly at risk of them turning on us for profit.
The concept that escapes you has a name: tragedy of the commons [1]. Yes, it is frustrating and depressing, in the long run a feeling of hopelessness settles in, that we are unable to share resources: we have failed to do so with the bodies of dead dinosaurs, creating authoritarian glorified gas stations as a side-effect, and we have failed to do so with more ethereal substances such as compute.
All the cutesy technowords-of-the-day, blockchain/dapp/web3/torrent/magnet links, are just a bandage over a greater point: ever since the atomic bomb, once we became able to destroy the planet, we needed to become a new species, evolve our cone of care. We were unable to do so and hopefully we will be extinct before we destroy the planet, let some other species have their try in a few million years, before the sun runs out.
> We were unable to do so and hopefully we will be extinct before we destroy the planet, let some other species have their try in a few million years, before the sun runs out.
Eh, we can take solace in the fact we still don’t have the capability to destroy the planet. Vastly alter the current environment, cause mass extinctions, and irradiate the planets surface, sure we can do those things. But the planet won’t care, and life, well, life finds a way.
I find it a bit ironic that you link Wikipedia to make that point. The greatest encyclopedia in the world, a great demonstration of the potential of collectivist projects. I still remember when it was launched and everyone thought it was crazy and impossible and laughed at it.
I would also draw a distinction between web3 and torrent technologies. Torrents work great, and it doesn't even give its users a monetary incentive to seed, people do it anyway. But web3 makes everything transactional and builds everything around individualist monetary incentives, and yet no useful application was ever (so far) conceived by it. So perhaps torrents and the wikipedia (and similar projects) work because it doesn't built everything around the free market libertarian fever dream.
Sure, Wikipedia is great, like finding moissanites [1] in the mud: great, but you are still in mud.
Perhaps I am too doomy, but as we see every day, and now with the GPT advances, almost every hour, a bridge being built between the information space, the decision-making space, and the 3D space of the physical world, and this bridge being restricted to only certain entities, it makes one wonder: would a Wikipedia even be possible today?
[1] Diamonds are a De Beers invention and a monopolistic violent endeavour, moissanites are cheaper, no artificial scarcity, and better looking https://en.wikipedia.org/wiki/Moissanite
Voluntary donations are the free market libertarian equivalent of involuntary taxation.
Web3 doesn't work because there's not enough value being provided, party because paying micro transactions is too unfriendly.
That's a hard problem and it's lack of success is a clear demonstration of the market working as intended and not rewarding something useless.
The socialist equivalent is a government owned web3 which we all have to pay with taxes and we're increasingly close to getting this.
People volunteer on Wikipedia, they write and edit articles, they do content and user moderation, etc. Everyone can edit Wikipedia, everyone can make a Wikipedia account, there are 42 million registered accounts, nobody is working and spending their time on Wikipedia for monetary gain, how does any of that possibly work and work so incredibly well? According to everything libertarians believe this should be completely impossible and yet it works, because libertarian, free-market and neoliberal ideology is WRONG.
What's the incentive for people to run a node in a decentralised network though? It will always end up being abused and misused, and be a negative influence on one's time trying to deal with that.
Because frankly in corporate network I want to add rule to proxy to allow access to this and that container registry and nothing else for security reasons.
"Just" docker registry proxy that had torrent support would be fine enough solution for the distribution. But good luck convincing anyone in ivory tower of security that opening some random ports to entire of the internet is a good idea
The problem here is everyone relying on Docker to foot an expensive bill for free forever. If it was always for a few, incentives would be more aligned and this blowup wouldn't happen. But as it is, it's a bit inevitable.
Yeah, i kinda hoped someone would start building building something which would be automatic paying with a blockchain. Like when you upload a docker image, that will automatically pay a few cent or dollar for that storage which will be active as long as the money does not dry up.
Sadly that ship has sailed, until a dollar/government backed blockchain which allows to do such things will pop up. Which won't happen i think
> In their updated policy, it appears they now won't remove any existing images, but projects who don't pay up will not be able to publish any new images
This is not correct. It's the "organization" features which are going away. That is the feature which lets you create teams, add other users to those teams, and grant teams access to push images and access private repositories. Multiple maintainers can still collaborate on publishing new images through use of access tokens which grant access to publish those images. It's kind of a hack, but it works. You would typically use these access tokens with automated CI tools anyway. This will require converting the organization account to a personal user (non-org) account. (Interesting note/disclosure: I was the engineer who first implemented the feature of converting a personal user account into an organization account some time around 2014/2015, but I no longer work there.)
For open source projects which are not part of the Docker Official Images (the "library" images [1]), they announced that such projects can apply to the Docker-Sponsored Open Source Program [2].
I would also heed the warning from the author of this article:
> Self-hosting a registry is not free, and it's more work than it sounds: it's a proper piece of infrastructure, and comes with all the obligations that implies, from monitoring to promptly applying security updates to load & disk-space management. Nobody (let alone tiny projects like these) wants this job.
Having most container images hosted by a handful of centralized registries has its problems, as noted, but so does an alternative scenario where multiple projects which decided to go self-hosted eventually lack the resources to continue doing so for their legacy users. Though, I suppose the nice thing about container images is that you can always pull and push them somewhere else to keep around indefinitely.
The move does show, though, that Docker isn't shy about changing existing terms. So there's merit for some projects to take control of their namespace. It doesn't seem inconceivable that at some point in the future, Docker will say that exchanging tokens in a personal user account to enable "hacky" org type features is a ToS violation.
Always found Docker merely caching layers/images as files named after opaque hashes in the file system un-Unixy. And it's also uneconomical in terms of local disk space (image dir building up) and network usage (images pulled to every individual host). Why not use clear names and use eg BitTorrent? Why tie this to a registry service over IP in the first place?
Docker layers are content-addressable. So the hashes are not entirely opaque. They are a direct result of what’s inside. Two images (mostly) share the same layers? No disk space wasted whatsoever.
Sure you could implement a finer-grained deduplication or transfer mechanism, but I doubt this would scale as well. Many large image layers consist of lots and lots of small files. The overhead would be tremendous.
The local storage is mostly a solved problem with hard links. Any modern file system (I.e. not NTFS) can have arbitrarily many file paths that refer to the same underlying file, with no more overhead than a normal file system.
The comment to which you were replying mentioned both the excessive local disk usage and the excessive network transfer, and so your comment appeared to apply to both portions. This is why I started my comment by explicitly restricting it to the case of local disk usage.
For hard links to work you still need to know that the brand new layer you just downloaded is same as something you already have, i.e. running a deduplication step.
How? Well, the most simple way is compute the digest of the content and look it up, oh wait :thinking:
I’m not sure what point you’re trying to make. Are you assuming that a layer would be transferred in its entirety, even in cases where the majority of the contents are already available locally? The purpose of bringing up hard links was to state that when de-duplication is done at a per-file granularity rather than a per-layer granularity, it doesn’t introduce a ru time overhead.
About time someone realized that re-centralizing back to GitHub’s services isn’t a good idea in the long run as I said before [0] during the recent ‘Dockerpocalypse’ thread.
Perhaps there is some glimmer of hope for self-hosting after all.
It's been a while since I've looked into self-hosted alternatives, do you happen to have any concrete suggestions applicable to solo developers or small teams with limited time and budgets (i.e. not self-hosted GitLab)?
The last time I've looked at self-hosted CI/CD, Concourse stood out as one of the more promising options.
As for code and container registry - Gitea? It seems like it has an integrated container registry now, so that's a plus.
If you're just looking for docker, the official docker registry[0] is quite literally just a docker container. The only problem is that authentication is very much geared around being totally private (as in, they recommend you just throw nginx in front of it). Couldn't find much on a "read-only" version of that.
GitLab is an overbloated mess that you can't really justify unless you have organization-style funding/tax-writeoffs for the server (at which point it's easily the best choice). It expects CI/CD to exist for all projects, even though it can run without it, you'll be missing quite a number of features (the main example is that GitLab demands release builds to be generated through CI unless you want to manipulate the API with curl on your dev machine, vis-a-vis uploading things). It needs a somewhat beefy server unless you go out of your way to downtune the entire thing (which requires quite a bit of configuration), a 5$ VPS will not suffice.
Gitea mostly rocks and in my experience runs on even that 5$ VPS, but it does not ship with any CI/CD by design. They do have a list of external services[1] that can provide CI that can integrate with their software, so you can have CI/CD. Personally I'd recommend Gitea if you're looking to selfhost.
> Gitea mostly rocks and in my experience runs on even that 5$ VPS, but it does not ship with any CI/CD by design. They do have a list of external services[1] that can provide CI that can integrate with their software, so you can have CI/CD. Personally I'd recommend Gitea if you're looking to selfhost.
Currently using Gitea + Sonatype Nexus + Drone CI which has worked nicely for my own needs, after previously running self-hosted GitLab for a few years, but finding the updates to be a bit problematic: https://blog.kronis.dev/articles/goodbye-gitlab-hello-gitea-...
That said, Woodpecker might be a more open CI offering, licensing wise and works similarly to Drone.
But personally I wouldn't judge others for picking whatever else they are familiar with and what works for them, even if that choice would be Jenkins or something like that.
It’s just a game of hot potato with platforms that are willing to foot the bill for the bandwidth and storage for some other business gain.
Self-hosting your own private registry is easy and cheap but for a public registry you’re essentially writing a blank check letting randos across the internet pull your image a million times from their CI pipeline and cost you egress fees.
> To dig into this traffic, the easiest option is to use an HTTP-debugging tool (such as HTTP Toolkit) to see the raw interactions, and configure Docker to use this as your HTTP proxy (docs here) and trust the CA certificate (here).
Probably not surprising given that this is the blog of HTTP Toolkit, but instead of debugging the HTTP requests, they could have gotten much of the same information by reading the introduction of the API docs (https://docs.docker.com/registry/spec/api/#overview).
- there are such docs
- you can find them
- they won't lie to you
- they'll be enough to give you a picture of how everything works
It's great that Docker fits this criteria, but in general tools that let you cut out a few of those steps and inspect how the actual thing is running live are also nice.
Ideally, consider using both: trust the docs, but validate regardless to have more confidence.
Nice! I do something similar with my podcast's RSS feed. I just put BunnyCDN in front of my podcast host's feed, create a custom domain for the CDN, and then distribute my domain instead of the vendor-locked RSS feed the podcast host provides. I've switched podcast hosts, and there's zero impact on listeners.
It allows domain owners to use a "pretty" name for their Terraform services (e.g. terraform.ycombinator.com) while pointing to a different host and path (e.g. an S3 bucket).
Decoupling the "root" of a registry from where its API is implemented is a great layer of indirection I wish existed in the container image registry ecosystem without speciaised server software to perform API-aware redirects.
Been a fan of CapRover for its ability to create a self-hosted private registry (HTTPS) at the click of a button, besides swarm/cluster support, and multiple deployment methods.
I recently set up a private registry for some custom GUI apps- it really is quite a bit of work just to set up and debug some less obvious issues (e.g. pushing large images).
I'm using official registry, docker-registry-ui for auth and web-based repository browser (which essentially proxies the docker API requests to registry, while providing web access to list the images and nginx-proxy-manager in front of all this (needed as the NAS I'm running this on hosts some other stuff as well).
Wholly non-commercial open source projects. Meaning, for example, even if you sell consulting services for your project on the side, you don't qualify. And it's re-evaluated every year.
Did you just discover how registries works or something? Docker hub is just a public registry, anyone can have their own private registry and it is what everybody usually does when you want your own kubernetes. Harbor is the usual free/opensource usually you use, or you can use the simple one that docker provides and run it locally. Also they are other public registries like quay.io or the GitHub one.
I think you misunderstood the article. This is not about running a private registry. Running a public registry (e.g. for an open-source project) exposes you to bandwidth costs and such.
The solution in the article allows such projects to use their own domain without incurring any of the maintenance burden of running a public registry.
One of the problems for open-source projects currently is that any move (even to another provider) causes a lot of disruption, because all users of the project need to update the address they pull from. This solves that.
>Did you just discover how registries works or something?
That feels a bit snarky. The article opens with that path, but concludes it's heavier than they want. Then suggests an option that's lighter/thinner and meets their needs.
The K8s proxy is redirecting from only hosting on GCR to community-owned registries - https://kubernetes.io/blog/2023/03/10/image-registry-redirec...
You can view the code here - https://github.com/kubernetes/registry.k8s.io
But because everyone is already pointing at gcr.io (just like many openfaas users point at docker.io/) - they're having to do a huge campaign to announce the new URL - the same would apply with the author's solution here.
I wrote some automation for hosting (not redirects) in arkade with the OSS registry - Get a TLS-enabled Docker registry in 5 minutes - https://blog.alexellis.io/get-a-tls-enabled-docker-registry-...
The registry is also something you can run on a VM if you so wish, and have act as a pull through cache.
Apart from reliability - GitHub's container registry is the current next best option - but we have to ask ourselves, what happens when they start charging or the outages start to last longer or are more frequent than 1-2 times per week as we've seen in Q1 2023.