Poor Docker. They created an ecosystem and tooling that is now critical to a large majority of software teams. Many cloud providers built large lucrative services on top of it using their existing networks to get market share (e.g EKS/Kube over say Swarm). Now Amazon can simply cut their last few legs (monetizing the registry) under them because it’s just a small cost leader to them.
As consumers if we just keep taking the freest lunch without any care or love for the general OSS ecosystem there will be a ton of adverse effects. This is a pretty F’ed up way to do business. This is how you get people to stop doing nice things like writing open software. It’s even more frustrating and reprehensible that they’re doing this under the guise of supporting the community.
> As consumers if we just keep taking the freest lunch without any care or love for the general OSS ecosystem there will be a ton of adverse effects. This is a pretty F’ed up way to do business.
Open source doesn't care about the way you (/we) do business.
I know its easy to get this defeatist view about OSS, like, "the linux desktop is losing the war" or "iOS is killing open computing" or whatever, but even if that were what was happening here (its not), open source persists. There is no war. Its just people writing code to help other people. In fifty years AWS, Apple, Google, whoever, might be dead, but code will still be here.
In this specific case, here's how I view this situation. Docker provides an indispensable public service via Docker Hub. They have had six plus years to monetize it, but it turns out, monetizing public services is nearly impossible. AWS has the scale to step in and take over some of the load, and they already have a monetization strategy. Good on AWS for this; Docker couldn't handle it, so AWS will.
> Open source doesn't care about the way you (/we) do business.
Open source doesn't write code. People write code. People who see their code get commercialized by competitors. They update their priors and next time they are in a position to choose whether or not to open source something, they do so keeping in mind what they have learned.
Are the Postgres developers mad that AWS makes money via RDS? Are the (unaffiliated) Linux developers mad that DigitalOcean makes money via a compute offering? Are the (unaffiliated) Kubernetes developers mad that Microsoft sells their product on Azure?
Unaffiliated being a critical part there, because of course Microsoft, AWS, etc contribute back to these projects. But, many, many of the developers on these projects are unaffiliated, often working at smaller companies/consultancies with no affiliation to the megacorps that make billions on the software.
Yet, I've not once noticed a single instance of aforementioned drama. What makes Docker, MongoDB, etc different? They're open source products developed and monetized by singular corporations, not communities or non-profit cross-functional organizations like The Linux Foundation. Its a big corporation slighting a smaller corporation.
This may reveal a bias in online discourse, especially on HN; we revere corporations, not people. We want to protect venture-backed Docker, but the individual developers working on, say, Postgres? Less deserving. Peter Eisentraut, Bruce Momjian, and Dave Page are core maintainers of Postgres, all working at a company named EnterpriseDB, who sell managed Postgres offerings. Why doesn't anyone complain about AWS screwing over Postgres?
Well, there's something bigger at play: Its actually, totally alright if megacorps use Linux, Postgres, whatever in this way. Developers enter these projects knowing it'll happen, and these projects are healthier than ever. That's a beautiful relationship right there! No drama. No one slighted. If these companies give open source back, that's great, but I don't expect them to (exempting GPL requirements of course), just like no one expects me to contribute every bit of code I write just because I coded it on Ubuntu using vim and linux and gnome and docker and such.
With Docker (and similar corporations), someone is slighted; its the company that was also trying to make money on it. And I think that's just mismanaged expectations at every level; the company believing they could build an OSS community of users and builders while still extracting profit, the leaders for accepting venture capital, and the users believing a company could solely develop and distribute something as complex as core application containerization or a database.
To be clear: I have a lot of respect for Docker continuing to run Hub for free as long as they did. They're an awesome company. No qualifications around that; I hope they find success in the market. I just don't believe that success will come at the scale and valuation they expect, because they aren't selling what customers want to pay for. AWS is.
Yeah, expectations of monetizing open source need to be managed. Downwards. That's fine unless you liked seeing open source get funded. Personally, I did, and I think it's a pity to watch those dreams go up in flames. I think there's also a meta-point in that by creating a system that can't be content with taking advantage, and instead must always take maximum advantage, that we rob ourselves of something in the process.
Here's to the high water mark of open source. It was good while it lasted.
What high watermark? Was that when MongoDB switched their licensing on v4.0 to force all hosted service providers to open source their entire software stack if they wanted to host it, a requirement they themselves are exempt from with Atlas? Was it in 2017 when Docker locked important pieces of the Docker ecosystem behind the closed-source Enterprise Edition?
Its risky to mistake a golden age of open source with "these companies just had infinite VC funding for a while, before realizing they need a sustainable business". So they try to strike a balance, which makes everyone unhappy because the open source advocates say they're turning back on their promise to the community, while enterprise advocates say they're too focused on hyperscale and not enough on solving actual enterprise problems. Some have survived, some are still in the VC honeymoon phase, many died (RethinkDB, CoreOS, etc).
These companies, and the products they made/make, do not represent a high watermark of open source because generally speaking their products die with them. There's no community to sustain it when the corporate sponsors disappear. And here's the funny thing: If there were a community, if you built a product so awesome that people love it and develop for it and use it by the millions, you're now Docker. The open source gets you to where you are, and now you just handed all of your competitors not just market validation, but the literal specification on what to build. And what it takes to protect from that would make you MongoDB; kinda open source, definitely not FOSS, lots of closed source components, enterprise, no community. Neither of these are model high water mark open source companies, because one is bad open source, and the other is a bad company (financially).
What's the best kind of open source? Code which was written to solve a problem a company (or person) has in their business (or personal life), then released because more people than they are having this problem. Golang? Made at Google as a language that was simpler to write and harder to mess up for engineers right out of college. Rust? Improving Firefox reliability. I almost weep at the strong story behind these projects; the technology wasn't created just to sell, it had a purpose that was validated (to some degree). Not technology for the sake of technology, nor technology directly for the sake of money, but for problem solving. Docker, MongoDB, CoreOS, none of these companies had problems; they invented problems, or inherited problems from previous jobs, then sold solutions.
Docker should have just remained free for personal use and small companies, say less than 50 employees and charge a lot more for enterprises.
Why aren't more licenses like this? What are the downsides? Traction? I am thinking how Adobe has made it easy to crack Photoshop so that students once they graduate demand Photoshop and Adobe Suite for their workflow.
That would have never worked because such a license would be at odds with free and open source licensing and Docker wouldn't have gotten any traction in these communities (that provide the infra backbone of the whole stack).
I get your intention, but monetizing infrastructure and systems code is pretty hard.
If the containers caught on, someone would write a replacement. I mean podman exists. For the majority of use cases, you need only a tiny amount of docker's features.
I wonder this too. I started using drone.io for CI early this year and part of the reason was because of their no BS license where anyone making under $1 million could just use it.
Unfortunately they got bought by Harness who's already slicing it into tiers [1] so it'll turn into the same feature roulette crap (like GitLab) that I was trying to avoid.
If I had input on the business model of some of these companies I'd say that every effort should be made to give individuals and small developers a competitive advantage if they choose to use your product and a big part of that is getting rid of tiers for technical features.
A really good example of how not to do it is GitLab IMO. Look at the way they reserve container scanning [2] for their top tier. Remind me why only large enterprises need security. The implementation of that feature isn't even good IMO.
Scanning an image at build time is the wrong time to do it. It should be more like Harbor [3] where it's a scheduled scan by the registry and everyone should have access to it. The feature tiering could be done via the deployment, traceability, and monitoring of the containers. As a small developer I can do those things by hand because I know everything that's going on and what's currently deployed. I can track it in a spreadsheet if needed. Large developers need a way for (ex:) a director to log in and get a report for all the vulnerable images running in production.
As is, it's a checkbox feature, so I can see why it makes sense to have it in an enterprise tier. At a certain level it's about checking the box and covering your ass, so if / how it works doesn't even matter at that point. As long as you can claim due diligence that's all that matters, right?
I mean GitLab is making $0 from me right now, so why not just give away everything until a company has 5 developers or $1 million in revenue or something. Make it so appealing that it would be stupid for small developers to use anything besides GitLab because it would be impossible to compete in terms of productivity.
As it is, I hate dealing with the feature tiers so much that I gave up and switched to Gitea with other self hosted solutions for CI, Registry, etc.. I want to spend my time writing code, not trying to figure out what features I can use with my current plan.
> I want to spend my time writing code, not trying to figure out what features I can use with my current plan.
And I want to spend my time writing code, not maintaining dozens of self-hosted applications, each with their own kinks and requirements.
What I dislike about comparing features is finding out what's missing and not supported at all, in any tier. Many of these are "you'll know it when you see it" kind of features. For example, they will list hardware 2fa as their feature, but they won't mention you're limited to only one key per account. Some of those are tiny inconveniences, others are deal-breakers. And you keep finding them, but you've already organized your workflow and switching to another vendor might be costly, mostly in time.
Open source is a trade-off. In many cases, it's rocket fuel for early and mid-stage growth in exchange for reduced barriers to entry for competitors. It also points out a real weakness in infrastructure: why is provisioning, configuring and deploying code so difficult and expensive? Shouldn't we have OS level tools that make this trivial at this point? It's pretty amazing that we're taking patterns like load balancer + Shared Cache + Database + Libraries + App Code + config files and assets and making it so difficult to configure them and deploy them. Reducing deployment complexity is still a huge opportunity...
> Reducing deployment complexity is still a huge opportunity...
I agree with that. There's probably even room for another CI system that's highly opinionated and focused on building images. Most of the existing ones kind of suck at building images.
Deploying apps is truly ridiculously complex. This is the problem I really see "no code" addressing. To me, writing the logic of an application is easy compared to the ridiculousness of publishing the code somewhere.
Before Netflix and iTunes/Spotify, people were convinced that nobody would ever pay for movies and music on computers ever again.
Turns out a product more convenient than piracy was worth people paying money for. Now there are 18 streaming services and piracy rates are climbing again.
Ebbs and flows.
Open source software is amazing but infrastructure development is a substantial part of any businesses development costs.
AWS offers higher level abstractions - for a fee.
In our industry people always take that option, until the cloud providers value proposition becomes no longer attractive, and open source innovation flourishes again.
AWS makes really good stuff, there is value in that.
> This is how you get people to stop doing nice things like writing open software.
Docker stopped making open software before they limited the pulls. Docker Enterprise for Windows (Windows dockerd, basically), Docker Desktop for Mac, and Docker Desktop for Windows are all closed-source and proprietary.
Docker-the-company hasn’t cared much for open source in a while.
the thing that annoyed me was the telemetry in the mac installer before you even did an install.
Additionally they made it hard to do a personal registries, which is what's going to happen now anyway. I think they tried to keep themselves in the loop too much, because it's valuable for business and monetizing was a long term goal.
"It turns out this is actually possible, but not using the genuine Docker CE or EE version.
You can either use Red Hat's fork of docker with the '--add-registry' flag or you can build docker from source yourself with registry/config.go modified to use your own hard-coded default registry namespace/index."
Docker Desktop is terrible spyware. I managed to crash it a few times recently and intercepted the reports before they were sent off; it collects a ton of sensitive information about your computer in the crash report it sends, including pcap logs(!).
I’ll never run code from them I didn’t build myself. That company is super shady now.
It's not hard, it's just clunky. You have to put the full URL of your Docker registry every time you reference the image. There's no way to say "just always use this registry instead".
I haven't found it terribly difficult to set up either, just annoying.
Didn’t they just stop launching new OSS projects, and focused on maintaining what they already have launched? They’ve been criticized a lot over the years (correctly, I think) for lacking focus. Are we now criticizing them for narrowing down their focus?
It's not ironic if you consider the intent is that software paid for by other companies or volunteers may no longer be if huge companies take all the upside.
Yeah, this is the ambiguity GNU/FSF people go on and on about. “Free” in the interesting sense means “available to modify and use as you like, as long as you don’t prevent other people from doing the same”
The tale of Docker is the tale of a non-success business, and unfortunately in those stories, a lot of problems created a lot of bad fallout. So depending on the lens you come and view it through, you can find villains all over the place. IMO, one of the biggest problems was that the Docker moat was primarily about mindshare. The core tech popularized containers, but it was thin enough that there's now the OCI, many other competing container runtimes, and docker plays completely no role in many large container usage scenarios (large public cloud operators). Kubernetes and really a lot of other clustering solutions always seemed to outweigh docker swarm, which was unfortunate too because that was probably one of the better monetizable vectors for the company.
If you compare all of this to a company like hashicorp, which in my opinion is just expertly run from the top, it's just a night-and-day tale. You can build great developer businesses, but you have to be careful, and probably still a little lucky. Docker had poor luck but I think they also had a lot of problems with not fully ideating the business of the product.
> This is how you get people to stop doing nice things like writing open software.
I mean... it doesn't look like people show any sign of stopping to write open software. There's more active FOSS than ever before, in large parts of the industry it's the norm.
I agree these are issues to talk about, though "OSS is in trouble" is a pretty weak argument. It's obviously not.
There’s certainly some threshold, but maybe past that you have to adjust your definition of “great”.
I think the technology of Docker just doesn’t line up with a venture scale business. So it’s not an execution failure, but just wrong from the start. If they had been successful at a huge scale, I think it would have almost been a pivot level change and using Docker as a springboard. Like offering container hosting or something.
I wonder how they would have done if they just stayed a 50 person company.
Even 50 person seems like a lot. I think people tend to forget that there's a lot of room at the bottom. If you want to give your product away for free you need to accept that there'll be less revenue, unless you can pull off the neat trick of segmenting your market almost perfectly e.g. the only people who use the free version are people who wouldn't have paid you anyway and your marginal cost is zero.
Could Docker as a developer tools business managed with 10-15 people? Quite possibly. Especially if it didn't try to become a generic VM on other platforms.
Registry is not a really something that can be monetised easily. Java ecosystem has had plenty of open and free registries, such as mvnrepository, mvncentral, jcenter etc, for a while for example.
In fact, I would argue that if they let go of their registry, then focus more efforts on possibly monetisable products/features. Even there k8s ecosystem is miles ahead of docker swarm.
That's because they're generally extremely easy to replicate. "Repository" generally implies that all of the intelligence is baked into the client, and the server is just some minor variation on an FTP or HTTP server (as Linux repos tend to be, PyPI afaik is largely the same way, I don't think NPM does anything super clever server side).
The flip side of that is something closer to a CDN, where clients are dumb and the server/network is smart. Those are monetizable because the servers have to be featureful, so you can win on features and charge money for those features.
I struggle to think of any really great value add features for a repository. There's vulnerability scanning, but the successes with that seem largely meh, and there are dozens of competitors in that space already.
I think they were hoping to ride on the "enterprise" crew, because no one gets fired for buying the canonical implementation. But then RedHat acquired Quay and that was basically the end of the end, since RedHat is enterprise and already has about a half dozen products in the ops space.
Honestly, as soon as they announced the new rate limits, it seemed obvious that this would happen: different platforms will implement workarounds.
I'm sure the same would happen if npmjs or pypi started charging for using their package registry.
There's lots of hobby programmer, or simply, companies from economies that can't afford to pay 5USD/month. For example, a friend of mine pays a total of $0.95 a month for his ecommerce needs.
I moved off of Docker Hub because of the number of unfixed bugs and lack of features. Builds sporadically failed for years, they took forever to add any kind of MFA at all.
Docker also has been a business since day 1. They didn’t write open source software because it was a “nice thing”, they saw a market for an open source container solution. I’m thus not worried that using non-Docker-operated registries will impact the future prospects of open source.
The problem (okay, "a" problem) here is that it is indeed OSS, and not FOSS, that is being used.
It might be worth re-visiting "Post-Open Source" by Melody Horn (boringcactus) [1] on this. You might want to copy/paste that link instead of clicking. :)
It has been discussed here before [2] and (unsurprisingly, IMHO) gotten quite a bit of blow-back from the HN crowd.
The General Intellect Unit podcast recently had an episode on the piece [3] and I think they made the point clearer (well... also they roll the piece out to 1h 40min) by more openly posing it in a class struggle context. (Side note: please don't post replies attacking this as communist. The podcast has "marxists" in the name, you wouldn't be insulting them, and I'm merely engaging with their content, so if you're gonna try to attack their/my positions, please do the same and don't just throw scary words.)
The bottom-line is that Open Source Software has nothing to do with the freedoms of the developers. It is...
* a corporate entrenchment scheme: enlarging your hiring pool by establishing your in-house solution as the industry standard, giving you full hiring pipelines and stronger negotiation position against current and prospective employees.
* a capitalist commons: by tending to one industry standard, the companies (not their devs, who will still have full work days) save on duplication of effort.
At the same time FOSS (which does focus on the freedoms of the devs to read, modify, learn...) kind of misses the point by being...
* too obvious a trap, and thus ineffective. No sane capitalist company will go anywhere near AGPL code (and we just had an example on HN of someone trying. it wasn't pretty)
* ultimately not even clear on how this will make anyone's lives materially better, even if it were effective. It's not gonna raise your salary, nor give you more paid vacation, health insurance. As a dev, you'll still be producing surplus value (okay, in a hopefully less alienated way) that will then not be paid to you (that's the definition of profit) so from this perspective the "F" in FOSS doesn't really bring much to the table.
The GIU podcast concludes that yeah, working in IT was nice for a while, because it posed some fundamental problems to capital and as a dev you were in a much better negotiation position. However, those times are over. Capital has found a way to make IT labour just as fungible as manual labor, at least to the extent that we should expect those "rock star" benefits to dwindle soon. Up to us to realize that and take action.
Again: relaying analysis here. Not sure I buy all of it. But even if, please keep your replies centered on the content, not the vocabulary it's presented with.
I personally haven’t worked on a software team not using containers in around 8 years. One place was kind of an exception - containerization was on the road map, but not in use in production yet. They’re everywhere. I’ve deployed 3 containerized services in the last 6 weeks - 2 using Go and 1 using Node. They’re kind of the default these days, especially when deploying to AWS/GC.
I assume it must be similar to some degree for a lot of developers here.
Some kind of white paper (even vendor supported) indicating that a majority, or even close to it, is using containers.
As far as I can tell, the state of the art for enterprises, which still make up the bulk of software development, is ASP.NET and J2EE, deployed to bare metal or VMWare VMs.
The hacker news bubble may be stronger than realised on this topic.
500GB/mo. presumably per IP address for anonymous pulls. Amazon have absolutely no problem revealing the true cost of egress when it suits them. Also, "500GB/mo.", at launch. Let's not forget those generous free limits can be slashed whenever it suits, and why they might be so generous for launch at all.
Don't be mistaken, actually using this for public images is falling for another "data gravity" marketing gimmick. So they attract enough public images that some folk find one or more of their services absolutely must pull from AWS, pushing their corporate NAT address over the 41GB/day limit. Now your "free" hosting has become a reason for that company to start paying ~$45/month to AWS in egress fees just to use your images, and perhaps with a new ticket in the backlog to consider simply moving the cluster to AWS.
"But it'll always be like that regardless of provider", not necessarily. The reason that Docker registry consumes a ridiculous amount of bandwidth is largely an architectural issue, both in the software and the design of the client's network. Perhaps a better direction is addressing this immense waste of networking rather than turning it into another free marketing opportunity for a near-monopolistic cloud
This entire post is just unnecessary fear-mongering.
>Also, "500GB/mo.", at launch. Let's not forget those generous free limits can be slashed whenever it suits, and why they might be so generous for launch at all.
AWS has, to my knowledge, never increased the cost of any service after launch. They do, however, have a long history of reducing prices.
You also completely neglected to mention this part:
>Simply authenticating with an AWS account increases free data bandwidth up to 5 TB each month when pulling images from the internet. And finally, workloads running in AWS will get unlimited data bandwidth from any region when pulling publicly shared images hosted on AWS.
So yes, AWS gives some incentive to create an AWS account, and an even further incentive to use AWS services. All I can say to that is... duh?
The problem is not that they give incentives. It's why they give incentives. Why do they want marketshare? It's simple, they're a business, they want to milk you for money.
And like their forebears they understand well that the more they lock you into their solutions the more likely you are to keep paying for money even when you're mad at them for something they inevitably do that you don't agree with.
1. The company I work for is already fully locked into AWS with years of work needed to go anywhere else.
2. Duh, this is an incentive to move to their cloud. Duh duh duh. Basically everything any corporation does is in some way intended to keep you locked into their system.
3. The company that I work for pays Amazon A LOT of money every year. As in, millions a year and almost certainly a month, too. If this allows us to pull images without limits, avoid having a separate billing agreement with DockerHub, and provides faster speeds from the regions where we're deployed in AWS, then this is a compelling offering from Amazon and my company would likely consider it.
>fully locked into AWS with years of work needed to go anywhere else
That’s the thing that I don’t understand... unless you’re treating cloud vendors like a traditional on-premise data center, you’ve already got lock in. And if you do treat them like data centers, it’s going to cost a lot more than just running a data center.
And if you decide to switch from AWS to Azure or GCP or Oracle, what happens to all your engineers who specialize in AWS? Are they going to forget everything they know and go learn something brand new and start from scratch? Or are they going to just quit and find a job working with AWS that they already know? You’re already locked in to every tech decision you make, and if you make the decision to not be locked in, you’re locked in to only vendors who don’t have lock-in which is a form of lock-in itself. Either way you’re gonna be paying a lot more than your competitors who are just focused on running their business.
It’s wishy-washy lack of decision making. No one cares about your tech stack, just pick one and get back to running your business.
You're a funny guy. Nothing is wrong with a business making money. What's wrong with Amazon is that they're going to make extra money through extortionate business practices and you're going to be shit out of luck looking for an alternative after they crush all of their competitors in ways that are not healthy to a functioning market.
The pricing (or lack of pricing, actually, since the only thing mentioned in the post is that people get 5TB/mo of transfer for free and AWS makes no money directly from this) is actually less costly than Docker Hub's new model. Are you telling me that Docker Hub is extortionate, too? Come on.
Also once you have 5TB/mo in container transfer bandwidth, you probably can spare a couple employees to set up your own solution if you can't use AWS. You can't realistically use that much in a smallish company.
> This entire post is just unnecessary fear-mongering.
You perhaps missed the bottom 30% suggesting there may be better options.
Docker's registry protocol treats layers essentially as opaque slabs, as any user will no doubt be aware, where update of a single timestamp or byte of a file could mean the reupload (and download) of a potentially unlimited amount of data. While that architecture has its perks (simplicity, sequential IO patterns suited to the magnetic disks of 2013), there are obvious downsides.
We have plenty of systems just like Docker capable of managing Docker-like file counts without the obscene bandwidth abuse, we just don't call them registries. They have names like Perforce, Subversion and even (some Microsoft variant) of Git. Eliminating the data gravity angle is impossible, but reducing its potency by several orders of magnitude is not only possible but has already been done repeatedly in many popularly deployed systems.
This is not to suggest replacing registry with Git, but perhaps some graph-like manifest format combined with some new "fetch-multiple-objects" protocol verb would be all required to bring the cost of running a registry down to essentially a non-issue.
You seem to be all over the place with your issues. Are you mad that AWS has bandwidth limits, or are you mad that AWS is encouraging the use of Docker altogether, or do you just have some kind of vendetta against the boogeyman that is AWS?
Docker registries are how things are done right now. And the changes to Docker Hub that happened this week have a significant impact on the Docker ecosystem. This announcement is about solving that impact now, because developers need a solution now. This doesn't have anything to do with any hypothetical improvements to the Docker ecosystem or Docker data usage that may otherwise be in progress by [insert company here].
At no point in my comment did I say anything about what's allowed to be discussed in this thread. You meanwhile, built a strawman that first started with complaints about the bandwidth. When called on it, you deflected to a different point about Docker architecture. And now, when called on that, you deflect again.
Please reflect on the constructiveness of your own comments. You are the one intentionally deflecting discussion.
For commercial use, $45/month is a low enough cost to be noise for many businesses (even though the bandwidth transfer markup is eye watering). If it’s breaking your budget, you can always pull and then push to another registry (it’s just tarballs after all) or build new images from scratch with your Dockerfiles. Open source projects and similar cost sensitive use cases might consider https://github.com/miguelmota/ipdr
Docker Hub is similar to Youtube in the sense that it’s a whole lot of bandwidth and storage used and someone has to pay for it. Could you run your own? Time is expensive. Don’t unless you absolutely have to.
$45/mo would be the minimum payment, assuming the free limit announced today actually lasted. Let's say they begin crying 12 months from now and knock it down even by 50%, now you're looking at $810 minimum per year per client site. Multiply that up by, say, 1000 sites, and you're starting to look at a "free" service that might completely pay for its own development within a year or two.
That has historically not been the direction AWS prices go, but as I mention in my other comment, it’s trivial to move if AWS tries to put you over a barrel. Object storage is a commodity, and Docker images are objects, so shrug. It would be cool if Cloudflare supported container registry primitives backed by user provided Backblaze B2 bucket config info, for example (maybe you can do this with workers?).
Properly parameterize your build/CI scripts (in this case, your container registry host and how to auth against it) for portability. Portability is your insurance against poor service provider behavior.
Sure, but say your engineers cost you $100/hr, and every hour beyond 2,000hrs per year they work, they become more unhappy.
I think it's worth spending the 8 hours/yr/site, supposing you have relatively few sites. Plus, your engineers for 1hr/month will never achieve the same SLAs that AMZN will.
For smallish numbers of sites, it seems like the AMZN solution is fine. For bigger ones, there is always enterprise pricing :-)
Does AWS raise prices like this (I know Oracle does - OUCH).
I've yet to be hit by one of these price increases and have been on AWS since the beginning. Until I migrated they supported my SimpleDB app FOREVER (I mean, it wasn't even on website at one point but the calls still worked!).
A small part of AWS success must be because of this confidence they will stick around, keep a service running.
Unlike other companies AWS understands how to keep customers and maintain good will. They start with moderate prices and then decrease them as they better understand the market. Never increase them. You won't see the same great initial deals from AWS as other providers (ie: Google Kubernetes) but you can be certain your deal won't go away.
For what it’s worth I’m not aware of AWS ever slashing a free tier or increasing pricing, though they do so much that I definitely could have missed something. Does anyone know of any examples particularly of them slashing a free tier later?
Everyone is beating him for his incorrect predictions about AWS pricing changes.
But nobody focuses on the most important point: Somehow AWS can give you a lot of egress traffic for free, something _you_ would pay a lot of money for. This is clearly showing that egress pricing is heavily inflated so every cloud can print a lot of money.
Egress pricing is there for another reason as well. It's meant to make it hard to move out of AWS.
If one looks at the pricing for AWS Snowball [0], their bulk data transfer device, there's no per gigabyte cost to get data into AWS, but $0.03/GB to get data out. I'm pretty sure the bits are just as heavy coming into AWS as they are coming out. :-)
I think it's also a convenient way to get rid of piracy and garbage content.
You can't host MegaFileShare on AWS because it would cost too much and AWS doesn't have to bother about useless legal discussions since your value to them as a customer is really low for this use case.
Not necessarily. When bandwidth balances you can get peering agreements, but AWS will be very server heavy. Moving data in would just improve their ratios and help them reduce their own costs, so, it makes sense to make it free.
In regards to "But it'll always be like that regardless of provider", it's worth noting that distributed file-systems are situated perfectly to provide a torrent-like way of downloading containers. Self-blog just because it's easily at hand: https://blog.bonner.is/docker-registry-for-ipfs/
Why is it so hard to have effective mirrors of images?
Images are basically content addressed, it should be easy to create an ecosystem where when you first use an image you pin a copy on a private cache, instead of shuffling hundreds of GB around to each leaf node of your clusters.
This way cloud providers could still sell you a managed service that implements such a cache, but it would be used transparently instead of having to change your manifests and/or FROM stanzas.
Netflix iirc has implemented a registry that uses IPFS. But talking to the right registry still requires that the image name includes the hostname of the registry.
The docker daemon has a registry-mirrors option; you can already use that to point to https://mirror.gcr.io . But you can use it only to cache the special "docker hub" images.
This seems to be a self-inflicted problem. Docker the company possibly wanted to be in the critical path
Artifactory does precisely this. It's particularly useful since large organizations want to minimize their risk of, for example, a Docker Hub outage preventing an emergency hotfix. In Artifactory you can setup 'remote repositories' which are then overlayed on top of your own repository within Artifactory. Docker provides the `registry-mirrors` setting which then will automatically redirect all pulls to Artifactory rather than Docker hub. There's no need to change your manifests and/or FROM stanzas.
But it's not transparent. In all places where you use the image you have to change the name of the image. This means more templating, more complexity etc
Assuming it’s similar to Nexus, you can combine multiple registries into a group and you configure the Docker daemon to use your local registry as the (only) mirror.
So if you have ithkuil/project on Docker Hub and donmcronald/project on GCR, you could reference them both with those short names and Artifactory or Nexus would deal with fetching them from the correct remotes.
The main downside is that you introduce the potential for namespace collisions. A second downside is that you become dependent on having that local registry aggregating images from multiple registries. Personally I don’t like it.
1) You can configure docker's daemon.json file on Kubernetes in the same way as every other system running Docker (in fact, you can apply it with a daemonset if you want to).
2) Yes, every other image mandates where you should get it from - Docker Hub's are unique in that regard.
Normally the registry the image is pulled from is configurable in the Helm chart.
AFAIK, the FQDN is used for connecting to the registry, but it's not passed to the registry. So when you pull `ecr.example.com/namespace/project` the daemon connects to `ecr.example.com`, but the registry only sees `namespace/project`.
There was an old proposal [1] to pass the FQDN to the registry. That would allow for an auto-magical pull through cache / mirror combo along the lines of what you're asking about.
For now I use a Nexus as a pull through cache, so `hub.example.com/library/image` is an image from Docker Hub where I want it to come through the cache. Anything that's important enough to have a local copy gets pulled/pushed into my private registry. I actually find the distinction between the cache (volatile) and the registry (non-volatile) to be useful.
If I had to take a guess as to why Docker stubbornly clung to the current system it's that it's adhesive. Once you start using an alternate registry for base images, etc. it becomes really tough to swap to another one. The big mistake there is that instead of being the go to destination for public images, everyone is wholesale switching to the registry from their preferred cloud provider.
Exactly. Docker made it sure they were in the critical path.
If you compare to Debian or Ubuntu packages, same use case, they have full support for mirror. When you run a debian VM in the cloud it's automatically connecting to the AWS debian mirror to pull packages. Everyone wins.
Docker distribution was designed to keep docker the company in control, not to have mirrors. Naturally it's getting substituted by competitors because they have no other choice.
I made a Terraform module that mirrors Docker images between two registries.
I built it to deal with GCP infrastructure (private networking and Cloud Run). When we migrated to GitLab, we used it to coordinate external dependencies between Docker->GitLab registries, using GitLab CI and their Terraform support. [Internally built images are just pushed directly during CI.]. We declare the mirror state and Terraform makes it so.
The spec is ambiguous about what to do if a blob with a matching digest is found in the registry but not in the repository namespace <name>. The layer deduplication section of the spec does state that the same blob MUST be shared between repositories. The manifest digest endpoint suffers from a similar ambiguity.
In any case, all I wanted to say is that it would have been quite possible to implement docker registry as a purely content addressed store + a key/value layer that binds human readable names to it. It's almost, but not quite that.
My point is that it would have been trivial to define the registry protocol to just require the digest. Do you see any reason why a pure content addressed registry protocol could have not existed?
No, I absolutely agree with you. It could have been a CAS design, but right it's not. It would take a V3 of the Registry API -- I commented on that in the DigitalOcean thread: https://news.ycombinator.com/item?id=24983077
If I were Microsoft I’d let Docker Hub just have free or extremely cheap bandwidth on Azure. Get good PR, help make sure that people can still easily move their applications to Azure (whether as a container or as part of a big k8s setup). They need it to stick around to make sure people that get going in AWS can move their stuff easier.
> Within weeks, AWS will deliver a new public container registry that will allow developers to share and deploy container images publicly. This new registry will allow developers to store, manage, share, and deploy container images for anyone to discover and download.
> ...
> A new website will allow anyone to browse and search for public container images, view developer provided details, and see pull commands — all without needing to sign in to AWS. AWS-provided public images such as the ECS agent, Amazon CloudWatch agent, and AWS Deep Learning Container images will also be available.
I recall the early days, the docker registry API was designed in a way that made serving from S3 impossible.
It always boggled my mind, because it must have made hosting docker hub super expensive.
In hindsight, the registry should have had a better API, been more suited for third-party mirrors, etc. then docker might still own the central registry... Now it seems like eventually they won't even own that.
I had the same thought. There should have been a play where you were the trusted metadata layer (across clouds even), but allowed actual images to come from S3, GCP, Azure directly (ie, serve metadata + hashes / signatures). Obviously AWS and everyone would spin up their own mirrors, lots of bandwidth would be saved (by Docker too) and they would still have a role in the center.
Does this in some way end up in an anti-trust vertical integration situation?
Dockerhub runs on AWS infrastructure to begin with. Is this similar enough to the railroad company undercutting the freight company that uses their railroad tracks?
Except other competing services also run on AWS, and have for years (some depend solely on AWS, like Heroku; others are multi-cloud, like MongoDB Atlas)
There are TONS of companies running on AWS that face competition from AWS. Snowflake just IPO’d. They run on AWS and sell a data warehouse service. AWS also has its own data warehouse called redshift
AWS has a lot of catching up to do when it comes to Docker support. It is so easy to use Docker in Azure. You can create a web app (App Service) directly off of a Docker image. When you make code changes just push the new image and the web site automatically updates. Even more interesting is Azure Container Instances where you run a docker image briefly and pay only for the brief time you run it.
I use AWS lightsail, which is awesome, but would be even more awesome if it supported Docker images out of the box.
It seems to me that Docker created something so absolutely amazing that everyone decided it was essential. Then it became an assumption. And then everyone started taking it for granted. To the point where Docker doesn't even get credit for what it does.
Personally I believe that what society rewards is quite pervese and often random.
Anyway before you jump on this AWS registry at least compare the pricing with Docker Hub.
Look at it another way. Docker is a complete PITA with many severe flaws as a piece of software, but it won because it raised investor money and spent it on stuff that was then given away for free. Basically it funnelled investor money into everyone else's budgets, which is a pretty sweet deal. If they hadn't done that, and if they'd worked on scaling their business in line with revenues, then quite possibly they'd have (A) more competitors and (B) better software. But it's hard to compete with free.
I am using Rancher and it has option to configure private registries. I cannot use a user/password combo to login to the registry without using aws command.
Is there an alternative alternative containerization platform which has a more structured and consistent CLI?
I found the CLI always a bit hard to grasp at the beginning, and still dislike it somewhat.
The general structure is `docker image pull` to pull an image from some registry or `docker image build` create one from a Dockerfile, `docker container create` to turn the image into a container, and `docker container start` to start the container.
But then there are a few other commands which essentially seem to be combinations of these, or accomplish these as a side-effect.
I can't put my finger on it, but there's something that feels predatory and maybe zero-sum-ish about this. AWS wins, docker loses. I guess this is considered a "shrewd" business move. I don't like it.
Well... If someone is making money using free or anonymous plans of Docker Hub, than simplest solution would be just switching to a paid plan. No AWS ”solutions” needed.
I think the real benefit/monopoly pull of this is free unlimited bandwidth from inside the AWS cloud. If you’re making money and switch to a docker hub paid plan, your paying customers using AWS might ask you to host on AWS too so they get trivial and free pulls. And it’ll probably be cheaper for you too. The AWS solution might not be needed but it’ll be hard to avoid if your paying customers are already on AWS and you want to offer them the best experience you can.
I expected our offices would hit them if our devs weren't working from home. That'd be all the devs at a given site pulling images from the same public IP address.
Although in case it's what you're asking: we haven't actually hit the limits yet. They're being progressively enforced. We're just taking steps to avoid being impacted when they do start affecting us.
>Especially in CI pipelines that like to rebuild images from scratch.
If people are doing that at scale on free accounts then I can see why dockerhub feels the need to impose limits on their free offering. Also...this is why we can't have nice things.
We hit them with our CI processes. Actually, I was a bit surprised that it happened because we only do 10-15 builds a day which shouldn't have triggered the throttle. Maybe there are some background checks that are happening in CircleCI that we don't know about or something.
Most CI systems used GET requests to fetch image manifests, in order to see what the registry's most recent image is. These requests are counted towards the limits in Docker's new rules.
Systems which built on top of the GGCR library[0] are switching to using HEAD requests instead[1]. These don't fetch the entire manifest, instead relying on just headers to detect that a change has occurred.
I don't see how. They only lose money on public anonymous pulls.
Docker's eventually going away and distros will replace it with a home grown alternative (or systemd). I predict that in 10 years distros will finally move toward a container native model (different from what we have now, but still based on containerization) and host their own registries for officially packaged containers. The same people that mirror their packages will host the registries. You'll add trusted registries the same way you add trusted package repos. In theory, this might actually reduce their overall bandwidth use.
Mindshare as THE go-to solution? They wanted to become Github for their space. While you can use private package registries, I suspect that's a 1% use case right now.
As consumers if we just keep taking the freest lunch without any care or love for the general OSS ecosystem there will be a ton of adverse effects. This is a pretty F’ed up way to do business. This is how you get people to stop doing nice things like writing open software. It’s even more frustrating and reprehensible that they’re doing this under the guise of supporting the community.