The fact that this is EPEL strongly suggests that it was set up by an AWS user, not by Amazon themselves. EPEL is not used by default in any common AWS AMIs. Perhaps it is an Amazon Linux user who enabled EPEL via Amazon's package, but it's not supported in the most recent version of AL so Amazon seems to have addresses that issue anyway.
More likely this is a change in a popular AMI or container image that is being used by a lot of different users, eg. a startup script that unnecessarily pulls or syncs from EPEL.
I suspect there are only a handful of vendors with popular enough AMIs or container images that could account for this, though.
> “A user with "five million additional systems" on AWS?”
Someone is going to be in for a big surprise when they get their AWS bill this month and realise there’s an infinite-loop bug in their instance spawning script.
A long time ago I knew a guy that uploaded a Counter-Strike patch to his ISP personal hosting and ended up on the official mirror list. Ended up taking down the ISP iirc.
I don't think it's one user; I think it's a ton of them. Want to use Let's Encrypt in your Openshift-on-AWS deployment? certbot's in EPEL, along with a lot of other quality-of-life stuff for log-shipping, monitoring, etc.
Why even are they URLs? The only reasonable suggestion I could find is that it was part of an abandoned or poorly adopted idea to also host the schema at that URL.
Probably the same sort of thought process that led to the convention of Java packages being named com.example.whatever. It identifies the creator and gives you some structure to create a unique identifier.
Lot of half-baked ideas floating around in the early years of the commercial internet, but the Java thing held up better.
Snarky answer: because they're the W3C and are high on their own supply ;-)
But it does allow more flexibility. If you don't want to be tied to a domain name, you can use a URN with a UUID like urn:uuid:4603d9d3-e895-4000-9077-0ab0f2776e1e
In addition to the uniqueness others have mentioned, where do you find the canonical definition of the schema if it doesn’t have a url? So they just cut out the mapping and made them one.
This is nonsensical. Where did you get the XSD from? You could get a proper spec from the same place. Contrast this with JSON, which makes zero attempt. People (generally) prefer that. If you were dead-set on embedding the documentation URL in the namespace, you could at least remove the tripping hazard and use new "protocol" like `xsd-spec://host/specs/foo/v1.2` or something.
I'm sure some braindead software out there attempts to retrieve namespace URIs, but it would surely be a drop in the bucket compared to traffic for schemas/DTDs (which are intended to be retrieved).
Though there are probably things AWS could do anyway, this could well be caused by a large customer using a custom AMI, and not because of anything Amazon did or didn't do.
Surely AWS knows how to set up a mirror. It's just a mistake, they'll surely correct it. Also simply blogging about it (which gets amplified by Phoronix, then by HN) is a better strategy for getting their attention than blocking.
> ADDENDUM (2024-05-30T01:08+00:00): Multiple Amazon engineers reached out after I posted this and there is work on identifying what is causing this issue. Thank you to all the people who are burning the midnight oil on this.
You worked in ops, but not in a context where your employer could get shamed by IBM in public on the pages of Phoronix and HN. Call it "cloud scale" ops, I guess.
I've made a point of calling out Digital Ocean in Linux mirroring talks as the gold standard for being a good citizen; run their own internal mirrors, which are FAST, making it a value add feature for them as well.
AWS is just being ignorant. If I were in charge of Fedora infrastructure I'd block them and send them instructions on how to setup a mirror.