I think Digitalocean has their own package mirror for their image. AWS is just b...

jcrawfordor · 2024-05-29T20:49:20.000000Z

The fact that this is EPEL strongly suggests that it was set up by an AWS user, not by Amazon themselves. EPEL is not used by default in any common AWS AMIs. Perhaps it is an Amazon Linux user who enabled EPEL via Amazon's package, but it's not supported in the most recent version of AL so Amazon seems to have addresses that issue anyway.

22c · 2024-05-30T00:17:57.000000Z

More likely this is a change in a popular AMI or container image that is being used by a lot of different users, eg. a startup script that unnecessarily pulls or syncs from EPEL.

I suspect there are only a handful of vendors with popular enough AMIs or container images that could account for this, though.

Havoc · 2024-05-29T21:21:07.000000Z

>that it was set up by an AWS user,

A user with "five million additional systems" on AWS?

Reason077 · 2024-05-29T22:03:32.000000Z

> “A user with "five million additional systems" on AWS?”

Someone is going to be in for a big surprise when they get their AWS bill this month and realise there’s an infinite-loop bug in their instance spawning script.

facialwipe · 2024-05-29T22:30:46.000000Z

It’s clearly either a large contract that would have been negotiated before any instances were spun up, or Amazon themselves.

vesrah · 2024-05-29T21:33:27.000000Z

A long time ago I knew a guy that uploaded a Counter-Strike patch to his ISP personal hosting and ended up on the official mirror list. Ended up taking down the ISP iirc.

stonogo · 2024-05-29T22:20:54.000000Z

I don't think it's one user; I think it's a ton of them. Want to use Let's Encrypt in your Openshift-on-AWS deployment? certbot's in EPEL, along with a lot of other quality-of-life stuff for log-shipping, monitoring, etc.

mac-chaffee · 2024-05-30T00:23:10.000000Z

Each with a unique public IPv4 address too!

briffle · 2024-05-29T22:27:35.000000Z

That is some massive AI training!

hinkley · 2024-05-29T20:43:54.000000Z

Escalating delays can help with this. Get it to be slow enough that people notice.

XML schemas have had a similar history, tanking w3c.org servers.

recursive · 2024-05-29T21:40:05.000000Z

They totally deserved it for making namespaces that "just happen to be" URLs. XML is insane.

rodgerd · 2024-05-29T22:16:15.000000Z

Ah I remember the good old days of strict XML parsers that would fail if they didn't have Internet access to pull the schema in.

hinkley · 2024-05-30T16:19:17.000000Z

Which you didn’t figure out for a while and so your CI/CD pipeline and dev code/build/test cycles hammer their servers for months.

Then you prefetch to fix that problem, and now some slow calls you hadn’t gotten to the bottom of suddenly aren’t.

kevindamm · 2024-05-29T21:49:04.000000Z

well, but they're URIs. See, the difference is right there. An identifier not a location. Nobody should ever confuse the two!

/s of course

IshKebab · 2024-05-29T22:09:35.000000Z

Why even are they URLs? The only reasonable suggestion I could find is that it was part of an abandoned or poorly adopted idea to also host the schema at that URL.

TillE · 2024-05-29T22:30:41.000000Z

Probably the same sort of thought process that led to the convention of Java packages being named com.example.whatever. It identifies the creator and gives you some structure to create a unique identifier.

Lot of half-baked ideas floating around in the early years of the commercial internet, but the Java thing held up better.

agwa · 2024-05-29T22:40:35.000000Z

It's a way to ensure global uniqueness. In the end they're just compared byte-for-byte as strings.

IshKebab · 2024-05-30T17:26:06.000000Z

Precisely, so why are they URLs and not something like Java's packages? Or just a URL without the `http://`?

agwa · 2024-05-30T18:37:35.000000Z

Snarky answer: because they're the W3C and are high on their own supply ;-)

But it does allow more flexibility. If you don't want to be tied to a domain name, you can use a URN with a UUID like urn:uuid:4603d9d3-e895-4000-9077-0ab0f2776e1e

IshKebab · 2024-05-30T20:32:42.000000Z

You could also just do `com.mydomain.uid.4603d....` if you wanted. But yeah I think your snarky answer is probably more than a little true!

hinkley · 2024-05-30T16:20:33.000000Z

In addition to the uniqueness others have mentioned, where do you find the canonical definition of the schema if it doesn’t have a url? So they just cut out the mapping and made them one.

recursive · 2024-05-30T17:25:00.000000Z

This is nonsensical. Where did you get the XSD from? You could get a proper spec from the same place. Contrast this with JSON, which makes zero attempt. People (generally) prefer that. If you were dead-set on embedding the documentation URL in the namespace, you could at least remove the tripping hazard and use new "protocol" like `xsd-spec://host/specs/foo/v1.2` or something.

agwa · 2024-05-29T22:36:22.000000Z

Schemas != namespaces.

I'm sure some braindead software out there attempts to retrieve namespace URIs, but it would surely be a drop in the bucket compared to traffic for schemas/DTDs (which are intended to be retrieved).

kbolino · 2024-05-29T20:42:25.000000Z

Though there are probably things AWS could do anyway, this could well be caused by a large customer using a custom AMI, and not because of anything Amazon did or didn't do.

ajross · 2024-05-29T21:39:59.000000Z

Surely AWS knows how to set up a mirror. It's just a mistake, they'll surely correct it. Also simply blogging about it (which gets amplified by Phoronix, then by HN) is a better strategy for getting their attention than blocking.

INTPenis · 2024-05-29T22:18:00.000000Z

I worked in ops for 20+ years.

If someone blocks you it becomes an incident, a post mortem and you learn your lesson.

If someone blogs about it, or e-mails you, it gets added to a todo list and might get fixed in a few weeks by a disinterested intern.

ajross · 2024-05-30T22:38:44.000000Z

Per the original blog:

> ADDENDUM (2024-05-30T01:08+00:00): Multiple Amazon engineers reached out after I posted this and there is work on identifying what is causing this issue. Thank you to all the people who are burning the midnight oil on this.

You worked in ops, but not in a context where your employer could get shamed by IBM in public on the pages of Phoronix and HN. Call it "cloud scale" ops, I guess.

papichulo2023 · 2024-05-30T00:09:43.000000Z

Maybe they have a commercial relationship and dont want to harm it because a bug?

pquki4 · 2024-05-29T23:41:13.000000Z

How do you know "it's just a mistake"?

phirephly · 2024-05-30T16:22:25.000000Z

I've made a point of calling out Digital Ocean in Linux mirroring talks as the gold standard for being a good citizen; run their own internal mirrors, which are FAST, making it a value add feature for them as well.

skywhopper · 2024-05-29T22:45:48.000000Z

I doubt Amazon builds the Fedora images. So if they’re pointed to the wrong place, that’s not AWS’s fault.

blitzar · 2024-05-29T21:33:26.000000Z

Nahh that seems needlessly cruel they should continue to serve them at 100k speed.