Not the same poster, but the first "D" in "DDoS" is why rate-limiting doesn't work - attackers these days usually have a _huge_ (tens of thousands) pool of residential ip4 addresses to work with.
I work on a "pretty large" site (was on the alexa top 10k sites, back when that was a thing), and we see about 1500 requests per second. That's well over 10k concurrent users.
Adding 10k requests per second would almost certainly require a human to respond in some fashion.
Each IP making one request per second is low enough that if we banned IPs which exceeded it, we'd be blocking home users who opened a couple of tabs at once. However, since eg universities / hospitals / big corporations typically use a single egress IP for an entire facility, we actually need the thresholds to be more like 100 requests per second to avoid blocking real users.
10k IP addresses making 100 requests per second (1 million req/s) would overwhelm all but the highest-scale systems.
We had rate limiting with Istio/Envoy but Envoy was using 4-8x normal memory processing that much traffic and crashing.
The attacker was using residential proxies and making about 8 requests before cycling to a new IP.
Challenges work much better since they use cookies or other metadata to establish a client is trusted then let requests pass. This stops bad clients at the first request but you need something more sophisticated than a webserver with basic rate limiting.
Why was that not enough to mitigate the DDoS?