If O(n^2) was the best, then how did Google calculate 100 trillion digits (10^14) of pi [1]? That would require a constant times 10^28 operations. Let's say the constant is 1, and all operations are the cheapest operations and you use an A100 GPU. That A100 GPU can do about 10^15 operations per second of Int8 type. You'd need 10^13 seconds of GPU time, which you could get with, say, 10000 GPUs in 1 billion seconds (30 years). I don't think Google did that.
[1] https://cloud.google.com/blog/products/compute/calculating-1...