Only insofar as there are diminishing returns on increasing memory, in general. If you can map your entire application instructions in a low latency block of memory, you're going to see massive benefits over swapping in/out portions repeatedly (where RAM latencies come into play).
Memory access typically follows a pareto distribution with a long tail. So doubling the size of the cache increases access speed more towards the tail, so the speedup is always less than the speedup of the previous cache size increase. The actual effect will vary by application but if the data doesn't all fit in cache, and access patterns follow that long tail distribution, its true that increasing the cache size had diminishing returns. Which is the case for almost all applications.
Sure, but that applies to main memory as well. Ergo, having a larger cache will offer a benefit over memory correlatively; it's only diminishing relative to itself.