These papers improve time complexity, not actual running time. I'm not sure it ever makes sense to do comparative benchmarking, since you can't test on arbitrarily large inputs (which is what's important when looking at algorithm complexity). For smaller inputs (for example, N in the 1000-1 million range), constant factors impact running time significantly. To compensate, you'd have to run a benchmark of size 10^10,000,000.
EDIT: For an example, let's say the current best algorithm for some problem takes O(N^2), with a small constant (let's say 100 * N^2 operations). Then someone comes up with a O(N^1.9), but with a larger constant (let's say it takes 1,000,000 * N^1.9). The newer algorithm will practically be slower for any reasonably-sized input you can throw at it, even though it's a major theoretical breakthrough.
As you are talking about reasonably sized input, your constant examples should also be reasonable. 10000 times difference in the constant is not reasonable. You wouldn't even get that from going from all data in the level 1 cache to random memory access with 10 times more instructions.
And "these papers" often do care about running time and will say whether it is possible to implement the algorithm efficiently, if you're lucky they'll even have done an implementation and you can see at what dataset sizes they are faster in practice.
And even if it were slower in practice, research like this is very useful, because it helps gives other researchers inspiration, ever since someone thought of looking at these problems like electrical flow in 2010, there have been ever faster approximations for it, while others will find ways to implement the algorithm efficiently, just in case a naive implementation of the offered algorithm isn't efficient enough.
Dismissing such research without having found any actual extreme inefficiencies in the proposed algorithm is not an attitude suited to anyone who wants to see progress.
And finally, the actual speed increase in the past decades hasn't been from O(N^2) to O(N^1.9) but from O(VE^2) to O(VE), which obviously does have a real impact, and even with a non-realistic 10000 times bigger constant it would still be far faster in any graph with more 10000 edges. And graphs with billions of edges aren't unusual anymore.
> And even if it were slower in practice, research like this is very useful, because it helps gives other researchers inspiration, ever since someone thought of looking at these problems like electrical flow in 2010, there have been ever faster approximations for it, while others will find ways to implement the algorithm efficiently, just in case a naive implementation of the offered algorithm isn't efficient enough.
> Dismissing such research without having found any actual extreme inefficiencies in the proposed algorithm is not an attitude suited to anyone who wants to see progress.
I never dismissed this research or said it's not useful; on the contrary, it's very interesting. I was just arguing against the usefulness of benchmarking results for such algorithms, when the main improvement is the lower complexity.
EDIT:
> And finally, the actual speed increase in the past decades hasn't been from O(N^2) to O(N^1.9) but from O(VE^2) to O(VE), which obviously does have a real impact, and even with a non-realistic 10000 times bigger constant it would still be far faster in any graph with more 10000 edges. And graphs with billions of edges aren't unusual anymore.
One concrete example I had in mind is Strassen's algorithm, which reduces matrix multiplication from O(N^3) to O(N^2.8) or O(N^log 7). It doesn't seem to be worth it to use Strassen over the simple algorithm for relatively small matrices.
There are other examples of algorithms and data structures that offer an O(N log log N) time, where the previous algorithm needed O(N log N) to solve the same problem. You need a very large data set to justify using the former, especially if the latter are much simpler to implement.
> As you are talking about reasonably sized input, your constant examples should also be reasonable. 10000 times difference in the constant is not reasonable. You wouldn't even get that from going from all data in the level 1 cache to random memory access with 10 times more instructions.
10000 was just an example, I guess it was unrealistic. In practice, I think an 100x slowdown from an algorithm that has bad cache behavior makes some sense. [1] says that an L3 cache hit has a latency of 1-300 cycles, as opposed to a L1 hit that is only 4 cycles. That's almost an 100x difference, so theoretically a cache-friendly algorithm can be that much faster (and that's just the difference between L3 and L1, main memory takes even longer to access).
". I'm not sure it ever makes sense to do comparative benchmarking, since you can't test on arbitrarily large inputs (which is what's important when looking at algorithm complexity)"
Of course it does, if you actually want the algorithm to be used in the real world.
It will get used in the real world in cases where the size of the input data justifies it. The main selling point of a better algorithm (and the main reason users would use it) is usually improved time complexity, not actual running time. Users pick it when the data is large enough to warrant it, after possibly running their own benchmarks.
If you can't even come show cases where the size of the input data gives you a better running time, nobody will ever use it.
"The main selling point of a better algorithm (and the main reason users would use it) is usually improved time complexity, not actual running time. "
I'm not sure where you get this. I have literally never seen anyone pick an algorithm based whose only benefit is "improved time complexity" when "actual running time" is worse for the majority of real world cases.
In particular, there are plenty of algorithms with "improved time complexity" but higher constant factors, and they don't get used.
As a concrete example: Most computations of dominators in compilers use theoretically worse (n log n instead of inverse ackerman) but practically better algorithms.
Also, you can't possible know when data is large enough to warrant it if nobody has taken the time to benchmark it.
EDIT: For an example, let's say the current best algorithm for some problem takes O(N^2), with a small constant (let's say 100 * N^2 operations). Then someone comes up with a O(N^1.9), but with a larger constant (let's say it takes 1,000,000 * N^1.9). The newer algorithm will practically be slower for any reasonably-sized input you can throw at it, even though it's a major theoretical breakthrough.