I agree it's useful although there also exists a more scalable approximate variant of classical k-Means and the new Yinyang k-Means (referenced in the paper), namely Mini-batch k-Means:
Another approach is to transform the dataset into a smaller but representative dataset, called a core-set, and running k-means on that core-set instead which produces a "(1+e
)-approximation for the optimal cluster centers".
Researchers included a link to (C++) source code: http://research.csc.ncsu.edu/nc-caps/yykmeans.tar.bz2