We never get a good sense of how much time was actually saved with that change n...

   We never get a good sense of how much time was actually saved with that change not least because the original function calls "initialise weights" inside every loop, the new function does not.

Good point. Furthermore to your point, I would assume a library like pandas has fairly well optimized group and sort operations. It would not occur to me that pandas is the bottleneck, but the author does clarify in his footnote that pandas operations, by virtue of creating more complex pandas objects, can indeed be a bottleneck.

   [1] Please don't get me wrong. Pandas is pretty fast for a typical dataset but it's not the processing that slows down pandas in my case. It's the creation of Pandas objects itself which can be slow. If your service needs to respond in less than 500ms, then you will feel the effect of each line of Pandas code.