> groupby by two columns is not very common, so DataFrames.jl decided to leave it out from precompilation. For this reason when you run groupby(flights, [:origin, :dest]) native code for such a scenario is not cached. This is indeed a hard design decision for package maintainers. You could add more and more precompilation statements to improve the coverage of cached native code, but it also costs as it would impact: package installation time and package load time.
How feasible is it to let the users provide their own precompilation code? If I know that two-column groupbys are important to me (or some other operation that takes tens of seconds), it would be nice to be able to pay the one time precompilation cost for them to have much better TTFX later.
Thank you. Tim Holy linked to the same in a previous thread (https://news.ycombinator.com/item?id=35885133), but I couldn't understand the context in which I'd want to use it when I looked at it then. Now I understand the idea of Startup packages much better.
How feasible is it to let the users provide their own precompilation code? If I know that two-column groupbys are important to me (or some other operation that takes tens of seconds), it would be nice to be able to pay the one time precompilation cost for them to have much better TTFX later.