Hacker News new | past | comments | ask | show | jobs | submit login

> groupby by two columns is not very common, so DataFrames.jl decided to leave it out from precompilation. For this reason when you run groupby(flights, [:origin, :dest]) native code for such a scenario is not cached. This is indeed a hard design decision for package maintainers. You could add more and more precompilation statements to improve the coverage of cached native code, but it also costs as it would impact: package installation time and package load time.

How feasible is it to let the users provide their own precompilation code? If I know that two-column groupbys are important to me (or some other operation that takes tens of seconds), it would be nice to be able to pay the one time precompilation cost for them to have much better TTFX later.




One approach you could try now is creating StartUp packages: https://julialang.github.io/PrecompileTools.jl/stable/#Tutor...


Thank you. Tim Holy linked to the same in a previous thread (https://news.ycombinator.com/item?id=35885133), but I couldn't understand the context in which I'd want to use it when I looked at it then. Now I understand the idea of Startup packages much better.


I really wish this sort of thing was automatic, transparent to the user, and built into the Julia runtime.


there's been a bit of work on this. the hard part is figuring out which code a user cares about isn't trivial.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: