Hacker News new | past | comments | ask | show | jobs | submit login

Yeah, having worked on alternative notebooks before, one of the big implicit features of Jupyter notebooks is that long-running cells (downloading data, training models) don't get spuriously re-run.

Having an excellent cache might reduce spurious re-running of cells, but I wonder if it would be sufficient.




We've thought briefly about cell-level caching; or at least it's a topic that's come up a couple times now with our users. Perhaps we could add it as a configuration option, at the granularity of individual cells. Our users have found that `functools.cache` goes a long way.

We also let users disable cells (and their descendants), which can be useful if you're iterating on a cell that's close to the root of your notebook DAG: https://docs.marimo.io/guides/reactivity.html#disabling-cell...


ipyflow has a %%memoize magic which looks quite similar to %%xetmemo (just without specifying the inputs / outputs explicitly): https://github.com/ipyflow/ipyflow/?tab=readme-ov-file#memoi...

Would be cool if we could come up with a standard that works across notebooks / libraries!


Function-level caching is the best match for how I'd use it. Often the reason for bothering to cache is that the underlying process is slow, so some kind of future-with-progress wrapper could also be interesting. An example of how that could be used would be wrapping a file transfer so the cell can show progress and then when the result is ready unwrap the value for use in other cells. Or another example would be training in PyTorch, yield progress or stats during the run and then the final run data when complete.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: