Hacker News new | past | comments | ask | show | jobs | submit login

> Cell-by-cell notebooks like Jupyter are great for prototyping, explaining and exploration, but their dependence on a Python server (with often undocumented dependencies) limits their ability to be shared and remixed. Now that browsers support dynamic imports, it has become possible to create a similar workflow entirely in the browser.

Pretty cool. As someone involved in building a machine learning platform which also allows users to leverage Jupyter notebooks, I feel your pain. We had to add many features to make working on non trivial projects more feasible. For example near-real-time collaboration so people could pair-program on one notebook.

One other problem you mention is the kernel in the backend. The problem with that is often found in issues where a user launches a long-running notebook, closes their tab/browser. The computation happens but the results don't find their way, which means users must have code for side effects and artifacts in the notebook.

We've added a functionality to run notebooks asynchronously to address that problem, which also addresses the problem of multiple users competing for limited resources (tragedy of the commons) and kernels dying. Now users can run their notebooks asynchronously: they're scheduled, run, and saved seamlessly.

Jupyter is a useful tool which has these problems that aren't trivial to solve. For example, for the near-real-time collaboration, the Jupyter team is revamping their API with Lumino (https://github.com/jupyterlab/lumino), which was PhosphorJS but the development stopped when the main contributor had retired.




Hey! I'm using jupyter notebooks on a daily basis, and I would be really interested in the 2 features that you mentioned: enabling real time pair programming (a la google doc) and resuming running notebooks whose frontend was closed.

Are you building a product that you plan to release? Or are these contributions that are available as open source components?

In any case, I'd like to know more, or to have some information on relevant topics that might be useful to understand and find workarounds for these issues!


Hi...

>and resuming running notebooks whose frontend was closed.

The platform schedules the notebooks, runs them, and saves the results with the output[0]. You can still check the progress, status, and position in the queue of the notebook if you wish. You still get the results after it's done even if you had closed the browser.

>Are you building a product that you plan to release? Or are these contributions that are available as open source components?

Well... We're a machine learning shop, present in Algiers, Algeria and Paris, France, that's been building custom turn-key products for large organizations for quite some time. As you know, non-trivial projects with real stakes are a bit more "challenging" than portfolio projects. Tracking rationale of projects, tracking metrics/params/models, shared access to data, deploying and managing models, challenges in collaboration, setting an environment and keeping things from breaking, or needing a beefy machine.

We started building the platform to allow our teammates to remotely work on projects involving machine learning, as commuting is hard and having contributors in different countries can be challenging. We also wanted to simplify access to a larger talent pool from other countries, which means we had to have a way to work together. We also wanted self-service: our data scientists had to sollicit the help of other engineers to either use some language specific feature, or to deploy their notebooks. We wanted to allow them to do so themselves in one click so they don't wait, and other engineers can work on other things. We wanted the results to get as soon as possible to the domain expert, and allo that domain expert to be able to train models themselves with no code experience, by exposing parameters that are relevant to some sector, and allow them too to deploy a model. In many cases, a domain expert would chime in to tell us that some variable that was dismissed was actually critical at some point in a decision-making process. So we wanted that, too.

We also wanted developers to interact with the models with HTTP requests period. We kind of suffered from having too many simultaneous projects with different stacks.

We needed a way to execute projects efficiently and effectively, but we didn't like other products. We also wanted extensibility. As I said, we built whole products, not just the models, so we needed a way to plug functionality in.

We're actively developing this.

>In any case, I'd like to know more, or to have some information on relevant topics that might be useful to understand and find workarounds for these issues!

Sure, shoot us an email at the site below or using the info in my bio.

[0]: https://iko.ai/docs/notebook/#long-running-notebooks




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: