Hacker News new | past | comments | ask | show | jobs | submit login

> JupyterLab adapts easily to multiple workflow needs, letting you move from a Notebook/narrative focus to a script/console one.

I'm not sure I like where that design is going. It's starting to look an awful lot like RStudio and Matlab and I moved away from those tools for a reason. My favourite thing about Jupyter is that it is focused on notebooks and narrative. It brought about a revolution of sorts; now we have people blogging and writing papers in Jupyter, github is full of random useful notebooks.

This design almost seems like a step backwards in that regard.




Context: We use Jupyter heavily (mostly against Spark).

In my experience there is a set of things that "traditional" Jupyter notebooks does really well. Anytime you have a linear flow of steps the notebook metaphor works really well.

However, if you are doing things approaching traditional development, where you have multiple sources of data, or loops that require debugging, or basically anything that isn't linear in nature it doesn't work so well.

I wouldn't want to lose traditional notebooks, but I'd love to be able to offer people something like this that offers better debugging and some development tool support, rather than jumping to a full desktop IDE.


My experience is in line with yours, debugging loops and functions is a big pain point.

However, I think there's a much better solution to be had here, which is to add more powerful debugging capabilities to Notebook. I think Notebook has potential for new debugging paradigms, imagine for example being able to break anywhere in a cell and get a new 'forked cell' which operates in the context of the code that you just broke into. I think that's the better direction to go instead of reverting to the very interfaces and paradigms that we moved away from.


This sounds like a promising idea.

As someone who still reaches for a Smalltalk environment when I need to prototype something, ipython notebook is about the closest thing I've ever found to the style of interaction you get with a Smalltalk REPL (aka "workspace"). Smalltalk deliberately blurs the lines between a text editor and REPL, and the debugger takes this a step further - a combined editor and REPL, within a suspended execution context. It could be argued, for example, that coding within the suspended context of a failed unit test, while the code underneath your cursor has this REPL-like liveness, is the non-cargo-cult way of doing TDD.

I'm not trying to claim Smalltalk as the Greatest Thing Ever, but its existence (and its "otherness" - from the point of view of today's conventional style of development) are evidence that there are useful tools to be had, somewhere down a road less travelled.


I learned enough Smalltalk to make prototypes with Seaside. I love the way notebooks work and to make them more Smalltalky would be wonderful not only for programmers but for the field as a whole.


I'm not sure (And by that I don't mean I disagree: I'm genuinely unsure).

To me, the traditional IDEs do work well for debugging and software development.

Notebooks are great for explanatory examples and interactive experiments. I think these are different to the type of software development I do when I use an IDE.

For example, I find notebooks great for rapid iteration of parameters when I'm doing "data science", or indeed most of the feature extraction->modelling->prediction data science pipeline.

What I don't find them good for is developing new algorithms. It isn't clear to me if this is an inherit limitation of the notebook format, or just something where it needs new developments.

(To be clear, I've also used both Zeppelin and Beaker notebooks and don't see any particular advantages. I've also used R Studio, but I don't really know enough R to comment sensibly on that)


I agree, the notebook could really have much more powerful debugging.

I've actually been working on implementing something to that effect for Python (see http://kitaev.io/xdbg). But it's been really hard to figure out who already has experience with similar workflows, because these tend to be isolated to particular language or devtools communities.

Apparently I need to take a closer look at Smalltalk.


Seeing as you're heavily using Spark, have you had a look at Apache Zeppelin (site: https://zeppelin.apache.org, demo: https://www.youtube.com/watch?v=J6Ei1RMG5Xo)? Seems like a more powerful notebook approach, plus better architecture for using embedded d3.js viz. Also painless templated SQL -> published dashboard looks great for getting data visible early on.


Yes. Not really a fan.

It looks nice, but the installation experience is (was?) terrible (as in - didn't work at all). Note the long gap between the 0.5.6 release (January) and the 0.6.0 release (July)? There were 3 (4?) Spark releases in that time, and that meant that none of the out-of-the-box released worked for anything except the version of Spark you downloaded with it (and from memory that had problems too)

I got it working and evaluated it in some depth. I'm from a Java background, so I really wanted to like it.

But it turns out that all those features that seem really nice are mostly only nice if you are trying to build applications, not notebooks. Maybe it has improved, and maybe for some usecases it makes sense.


One of the Apache Zeppelin committers here.

Thanks for honest feedback (although it sounds bitter) - we are working hard on improving situation with release schedule indeed.

If there is any chance you could elaborate on installation experience though a JIRA issue [1] - that would be very much appreciated.

1. https://issues.apache.org/jira/browse/ZEPPELIN


Jupyter really is just for analysis. I prefer standard IDEs like Jetbrains for application development.


I agree, but would phrase it more like this: Jupyter has succeeded because each of the different major modes of interaction have been decoupled. If you just want to use it in a shell, you don't need to involve the browser at all. If you want a narrative format for sharing, presenting, or converting to slides, you can easily launch that environment.

This feels like a big step backwards for me too. It's effectively like replicating the MATLAB / Octave / PyDev (Eclipse) sort of IDE-with-extras-plus-console that is so, so cripplingly bad, but acting like it's great and new just because it's all in the browser.

If you're a fan of productivity, you shouldn't want to do that kind of stuff in a browser. Heck, I even disable all of the dropdown menus in Emacs because even that is too much of a productivity hindrance / inefficient use of monitor space when I am writing, reading, and thinking about code.

This is one of those things where I feel that it doesn't actually solve practical use cases, doesn't make people more productive, but because there is a big hype engine behind it, it gets adopted and talked about anyway, and eventually becomes the sort of thing that an Office Space kind of manager starts to force you to use ... which really scares me. Stay off my lawn.


Can you elaborate on what you think is bad about the IDE-with-extras-plus-console framework? I do research in quantitative finance and I actually find it really useful.


One of the main things is that it continues to perpetuate primarily mouse-driven interaction with the development environment. Even when tools like this enable Emacs or vi key configurations, the integration just never quite works, and there are environment-specific options you are required to select that come from e.g. drop-down menus, etc. Interacting with UI elements is horrendously unproductive and disruptive to thinking. Putting it in the browser makes this worse, because then you've also got the browser's own key configurations, like tab switching or bookmarking, to worry about.

There is seldom any value in looking directly at code and at a console at the same time. But if you really want that, it's super easy to do it with a window manager like xmonad, or even just arranging shell windows on your desktop so that you can alt-tab between them easily.

You often want to quickly spawn and kill shell tabs, which themselves may or may not be in the same language. For example, I often have a tab in which I'm using IPython, another tab in which it's the working directory so that I can execute things with Python directly, mv/cp files, ls, etc. And then still more tabs in which I have background processes that check for file changes and run my unit tests whenever things change, sometimes a tab for Python 2 and another tab for Python 3. And further, development projects are almost always cross-language, so I tend to also have some tab opened for writing and working with C or Haskell at the same time, and possibly another with a psql shell.

Since there are so many necessary tabs just to do even the tiniest things, it means that any and all visual overhead must die. Switching these tabs, even if it is quick, inside of a clunky GUI application like a browser is just too unproductive -- the browser periphery already wastes maybe 5% of the available visual space, and then the overhead for the tab icons, clickable close buttons, etc., wastes another 5% inside of that, and then the width of the tabs is restricted because there's some left panel with directory information or in-memory workspace information (both total wastes of time), and the height is restricted below by some worse-than-plain-shell console; it just makes no sense. It's too much visual clutter and too inefficient to facilitate switching around as much as is necessary.

I think it should be emphasized again that the left-side panel showing either directory structure or the contents of an in-memory working environment are huge wastes of time. If you need to visualize a directory structure, that should just be another buffer, like a code source file, and when you want to view that you just switch to that buffer. There's no benefit to having some of your visual field distracted by it when you're working on other files. And the in-memory information is also generally crappy. It's another thing where if you really need it then it should just be in another buffer and you should quickly go to that buffer and give it all your attention for the short time you need it, then go back. It serves nothing by having it as an ever-present visual distraction. But more than that, relying on inspecting variables that way is a very infantile thing, and I see it a lot with MATLAB programmers. Their form of debugging is not to scientifically inspect and control the execution of the code, and use proper breakpoints and watchpoints to tells them what's going on, but instead to just "run everything" and then go and click to open up a spreadsheet-like view of a matrix variable or something and manually (!) inspect the data. Then they become reliant on this as a crutch and complain when it's no longer there, instead of learning proper ways to write tests and proper debugger usage and let those things automate the problem of zooming in on outlier data, messy data, or bugs.

Anyway, there's plenty more to say, but it's probably long-winded enough.


I generally much prefer GUI tools to command line tools. I've been working as a software developer for 20 years, so it isn't that I don't know the command line.

I find GUIs generally allow for better discoverablity and I much prefer switching browser tabs to switching tmux/screen terminals.

Perhaps it is possible that different people work in different ways? Maybe the things you are thinking of 'wrong' actually just don't work for you, but really do for others?

For example I know how to use a debugger. I've written my own custom debugging clients, and attached IDEs to live production webservers and debugger code live, so I really do know what I'm doing. But sometimes I prefer outputting data into an Excel spreadsheet because it gives me better context. Sometimes that really is the right tool for the job.

Also, get more, bigger screens. It's made a huge difference for me.


Looks like you're over thinking it a little. It's just a tool.


It's a bad tool. It's important to contemplate what is a good tool and what is bad tool -- and to avoid using bad tools when you are able.


One of the things about JupyterLab is that it really emphasizes extending and customizing the environment. Think of it as a platform for web-based applications and a reference set of components. We've tried hard to make the underlying platform have good support for power users. We've paid a lot of attention to making a good keyboard shortcut system, for example. I think it makes a lot of sense for someone to write a plugin for the system that registers keyboard shortcuts for doing tab management, maybe in the vein of a tiling window manager. Also, it would take changing only a few lines of code to move the file browser from the left side panel to the main docking area - it's just a widget that is registered on the left side panel rather than the main area.

We also encourage people to theme the environment, and provide themes via plugins. I think theming things with a lot more minimal whitespace would be an interesting project for users like you.

Also, for a power user, I'd definitely suggest running it in the browser application mode, which gets rid of lots of the browser chrome.

Remember - it's an alpha-level project, we're still experimenting a lot, especially on the user experience and UI. JupyterLab was built to be easily extensible and customizable, and we encourage people to experiment with customizing the environment in a way that suits them. And we appreciate feedback as well! Thanks!


Ipython terminal is still there, but you can't expect ma$$$ or Excel users to start working in vim or emacs based workflow. Besides browser provides rich output, cross-platform support. The big thing for me are widgets.


The returns to widgets in these kinds of things are soo diminishing. It looks cool the first 5 times, but then your boss says, "Great. Productionize it" and it's a world of shit. The widgets are good for throwaway demos and presentations, but if you are designing some functions for a business API of some kind, and your form of reporting is to show people something with a widget, now they want your internal API to include the widget, which is a dangerous game.

I like super boring static charts, such that I can completely decouple the presentation from whatever business API was used behind the scenes for the data that's displayed. Take a super severe Occam's Razor approach to absolutely avoiding any kind of animation or interactive plot for as long as possible, and only if there is some kind of outrageously severe business case that absolutely demands it (which there almost never is) will I resort to designing a way to productionize the widget aspect of the report visualization as well.

For this reason, I now see widgets like from IPython as a huge minefield. It looks pretty, but it's a big trap, and it doesn't add nearly the value everyone thinks it does.

(Just to be clear, I'm only talking about the non-demoware case, which is the kind of case I have always worked with.)


I use widgets for data exploration, model tuning, and interactive plots. Works great in practice! Have a look at bqplot from Bloomberg used in production.


I avoid widgets for data exploration, which should be written from the start in a well-tested and library-focused sort of way even when it's ad hoc.

Model tuning absolutely should not be done with something like a widget. In fact, I could see how that could easily lead to unreported issues with multiplicity of testing when someone's just sliding around a slider and seeing what looks best, oblivious to the statistical consequences. Model tuning is better handled by having a separate sort of model specification file, in which parameters, data cleaning steps, etc., have to be registered ahead of time before any code is executed whatsoever. That allows full transparency and reproducibility: you can even map model specs to unique IDs and backtrack to which analysts executed the job to fit that model, how many times it was updated, etc. ... whereas someone monkeying around in a notebook with a slider bar, that's absolutely not OK. It would be OK for demoware, but never ever for serious production cases.

Avoiding widgets works great in practice. When I was working in quant finance, this was a reason why we heavily decoupled all data presentation code from all data exploration code.

We also realized that the interactive plots just add nothing 99% of the time and are virtually never worth the headache. Just use static plots until there's a serious use case that truly requires interactivity. Above all, don't use interactivity just because it's the shiny new thing.

bqplot, Bokeh, d3py, etc., these are great engineering projects that just unfortunately don't have pragmatic use cases and are generally adopted out of hype and an obsession for the new more than for pragmatism.

After working with lots of these tools, we just began to realize that e.g. mousing over line charts or maps and being able to click to drill down into data points was simply not helpful. Streaming plots do have some applications when you have to view real-time dashboards, but in these situations it gets abused and used incorrectly, mostly due to bad dashboard ergonomy (cough Bloomberg), and so there actually can be a cost-effectiveness argument in avoiding the streaming dashboard anyway, kind of like cost-effectiveness arguments for reducing alert menus to avoid alert fatigue. The cognitive foibles of the user matter to the design!

In a Tufte sort of sense, it just did not actually aid in perceptual understanding. It's more of a "let's do it cause we can" thing than a "let's do it because it actually offers actionable insight" thing.


Hey, Tufte also criticizes Excel, which is still most widely used tool for analyzing data and making plots that are not static. Engineers love it. Anyway I will stop here my replies.


Nah. Excel is terrible. Tufte's right about that one. I'm sure there are engineers who like Excel -- I mean Coldplay keeps selling albums, so?


Notebook is still primary interface for final version of your analysis. But during iteration and experimentation of analysis, the other parts of jupyterlab are really helpful! Before all of these was not so nicely integrated.


But from what I can tell, it still doesn't look integrated. It looks like separate tools all glued on the same webpage.

Now I have yet a separate terminal, a separate file manager, a separate Python REPL, ... How is this any better than just keeping those applications open and switching between them?

I really hope this isn't just a poorly-implemented window manager inside a web browser.


For some plugins integration just means visual integration (i.e. window-manager integration). And, to be clear, this one is not poorly-implemented. It's based on a very powerful web-application toolkit called PhosphorJS written by very experienced UI developers.

For other plugins there is very nice integration so that you can have interactions in one plugin update outputs in another and vice-versa.

The ease of creating a connected ecosystem of web-applications connected to your workflow is one of the powerful aspects of the new platform.

What would great integration look like to you?


I would want integration into my existing workflow, not integration that replaces my workflow with something completely different.

For me, this means using emacs-ipython-notebook. This brings the ability to interact with a running IPython Kernel into the text editor that I use daily. It's the best of both worlds: interactive plotting+notebooks+rapid development assistance from IPython, and thanks to Emacs, all the keybindings are what I expect and I'm already familiar with how to extend the environment to suit my needs. EIN has been transformative for me.

Perhaps my needs don't match the needs of others. For most people, I imagine the Jupyter-lab style of environment that's self-contained in a browser would in fact be a step up from the ad-hoc notepad++ windows and PuTTY sessions used previously.


The layout is customizable - you keep what is needed




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: