More

fryguy · on Oct 5, 2022

I mean, that's what STIR/SHAKEN do.

fryguy · on Jan 28, 2020

My problem with notebooks is that I feel like the natural mental model for them is a spreadsheet mental model, not a REPL mental model. Under that assumption, changing a calculation in the middle means that all of the cells that depend on that calculation would be updated, but instead you need to go and manually re-run the cells after it that depend on that calculation (or re-run the entire notebook) to see the effect on later things. Keeping track of the internal state of the REPL environment is tricky, and my notebooks have usually just ended up being convenient REPL blocks rather than a useful notebook since that's the workflow it emphasizes.

WorldMaker · on Jan 28, 2020

That's something that I think Observable [1], in my modest usage, seems to do well.

[1] https://observablehq.com/

jacobolus · on Jan 28, 2020

Yep, the real complaint is “dead state”, not out of order execution. Worrying about linear flow per se turns out to be misguided based on lack of imagination for/experience with a better model: reactive re-rendering of dependent cells. Observable entirely solves the dead state problem, in a much more effective way than just guaranteeing linear flow would do.

* * *

More generally, Observable solves or at least ameliorates every item in the linked article’s list of complaints. (In 2020, any survey about modern notebook environments really should be discussing it.)

I found the article quite superficial. More like “water cooler gripes from notebook users we polled” than fundamental problems with or opportunities for notebooks as a cognitive tool. I think you could have learned more or less the same thing from going to whatever online forum Jupyter users write their complaints at and skimming the discussion for a couple weeks.

I guess this might be the best we can hope for from the results of a questionnaire like this. But it seems crazy having an article about notebook UI which makes no mention of spreadsheets, literate programming, Mathematica, REPLs, Bret Victor’s work, etc.

From the title I was hoping for something more thoughtful and insightful.

SiempreViernes · on Jan 28, 2020

You can get a jupyter extension[1] that allows you to add tags and dependencies and this way construct the dependency graph as you go along. Of course, you have to do it manually and the interface is a bit clunky, but it does what it says.

In practice I think taking care not to accidentally shadow variables is much more important: this dependency business only makes sense once you have a clear idea of what you need and by that point you are mostly done anyway.

[1] https://jupyter-contrib-nbextensions.readthedocs.io/en/lates...

jacobolus · on Jan 28, 2020

I don’t understand what you are trying to say in your second paragraph, but I highly recommend you spend a few weeks playing with http://observablehq.com instead of speculating about the differences.

In practice, I find it to be dramatically better than previous notebook environments for data analysis, exploratory programming / computational research, prototyping, data visualization, and writing/reading interactive documents (blog posts, software library documentation, expository papers ...). It has a lower barrier to starting new projects, a lower-friction flow throughout

I find it better at every stage of my thinking process from blank page up through final code/document, and would recommend it vs. Jupyter or Matlab or Mathematica in every case unless some specific software library is needed which is unavailable in Javascript. The only other tool I really need is pen and paper, though I also use http://desmos.com/calculator and Photoshop a fair bit.

tel · on Jan 28, 2020

This falls apart when computation is a factor, though. You can't recompute the whole notebook on every commit when there are 30 cells that each take 2-8 seconds to complete.

etangent · on Jan 28, 2020

[flagged]

stevesimmons · on Jan 28, 2020

In Jupyter I approach this by structuring my exploratory analysis in sections, with the minimum of variables reused between sections.

Typically the time-intensive data prep stage is section 1.

The remaining sections are designed essentially like function blocks: data inputs listed in the first cell and data outputs/visualizations towards the end.

Once I decide the exploratory analysis in a section is more-or-less right, I bundle up the code cells into a standalone function, ready for reuse later in my analysis.

Jupyter notebooks can easily get disorganised with out-of-order state. However that is their strength too: exploratory analysis and trying different code approaches is inherently a creative rather than a linear activity.

Sean1708 · on Jan 28, 2020

Maybe I'm missing a joke here, but if that's your workflow then there's absolutely no advantage to notebooks over something like Spyder or even VS Code.

etangent · on Jan 28, 2020

No, that's not the workflow. You work in the notebook as normal but from time to time (say every two hours) rerun the whole thing.

One advantage of this is that it forces you to name your variables such that they don't overwrite each other. Further down the line this enables sophisticated comparisons of states (e.g. dataframes) before and after (something data scientists need)

gdy · on Jan 28, 2020

If you have a few long data loading and preprocessing steps it's a pain to wait for them to run again, people try to avoid it.

When something odd begins to happen, they don't immediately consider the possibility that it's not their bug and waste time trying to 'debug' the problem instead of just rerunning the notebook.

playing_colours · on Jan 28, 2020

Would it be a solution to store intermediate computations to an in-memory or disk database like Redis, SQLite? It is a matter of few minutes to run a docker instance and write simple read / write + serialize Python util functions?

gdy · on Jan 28, 2020

Surely, it would be a solution, but I don't think for an average data scientist it's a matter of few minutes.

etangent · on Jan 28, 2020

You don't reload every time you write a line of code. Nobody's insane like that. You reload every two hours or so. This is good enough for most except most extreme data sets.

graphpapa · on Jan 28, 2020

Well if block 1 takes ages but everything after that is dependant 2->3->4 etc. Obviously it would be nice to just re-run block two and have those changes cascade

heinrichhartman · on Jan 28, 2020

I break long running data imports out into seperate Notebooks or .py files, and persist the results.

Always restart&re-run for usable results.

tobmlt · on Jan 28, 2020

That’s what I’d always do. On more complex notebooks, though, is it possible that isn’t a solution? I wouldn’t think so but I am happy to be surprised. Then again I use notebooks only at the end of a project to present work in “executable presentation” style. Restart and Rerun all has been always been sufficient for me. More generally, I took a look at notebooks, thought, “Why develop with all the extra baggage” and left it at that until ready to experiment with presentation methods for (tight) core ideas.

dkersten · on Jan 28, 2020

Why are you even using notebooks at all then?

etangent · on Jan 28, 2020

> One advantage of this is that it forces you to name your variables such that they don't overwrite each other. Further down the line this enables sophisticated comparisons of states (e.g. dataframes) before and after (something data scientists need)

Also, not sure about you, but I like seeing all of my outputs on a single browser page without having to write any glue code whatsoever.

fryguy · on June 8, 2017

It seems to me to be the exact opposite of this. If all of the data going from server to client comes through JSON to javascript, which usually means a JSON serializer and should correctly escape the data since you're not generating the JSON by hand, then there is no chance for traditional XSS attacks since the only remaining vector would be doing manual DOM building by concatenating strings, which you generally don't do in React. Now CSRF attacks I would believe you, but not XSS with React.

fryguy · on May 23, 2017

I would say this is more like a local git repository than what you said. When you add passwords to a vault, then it gets saved to your local copy and then synced to the server. When changes are made elsewhere, it downloads the changes and syncs your local repository.

Now "travel mode" simply removes the local git repository. The data still exists in the cloud, but you have to actively go out and log in to their service to retrieve it. Are you "hiding something" because you deleted a local copy of something from your device? There isn't something on your device that is somehow hidden. It's not there.

fryguy · on March 17, 2017

I prefer the Goodreads rating system which compresses 1 and 2:

1 - Did not like it

2 - It was ok

3 - Liked it

4 - Really liked it

5 - It was amazing

The reason being is that most books won't end up being 1 or 2, and there really isn't much difference between them. But it sucks giving 2 stars to something that wasn't bad. Perhaps emoji arranged in a non-star-like pattern would be the best.

fryguy · on March 11, 2017

Can you explain how a collision helps? I mean, there's trivial collisions with the truncation that would be used instead. That doesn't mean that bcrypt(f(x)) is any weaker because there may be some other x' that f(x) = f(x').

fryguy · on March 11, 2017

zxcvbn (https://dl.dropboxusercontent.com/u/209/zxcvbn/test/index.ht...) gives it a score of 4/4 and an entropy of 10^20, meaning it would take centuries to hack at 10B a second. I think this is a slight overstatement of the security, because it's probably more along the lines of 50502^30 which is closer to 10^12. And this would be a legal password (but banned by the bullshit password rule).

I feel like the solution to everything in this thread is just to use zxcvbn and stop with the insane rules for things. In your two cases: the bank would disallow passwords below some limit while the blog would just show you a warning (in case you were ignorant of hacking enough to know that "aaaaaaaa" wasn't a good password), but let you use your awful password to spare you from having to remember it.

fryguy · on March 10, 2017

I use realtimebitcoin.info

fryguy · on March 10, 2017

How many dots/stars should one display for a password? That's a question that can't be answered by your two valid question. Are you suggesting that dots/stars shouldn't be displayed for passwords, since you can't ask how many "characters" it is?

slededit · on March 10, 2017

You could divide the length of the string by the length the '*' character in a monospaced font. It doesn't really make sense for a combining or other invisible character to get its own asterisk.

toast0 · on March 10, 2017

If you have an entry indicator, it should probably be about the same width as the entered text; or if you're concerned about leaking precise length information for fields that aren't monospaced, you could add a dot each time the rendered text would increase in width.

fryguy · on Feb 28, 2017

Think about it this way. Netflix wanted to separate their DVD and streaming into separate companies, so there would be different websites for each. Consumers complained about that so much that they changed their mind and stayed with one. If Google/YouTube started with a single product and broken it out like it is now, people I think would rightly complain.

I think the most egregious thing is music, rather than YouTube Red or this service itself. There's Google Play where you can buy music, Google Play Music where you can subscribe like Spotify, and YouTube Red also includes unlimited music on top of the shows on Red but is somehow different than Google Play Music. I guess what I would suggest if I had any say in things is just killing the Google Play Music and Google Play Movie/TV brands, and moving the content to YouTube and maybe integrating YouTube into the Google Play app on phones. YouTube already has movies, so it would just be a matter of making the music and live streaming part of it more visible on the home page. I don't see any links to them on my front page.

harigov · on Feb 28, 2017

I would say - just brand them as completely different services unless they are totally dependent. Just because movies are served on top of YouTube doesn't necessarily mean that they literally have to be part of YouTube website. Why not a website like "MovieTube" with its own website and brand? It makes YouTube a single focused website intended to find any/all random videos on internet and people understand when they hear "MovieTube". With YouTube Red, YouTube TV, YouTube Live, YouTube Movies, etc., it is hard to remember which is what.