Hacker News new | past | comments | ask | show | jobs | submit login
Making Hard Things Easy (jvns.ca)
1134 points by hasheddan on Oct 6, 2023 | hide | past | favorite | 197 comments



The part that resonated most with me is "Show things that are normally hidden".

Tools that do this make things clearer almost immediately. Consider the developer tools in a web browser. Do you remember the "dark ages" before such things existed? It was awful because you had to guess instead of seeing what was going on.

Tools like Wireshark that show you every last byte of network packets that it has access to AND parses it to help you see the structure. This isn't just for debugging networking data; it's hugely beneficial in teaching networking concepts because nothing is hidden.

This is also one of my favorite things about open source software. I can view the source to understand what's causing a bug, to fill in knowledge gaps left by the documentation, or just learn more about programming concepts. Nothing is hidden.


The game development equivalent of these is renderDoc. I was stunned when I learned its existance for the first time.


I’m about to embark on my first real and truly massive graphics project. Thx for mentioning such a cool tool.


Wireshark is great but it does not show you every byte the network carried. For example it never shows Ethernet preambles, only sometimes shows Ethernet frame checksums, and never shows interpacket gaps (which are a required part of the Ethernet protocol).

So yes it comes close but it just goes to show you, there is always more detail hiding somewhere!


Yes, the toughest "hidden things" problems are pulling together data that is related, but not part of the same system. In this case, Wireshark can only show you what the OS gives to it.

In the article, it was pointed out that DNS caches can be hidden. They're especially hidden when they're upstream and in another computer!


It doesn’t show electrical signals either. That’s a pointless nitpick if you’re developing anything beyond an Ethernet implementation.


My biggest problem with Wireshark is that it can't do anything with HTTPS traffic - which is most of the traffic I'd be interested in. I understand that's kind of the point of HTTPS, and a MITM proxy with cert replacement is somewhat out of the scope of Wireshark, it still limits the usefulness of the program.


This method is really simple for Chrome, Firefox and other OpenSSL apps: https://wiki.wireshark.org/TLS#using-the-pre-master-secret

You set an environment variable to instruct the app the write a file that Wireshark can use to decrypt its traffic, and change a setting in Wireshark to use that file, and that's it.

You will even be able to see decrypted WebRTC traffic.


it can, just not as easily as MITM proxies: https://wiki.wireshark.org/TLS


Check out Charles Proxy for mac. It’s great.


Are the Ethernet frame checksums even visible to Wireshark, which hooks into the IP layer? would some of the ethernet stuff be only visible within the ethernet card itself, not to the software stack?


Sometimes they are. It depends on how the capture was generated. If you look in the options of Wireshark there is one to detect bad checksums, so clearly there is a way to capture them. Here is one such way: https://stackoverflow.com/questions/22101650/how-can-i-recei...

This can be used to detect partially-bad network cables, because there is no reason you should ever receive a bad FCS.


Wireshark has support for lots of interfaces, it's just a consumer for OS/driver supplied data in this aspect.

See eg this bit about wlan for some of the complexity: https://wiki.wireshark.org/CaptureSetup/WLAN#link-layer-radi...


Wireshark works at the data link layer (L2).


Well...

ethhdr only has 14 bytes: 6 for dest mac address, 6 for source mac address, 2 for ethertype (e.g. IPv4 vs IPv6).

Any bad checksums that Wireshark can detect are predominantly at the transport layer (L4; TCP / UDP).

You'd have to explicitly turn on FCS, which I don't know if you can even do on, say, Windows.


To be fair, the preamble can be easily considered just an electrical signal due to its objective (sync) which doesn't affect how the network works.


Wireshark will also frequently show the wrong checksums for IP, UDP, and TCP. Specifically when computing them is ofloaded to the network card.


My dream is to make everything visualizable at runtime. I think all of computing becomes very simple and much less complex if we can do this.

We are visualizing things in our head already. And any explanation of anything in computing is a diagram. But we have zero diagrams when coding.

Just dynamically instrument all code to send messages to a GUI.


>My dream is to make everything visualizable at runtime.

Check out demos of the old Lisp Machines, [1] is a brief overview demo, [2] links to a timestamp with a view of some simple diagramming, but I’ve seen TI-Symbolics beasts routinely display complex relationships in Lisp code on their massive (for the time) bitmapped screens. The limitation was the end user managing the visualization complexity.

With open source llvm, clang and similar making available abstract syntax trees and even semantic analysis and type checking results, LLM’s assisting with decompiling binary blobs, and modern hardware (goggles, graphics cards, and so on), I sometimes wonder how close we can come to reproducing that aspect of the Lisp Machine experience on open source operating systems.

[1] https://youtu.be/o4-YnLpLgtk

[2] https://youtu.be/jACcgLfyiyM?t=43m52s


Clojure does pretty well. See https://github.com/nubank/morse, https://docs.datomic.com/cloud/other-tools/REBL.html, and https://vlaaad.github.io/reveal/.

It's one of the areas that homoiconicity helps: code is data, data is code, so visualization tools can work on both sides.


'"Show things that are normally hidden". Tools that do this make things clearer almost immediately.'

At least with all the DevOps shit going on now, it seems that many of the tools increasingly hide things. And the gurus who had that knowledge and could teach you are now concentrated in those tool companies instead of in your org.


This is one thing I love about Magit for Emacs. The UI is really clever and slick—maybe the best Git frontend that I've ever used—but the way you interact with UI is by toggling flags and options that actually map to the underlying command line Git arguments. I can seamlessly hop into the command line and feel right at home using it directly.


On top of the UI having very tight and visible relationship to git CLI commands, there's also a log buffer showing you exact commands executed and their outputs, and it's only a single $ press away.


Yes, text based UI's like magit offer this kind of simplicity without sacrificing cli speed.


Man, having spent waaaay too long troubleshooting errors where it's not remotely clear what part of the config the program is even consulting makes this hit home.


This is what I like about SQL databases. They frequently have tables that you can query to find system information. This makes it easy to explore the system within the system.

The proc filesystem on Linux is philosophically similar. It allows you to understand processes by working with files.


This is also one of my favorite things about open source software. I can view the source to understand what's causing a bug, to fill in knowledge gaps left by the documentation, or just learn more about programming concepts. Nothing is hidden.

On the contrary, you don't need the source, and in some cases it may even be misleading in many ways, when you can look directly at the instructions the machine executes. Tools like disassemblers and decompilers would be equivalent to what you speak of.


Nothing stops you from doing this with open source software either, though; making something open source is strictly an increase in the information available, and at least to me that seems in the spirit of "show me things that are normally hidden"


Yep. Mozilla developer tools basically launched my career.


https://news.ycombinator.com/item?id=37467078

Is there a way to get Awk to emit a non-terse version of the script passed in? ie awk '/test/' -> '{ if($0~/test/){print $0} }'


Sector editor!

(Or back in the day, looking at the source code because we ran uncompiled stuff in Basic and whatever and that was pretty cool)


That's one of the advantages of using programming languages where source code is distributed (e.g. Python, JavaScript, PHP), not compiled binary artifacts (C/C++, Java). You can see the source. You can even modify it and run the modified version without compilation.

It's also awesome to use Java IDEs that can show both the bytecode of .class files and also perform decompilation.


Julia has to be one of the most likable people in tech! Every time I read one of her articles I feel that same bubbly rush of excitement I got when I was a kid, just starting to unfurl the secrets of reality through my own little experiments. Absolutely lovely.


Yes, it's rare to find someone who possesses deep technological know-how and is also a brilliant teacher and communicator. Andrej Karpathy is another who comes to mind. Fortunately, I've discovered that more people fit this mold recently.


For a second I thought you were talking about Julia the language. :)


Shhh... You'll make me miss undergrad. c:


Very much. I'm usually not a fan of the "omg awesomesauce"-style of overexcited blogposts or tutorials, (I much prefer the drier, high signal-to-noise ratio, concise, beautiful texts of the Landau&Lifschitz style), but her posts all make me feel that giddy rush of excitement you're talking about :)


I think because it feels authentic, rather than forced.

It’s really easy to bleed one into the other.


She was LOVELY in person as well. I got her to sign my copy of How DNS Works.


> One thing that I sometimes hear is -- a newcomer will say "this is hard", and someone more experienced will say "Oh, yeah, it's impossible to use bash. Nobody knows how to use it."

> But I would say this is factually untrue. How many of you are using bash?

I think the meaning of the statement is not that straightforwardly literal.

I think what it means is, "We don't have strong confidence in our understanding of our bash code or confidence that it will behave as we expect in untested scenarios. If anything out of the ordinary happens, we kind of expect that something will fail and we will learn something new about bash that will make us cringe and/or strike a nearby object hard enough to injure ourselves."

Bash is a complex language, and for most programmers, it is unlike any other language they use. Most companies have a little bit in production somewhere, and most of them don't have a single person who writes enough bash to know it well. I think it's no accident that build tools, CI tools, and cloud orchestration tools are evolving in the direction of minimizing the need for shell scripting.


Personally I think the complexity from tools like bash come from lack of evolution.

As a thought experiment, why couldn't bash have a better assignment statement available?

In other words, something like:

set --goodass

a = string1 + '.' + string2

This would cut through SO much of the shell quoting nonsense that you deal with.

another tool like "make" would benefit too.

I think 6 months of development to "make" to have usable variables, clear ways of manipulating paths and filenames and making targets more usable... that would be better than 6 months of creating complex makefiles.


I think the question is whether a newcomer will understand the implicit meaning or if they might interpret it more literally than its meant. In particular, "for most programmers, it is unlike any other language they use" is not something someone new would necessarily be able to infer because that sentiment requires enough experience to tell the difference between "uncommon" and "extremely esoteric".


On a related note, most software is over-engineered. I think it's partly because of centralization of the industry; it's pushing everyone towards a small number of tools for the benefit of a small number of people who control them and so many of these tools end up becoming 'everything tools' and cover more use cases than they should.

Companies want developers to all know the same tools; that way they are easily replaceable across projects and companies and have little bargaining power in the industry. This is why software has a single mainstream trunk and alternative approaches are shunned with no jobs available. The industry is not being allowed to decentralize despite the fact that it naturally 'wants' to.

On the bright side, I think that eventually, some new, far superior non-mainstream approaches are going to materialize and they will erode the mainstream approaches.

Tech is not like math and not even like science; it can support MANY different branches solving any given problem in many different ways.


I agree. In some ways it feels like we have gone backwards in web dev since say the early days of ASP.NET and Rails. Back then we had browser wars to keep is busy. But now browsers are broadly compatible but we have invented all this front end complexity for web apps that often don’t need it.

Stuff like DNS, IP, https can’t be helped as they are fundamental things that need backwards compatibility and are somewhat political too.

I feel that learning those things well is a better investment though than learning the frameworks.

… if I keep going I will start talking about innovation tokens!


You can learn both, though. As much as people like to trash talk it, I think learning from "the bottom up", as long as you remember to follow the 80/20 principle and not go too deep into unnecessary rabbit holes, is still the best approach in terms of long term ROI on your time. I got a degree in EE because I wanted to be a "true" full stack engineer; last year I finally got a chance to learn React and a lot of cockpit flight hours setting Microsoft Azure. It took longer but I feel I'm on much steadier ground to keep climbing up.


To make hard things easy you have to find the right way to abstract them so you hold only some bits of the hard things in your head and all the frequently-used details too (maybe), and everything else you have to look up as needed. That's what I do, and that's roughly what TFA says.

The problem is that people don't necessarily bother to form a cognitive compression of a large topic until they really have to. That's because they already carry other large cognitive burdens with them, so they (we!) tend to resist adding new ones. If you can rely on someone else knowing some topic X well, you might just do that and not bother getting to know topic X well-enough. For those who know topic X well the best way to reduce help demand is to help others understand a minimal amount of topic X.

> So, bash is a programming language, right? But it's one of the weirdest programming languages that I work with.

Yes, `set -e` is broken. The need to quote everything (default splitting on $IFS) is broken. Globbing should be something one has to explicitly ask for -- sure, on the command-line that would be annoying, but in scripts it's a different story, and then you have to disable globbing globally, and globbing where you want to gets hard. Lots of bad defaults like that.

It's not just Bash, but also Ksh, and really, all the shells with the Bourne shell in their cultural or actual lineage.

As for SQL, yes, lots of people want the order of clauses to be redone. There's no reason it couldn't be -- I think it'd be a relatively small change to existing SQL parsers to allow clauses to come in different orders. But I don't have this particular cognitive problem, and I think it's because I know to look at the table sources first, but I'm not sure.


Yes, `set -e` is broken

The need to quote everything (default splitting on $IFS) is broken

Globbing should be something one has to explicitly ask for

By the way OSH runs existing shell scripts and ALSO fixes those 3 pitfalls, and more. Just add

    shopt --set ysh:upgrade
to the top of your script, and those 3 things will go away.

If anyone wants to help the project, download a tarball, test our claims, and write a blog post about it :)

Details:

https://www.oilshell.org/release/latest/doc/error-handling.h...

https://www.oilshell.org/release/latest/doc/simple-word-eval...

These docs are comprehensive, but most people don't want that level of detail, so having someone else test it and write something short would help!

For awhile I didn't "push" Oils because it still had a Python dependency. But it's now in pure C++, and good news: as of this week, we're beating bash on some compute-bound benchmarks!

(I/O bound scripts have always been the same speed, which is most shell scripts)

(Also, we still need to rename Oil -> YSH in those docs, that will probably cause some confusion for awhile - https://www.oilshell.org/blog/2023/03/rename.html )


Slight correction: bin/ysh has a few things that ysh:upgrade doesn't catch

https://lobste.rs/s/6gycoi/making_hard_things_easy#c_sjfxif

Feedback is welcome (especially based on upgrading real scripts)


The problem is that we're still using these ancient shells when we have better ones. Users shouldn't be wasting time memorizing arcana like "set -e". At least we have search engines now...


I'm quite partial to the fish shell myself for this reason.

But I SSH into a lot of embedded systems these days, where you don't exactly have the luxury of installing your own shell all the time. For those times I like to whip out the "minimal safe Bash template" and `sftp` it to the server.

https://betterdev.blog/minimal-safe-bash-script-template/


When the shell script is that long, I reach for Python or Go instead pretty fast =)


Au contraire, bash/sh is pretty much everywhere.

Also, we have ChatGPT now. That helps a lot.


A possibly radical way of fixing the SQL clause order would be to introduce "project" which behaves like "select" but can be put in the right place.



When explanations include superfluous detail, I find it very confusing. Like Chekhov's gun, I keep trying to fit it into the plot but it doesn't fit.

My super power is a terrible memory. So I have to understand things in order to remember them (aka a cognitive compression). I can't just learn things like normal people.


> As for SQL

We should peel off SQL and get access to the underlying layers.


What if my SQL engine is Presto, Trino [1], or a similar query engine? If it's federating multiple source databases we peel the SQL back and get... SQL? Or you peel the SQL back and get... S3 + Mongo + Hadoop? Junior analysts would work at 1/10th the speed if they had to use those raw.

[1] https://trino.io/


TiL: The shell does not exit if the command that fails is a part of any command executed in a && or || list except the command following the final && or ||.

Reference: https://www.gnu.org/software/bash/manual/bash.html#index-set


"Fails" is a higher-level concept than the shell is concerned with. Failure conditions and reactions are entirely at the discretion of the programmer and are not built as an assumption into the shell.

The only thing /bin/false does is return 1. Is that a failure? No, that's how it was designed to work and literally what it is for. I have written hundreds of shell scripts and lots of them contain commands which quite normally return non-zero in order to do their job of checking a string for a certain pattern or whatever.

Programs are free to return whatever exit codes they want in any circumstance they want, and common convention is to return 0 upon success and non-zero upon failure. But the only thing that the shell is concerned with is that 0 evaluates to "true" and non-zero evaluates to "false" in the language.

It would be pretty inconvenient if the shell exited any time any program returned non-zero, otherwise if statements and loops would be impossible.

If a script should care about the return code of a particular program it runs, then it should check explicitly and do something about it. As you linked to, there are options you can set to make the shell exit if any command within it returns non-zero, and lots of beginner to intermediate shell script writers will _dogmatically_ insist that they be used for every script. But I have found these to be somewhat hacky and full of weird hard-to-handle edge cases in non-trivial scripts. My opinion is that if you find yourself needing those options in every script you write, maybe you should be writing Makefiles instead.


> It would be pretty inconvenient if the shell exited any time any program returned non-zero, otherwise if statements and loops would be impossible.

In another life I worked as a Jenkins basher and if I remember correctly I had this problem all the time with some Groovy dsl aborting on any non zero shell command exit. It was so annoying.


This is because && and || are often used as conditionals:

    [ -e README ] && cat README
avoids an error if the file README doesn't exist, and

    [ -e README ] || echo "You should write a README!"
works the opposite way.

What's more pernicious is that pipelines don't cause the shell to exit (assuming set -e) unless the last command fails:

    grep foo README | sort
does not fail if README doesn't exist, unless you've also used `set -o pipefail`.


so what happens?? it just hang up on the foo??


`grep` should terminate with an empty output on `stdout` and an error message on `stderr`, then `sort` will successfully sort the empty contents of `stdout`


In my opinion, it is one of the biggest flaws in the shell language design, because it means, that a function can lead to different results independent of the arguments, but depending of the context from which you call it. And it even overrides explicitly setting `set -e` within a function.

Some time ago I gave an example: https://news.ycombinator.com/item?id=22213830


There are more arcane things to learn about shell, at some point one has to go shrug, it's a fine tool for getting quick results but not for writing robust programs.


This is a great description of things that seem like they shouldn't be so difficult but can have many complications. The SQL part seems to double-down on a conceptual failure rather than demystifying it though.

A query's logic is declarative which defines the output. It's the query plan that has any sense of execution order or procedural nature to it. That's the first thing to learn. Then one can learn the fuzzy areas like dependent subqueries etc. But being able to see the equivalence between not-exists and an anti-join enables understanding and reasoning.

Using an analogy such as procedurally understanding of written queries only kicks the can further down the road, then when you're really stuck on something more complicated have no way to unravel the white lies.


> The SQL part seems to double-down on a conceptual failure rather than demystifying it though.

She talked about a mental model to help her understand the query (it can be useful), and mentioned that it probably is not how the database actually processes the query.


My point is that there should be two mental models. One for getting the correct results. Then another for doing so performantly. Being able to write many different forms of obtaining the same correct results is where this leads to combined understanding and proficiency.

An example of where muddling these ends up with real questions like "how does the db know what the select terms are when those sources aren't even defined yet?" By 'yet' they mean lexically but also procedurally.


I suspect that Julia is solely using the first kind of mental model (getting the correct result), and completely ignoring query planning. But even this model has an order to it! Three examples of how this order can manifest, that should all agree with each other:

1. The explanatory diagrams that Julia drew for the talk. These wouldn't make sense if they were in a different order.

2. The order of operations you would perform if you developed a proof of concept SQL implementation that completely ignored performance. In this example the order would be: "cats, filter, group, filter, map, sort". This is exactly the order that Julia's explanation showed.

3. The relational logic expression for this query. There should be a correspondence between this expression and this ordered list of operations, though it's somewhat annoying to state. I think it's that, assuming all the operators in the relational logic expression are binary, if you reverse the order that a subset of the operators are written in, then the operators in the tree occur in the same order as the ordered list of operations. (I don't actually know relational logic, so I'm making a prediction here. This prediction is falsifiable: you can't put the operators in the tree in an arbitrary order.)

(Side note: the order isn't completely fixed. The last two steps --- SELECT and ORDER BY --- could happen in either order.)


If there's an index on the ORDER BY fields, it can even be first.



Topical recent episode about Postgres - but can be extrapolated in many cases to other databases: https://www.se-radio.net/2023/09/se-radio-583-lukas-fittl-on...


Excellent talk. She seems to be a very likable person. She is right about Bash being full of "gotchas" and trivia and memorizing them all is very hard, but I think it is nice to memorize some trivia. For instance, I tended to forget the order of the arguments of the find command, and I would lose time trying to remember its syntax when I'm in front of a machine with no readily available internet connection. So I committed to learning and memorizing the most common command line tools and some of their "gotchas". I used Anki for that, and some mnemonics, and the return on the investment has been worth it I think.


I came here to say Anki is my lifeline for grokking difficult things like DNS.

It was in fact on jvns.ca's book recommendation that I got Michael W. Lucas's _Networking for System Administrators_, and strip mined it for Anki cards containing both technical know-how and more than a little sysadmin wisdom.

It might be one of the highest ROI books I've ever read, considering I actually remember how to use things like nectat and tcpdump to debug transport layer issues at a moment's notice now.


I maintain a file with commands that I don't use often (ex: increase volume with ffmpeg, add a border to an image with convert, etc). I even have a shortcut that'll add the last executed command to this file and another shortcut to search from this file.


Oh, specially ffmpeg and imagemagick! I have a handful of incantations saved over time.

Just today I saved a new one for trimming borders on video screenshots https://xenodium.com/trimming-video-screenshots to https://github.com/xenodium/dwim-shell-command/blob/main/dwi... (that’s my cheat sheet).

I wrote an Emacs package that works fairly well for saving commands but also making them reusable from its file manager without the need to tweak input or output file paths https://github.com/xenodium/dwim-shell-command

While Emacs isn’t everyone’s cup of tea, I think the same concept can be applied elsewhere. Right click on file(s) from macOS Finder or Windows Explorer and apply any of those saved commands.

Edit: More examples…

- Stitching multiple images: https://xenodium.com/joining-images-from-the-comfort-of-dire...

- Batch apply on file selections: https://xenodium.com/emacs-dwim-shell-command


If you don't mind, it would be awesome to see your cheatsheet. I think this would be a great thing for people to share - like their dotfiles. But maybe they already do and I don't pay much attention to it because I'm lazy - like their dotfiles.


Oh, that's a great idea. I have a doc that I maintain by hand, either via ">>" or editing directly. Time to go and make a shortcut. Do you do any annotation to help with the search?


A large Justfile (https://just.systems/) of random recipes might be a way to make it both executable and searchable (at least on zsh, you can get an autocomplete list of completions from the command line).


Can you think of a single CLI tool that would let me commit a past incantation to a file and retrieve it later? Especially one that syncs well across devices.

The best I can think of is Atuin (https://github.com/atuinsh/atuin) but I wasn't super interested in using it - I kind of want something more lightweight.


I've been using a little set of bash funcs I called `hs` for this for a few years:

https://github.com/mikemccracken/hs

your snippets are stored in a git repo that you can sync around how you like.


Usually no annotations, as I typically search by command name. But sometimes I edit the file to add comments if there are many examples for the same command.


I like `fzf`'s default override of Ctrl+R backwards search for this purpose, along with the fish shell's really good built in autocompletion.

I've been thinking about updating the GIFs in my fzf tutorial to show off fish, but I think I'd rather leave them with ish just so I don't dilute the pedagogical message.


If you’re already putting them in a file, you might as well put them in a shell script on $PATH: at a certain point I started writing shell scripts and little utilities for relatively infrequently used commands and other tasks (e.g. clone this repo from GitHub to a well-known location and cd to it)


Nobody asked but I'd like to chime in with my method.

I have a lot of aliases, for example to start my QEMU VM with my development stuff in it, I make an alias for 'qemu-system-x86_64 [...]' with all the switches and devices and files required, called 'startvm'. I have another that takes me to my current project's folder and pulls the repo. And a third that creates a new folder called 'newproject', creates a small set of folders and empty files with specific names, and finally makes a git repo in it. I am a serial abandoner of projects so I use this more often than I care to admit.

It's not pretty, but functional; and since I always copy my dotfiles when I change computers, I've kept these small helpers with me for a while now.


Keeping a ~/bin directory with all your personal shell shortcut scripts has been my go-to for years. I tend to make a lot of project-specific shortcut scripts for anything that I want to remember/becomes a common task.


Wait, how do you “cd to it” from within the script? Doesn’t exiting the script take you back to where you were?


You're right that you often can't modify your current environment by invoking a shell script. That's because it's executed in a sub-shell.

For cases where you need to modify your current environment (setting environment variables, changing directories, etc), you need to run the script using the "source" built-in. That will execute the script in the current shell rather than a sub-shell.

So instead of

    ./some-script.sh
you'd run

    source some-script.sh
or use the dot (".") shorthand

    . some-script.sh
In cases where I need to source a script, I generally create an alias or a shell function for it. Otherwise I may forget to source it.


For this I use shell functions in my .zshrc and I wrote a loader to source a bunch of files in a .zsh.d directory.


Rolled your own direnv?


No, this is global: I use direnv too.

It’s more like “roll your own oh-my-zsh”


Pretty much the same, though I usually just keep the file open in a side terminal. I want to use stuff like cheat.sh (ex. curl cheat.sh/grep) but I never remember.


I tended to forget the order of the arguments of the find command, and I would lose time trying to remember its syntax when I'm in front of a machine with no readily available internet connection.

The man pages are readily available.

The bash man page is huge and hairy, but comprehensive, I've found it pretty valuable to be familiar with the major sections and the visual shape of the text in the man page so I can page through it quickly to locate the exact info I need. This is often faster than using a Internet search engine.


> The man pages are readily available.

True, but I find the man pages not easy and quick to parse.


I'm not a fan of man pages. Or any documentation that focuses on textual explanations rather than examples in code (looking at you aws).

I recently found https://tldr.sh/ and found it more convenient. I ended up writing myself a vscode extension to have a quick lookup at my fingertips, since I am at least 60% of the time looking at a terminal in vscode


Right. I don't think you're supposed to read them top-to-bottom.

Use `/` and search for the things of interest (keywords, arguments, options, etc...). Use n/N to quickly jump forward/back.


Find has a particularly bad man page to find things that way.


AFAIK there is a find replacement with sane defaults: https://github.com/sharkdp/fd , a lot of people I know love it.

However, I already have this in my muscle memory: find <where> -name '<what>' -type f(file)/d(directory)

Works in 90% of situations when searching for some file in terminal, ie: find / -name 'stuff*'

The rest of the time is spent figuring out exec/xargs. :)

And once you master that, swap xargs for GNU parallel. I bet your machine has a ton of cores, don't let then sit idly. ;)


FWIW it's `-maxdepth 1` not `-depth 1`.

But yeah, you're right.


Dude, flashbacks. Aren’t you supposed to do a trigger warning or something first! ;)


you can grep the man page contents by using the following command

man <command> | col -b | grep "search_string"


I've used a small bash function. example : to search grep's manual for "lines", I type gm grep "lines"

gm () { man $1 | col -b | grep --color=always "$2" }

I also have something similar for grepping a command's help

gh () { $1 --help | grep --color=always $2 }

I usually try gh(grep help) and if I don't get what I'm looking for, I run gm(grep man).

It appears that I can also use tldr


For find, I think the Info pages are (even) more comprehensive (info find), and you get more structural navigation.


Comprehensiveness is not a benefit if it's poorly searchable


It might be better to invest in something more general like better docs/cheatsheets (the bad old man pages which you could convert to a text editor friendly format, or something better like tldr, or something like Dash) so you don't depend on the internet, but also don't have to memorize bad designs (since find wouldn't be the only one)


We should stop using Bash, and use TypeScript instead.

Bash is terrible.


well this shell is just broken

  > ls -al
  <repl>.ts:4:1 - error TS2304: Cannot find name 'ls'.
  
  4 ls -al
    ~~
  <repl>.ts:4:5 - error TS2304: Cannot find name 'al'.
  
  4 ls -al
        ~~


Easy...

    $`ls -al`
or

    ls('al')
And, transpile shortcut strings on-the-fly to TS code:

    ls({
      all: true,
      oneEntryPerLine: true,
    })


I have no idea why but I wanted to hate this article. Maybe jvns shows up on HN too often and I was in a bad mood. But this is a great article and as someone with 20 years of development experience is about as true as any meta-level discussion on programming could be.

The selective vision thing is so true, both for `dig` and for `man` pages. I can't count the number of times have I `man <cmd>` and just felt overwhelmed by the seemingly endless pages of configuration options and command line flags. One tip I use for `man` is use vim style search functions triggered with `/`. For example, if I want to find how to output the line number of each match in grep and I can't remember how - I'll just `man grep` then type `/line` and hit enter and it will search for any occurrence of the word "line" in the man page. Next match is just `/<enter>`.

I'm also a bit sad to hear that Strange Loop is now finished? I only found them last year or so and it seemed like so many of the talks were exceptional quality.


> I'm also a bit sad to hear that Strange Loop is now finished? I only found them last year or so and it seemed like so many of the talks were exceptional quality.

You might want to watch Alex Miller's talk that has been uploaded recently:

https://www.youtube.com/watch?v=suv76aL0NrA

And yes, it's sad that it ended!

However he made a very good case for why sometimes it's good for things to end. If you watch the whole talk it all makes sense.


Subtitles courtesy of DownSub (poor formatting my own). https://pastebin.com/isDPeBQd



There's also tldr: https://github.com/tldr-pages/tldr

It lets you check the most commonly used options from your terminal, for example "tldr badblocks".


This looks like https://tldr.sh in a browser.


curl cheat.sh/awk # is the intended behavior, and it frequently goes into way more detail than tldr does. Both great tools!


Thanks, didn't know about the curl behavior. Nice!


> Next match is just `/<enter>`.

You can also press `n`


I really disagree strongly with the take on bash. The best solution is not to add tooling on top of bash or memorize its idiosyncrasies. It is to not use bash. That is the only way to escape its pitfalls.


I have yet to find a proper replacement for bash. Especially for scripts.

The two most common alternatives are 1) using some of the newer shells people have created, like Oil shell [0], or 2) using programming language like Python, JavaScript, or PHP.

The problem with using a newer shell is that you'll have to install the new shell anywhere you want to use the script. Meanwhile bash is ubiquitous. Unless you're the only one maintaining the script, you're requiring others to learn the other shell to maintain the script.

The problem with using another programming language is that they rarely have good ergonomics for doing what bash does: stringing together commands, command input, command output, and files. If you try to do that in another programming language, things suddenly get a lot more complicate or at least more verbose.

So I still use bash, but I recognize that it's strength is in running other commands and dealing with I/O. If I'm doing complicated logic that doesn't involve that, then I'll offload my work to another language. Sometimes that just means calling a python script from bash, not avoiding bash completely.

If people found that they work better by taking other approaches, the please share them.

[0] https://www.oilshell.org


I'll still write bash, but only if it's very trivial (~ <= 10-ish lines, no super-complex conditionals, etc). For anything else, it's worth stepping up to just about any more-robust language. If you don't like the hoops that something like Python makes you go through for basic shell-like operations, maybe try Perl? For middling-complexity scripts, perl can be a nice win over bash, without all the development overhead of Python. Perl5 is pretty much installed everywhere and universally compatible for anything but the very latest language features.


I usually resort to running inline scripts using python3 or node (using only the standard library) if I need to do something that I can't express in bash easily. That means avoiding stuff that have to look up and I won't remember or be able to debug in a month.


I like the idea of running inline scripts. This means that all the logic is in a single file so you don't have multiple files to copy around (e.g. a .sh file and a .py file).

I've done this with awk and jq, but haven't done it with node or python. But why not? It sounds like a good approach for some cases.


I love bash for anything that doesn't need to scale. The reason is reliability. I can't tell you how many times I've run into problems with python or similar on small library implementations. If I need to make an http request in my code and I'm using curl in bash and something fails, I can pretty much guarantee the problem is not curl itself. It's battle tested and deterministic. However the python script that forgets to wrap a try catch or configure proper tls auth or some other quirk? I have had those eat hours/days of my time and having to resort to tcpdump to sort out what's really happening. Same with SQL libraries/ORM vs SQL cli, same with Json/xml parsing libraries vs jq etc.


> The problem with using another programming language is that they rarely have good ergonomics for doing what bash does: stringing together commands, command input, command output, and files. If you try to do that in another programming language, things suddenly get a lot more complicate or at least more verbose.

tclsh is one way around that. Use a real first-class programming language, but in a mode where calling programs is as easy as it is in shell.


I think this is a valid point. bash is an overly complicated tool...so I'll write another tool on top of that (with none of the decades of debugging that bash itself has undergone) to make bash...LESS complex?

The problem is with bash itself.

We tend to undervalue ease of use and overvalue "cleverness."

Case in point: git. Very clever tool. Ease of use: terrible. But Linus wrote it and Linus is clever, so it must be us that's the problem.

We get what we value. Let's value ease of use more.


Yes. While I’m a huge fan of shellcheck and am one who actually have used and know bash deeply. No amount of linters or other tooling on top of bash can fix it.

Best solution is to just stay away.

For real. Just stop. Don’t try to be macho. The whole model of the language is fundamentally broken. I mean, stringly typed, global mode switches, one-character flags for fundamental comparison operators, defaulting to ignoring errors at every corner you look, functions especially. Each such idiosyncrasy on its own is enough to dismiss such a language, bash has them all plus more.


It has idiosyncracies because it's not a general purpose language. Even the things that she mentions are happening for good reasons - like the fact that set -x would break the expected behavior of || and &&.

Actually what language does crash when a function returns false? I mean some throw exceptions but isn't "false" a valid thing to return?

I find the same thing with makefiles - people don't understand what they're doing and expect them to work in a certain way because they haven't ever thought about build systems very deeply. Recursive assignment in Make catches almost everyone out e.g.

FLAGS=-b

COMPILE=compile $(FLAGS)

$(info compile command=$(COMPILE))

FLAGS=-a

myfile:

echo $(COMPILE) $? -o $@

outputs:

t43562@rhodes:~ make -f t.mk

compile command=compile -b

echo compile -a -o myfile

compile -a -o myfile

Despite this, making all assignments immediate to match other programming languages would take a VERY useful tool away. The more you understand these tools the more you know where to bother using them and how much effort to put into it.


You can get code formatting by indenting it all with 2 spaces.


That's easier said than done, and completely getting rid of bash may often not be worth the time and effort. But in general I agree, anything remotely complex I try to offload into scripts written in less idiosyncratic languages. Having some tools to help avoiding mistakes with the last 5% that stay bash scripts is super helpful in that case.


Many have tried, nobody succeeded.

Bash scripts are extremely useful and productive for their niche. You'd have to change a whole lot of things to match that.


a part of me agrees with the aim of what you say: why not start afresh with something that is less prone to accidents? and i do agree with that idea. but i think it is also somewhat impractical to ignore that pretty much every server i've ever interacted with has a default, vanilla, bash installed, and if i know how to use that, it helps me get by.

not to say we can't and shouldn't try to do better.


So, you never use bash?

That's not a very practical hill to die on. But well, you get to decide what fight you engage on.


Indeed, especially now when you have so many great alternatives, use python as your shell if you like

Many bad designs are too entrenched to be fixed with some tooling


If you think this way, you will quickly stop using anything. Most computer things are pretty complex and have really intricate oddities.


No, you won't. There are numerous shells that are better. The path to improvement lies in being able to call drek by its name.


Well said! I'm in manufacturing, not web tech, so the hard things I work with are very different (RS-274 G-code, servo motion systems, PLC/SCADA network connections are a few examples), but it's interesting to see the overlap in causes of difficulty like trivia, gotchas, frequency of use, and visibility. And disheartening to see that the web tech community is so open and friendly and communicative, while controls engineers may rarely hang out on forums but rarely share anything even close to this outside of a $2,000 training course.

And not to nitpick, but I don't have Julia's email, but when struggling with hard things like DNS I hate to have someone bounce off this little roadblock...her link to to the demo of the DNS exploration website is pointing to the .com, when it's actually at the .net TLD:

    <a href="https://messwithdns.com">messwithdns.net</a>
                                 ^^^    =/=       ^^^
Looks like someone needs an HTML linter... :)

The correct link - https://messwithdns.net/ - is actually pretty neat!


fixed, thanks!


The point Julia makes about `grep` resonate with me alot! I have the same (call it problem if you want) issue with `ps`. There is only one variation of ps I know of (`ps aux`) and if I want to change that I have to either look for the options in the man page or google it.


ps is bad in general, the default view is almost never what you want and lots of minor formatting options make for a complicated man page. But it is especially a shitshow on linux.

fsf: which ps options should we use(bsd, systemv, solaris, sgi)?

also fsf: well that's a tricky one... why not all of them?

The linux ps man page is a wild mess.


Oh, that's a nice one, thanks for mentioning it. I'm personally used to "ps afx", but the "u" does give out some quite useful info... and it's compatible with "f"! So I guess I'll add "ps aufx" to my repertoire.


faux is easier to remember


On a target, Strange Loop seems to have some of the best presentations I've seen over the years.

It's really unfortunate that this will be the last one.


What?? What happened


Alex Miller (the organizer) decided to stop. Probably best to let him explain in his own words: https://www.youtube.com/watch?v=suv76aL0NrA


What an incredible talk, too! Thank you so much for sharing! So glad I scrolled down this far haha


These are the use cases where generative AI is great. I want to convert an mp4 to mov and resize it to 720p with ffmpeg, give me the command. I want a bash script to filter out the third column from a CSV and convert the result to a JSON.

These are not hard, but I won’t remember ffmpeg flags because I only use it once per year.


> people share things by sharing a 'best practices (guide)' but I always love to hear the story. Every time someone has a strong opinion like "no one should ever use bash", I'm like "tell me.. what did bash do you to, I need to know!"

So true!


This is a talk turned into a web page done right. It seems like it should be a simple thing to do, but often the results are confusing and hard to read. Not here, well done!


I'd love to know if jvns has a library or something she uses for this. I've seen this kind of slide thingy on https://boringtechnology.club/ too, and I'm really curious!


Another lovely example of this format is https://idlewords.com/talks/.


Shout out to https://explainshell.com/ for being a great resource to understand cli tools and options


Shout out to you for finding this for me


> Everything is in the same order as you write it, except SELECT is fifth.

been using SQL for years and didn't stop to think about that...


I don't think her conclusion is correct in this case (that it's all about chronological order). It's not about what happens first but about levels of abstraction. It doesn't make sense to say that we want to decide on the number of doors before specifying if we're building a LEGO car or a skyscraper in Manhattan.

The reasonable approach is to start from high-level concepts and only then deal with the details - without specifying high-level concepts the details have no particular meaning.

As a side note I have to say that I definitely prefer languages with `object.function` rather than `function(object)`, precisely because of this. Another example: `if(foo == 5)`, not `if(5 == foo)`


I both agree and disagree with her sentiments here. Not in a right/wrong sense, but just what I found works for me.

I agree with having helpers to understand how tooling works. Any resources that increase understanding of how to use a tool to it's full capability is productively beneficial. (Hat-tip to anything that unpacks all the `curl` switches.)

Here's where I have disagreement: an old boss of mine once said "don't worry about the tricks of the trade; learn the trade." That's very contextual, but sometimes understanding the core first makes the rest of it easy. And understanding the core takes both effort and increases cognitive load, so I understand why one might not go that route.

So, for me, I always try to keep a balance between trying to back my way into execution via those helpers, and recognizing when I need to take a step back and learn at a bit more core level.

Her down-to-earth approach is refreshing, that's for sure.


> Here's where I have disagreement: an old boss of mine once said "don't worry about the tricks of the trade; learn the trade." That's very contextual, but sometimes understanding the core first makes the rest of it easy.

Seems like you might agree more than you realize! From TFA:

> And much like when debugging a computer program, when you have a bug, you want to understand why the bug is happening if you're gonna fix it.

It sounds a lot like "don't worry about how to fix the bug; learn what the bug is."


I see jvns.ca, I click. I am a simple person. I know that it'll probably be the best thing on HN that day.


Every time a jnvs.ca article goes popular on HN I have to relive the trauma of my Stripe interview with Julia where I just could not get that simple test to pass despite having years of programming experience. Ugh! She was nice though :)


What was the test?


And that is why I like the book Accelerated C++. Instead of explaining each and everything. It helps just the right things, thus preparing you for further adventures. Another literature that comes closer is Designing Data Intensive Applications and SICP.


I recently wrote "is" so that I would have a tool to make shell scripting just a little bit easier: https://github.com/oalders/is


Thanks for this. It is similar to how I feel these days; I am currently learning vue.js, and working on 3 different components at the same time (I should probably have focused on a single component, but the boss kept finding things that needed improvements).

I also never worked with vue before. At one point I found myself utterly confused; opening up a file to do something, and instantly forgetting what I was doing, having to go back and re-read code. I was super tired after an near all-nighter at this point, but felt like I had sleep deprived induced dementia or something. In occasional brief moments, I was totally unable to focus or remember what I was doing. Very interesting feeling actually.

When you have not worked with something before, and you are on a deadline, it is just a pretty awful experience. Vue, however, is supposed to be simple and make things easier for the developer. Well, not if you come from a back-end PHP background, with only moderate JavaScript experience, and there is literally no documentation on how things work in the CMS I am currently working with. Have to read existing code and try to replicate things. It is nasty beyond nasty!


I enjoy the content and I feel pretty bad saying this. But while her constant laughing starts cute, ends up being like when I’m sober and I’m talking to someone high and can’t stop laughing at everything even if it’s not funny. Maybe it’s a nervous thing? It’s part of what makes her unique I know, but I enjoyed the parts when she’s talking somewhat more seriously about something and it doesn’t sound like she’s hearing a joke I’m not getting


On a side note, I use GPT-4 to help me create simple Bash scripts. Haven't run into a problem yet, but I double/triple check they won't do anything strange and most of the time, it works quite well.

Don't understand the script, but this is only for personal use. I wouldn't do the same in a work setting (obviously).

And in this specific example, it's been quite a powerful tool to make many parts of my life easier.


> So, bash is a programming language, right?

No, it's a shell that lets you interact with the OS. All the "clasical" UNIX shells were created with interactive use in mind and optimized for it, but happen to be good enough at batch processing[1] to be conflated with programming languages (mostly these days, like, 40+ years later).

The focus on interactive use is the reason you don't have to

- surround command argument lists with parentheses

- put commas between command arguments

- put semicolons at EOL

- surround string literals with quotes (unless the content interferes with the syntax, like, sigils or IFS chars, and you want to escape it).

- surround variable names in braces (unless you want to use some advanced substitution)

- call commands to manipulate I/O; it's baked in.

And these are just the things that existed 40 years ago. Bash gives you so much more stuff that makes interactive use a bliss, but I'm not going to dump the man page here.

[1]: https://en.wikipedia.org/wiki/Batch_processing


In practice, it's also a programming language :)


Emphasis on "also".

If you optimize a shell feature for general-purpose programming, interactivity will suffer (increased verboseness, usually), and vice versa.

If we try to bring shells closer to the real PLs, instead of just using those PLs when it matters, we'll lose some of things that made shells attractive in the first place.


that was fantastic. i didn't watch the talk, i opened up the transcript and thought to myself i would read a few slides and see if i enjoyed it. that was maybe 20-30 mins ago and i finished it all. really well done, i enjoyed it from start to finish.

i like the way she broke it down. i feel like the author addressed ways to learn and communicate effectively and also grounded it in very concrete terms. i don't want to get too much into psychology or anything which i really am not qualified to talk about but i feel like going through these types of ways of learning and communicating is a continued exercise in ego deflation and in pragmatic problem solving.

i liked the SQL order-of-query-operations thing, the bash and shellcheck thoughts (i will use `-o all` from now on), i am curious to play with the DNS tool (the article links to https://messwithdns.com/ but the actual URL - taken from the text - is https://messwithdns.net/ , FYI), and i enjoyed the part about HTTP (i was hoping she would talk about SPDY and whatnot, i have not even begun to explore such things and am curious what that is all about).

i am going to read the two posts linked the behind-the-scenes on "hello, world".

thanks for the link!

edit: two things this made me think of:

1. the XKCD comic "ten thousand": https://xkcd.com/1053/

2. Mark Russinovich on git: https://twitter.com/markrussinovich/status/15784512452490526...


I think the problem with making hard things easy is that we're conflating tools and problems.

Are these hard problems? Maybe. Some HTTP problems are hard, and some are not. Some SQL problems are hard, and some are not.

Are tools intrinsically difficult? That's an even more complicated question because it is completely context dependent. A better question might be, does this particular tool make it harder to solve this particular problem than it otherwise might be?

And that's where you run into trouble. These tools have been designed to do a LOT -- to solve many different problems, both easy and hard. If you just "show everything", you may make the easy problem much harder to solve.

At the end of the day, some tools make some things easy (or possible!) to do in a way that they otherwise wouldn't. And we should appreciate that. There will always be hard problems, and for the most common ones, there are probably some better tools waiting to be written.


"Tools we use to reduce cognitive load"

The essence of the software literacy revolution

"Share Tools we use to reduce cognitive load"

The essence of the FOSS revolution

Just one line and it was like a road to damascus


shellcheck is a great idea. But you know what’d be even better? Not having to write bash in the first place.

Julia is right: computers are great at remembering trivia. One of the best demonstrations of this, is compilers: they take in a high-level abstract program, and use a vast repertoire of knowledge about instruction sets, to transform the high-level abstract description of a program, into a program executable on a machine that’s much more awful to program for directly, with many more gotchas.

(Here’s where you say “Hey, wait a minute.”)

Why are we writing bash, when we could be writing programs in a higher-level language that compiles to bash? One with exceptions, at the very least.

(Yes, such a language would need a runtime that wraps every common binary and lifts its stdout + stderr + exit code -based semantics into safe, high level semantics. So what? That’s only 10000 “FFI bindings” or so for anything a normal person or sysadmin would want to call. For everything else, just ensure that writing your own FFI binding, inline to a script, is easy.)


Julia Evans is amazing. So likeable, such a great teacher, so much enthusiasm and love for her subject and so much wisdom. An example to us all.


The problem is, people love making easy thing harder.


What an awesome, entertaining, and informative talk!! The pure joy of the talk makes the information in it seem way more digestible! Amazing.


I do not agree that education is the way to make hard things easy. Make hard things easy is putting in more work and actually making hard things into easy things. I understand that maybe no one has the power to change things as big as DNS or HTTPS or Bash or Javascript now, but everyone can contribute to some little thing.


This was a great read. I especially liked how you can read the presentation below with the slides if you'd rather do that than watch the video. No issues with video, but sometimes I'm in the mood to read, and this was very satisfying to be able to do here.


This is why the Dummy's and Idiot's series of books written in layman's are so popular.

If more people realized how great the chat AIs are for this purpose, they would be even more popular despite the hallucinations.


Is it better to set -e or check return code of processes and decide to exit?


Better/worse is one of those subjective things that comes with experience. No one can definitively answer such a question in a way that will apply to every circumstance.

But as someone who rarely writes bash scripts but just enough to be dangerous ... I use `set -xe` at the top of every single shell script I write. The -x flag causes the script to echo every command that is executed to stdout which can be very valuable if I am debugging a shell script that is running on an external environment (like a CI). I've seen this habit suggested many times and it has served me well.


Mine's `set -eEuo pipefail`.

I'd have to look up exactly what's what, but if you're inclined to `-e` you probably want the others too. (Off the top of my head I think E is the same thing in functions, u is bail out if a variable is used without being defined (instead of treating it as empty string), and pipefail is similar for when you pipe to something else like `this_errs | grep something`.)

I don't like `-x` personally, I find it way too verbose; more confusing than helpful.


Yeah, `-x` can be annoying if it's always on.

My suggestion would be to run your script like this `bash -x your_script.sh` when (and only when) you need to debug something.


I wonder if the speaker manages to keep their good personality and optimism when shits hits the fan and customers get angry (it’s a sincere question, i’m not being snarky).


I wonder if tools like shellcheck are outperformed by ChatGPT?


Excellent talk. I sent it to a few folks I know who ive been helping to understand things. Very fun and engaging, with a few iconic slides


"Why was this hard" is a great question to use to become a senior engineer.


After reading the doc, I started to realise things were really hard pre GPT era.


This is an AMAZING article and you should feel very proud of making it


I expected there to be an abstracted, general, repeatable tldr that can be reapplied as a mental model. I haven't digested the whole thing, but after skimming, I can't identify what it is.


it's kind of more like the beginning of a pattern language for why things seem hard to novices but simple to experts, and ways to mitigate those effects - a different mitigation is appropriate for each pattern


She illustrates several (not one) general techniques that can be applied in different situations.


screenshots with a summary is a great way to let someone skim a talk before hitting okay.


and impossible things difficult (hence no longer impossible)!


Well, I'm going to disagree with her solution to bash. Bash is terrible, end of story. Just the *.txt is actually an endless pit of trap, badness and wrong behaviour.

I've not written a bash script in years. I always use Python and in my experience professionally, most people automating things nowaday use Python. I only use bash to write one-line to invoke the underlying Python script. For the record, I do the same for batch file on Windows, with a simply .bat file to invoke the underlying Python script.

Trying to improve the bash experience with tools and knowledge is throwing bad time at a problem that has a well-known solution.

Edit: to be clear, the presentation itself is awesome. I'm just disagreeing that bash needs to be better explained. I'm ready to admit that there are existing scripts out there, most of which are probably only working in the trivial, normal case and are just one snag away from exploding, and having a known source of help can help. But please, take you're bash scrip to the shed and upgrade them to Python.


Warp.dev solves this


A paid LLM solves this?


.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: