Hacker News new | past | comments | ask | show | jobs | submit login
Be careful of the examples you use. They stick (thinkst.com)
199 points by mh_ on Aug 21, 2023 | hide | past | favorite | 128 comments



> It’s important to realise this isn’t a customer-side issue; they shouldn’t have to consider the impact of every configuration option we choose to put in front of them. They don’t have the full context and knowledge, and expecting them to be experts in the nitty gritty of Canarytoken discoverability

Yes!

> Going forward, we will show multiple examples of prefixes. A user looking to add a custom domain will see a variety of example zones when they visit the page, and the examples will cycle each time they open the configuration page. We want to convey that they have options in choosing the name, and we show them a variety of sample options. Our hope is that this will prompt customers to pick their own names, and if they do rely on our examples then those are now spread over a large list of examples.

No! They were so close and yet it sounds like they've still missed the point. The issue is that users don't understand the "why" behind the prefix. Just randomizing the prefix that they're shown does nothing to change that.

IMO, a better solution would be. 1. The shortest possible explanation under the field of why you shouldn't use "someprefix". 2. Prevent users from using "someprefix" as the prefix and show them the warning again. By eliminating the default option as an option, you force your users to leave auto-pilot mode and actually consider their choice.


> 2. Prevent users from using "someprefix" as the prefix and show them the warning again.

Don't do this - examples in documentation should be valid. Having an example that doesn't work when the user tries it out will just lead to frustration.


I ran a blogging SaaS platform for a while. I had some instructions for configuring a reverse proxy to serve the blog. There were instructions for most major web servers. One step was adding a custom header.

- Go to <link> and get your publisher ID - Add the following line to your config(replacing xxx-xxxx-xxxxx with your code from the previous step):

AddHeader X-Publisher-Id: xxxx-xxxxx-xxxxx;

We had a bunch of people leave the x's in and were confused why it wasn't working. So we made a blog that explained the misconfiguration, and replaced xxxx-xxxx-xxxxx in the documentations with that blog's ID. We got far fewer support requests after that.


I strongly disagree.

First, I've come across plenty of documentation that has commands that you can't just copy paste into your terminal. As long as the parts a user needs to fill in are clearly marked and explained, I don't see an issue. Especially in a case like this where there isn't a clear "right" answer and what works for one user may not make sense for another.

Second, I feel like there should be some sibling of Hyrum's Law (https://www.hyrumslaw.com/) that says that users will eventually do everything you tell them not to do.

If you don't want users to do something, then you need to protect them from themselves and explicitly prevent it. Just saying "don't do this" and expecting users to listen isn't going to work.


There are plenty of scenarios where that is not desirable. Perhaps most famous is IP addresses: https://www.rfc-editor.org/rfc/rfc5737.html


Disagree. Think of any example that includes the configuration of a secret like an API key. Typically you see something like "<API key>" or "xxx-your-api-key-xxx" that signals to the user that they need to input a real value rather than leaving the default, usually coupled with a nearby note describing how/where to get said key.

The example here is of course slightly different, but I think a similar pattern could be applied.


This gets fun when the example is something like instructions for wiping your disk to install a new OS. If you use an invalid disk name, they get errors. If you use a valid disk name, they wipe the disk on the wrong computer.


> Prevent users from using "someprefix" as the prefix and show them the warning again.

Yes. That's why https://example.com exists.


Reminds me of a large company I worked at, I had some documents for developers how to set up some local environment thing they had to do exactly 1 time and never again. It was just a handful of terminal commands, all starting with the traditional shell notation like:

$ (some command)

Over the course of a year I got periodic complaints that it "wasn't working" and I tried to find issues on my end and couldn't. One particularly vocal dev came to me directly and insisted it was broken, so I went on a shared session with him, it turns out they were pasting the "$" into the terminal causing it to say: "$: command not found."

That was the source of all the complaints, once I removed it, they stopped.


Another trouble source I've had has been been fixed by some wikis. When I first ran into one that supported this about a dozen years ago it made my life a lot easier.

And that is having a function that injects the current user's username into the wiki page. Any instructions that require you to log in as you can then be blindly cut and pasted, without people coming and telling me "it doesn't work" when they try to log in as me. Yeah of course that doesn't work, mate.

People reading runbooks aren't fully engaged. They may actually be in a war room. The moral of this story is that you are not entitled to carve out a chunk of your coworker's attention. Those periods should be brief, and tied to an active initiative/epic. After that 'work' is done, everything you wrote should be something that they can operate at 2:00 am.

Because sooner or later, they well.


Is there a reason websites put that $ in front of commands you are fully expected to mindlessly copy and paste? I've seen it happen more and more and it simply baffles me.


I think it is to separate command line input from output -

$ echo "Hello!"

Hello!

The $ denotes that this is a terminal command, and anything that is not preceded by $ will usually be some kind of output. That's certainly how/why I was using it.


On top of that, I find it useful to know if a command should be run as a normal user ($) and root/administrator (#).


Docker does this properly; they show the character but it's nonselectable so you can still copy paste directly without spending time removing the $. eg https://docs.docker.com/engine/install/ubuntu/


> you are fully expected to mindlessly copy and paste

I think this is the disconnect. Don't mindlessly copy and paste things. Take the time to understand what's going on.


Even when I understand, copypasting is often faster and skipping or removing the $ is slightly annoying.


My favorite are the ones where the command is in a width limited element that needs to scroll so you can't see the full command to evaluate if you want to copy it or not. If the overflow has been properly set to scroll, it's doable, but I've seen the odd site where the overflow was hidden and you had to copy&paste it somewhere else just to see if you wanted to paste it in an actual terminal


I try to do that, but there are a lot of times where I find myself thinking "I don't care, I just want this to work" and blindly copy anything that's not super fishy. Especially after nvidia updates.


“$” for non-root shells and “#” for root shells instead of writing out sudo/su.


This is exactly why. If you are doing linux maintenance, this is really helpful because you can tell if you should be running a specific command as root or user. If you are just giving instructions on what to do for a normal user, omitting it is probably fine. If you are doing something that will require both, it might be good to explain it at the beginning of your instructions.


This goes back to pre-Web, originally to distinguish normal user (`$`) from `root` user (`#`), and sometimes from C-shell or some non-shell interactive line-oriented program.

I used to think it was dumb and annoying.

I partly changed my mind recently, when I found myself copy&pasting lots of quick wiki-like documentation into Markdown fenced code blocks that involved multiple hosts.

Just having the hostname in the prompt reduces the text I have to type in addition to the copy&past, to explain where this command is happening, such as on the workstation vs. one of the servers.

(But it didn't quite help the other day, when I was documenting some self-hosted LLM procedures, and needed to be copy&pasting examples from multiple Screen terminals on the same host, where the prompt didn't distinguish them.)


> Is there a reason websites put that $ in front of commands you are fully expected to mindlessly copy and paste?

As well as the answers you've already been given (distinguishes input from output, distinguishes root from non-), one answer is contained in your own question - it prevents mindless copy-pasting.


Except it doesn't, removing $ has no positive effect on your mind


I'd claim that I'm technically correct - you _cannot_ mindlessly copy-paste `$-prefixed` code into a terminal and have it execute, because the `$` will result in a syntax error - but you are, indeed, correct to say that it is possible the _then_ mindlessly delete the `$` without thinking about the rest of the content. But it's another prompt, at least!


if you want to be that technical, than start with the fact that you can easily mindlessly copy&paste, it just won't execute, which is technically a separate thing

(another prompt is just an annoyance for no benefit)


As a generalist with many operating systems in use, I like that this tells you what OS the command is for. Especially now that powershell with its many unix-like aliases is getting popular, this isn't always evident.


> powershell with its many unix-like aliases

This choice simply baffles me. If I could just use those aliases and have it under the hood be PS calls- that's great! But that's not what's happening, instead it's a totally different command with different a different syntax and different flags. Why make it an alias at all‽


Yeah, alright. As the former "PowerShell guy" for an office of 200+ people when I still preferred WSL and bash, I can definitely accept this answer.


> Is there a reason websites put that $ in front of commands

back in the mid-early days of computers, I'm pretty sure the unix prompt was a $ (and as has been pointed out, a # for the root user). Thus custom of using it in documentation started because it's literally what people would see.


Because that's how it looks in their terminals, and it's a bit of a challenge to think through how users use something


Yes. $ means regular user, # means root user. Usually.


For a long time I had my prompt some version of:

:$! $CWD $HOSTNAME $;

: (exit status, current directory, hostname, $ or # and a ;

the idea being -- you could just cut and paste and it'd mostly just work (except megabozos who like to make directories named ; or whatnot)


I gave what my company calls a “lunch and learn” presentation once of some interesting tools. People liked it and shared my deck around which was cool. But then my quick/dirty examples started showing up in best practice (I loathe that term) decks shared to very large teams with my name at the bottom. A security guy, who I greatly respect, raised some questions and I had to go through the whole story with him and then find all references to my examples and fix them. It was pretty embarrassing.


I did a similar thing as part of a "lunch and learn". NodeJS + Express was super fresh and I did a small example app. When returning the user profile, I just queried the database and returned the entry displaying some properties on the frontend.

The team lead was like "show us the request in the console", and I opened it up and there was the non-encrypted password, createdAt date and basically all the not-needed properties.

I still cringe thinking about it.


No need to cringe, this is exactly what demos are. A hack to demonstrate functionality.


I still cringe thinking about it.

Meh, as a sibling comment points out, it's a demo, it'll happen. And when it happens, it's a teachable moment (assuming you can keep your head together as your demo falls apart before your very eyes): "Ah, so as you can see it's import to set the $DO_NOT_DISPLAY_PLAIN_TEXT_CREDS environment var to 'true', otherwise you get this disaster! Hahaha...ha."

Or if you don't know at the time what's going on, "obviously I'm just getting started on this myself, and need to play with some configuration. Better make sure I do before any of this goes to production! Hahaha...ha."


hah what i did involved a db too, a string based query without sanitization introducing a possible sql injection. The input never came from a user and was sourced from a config file but i still should have known better. I learned that if you put it in a slide, no matter scope/purpose, it better be production quality because people are just going to copy/paste.


copy/paste/“improve”

the only thing scarier than seeing your personal “not for prod” code running in prod by someone else’s hand. the “improvements.”


I don’t think that’s so bad. It’s a demo, not an end product.


> It was pretty embarrassing.

Maybe, but you did learn a lot too. A follow-up that explains all those fixes might be even more interesting than original presentation, as far as I'm concerned.


> best practice (I loathe that term)

Me too :( One of my coworkers keeps talking about “best practice” all the time. And he writes the most broken garbage of anyone I’ve met in a long time.

I’m often feel like quitting, because I do not enjoy working with him. But I like the company I work for. And I don’t want to spend time trying to find a new job at the moment.


I've found that vague reasons like "it's best practice", "that's the way it should be done" & "that's not scrum" are usually thrown about by people who don't actually know why they're doing that, they just learned it and now feel strongly about it but can't actually back it up.


Can we please use `example.com` for an example domain name instead of like `somedomain.com`? It can create accounts with emails that someone can actually intercept.


There is also an entire TLD, .example so that you can put multiple names in a TLD and distinguish big-corp.example from my-local-store.example and it's clear that those aren't related, they just share a registry the same way as letsencrypt.org and wikipedia.org do


example.com is officially reserved in the spec for this use case. Is the .example TLD reserved as well?


.example, .invalid, .local, .localhost, .onion, and .test are all "special use" domains.

https://en.wikipedia.org/wiki/List_of_Internet_top-level_dom...

"ICANN/IANA has created some Special-Use domain names which are meant for special technical purposes. ICANN/IANA owns all of the Special-Use domain names."


It seems true, according to [0]

      ".test" is recommended for use in testing of current or new DNS related code.
      ".example" is recommended for use in documentation or as examples.
      ".invalid" is intended for use in online construction of domain that are sure to be invalid and which it is obvious at a are invalid.
      The ".localhost" TLD has traditionally been statically defined in DNS implementations as having an A record pointing to the back IP address and is reserved for such use.  Any other use conflict with widely deployed code which assumes this use.
[0] https://datatracker.ietf.org/doc/html/rfc2606


Yes, here are the reserved TLDs [0]:

    test
    example
    invalid
    localhost
    local
    localdomain
    domain
    lan
    home
    host
    corp
0. https://www.ietf.org/archive/id/draft-chapin-rfc2606bis-00.h...


Be careful, that is a decade plus old expired draft of a proposed update to RFC 2606. The current version of the standard, including 6761 which updates it, does not reserve most of those.

https://datatracker.ietf.org/doc/html/rfc2606#page-2 https://datatracker.ietf.org/doc/html/rfc6761

You are probably safe using names like .lan and .corp but they are not currently protected by standard in the way example is.


Ah, that's a good point.


It is (in RFC 2606), along with .test, .invalid, and .localhost, for similar reasons [0].

https://datatracker.ietf.org/doc/html/rfc2606



Another "can we please" is stop illustrating (or providing as defaults) insecure configuration examples.

A large percentage of people following the examples will use them verbatim.

Had a case recently where someone was setting up a postgres container and the example was something like:

    docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -d postgres
You guessed it, "mysecretpassword" ended up being the password. And the service predictably got compromised, because that example was one that was pretty high up in seasrch results for "how to run postgres in docker" and attackers probing for port 5432 will try "mysecretpassword."

Another one is can we please stop posting code examples that are illustrating some language feature or how to do thing X in language Y, where the code has a footnote "this isn't secure, don't do this in production code." That footnote will be ignored. If you're going to bother answering a question, answer it with a proper implementation and not something that is a gaping code injection vulnerability.


That could be improved if:

1. Software is coded to recognize "stupid example values" and prohibit them with an error or at least throw some pretty obvious warnings.

2. Such software also has a rarely-used but documented option to bypass the above checks.

For example, the PostgreSQL server might refuse to allow any DB user to have a password which begins with "password", returning a "Pick A Real One" error.

However somebody somewhere will always need the "don't do that" option: The more different libraries and services you bring together, the more likely there will be either false-alarms or outright conflicts between different example-blocker schemes.


My pet peeve is insecure configuration defaults. The Postgres container by default does not require any authentication for localhost connections, and with containers "localhost" can mean unexpected things.


This is also an opportunity to think about the value of a piece of configuration. If an example configuration value works for 40% of users without modification, should that value even exist? Think Bash's `HISTCONTROL=erasedups`, which shouldn't be necessary to set in the 21st century. Or should it be auto-generated, like Docker's container names?

In the very best case, the defaults are so good that an empty configuration does what most people want. Think ripgrep, …, welp, I can't really think of many good examples. Browsers need extensions, Bash needs a decent prompt, even many pro cameras need to be configured to save raw images by default.


> Think Bash's `HISTCONTROL=erasedups`, which shouldn't be necessary to set in the 21st century

As in that it should be the default? Or only option? I personally do not have (and want to have) that set, and I am hardly alone. Any change in defaults fucks someone over, especially in things like bash, where you ssh into many machines with many different versions of bash...


Omg, I just read the documentation for erasedups and I'm actually shocked anyone would want that feature turned on, to the point where it never even would have occurred to me to implement it in the first place and if the feature worked like that without some way to turn it off I'd have been super angry :(. Maybe we are parsing that sentence wrong and by "shouldn't be necessary to set" the idea is "it is useful in situations where you have limited disk space / memory but in the 21st century no one would need to set this so it might as well no longer be supported"?


That option has nothing to do with limited disk space.

Eliminating the duplicates makes it much easier to search the history for a command that was used long ago instead of having to skip over hundreds of duplicates of some non-interesting command, such as "ls".

It also makes it much more likely that complex commands used a long time are still preserved. No matter how large you make your history file, it is much more likely that it will become filled with simple commands that you do not need to recall from history, instead of keeping the complex commands that you hate to retype.

Moreover, while having a command history is useful to avoid retyping some commands, some may be less willing to preserve a history from which it is easy to discover which have been their exact actions while using the computer, though for this it is preferable to also disable the saving of the history file.


    $ cat ~/.inputrc 
    "\e[A": history-search-backward
    "\e[B": history-search-forward
This means that I can type the first couple of letters of a command, and then use the up and down arrow keys to cycle through history to search the rest of it. It makes life simpler for me, and confuses me only on the occasions when I've used `git stash`, so `git status` doesn't appear when I expect it to.


fish shell does this by default, and also autocompletes commands from your history. I highly recommend it.


Ctrl-O is incredibly useful. Go back to an earlier command, press Ctrl-O repeatedly, and you re-run a series of commands. That doesn't work if some of them were deleted as duplicates.


Now I'm confused. Why would you want duplicate entries in your command history? I get that on an overloaded system in the 80s it might have taken a perceptable amount of time to filter out duplicates. But on a modern system you surely want it always enabled?


I think the people who leave it on see it as "rewriting history" -- e.g., if you typed [cmd1], [cmd1], [cmd2], and you erase duplicates, your history no longer reflects what you actually typed. The people who leave it off see the history as more like "a list of interesting things I did at some point", so erasing duplicates means it's easier to find the most interesting things.


Any search of the form "what other interesting commands did I need the last time I did <foobar>?" benefits from being able to see the historical context. Or if I know the particular incantation is one I wrote several times and is very similar to some others then it's easy to search the history for the blocks of popular commands. Or whatever. It's your terminal, but given how fuzzy the interconnects in a mind can be it's hard to know what info might be useful down the road.

To your point about computing power, I think it's only now that it's reasonable to not filter the duplicates. If I want to emulate the dedup behavior it's trivial nowadays to blindly read 1M lines of history and dedup on the fly each time. Using downstream tools doing duplicate filtering/transforming work is similarly very very fast. Disk is cheap enough that I really do want to keep every command I ever write in my history. More expensive compute and disk would make me more likely to turn that feature on, not less.


I don't use bash so it's not applicable to me anyway, but why would I want it on? It's saving a few bytes of disk space, at the cost of ruining my history (the thing that the whole feature is about). At least fish's search feature is so awesome that I never had a problem with duplicate commands causing problems.

And I want my history untouched, because sometimes I forget how I solved something, go back a few months in my shell history and see the sequence of commands that I've used. I think that's a nice thing to have. And if I ever need history without duplicates, it should be easy to deduplicate it with a simple script.


I use ignoreboth -- it's not about saving disk and memory, it's about making the history more useful for searching. The ignorespace also helps avoid getting credentials saved to disk you don't want saved and is even more useful these days than it was in the past because we're no longer on multiuser systems where you really want to keep credentials out of commandlines entirely.


I have the HIST_IGNORE_ALL_DUPS options set on zsh. I only need a history of what commands I have run, the order isn't important to me. Having the history be a unique set of commands makes it easier to dig through.


How do we combine evolution and development with backwards compatiblity? I think it's quite natural that we end up with this conundrum. Like say for example Vim having outdated defaults, because changing them could disrupt existing users. A reboot/fork of it can reset and start fresh but will eventually in its own development hit the same problem.

Are there examples of projects who solve this well?

Ripgrep's author is also very careful about breaking changes - I think that means it will also one day have outdated defaults!


A reasonable compromise would be to allow changing defaults whenever there's a major version bump.

ripgrep seems to have a major version bump every 18 months or so, which seems a bit excessive. OTOH vim went through 9 versions in 30-something years which seems more reasonable. Although I think there was more churn early on. ISTR vim 6 being around for a long time.


Does vim follow semver? (vim predates semver...) So how do you know you're comparing apples-to-apples?

Look at the breaking changes in each ripgrep major release. I don't use major releases as a means of breaking popular workflows. I use major releases even when there are very small breaking changes with minimal impact.


> Does vim follow semver?

I don't think vim follows semver, although a lot of old unix software, and also Free Software, would use major version bumps to indicate compatibility breaks. Semver was in many ways the documentation of what a lot of software was already kind-of doing.

But vim was used as an example by the GP, so re-using it as an example to show what I thought was a sensible major-release schedule (instead of, say, coreutils, or glibc, or glib/gtk, or perl, or Qt, etc...) seemed appropriate.

> I don't use major releases as a means of breaking popular workflows. I use major releases even when there are very small breaking changes with minimal impact.

A breaking change is going to break someone's workflow. (See also, Hyrum's Law, xkcd 1172.)

As a user and developer, my preference is for breaking changes to be put off as long as possible, and then all applied together every few years or so. (Or never :-) That way, I don't normally have to pay that much attention to updates, even if an app gets new features. But when there's a major version bump, I can check the release notes carefully and know to keep an eye out for anything unusual/different.

If I only ever ran one program, it wouldn't be a big deal. But I don't, I run hundreds regularly. If they all have breaking changes once per year, that averages out to me needing to check over two sets of changes that might impact me every week.


Did you do what I asked and look at ripgrep's changelog?[1] The breaking changes are prominently advertised in each major release. Not all breaking changes are the same or have the same impact. Some major releases don't even have any breaking changes. (semver doesn't say to only do a major release when there's a breaking change. You can do a major release without breaking changes.)

I don't think you're correct about "old Unix software" using major version bumps to indicate compatibility breaks. Recent 3.x releases of GNU grep, for example, fucked around with the meaning of \d when using the -P flag. With no changes to the major version number. Did that break your scripts?

The thing about semver is that it tends to make breaking changes much more visible, which is kind of the point. And of course, when you compare it to projects that don't use semver and don't increment the major version for every breaking change, the projects using semver look like they're moving at a much faster pace. It might be true, but you can't conclude it by looking at version numbers when the projects aren't using the same versioning scheme.

[1]: https://github.com/BurntSushi/ripgrep/blob/master/CHANGELOG....


Another example of old timey Unix code just breaking things in minor point releases. See https://abi-laboratory.pro/index.php?view=changelog&l=glibc&... and https://github.com/intel/hyperscan/issues/359.


Open source projects are terrible at this, in general. Any feature ever introduced, no matter how ephemeral or small an audience, can't ever be removed. Which I suspect has contributed to most GNU tools being kitchen sinks rather than doing one thing well.


> In the very best case, the defaults are so good that an empty configuration does what most people want.

More generally, defaults (including, default examples), matter: https://news.ycombinator.com/item?id=25646180

Especially for software which is mod-able, malleable, composable, configurable... defaults have a disproportionate impact on UX, DevEx etc. A reason why TLS 1.3 got rid of a laundry list of options, or why WireGuard is simply a joy to work with (as opposed to IPsec / OpenVPN), or why middleboxes on the Internet are a big hurdle to protocol upgrades, or how NewCloud companies like Cloudflare, Replit, Flyio, and Vercel have devex beyond what the Big 3 can muster up.

> Think ripgrep ... welp, I can't really think of many good examples.

Apple has got this spot on, across decades. Their products "just work", as they say so in their own marketing.

> Browsers need extensions

Now you see why Chrome is defaulting to Manifest v3 ;)


> This is also an opportunity to think about the value of a piece of configuration. If an example configuration value works for 40% of users without modification, should that value even exist?

This sounds completely insane. If a majority of people need something different, they shouldn't be allowed to have it?


I read it slightly differently: those 60% can set one of the other values and the default value no longer exists but is simply the default so the 40% who previously had to set that value now don't.


Oh, absolutely. If you give people an example (and you should), the overwhelming majority will copy the example exactly and then only change what they are forced to change when it doesn’t work otherwise. Therefore, prepare your examples accordingly.


About 5 years ago I made a blog post detailing how to use Traefik/LE with PHP. For about one day, I realized I had my personal email in the template for the warning email for when the Lets Encrypt cert is expiring. I still get emails warning random people that their domain is going to expire.

Prepare your examples accordingly.


This sounds like a great way of getting some petty revenge while writing documentation.


This also applies to code. I come across the situation A LOT in OOP codebases that they have completely superfluous interfaces and/or abstract classes, that are made pointless by the fact that every single implementation extends some example implementation. The most egregious example I ever found was a Minecraft mod which had an interface, implemented by an abstract class, which had another abstract class that extended it, which was implemented by a an example class.

Every single mod I could find (sample size of ~1000) just extended the example.


This may be an actual answer to the problem. If you show "someprefix" as an example, then give users a helpful error message if they actually type in "someprefix". Something like "Dear user, someprefix is only an example, please replace this with a name appropriate to your business."


No, you’ll only get “someprefix1”, and about 14 similar variants. What you need to do is either prepare a reasonable prefix valid for all (or at least most) users, or have the documentation to be dynamically prepared for each user, with a dynamically generated appropriate prefix for each user.

Alternatively, you’ll have to teach users what kinds of prefixes would be appropriate, with an example which is obviosly not appropriate for any of your actual user. This will take some length of text to explain, and many users will not read it, and may instead abandon your service.

Relatedly, I’ve always disliked when programs force me to name N number of things without adequately explaining

• What the names are (Is this some kind of group name? Instance name?)

• How the name will be shown. (Should I prefix the name with the company name myself, or will that always be visible? Will this name be shown together with numerous other names which are all UPPER CASE? Will the name be automatically converted to lower case?)

• If any of these names will be publicly visible.

• If any of the names can be changed later, and how hard it is.

• What characters are allowed (Are spaces, underscores, or dashes allowed? How about Unicode? Emojis? What is the normal naming scheme?)

• How long is the name allowed to be? (Will it be silently truncated at 8 or 16 characters (or grapheme clusters)?)


I'm thinking if invalid characters in the examples given would be an acceptable solution. For DNS records this could be XML-like tags like <someprefix>.<yourdomain>.<tld>

On one hand, it prevents blind copy-pasting but on the other hand, your example is invalid.


For procedure documentation I often use ${service_namr:?} For similar reasons. If the variable isn't set you get a clear error. And it provides an easy way to use the template without modification.

It isn't perfect, common variable names may already be set or have been set in a previous execution of this playbook on a different problem. But it catches common issues while being convenient.


Yeah I think this is the only good solution, really. Otherwise it's just often unclear which things are required to have a specific value, and which things can be replaced. In this example, I'm sure plenty of people thought it was possible that "someprefix" was required, rather than just an example.


An invalid example could confuse the customer and create unnecessary support calls.


A few years back, I recall reading about some automotive manufacturers who had just copied an example "airbag arming authorization" code/value that appeared in a shared spec document (IIRC) for their vehicles. There was a Metasploit module created (for the Hardware Bridge) that would send CAN bus messages to just check/verify if a particular vehicle uses this insecure arming code. For vehicles using this known code, an attacker with CAN bus access could deploy airbags on an unsuspecting target during vehicle operation. https://www.rapid7.com/blog/post/2017/12/22/metasploit-wrapu...


I've done some commercial software SDKs and this strikes me as the least surprising thing in the world. MOST programmers will copy and paste example code into production applications without really thinking about how well it fits into what they're doing.

The takeaway is similar to the article: think very, VERY hard about your examples and sample code. It doesn't just have to be correct and demonstrate the features, it also needs to be fairly robust so that customers don't hurt themselves with it.


An interesting thought is that the examples in your documentation don't necessarily need to be static and the same for everyone.

For example, if a user is logged in, you can autofill the appropriate accounts/domains/ids/etc to make the example work out of the box; and if some ID needs to be essentially random, then you can make it actually random when you generate the example.


They don't have to be static but making them dynamic might not worth the cost.

From a simple static page, now you need an API service, most probably connected to a DB or somehow integrated to the rest of your backend. So markdown suddenly isn't enough and you need some server-side logic.

For random strings, you can do this with client-side logic, which in some cases might be easier than server-side logic. But you are still moving from no-logic (static) to somewhere-logic.


SwaggerUI/OpenAPI support this to add your own users API credentials to the api call examples, if you care to implement.

as a develoer, is just a nice touch that I can copy paste an example of code snippet and since I'm logged in they can give it to me already with a valid api key.

ymmv

They also allow to output examples in as many langugages/sdk's as you need too


This unfortunately encourages users to hard-code secrets since the example snippet they get literally does it.


Why is "use more examples" the solution? If the users are copy pasting the code, why not just generate random strings thereby showing an example and also fulfilling their own requirement of non-identifiable strings?


A random string may look suspicious and the goal of this is to avoid suspicion


Very true. Good examples consume a lot of time. I was bitten a couple of times when the customer nailed me down with "But this example can never occur" and my futile attempt to justify "But it's an example!"

Good examples make documentation worthwhile to read.

Good defaults make an application worthwhile to use.


Wasn't the DMCA takedown of youtube-dl also caused by an example in documentation where they used a youtube link to some big name vevo artist?


The opposite is also true. It happens rarely, but I have been bitten by trying to configure something to be what I would like it to be, only to discover it had to be what was in the documentation for it to work, generally with nothing in the documentation itself to clarify.

Can't think of any examples now though I'm afraid.


Can't think of any examples now though I'm afraid.

Dodged a meta-bullet there...


Heh. "You can totally put in any value here, as long as its exactly this one"


"You can have any color car you want. As long as it's black."

-- Henry Ford


Completely off topic aside but I read somewhere recently that the reason for this was that black paint dried much faster than any other colour, which meant less time taking up space in the drying room, which meant more production capacity.

So basically it was black because that kept production costs down.


I don't think that's in opposition to the posted article, it's just a lack of documentation ("with nothing in the documentation itself to clarify").


This isn't quite the point of the article, but we allow people to apply for student discounts for our service, and provide the following example that we ask users to send to us over Intercom:

> Hello, could I please apply for the student discount?

>

> [PLEASE READ AND DELETE THIS – After sending this initial message, please attach a proof of your student status, such as a photo of your valid Student ID so we can process this quicker!]

I don't think any of the countless people that have asked for the discount have ever removed the "PLEASE REMOVE" part, and many don't bother to send the proof until we ask for it either.


I think you should just remove that example entirely then. It barely adds anything and as you said it is causing recurring issues.


I've run into a number of networks in my area (private businesses, a couple municipalities, a couple law enforcement agencies) all using the 192.9.1.0/24 subnet.

There was some overlap in these sites w/ respect to IT service companies involved in their setup. Best as I can guess it came down to one person who floated between the employ of a couple (or three) IT service companies leaving a swath of 192.9.1.0/24 in their wake (or maybe training other technicians during their time at these companies). It seems like this work might have been done pre-RFC 1597 (which is, I think, the first place that what is today's RFC 1918 address space shows up) but I think they were just following examples.

I'd love to know what examples motivated the us of this address space. I find some old Sun docs[0] referencing this address space, and RFC 2328[1] makes reference to it.

[0] http://bitsavers.informatik.uni-stuttgart.de/pdf/sun/sunos/3...

[1] https://www.ietf.org/rfc/rfc2328.txt


Any template strings are ambiguous unless there is more than one example.

For example, let's imagine that there is an instruction saying that in a config file, there should be:

PASSWORD=[password]

Let's say our password is "admin". Then it could be that:

PASSWORD=admin

PASSWORD=[admin]

or even

PASSWORD=[password]

as it is not a place to actually store the password, but to select an authentication method.

Sure, sometimes (but not always!), it is possible to deduce how to fill the pattern.

If the field has some canonical value, go with a sane default e.g. "canary.their-company.com", with a note that any other suffix works instead of "canary". Sensible defaults save us a lot of brainpower (vide https://en.wikipedia.org/wiki/Convention_over_configuration).


In-band signalling often seems to have this issue where it's not 100% clear what is part of the message and what is part of the meta-message.

Edit: I think it's mildly amusing and further drives the point hom that some people in this thread missed endnote 1, where you say it was not the actual prefix.



> Frankly it’s a reason enterprise software is often so terrible; tons of options you barely understand or know about, and are configured according to tutorials/examples rather than understanding.

This article stresses that it's not a 'customer-side problem', and what they'll do to try to address it on their end.

But is there anything that enterprises can do in order to encourage people not to work blindly from tutorials? What do companies where workers avoid this pitfall look like?


This is such a great story and an important one. I always optimise examples for people copying and pasting, trying to make it as safe and meaningful by default as possible. It doesn't matter why you're copying and pasting - you may not have a lot of skills in this specific area, or you might be in a hurry. If you know what you're doing, you can probably improve the code, but if you use it as-is, it shouldn't come back to bite you!


I find it funny how people would just happily use `someprefix` as the subdomain. Isn't it obvious that it's meant to be replaced with another prefix?


From the aricle footnotes:

> some-prefix is used in this post to protect our poorly chosen actual-prefix


Maybe, but it's not obvious to me what the "another prefix" should be.


40% !!

I could kind of tell where this article was going from the first paragraph, but i never thought "some-prefix" would be used by 40%. That is such a high number.


On the contrary, for those of us who have experienced The Public it seems rather low.


I think this is also true for trivial / hypothetical examples. I used to work at a global company that would use 'acme' as an example domain, including for emails and such. Because when we started, the domain didn't exist so test emails would just disappear in void.

Until the domain was registered and is actively being used.


There used to be a blog where someone registered DoNotReply.com and posted all the replies he got as a result of companies using that as a default reply-to in emails — which often included sensitive information.

It’s not around anymore but here’s a discussion of it when it was up:

https://boards.straightdope.com/t/donotreply-com/442816


This reminds me of the association between tetanus and rusty nails.

Why would rust make the presence of bacteria more likely? Is it a food source? No, tetanus on a rusty nail was just an example used in an article many years ago.

Sadly, I cannot find a source for the idea coming from an article at the moment. :/


> When given an example, a significant number of users default to using that same example in their customisation. The behaviour is consistent across customers and configurations. This surprised us!

This is not surprising to me at all. Maybe the authors have never used an example before?


Well, I’m also the sort of person who wouldn’t. Similarly, I also never copy&paste example code when reading documentation, instead, I immediately jump into writing my own variation. It took me a while to realize that’s not typical.


I'd rather try the defaults, see if it works or what doesn't work, or just to gain some experience for how everything involved works, and then work off of that known baseline. Anything else just feels like randomly throwing shit at the wall to me. Especially if I'm not familiar with the thing I need the example for.


Often examples don't make it clear what is expected. As a consequence you might copy the example value temporarily until your understanding solidifies, but it never does, or doesn't before it gets a dependency on it.

So foo.bar.com or my-subdomain.example.com?


Ironically enough that this posting's title seems to be ignored by people which cite parts of RFC 2606 which states at the beginning:

> Updated by: 6761


How about generating randomish suggestion like

company-35642.domain.com


If I had a dime for every time I saw somebody copy and paste "#myExampleWidget" into production code ...


    printf("Hello World")




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: