Hacker News new | past | comments | ask | show | jobs | submit login
I_suck_and_my_tests_are_order_dependent (rubydoc.info)
333 points by devrob on Feb 16, 2023 | hide | past | favorite | 290 comments



Funny. At a previous employer, we had a few services that needed access to some data we hadn’t put behind a service yet. For bureaucratic reasons I won’t bore y’all with, our team wasn’t given the green light to create the needed service to provide access to the data, so a coworker and I created an endpoint on an existing service but we added a requirement that an oddly specific and undocumented HTTP header calling out management must be provided in every request made to the endpoint and then we waited. At some point, someone else needed to use the endpoint and came to us asking why it wasn’t working. We told him about the header. He was annoyed and complained to his manager. The very next week, we got what we wanted: Our new service was up and running! We also got into a bit of trouble over this, but that’s just an extraneous detail — haha.


If you had done nothing that service would not exist.

Ask for forgiveness, not for permission.


A couple of related practices I've picked up over the years: (1) using long names when implementing things that are necessary but discouraged (2) putting dange_danger in names of things that should never be used in production.

The former introduces just a tiny bit of friction, but it's often enough to encourage developers to use the preferred API calls.

The latter I learned almost 20 years ago from Google's "Mustang" search back-end. If running a test instance of Mustang without bringing up a ton of dependency processes, there was a --danger_danger_must_not_fail true commandline flag (or something very similar) to turn a bunch start-up sanity check aborts into warning log messages. If any code reviewer sees "danger_danger" being added to production command line flags, they'll almost certainly ask the right questions.


The Haskell ecosystem uses a very similar convention: unsafe functions have names starting with ‘unsafe’ (or, in one particularly horrible case, ‘accursedUnutterable’ [0]). These are generally functions which subvert the type system, so the name acts as an alternate warning.

[0] Documented at https://hackage.haskell.org/package/bytestring-0.11.4.0/docs... — incidentally one of the funniest pieces of documentation I’ve ever seen!


That doc link is funny, but it's also very frustrating -- you would think that after all those warnings, there would also be some glimmer of information about what sets the accursedUnutterablePerformIO function apart from unsafePerformIO, or at least why it exists, but apparently not. It's clearly important because most of the core functions in Data.ByteString call it.

It seems like the authors have decided that the trickiest, most magical parts of their codebase are exactly where they don't care about useful documentation. That seems like a bizarre strategy unless they plan to never let anyone else help with maintenance.


“If you aren’t willing to read the source of the function to learn what it does, we don’t want you anywhere near it. No summary short of the function itself can accurately communicate how much of a fragile hairball of edge-cases this function is, and how many things you have to know about to correctly call it.”


This is pretty much correct. For reference, here’s the original documentation (changed in commit 80ff4a3018cd8909abb1d4e0c32f012a523883ec):

    -- | Just like unsafePerformIO, but we inline it. Big performance gains as
    -- it exposes lots of things to further inlining. /Very unsafe/. In
    -- particular, you should do no memory allocation inside an
    -- 'inlinePerformIO' block. On Hugs this is just @unsafePerformIO@.
The commit message notes that ‘We've had a few instances of people being tempted to use it without really understanding the consequences’.


I wonder why inlining the function makes it that much unsafe.


This is the worst kind of arrogance. All you're doing is reducing the number of people who'll understand what's actually going on, which not only makes it less likely to be used properly; it also makes debugging code that uses it more difficult.

You're deciding for other people what they should and should not know. Another word for this is gatekeeping.

Don't do that. You're not the other person. You can't read their minds, nor can you fathom their reasons. Deliberately hamstringing folks is bad policy, and supremely arrogant.

Just document it properly, lay out the risks, and leave it to the reader to decide on policy.


Oh wow your comment really rubs me the wrong way...

In my opinion, it is properly documented, almost perfectly so, by the authors giving you the full source code.

If you are not willing to read the source code, but instead always blindly rely on documentation to be correct, then in my book that's just lazyness.


> If you are not willing to read the source code, but instead always blindly rely on documentation to be correct, then in my book that's just lazyness.

Hmm, I agree somewhat with both your comment and the one you're responding to. However, the claim that you should read the source of the libraries/frameworks you're using to me is interesting - because to me it seems sane, however not always entirely viable.

For example, who would be more successful:

A) A Java developer who reads the Spring source code and dives into that verbose Eldritch codebase with years upon years of fixes, design patterns and abstractions, all to learn how the dependency injection works better, probably spending days to weeks in the process?

B) Or perhaps someone who looks up the annotations or configuration they need, or maybe other code snippets on StackOverflow or the docs, or a similar site and solves their problem so they can move on to the next tasks, while being none the wiser otherwise?

Sometimes the answer will be A (deep knowledge can be useful), sometimes it will be B (we all have limited time and energy), however it feels to me that with sufficiently complex codebases one might just get more and more confused as well, as opposed to gaining any sort of a clarity or understanding, when the framework/library actually abstracts away really complex things.

Just using Java as the example here, as it's one of the more enterprisey languages and Spring as an arguably brownfield maintenance mode framework that's huge, given its long history.


I appreciate that you broke apart the users into two groups. I imagine Group B is much larger than Group A, and that the Haskell maintainers want to emphasize "if you use accursedUnutterablePerformIO, make sure you are in Group A!".

Most Group A people didn't get there fixing surface level stuff. They got there because there was an insidious defect buried deep that required that deep knowledge to understand or fix.


Documentation is not a substitute for reading the code, but rather a complement to it.

Code by itself is a limited medium that can only do so much. Asking someone to rely solely on code to explain a complicated subject is counterproductive (the more complicated the code is, the harder it is to guess whether something in the code is a mistake, or working as intended with some unknown reasoning behind it).

Documentation points out the non-obvious parts, the reasoning behind it, the subtleties, the gotchas (things to watch out for), the best practices, etc.

Documentation is a map to guide the user, not an end-all-be-all about what the code does.

When the code does non-obvious things or has non-obvious reasons, you need both code and documentation.


I would agree in general, but the comment I was replying to is criticizing the following documentation as arrogant:

"If you aren’t willing to read the source of the function to learn what it does, we don’t want you anywhere near it."

And I believe that might be reasonable for a complex function that cannot be exhaustively documented in any form shorter than the source code.


I disagree, but I think this comes down to the wording and tone.

The following would be perfectly justifiable and I'd agree with it wholeheartedly:

"This code is incredibly complicated and we don't have the resources to document it properly. So please DO NOT GO ANYWHERE NEAR THIS CODE unless you've spent weeks/months studying it in depth. And even then, think twice!"

One is a friendly "here there be dragons" style warning that explains some things (such as why it's not documented). The other is a passive-aggressive accusation followed by an "I know better than you, kiddo" style enforcement, which exudes arrogant presumption.


But that phrasing makes an entirely different point. It's not "we don't have the resources to document it properly" — it's more like...

"There is no possible documentation that could help you to not screw up when using this function. We've tried. There used to be docs here — first as doc comments, then hidden within the function (to force you to read the function to find them), then as a special pre-commit lint rule that triggers when people add a new use of the function to the codebase without adding a special annotation to acknowledge it. None of it helped; people still screwed up the usage. Even the people who wrote the function screwed up in new usages of it — every time they tried! — because the number of things you need to remember to get right to use it, exceeds the capacity of any single human being's short-term working memory. The very existence of this function is Ozymandian arrogance. It should be banished from the Earth. The only problem is that people keep reinventing it. We're waiting for a sufficiently-smart linter, so we can teach it to not let people reintroduce any function isomorphic to this to the codebase; then we can truly excise it for good. In the meantime, we leave it undocumented, for a bit of security through obscurity: a thing that you can't find a good way of coming to a seeming understanding of, you're less tempted to use."


I'd argue that both of you are somehow correct. Give me the code, so I can see what it does. And give me documentation (be it correct comments, a dedicated documentation or whatever) to know why it does something.

Code states the obvious, but you cannot read between the lines.


It reminds me of how narrow roads are sometimes used to slow traffic, though I dont know if its effective here. Im hesitant to discourage it when the language is so explicit about it being bad, it might be quite effective afaict



It's documented that you need to read the source.


Good documentation tells you a lot more than the source could. Especially when you're talking about horrific side-effects.


Another correct way to use Hungarian Notation, according to (iirc) Joel Spolsky


To me unsafe generally means “memory safety or type safety is disabled”. It doesn’t mean the code can’t run in production. Maybe some just implemented fast inverse square root and tested it manually.

danger_force_electrify_fence() is a whole different story.


> To me unsafe generally means “memory safety or type safety is disabled”.

And that is exactly what it means. Unsafe functions in Haskell do run in production. They just subvert type safety (and I believe memory safety too, in the case of accursedUnutterablePerformIO).


Even plain unsafePerformIO is memory-unsafe, as it will let you construct a polymorphic IORef, through which you can implement unsafeCoerce.


Hmm… is that memory-unsafe? I thought that came under the heading of ‘subverting the type system’. Then again, I suppose they’re more or less the same thing in a strongly-typed and pure language like Haskell.


Yes, it is very memory unsafe. Subverting the type system often the main source of memory unsafety in most languages. It typically means you can treat an arbitrary integer as a pointer to any other object.


Rust and C# both use 'unsafe { }' blocks, which can only be enabled by a module-level compiler flag.


There is no such "compiler flag" in Rust, that's only a C# thing.

But also more relevantly here, stuff you'd be calling in those Rust blocks tends to have longer names and often calls out the fact it's specifically not checked, e.g.

  a = b.unchecked_add(c); // Like a C or C++ arithmetic operation if you overflow it's UB. This might be faster. It might not. But if you must have this anyway, that's how

  n = NonZeroU32::new_unchecked(0); // Unlike new() which returns None because duh, zero isn't non-zero, this results in UB.
Not all of them though, for example:

  v = Vec::from_raw_parts(ptr, len, cap); // Make a Vec, the pointer ptr had better actually be pointing at contiguous memory for exactly cap item-sized slots, the first len of which are in fact legitimately values of the appropriate type, if any of this is wrong that's immediately UB.


While sadly it does not require a flag to enable, at least you can turn it off with: `#![forbid(unsafe_code)]`.


For languages that support Unicode identifiers, you could zalgo-ify the name to increase the friction:

|

|

d̜̣͎̱̟̥͓̥͍̙͖̫̓̀͗̌̚a͍̩̫̳͖̒̒́́͐̍́̍̏͋̿̈n̳̯̫̥̞͗̒̆͗̀͒͒͌̒̎g͉̝̠̝̠͉̰̘͉̩͖͙̾͐̽̂̈́͗ë̗̙̳̮̘̟͙͇̣͇́̐͑̈́͂̾́̇̇̀͒ŕ̗̱̜͚̌̔̓̄̄͒́̄͌̚_̱̭͍̣̳̝͓͔͖͓̀̃͒̑̊͛͛̽̑͋̿͒ͅd͉̰̖͎̰̙̘̰͈̫̃̍́̋̆̀̓͆̈́̓̇̾a̭̦̲̫͕̩͔͙͔͖̱̿̑͌̊͌̾̊͂́͂̉ň͖͔̩͓̮̦̭̜̩͉́̀̍́̑g̖̦͙̝̈͆͗̈̄́̉̽͗̄͆́ȅ̠͔͕̜͔̇͗̊̈́r͎̬̦͈͈͐͑̃̄̀͒̉̓̋_̪̝͔͖̠̘̰͛̂́̔̆́̓̅͒t̖̘̟̝͚̭̦̪͕̘͐̎̋̈̍̋̾̈́̇̈́̉̚ḣ̙͇̝̳͓͇̮̫͙͈͍͊̇͌e̳̖̯̥̭̜͓̜̒̉̎̒_̥̱̲̘̞͓̖̣̝̮̮̌͌̅̀͊̅̍̑̿̂̿ͅp̗͖̘̝͈͙̫͖͍̠͑̎̃̍͌̇͋̍͋͗͑͛o͉̬̗̠̝̗͈̟̘̱̬̫͗̽͆̊́̂̚n͚̪̰͉̪̟̈́͛̓͒͌̎̑́̒̃͌͆ͅy̞̗͖̗̾̂̿̿̀̊͋͐̓̓̚_̜̲͖̣͈̞͖͎͆͑͌̾̃͆ͅh̦̘͇͍͓͕͒̎̒̽͂̊̐̍e͎̰̲̥̱͍̗͉̞͔̦̬̊̌͆́_̲͚̮̮̙͇̠̮͖͐́̎͆̿̄̋c̙̩͈͇̩̯̠͇͙̄̄̒̄̓͑͂͗̈͂̊̚o̪̥̲̗͓̭͍̣̮͙͉͇̽͊̿̔m̫̖̘̰̆̊̒̃̔͛͛̚ẻ͓̭͓̗̙̳͖̭͚̐̀̃́̃s͖͈̬͚̜͊̐͗̓̂͑̀͑̔̿͌͒

|

|

(Yes this is actually a valid Unicode identifier.)

https://en.wikipedia.org/wiki/Zalgo_text


Now that I think of it, a more toned-down version could be:

    d͓ͨa͓ͣn͓ͮg͓ͤe͓ͣr͓ͭ_d͓ͨa͓ͣn͓ͮg͓ͤe͓ͣr͓ͭ_f͓ͨo͓ͣo͓ͮb͓ͤa͓ͣr͓ͭ_o͓ͨp͓ͣt͓ͮi͓ͤo͓ͣn͓ͭ


And it's a perfectly valid function name in Ruby: https://replit.com/@brandt1/WatchfulTemptingRecursion#main.r...


Also works in .NET, although Visual Studio really doesn't like the identifiers.


My own example from a decade ago: https://sourceforge.net/p/portableapps/launcher/ci/default/t...: you can’t run PortableApps.com apps from the Program Files directory for a few reasons, but for some reason we decided to allow overriding the guard anyway, but in a clear and verbose way: you have to set an environment variable named IPromiseNotToComplainWhenPortableAppsDontWorkRightInProgramFiles to “I understand that this may not work and that I can not ask for help with any of my apps when operating in this fashion.” (case-sensitive).


At my old job we had some unsafe-ish operations, mostly you needed to do --force or --really. Typically you would invoke without it, then it would say "heres what I am gonna do, do this again with --really and I will do it for sure"

There was one tool that had '--yolo', because if you're at that point of an operational crises - you might as well give it a shot.


I too have written force features. You've likely never seen this, but in tool design, prefer git push's --force-with-lease design.

Force With Lease says "I believe X is true, and because of that, force". The tool can check that X is indeed true, and if so force, but if it's not true the human was wrong and they ought to re-consider and obtain a new "lease" before we make the change.

In Git this "lease" is the current ref of a remote, if we specify this, but actually it's wrong, that means the state of the remote system has changed, and we need to re-consider whether our forced change is still appropriate. e.g. While you and Bill were quickly changing colors.js to hackily disable dark mode, turns out Sarah guessed the actual bug, and replaced main.css with a patched version that works fine even in dark mode, if you force-push your change, instead of zapping the broken change it zaps Sarah's fix!

This approach works best where you can actually take some sort of "lock" and avoid clashes at the tool level, but there's some benefit even without that at the human level.


--force-with-lease is great, but I really wish it was --force and the current force was --force-without-checking-if-its-safe or something. The more dangerous one being the shorter one is unfortunate, and I think it's worth the breaking change to swap them (with some sort of config option to gradually do the switch).


An even better design is to have the command write out essentially a script file for what it is about to do, then the —-force flag would execute that script rather than start from scratch. That way you avoid having things come up between when you do a dry run and the real deal. This isn’t always applicable but sometimes can be a really nice pattern.


this works well for things that are shell scripts, and we certainly had at least one " check the output and then pipe it into sh". But for the most part that falls away when you have a lot of objects and abstractions and stuff.


I am going to have to steal --yolo. Love it. At a past job it was basically our team slogan (extremely tongue in cheek... has anyone ever said YOLO seriously?) any time we were deploying a change we were at all nervous about. I can't believe we never codified it into our tooling.


It's originally called "Carpe Diem", and it's been a common phrase used and misused for millennia

Of course, no one thinks that when doing something truly dangerous.


Many people have definitely said "you only live once" and meant it seriously.

Yolo became a joke came AFTER people did a bunch of stupid shit.

Kind of like "hold my beer".


YOLO was the motto in the popular Drake song “The Motto”…


Was this old job Canva by any chance?


For 1, I have seen somewhere that it a reason why C++ casts are so verbose compared to C casts: like "reinterpret_cast<A>(b)" vs "(A)b". Type casting in C++ may be the result of bad design. And while casts definitely have a place, they are ugly, and therefore, they should look ugly.


But C++ still lets you use C-style casts, which kind of defeats the point.


Not so much, you could phase out C-style casts over time. C++ compilers could grow options to warn about C-style casts (if they haven't already), shops could outlaw them in coding style guides, and slowly it would go away.

What does kinda defeat the point is that C++ adds a new form of cast that is even less conspicuous than a C-style cast:

  typedef int* int_p;
  int f(char* p)
  {
      int* ip = int_p(p); // basically a reinterpret_cast
      return *ip;
  }
  int main(int argc, char** argv)
  {
      return f(argv[0]);
  }
gcc -Wall compiles this with no warnings.


Ugh, I really need to clean up and submit my patch that adds those warnings. Hopefully I'll have some time this year.


You can blanket ban C-style casts with appropriate warnings and static analysis.


This is the flag required by our script to create and merge pull requests from one environment to another:

MERGE_APP_REPOS_YES_I_ALREADY_MERGED_LIB_REPOS_AND_WAITED_FOR_THEM_TO_FINISH_BUILDING


This seems like a broken process to me.


It's a script run by a human when they want to push the changes in one environment to another. Our various apps consist of 3-10 repos. All the apps are backed by two shared lib repos (one for node lambdas, and one for C# lambdas). Merging each repo manually to do an environment push is cumbersome.

The script creates and merges pull requests on github for the lib repos and app repos - from one environment (branch) to another. First you have to do the lib repos, and if there are any changes wait for the CICD jobs to finish. If there are no changes to the libs, the script indicates that no pull requests were needed.

I have no idea how to get a local bash script to poll for when AWS CICD jobs are done, nor do I really want to spend days on that. So I just added that parameter to make sure I (and anyone following me) runs the lib merges first and waits for them to finish building before merging the app repos. The job without any flag just merges the lib repos.


Next time there is a HN thread about monorepos I think I'll just posts link to this comment.


On small fixes, it comes in handy to be able to separately deploy front end code, API Gateway code, or different back end lambda groups. But other times, like big feature additions, most of the repos have new changes, and I need to push an entire environment.

Part of this is an artifact of using lambdas and serverless.com. Serverless deploys all the lambda code in a repo to each lambda. It doesn't try to parse which code is shared. If I add too many lambdas to a repo, deploys take longer and I run into the AWS lambda code size limit.

But I also don't want a repo for each lambda because then I'd have to automate a ton of stuff that's currently manual, and for our tiny shop I think that would be overkill. We'd almost need a dedicated automation person.

We also have repos for several react front-end web apps, and the AWS API Gateway code that connects the front end to the back end. It's very common that I only deploy the react web-app. I have no idea how I would do that in a mono-repo that had all the front end, API Gateway and serverless.com/lambda code. I guess git tags? But I'd always be worried about screwing those up.


Why not create a draft PR on the app repos and tell user to merge it when lib builds are done?


It would be the same result. A human still has to wait for the lib builds to finish and then take action. This way there are two fully automated steps, with one human wait step in between them.


Yeah, but it will be a simpler script though.


Seems like that is intended to never be used by humans and is used by a process that has always done the other thing first (unless you're manually debugging the process itself in which case you'll copypasta the silly long thing).


if_you_use_this_you_will_be_fired is my favourite suffix seen in production.



It's used in many of Facebook/Meta's internal code bases, in fact.


And it has been for a long, long time

https://twitter.com/thefotios/status/689130197891829760


This comes with a huge caveat: You must explain it in enough detail for anyone encountering it to understand why it's dangerous and what they should be doing instead, if at all possible. The documentation must at the very least be where the name is defined. "But why?!" is exactly the wrong thing for a developer in a hurry to be thinking.


my contribution: public_method_that_only_deep_merge_should_use

https://github.com/chef/chef/blob/68dd5f42273f19bc5975c0dc8e...

that was 9 years ago and it was code smell that things were broken apart incorrectly and at some point i rewrote it so that wasn't necessary -- but sometimes you just gotta move the ball down the field, even if you don't get a first down.


I have a similar insight regarding long names.

If you strive to make your function names as self-documenting as possible, you then end up with the following situation: either

1) Long-named functions become a signal that code structure is suboptimal and can be refactored to split functionality / responsibilities appropriately, until the function has a reasonable name, or

2) You are working with code that cannot be refactored, in which case the long, cumbersome, but otherwise highly informative name is doing you a favour, as well as possibly forcing you to wrap meaningful interfaces around it to make you less reliant on calling that monstrosity directly as much as possible.

So yes, I always try to make my function names contain as much relevant information as possible.


Yep - we do the same thing: functions which don't (for example) check if a given user should be able to do the given thing use then naming pattern dangerously_verb()


i wrote "this is dangerous" in a comment the other day and then it failed on me the next day. whoops. i guess i was right


I've noticed a lot of developers that have this kind of code purity complex where their way of doing things is superior even if the actual code Works - which I think is the number one most important thing about code. The code purity complex is a self-limiting decision.

Those developers will be sitting at home fuming, I guess, as I cheerfully build a new feature on VB6 code last changed in 1999 that relies on using "On Error resume next", on an obscene contract rate.

Working only with Good code is a luxury decision, and writing Code That Sucks isn't necessarily bad, if thats the fastest/cheapest way to write code that works. The end user almost as a rule, doesn't care about your code purity. A developer probably couldn't even explain that purity in a way that would make the end user understand its value. This method name indicates a lack of understanding that finding the right balance of speed, cost and quality is a compromise.

Yes, tests that can run in isolation are "better".


If you need a metaphor to understand this perspective: there's a difference between building a bridge that gets some cars over it, and building a bridge that will continue to get cars over it, in thirty years, while accounting for the expected investment of maintenance over those thirty years, and supporting an expanded pedestrian and bike lane added ten years down the line, to the best of the ability of the engineer during the initial planning, while keeping build costs under control.

Yeah; working only with Good code is a luxury. But simultaneously, admitting to recognize "It Works" as the most important metric to evaluate good engineering practices by is quite literally what led to the East Palestine disaster last week. Being able to abscond from the responsibility of the repercussions of your "It Works" solution is also a luxury; or, charging those "obscene contracting rates" you talk about.

Software Engineering is nascent. We don't always know, and even less often have a good shared language to talk about, what is "The Right Decision" in every context. That isn't an excuse to not try.


You have the wrong metaphor, since bridge projects are all more or less the same, but software projects aren't.

I like the shed and skyskraper metaphor better. It makes a clear distinction that some projects are sheds (can be a single script) all the way up to skyscrapers.

Some projects don't even need to be extended afterwards, like most computer games.


I like the bridge metaphor because they have similar problems. Over 230,000 bridges in the US alone (a third) need repairs or replacing.

https://www.artba.org/2020/04/12/230000-u-s-bridges-need-rep...


It doesn't really work with either metaphor. Somebody building the Golden Gate bridge or skyscraper knows that they're building it and has a clear picture of what it will look like once it's finished.

Software is more like a settlement. It may start out looking like a village and end of life looking like a village or it may catch a wave and end up looking like a megalopolis - perhaps a clean, well organized one like Singapore or a disastrously organized one with chronic traffic, corruption and crime like Lagos.

A lot of people out there think that they need to make space for a subway in their village coz it's Definitely Coming One Day while others don't have time to build a subway even while they suffer 7 hour long traffic jams.


Metaphors don't have to apply to every conceivable perspective of looking at the subject. They are there to simplify transfer of knowledge

I'm sure there's some way the shed and skyscraper metaphor fails too


I agree that this is a much better metaphor.

I've often worked with developers who want to build the foundation for a skyscraper just to build a dogshed.


I simply won't take the job if someone demands I do a rush job of programming train braking systems.

It seems like you're reading into what I'm saying a commitment to "not trying" and "[a] metric for evaluating good engineering practices". I'm absolutely not, I'm arguing against the kind of sneering belief in self-superiority that leads some software developers to feel like someone who uses a particular method or way of making working software, "sucks".


There’s a balance here, and it’s important to recognize that often the “but the code works” mantra is used by the lazy or incompetent to justify poorly written code.

I think everyone has been burned by something different, and sometimes we read internet comments that are easy to misinterpret as supporting The Thing We Hate That Burned Us.


While it's true that the end user ultimately cares about whether the code works or not, the argument that code purity is a self-limiting decision ignores the long-term costs of not prioritizing code quality.

Code that "works" in the short term may have unforeseen consequences down the line, such as increased maintenance costs, difficulty scaling, or even security vulnerabilities. It's also important to consider the impact on other developers who may need to work on the same code in the future - writing messy code can create confusion and make it harder to collaborate effectively.

More to it, the idea that working only with "good code" is a luxury decision assumes that writing clean, well-structured code is inherently more time-consuming or costly than writing messy code. A well seasoned engineer should be able to balance both because they know that better architecture will ultimately save time and money in the long run by reducing the need for debugging and maintenance, and making it easier to add new features.


> More to it, the idea that working only with "good code" is a luxury decision assumes that writing clean, well-structured code is inherently more time-consuming or costly than writing messy code.

If I let juniors push code to master only if it adhered perfectly to my vision of "good code" I can assure you that I'd be consuming a lot of time. But, just because they don't use list comprehensions or lambdas or accidentally throw around state or whatever doesn't mean that they "suck".


I think there are contexts where this approach is sufficient.

But the little Dijkstra in me can’t stand for it in most professional situations.

I’m not the kind of programmer that delights in not knowing what I’m doing and fixing the result of my neglect later on.

I also charge obscene rates. Usually to fix piles of code no one understands anymore.

Tests I find are a bare minimum and are not even sufficient in more cases than people realize. It’s a shame though that even this low bar is too much for some.

It’s a good balancing act to know when the enemy of a good plan is a perfect one… and what makes a sufficiently good plan.


"Tests I find are a bare minimum and are not even sufficient in more cases than people realize. It’s a shame though that even this low bar is too much for some."

That's in part because some really bad hipster devs decided they knew better and nixed integration testing...

"Tests" usually means "happy path and/or unit", which nearly always is that ultra low bar.

The amount of testing required to truly validate something of reasonable complexity is so staggeringly high, if your tests run in under an hour you probably didn't test it, or you are building something really, really simple.

True testing is about covering the combinatorial explosion, and that's assuming your problem space is decidable.


> I think there are contexts where this approach is sufficient.

Absolutely!


> Working only with Good code is a luxury decision,

> on an obscene contract rate.

Alternate take- being a contractor who gets to roll off the project in 3 months and therefore doesn't have to support the code they write is also a luxury.

Those of us who are in-house, on the other hand, do have to support the code we write. And writing Code That Sucks is a great way to burden our colleagues and our future selves with technical debt that is expensive to pay down, if it even gets prioritized in the roadmap at all.


I've been in-house and wound up being the last programmer maintaining the massive pile of tech debt. We had ordering issues and couldn't run our tests in parallel due to global state problems. And it wasn't a mistake here or there, it was the accumulation of a decade of issues. I took a few stabs at it, but doing a week or two of work here and there never moved it significantly forwards. The minitest author took a run at fixing a few of the problems and recommended scrapping the tests entirely and doing a full rewrite. There was no way that I had the time to do that (literally, like me). The answer of course was that the company was effectively dead and it was long past time to quit.


> And it wasn't a mistake here or there, it was the accumulation of a decade of issues.

Nodding my head to this. I feel like this is why it’s so hard to convince certain coworkers to take a hard line on code quality. “The class is already 2000 lines of code long, adding one more method isn’t the end of the world.”

No one sets out to turn a 100-LoC class into a God class in one PR. It’s always a “death of a thousand cuts” situation. That’s why we need rules which strike some people as arbitrary or draconian, but which I see as a useful tool to fight backsliding, like these [1]. You have to draw the line somewhere.

1. https://thoughtbot.com/blog/sandi-metz-rules-for-developers


Yes, but what do you do once you have a mess and you're understaffed, and the horse left the barn a long time ago?


That’s when engineering management needs to make some tough choices, like “No new features on the roadmap until we get our house in order and pay down the tech debt”. Give the staff who are there the time they need to plug the holes in the ship.

And if engineering management doesn’t have a seat at the table, i.e. if product or sales or the C-suite looks at engineering as a bunch of order-takers, maybe that company isn’t meant to be long for this world and it’s time to start polishing our CVs.

I’m an engineer, not a miracle-worker. My help isn’t for people who need it, it’s for people who want it.


> That’s when engineering management needs to make some tough choices, like “No new features on the roadmap until we get our house in order and pay down the tech debt”.

If I had a penny for each time I'd seen this argument made and knocked back...


> If I had a penny for each time I’d seen this argument made and knocked back…

That’s what paragraphs 2 and 3 of my comment are for.


And those really don't change anything about my response. Your solution seems to be "if you're at a bad company, quit", and like I said earlier: "Working only with Good code is a luxury decision".


Software engineering is still a seller's market, even with the recent rash of firings. Most engineers are much more employable than they give themselves credit for.

Saying "Working only with Good code is a luxury decision" is like saying "I can't afford to buy fruits and vegetables, so I'm going to live off of instant ramen for the rest of my life." Like, yes there are people who actually live like that. But it's an incredibly short-sighted way to live, and they'd be better-served by making (drastic) lifestyle changes.

Mandating good code is an investment that costs more in the short term, but provides dividends in the form of faster feature delivery over the long-term.[1] It's a luxury for apps that don't expect to have a "long-term", just like fruits and vegetables are a luxury for people who don't expect to live past 50. But it's a necessity for developers who want their apps to be extensible and responsive to change.

EDIT: looks like we reached the limit on the comment reply chain. I’d agree with your comment below, and I’m not here to defend the naming choice of the method in question. The reason I love the Ruby community is, among other things, because of the whole MINASWAN ethos. And I think the same goal behind the method name could have been accomplished in other ways. I’m here to dispute the idea of code quality as a luxury, that’s all. Sounds like we’ll have to agree to disagree on that point.

1. https://miro.medium.com/max/605/1*RcmxmnYCcKTrCdJpCo7WYg.png


> Mandating good code is an investment that costs more in the short term,

Yes, the system we work in demands short term growth and profits. This is why there is a problem, not because developers "suck".


You clean it up. I've cleaned up projects that we're total shit shows and the cleanup paid dividends when the time came to update or change that part of the code.

Even if a project has gone to seed, things rapidly get better when you start to do things the right way again.


Yes, I started to clean it up. If I had stayed there for another 10 years, I might have finished.


I currently work in-house, and am, today, refactoring other people's tech debt!

I don't think some of the code I wrote on that contract will ever be altered again. Pretty sure the VB6 runtime will be impossible to find soon, if it isn't already.


I think I measure tech debt by how easy or hard a piece of code is to change, in response to changing requirements. So yes, I’d agree that if code is never going to be changed again, then the concept of tech debt is less-applicable.

That said, predicting which code will or won’t be changed in the future is a difficult prognostication. It seems safer to assume it will be changed on a long enough time span. Then if it’s not, you can be pleasantly surprised.


I tell people, code architecture is designing in flexibility you will never use. It's a pretty safe bet you won't guess right.


> ... have this kind of code purity complex ...

I think it's fair to describe "tests are order dependent" as technical debt. Taking on a bit of technical debt for better code velocity is an option.

It may be a good option (depending on the situation); but with such tech debt, there are risks like bugs which might not be found until later, or some other impact which could make it much harder to maintain.


That, but it also costs me a lot of time to set up and tear down our database. If I had to do that for every test everything would slow to a crawl.


If you can reuse the test db then it's still not order dependent. If your test framework doesn't support reusing setups, you create some kind of function like getOrCreateDbConnection() that idempotently set up your test environment.

Order dependent tests happen when you have like a createUserTest and a deleteUserTest and the latter doesn't actually create the user it deletes but requires being called strictly once after createUserTest.

Some of our code at work is like that and it's a pain because it means the tests have to run sequentially (slow), you can't run any single test in isolation and if something breaks it sometimes causes a failure in the wrong test.


I try hard to mock out the database calls. I know some disagree, but I'm pretty productive and the occasional bug that does slip through because of this is worth the time saved, IMO.


don't tear down the entire db for each test, make utilities that create the base objects you need to operate on and chain them together


Absolutely agree with you here.


Oh my God, this comment shows up on a lot of these threads about software correctness and I can't disagree more.

* Today, modern software is failing left and right. It's a serious problem and it's extremely frustrating. In part, this is because software industry has abysmal correctness practices. In part, this is because when software engineers write code correctness is an afterthought.

* Software correctness, is not easily measurable, and therefore cannot be approached as a metric, instead should be approached as a mindset. Software correctness is about designing systems such that bad things cannot happen by definition. When approached this way, seeing instances of unsafe ideas is sufficient red flags and should be removed. This is because the fact that unsafe constructions exist is sufficient evidence that unsafe behavior exists.

* When unsafe code is written, it can be shown that it "works", but this takes effort and engineering time. When things are designed such that they cannot fail, it doesn't take effort and engineering time to make sure they "work". This is why designing from the beginning with correctness in mind is superior than writing crap code and then debugging later.

* You say working with good code is a luxury, but it's actually the opposite sometimes. Sometimes you can write code that works 99.9% of the time and call it a day. This is a luxury. Some of us can't. Sometimes it really is the case that if software fails 0.01% of the time someone will be physically harmed. In cases like this some software engineers scratch their heads because they know how to fix bugs, but they don't know how not to never add bugs in the first place.

Why did I write this comment? In case you wonder:

* You say some people have code purity complex, which you use pejoratively. But in my experience, some people also have code impurity complex such that they think somehow they're better because they're capable of dealing with crap code without complaining. It occurred to me your comment lacks self-awareness in this regard.

* You say "developers will be sitting home fuming", as if this out of some abstract religious principle, when there are concrete, sometimes mathematical reasons why certain code purity concepts exist.

* You say "code purity is a self-limiting decision" without realizing writing code with correctness as an afterthought is often a more significant self-limiting decision because it increases the cost of the software in terms of maintenance and refactor cost.


> But in my experience, some people also have code impurity complex such that they think somehow they're better because they're capable of dealing with crap code without complaining.

You better believe that I complain loud and hard about crap code. I complain about how crap other people's code is, and I have the self-awareness to complain about my code that is crap. What I have a problem with is the attitude some people have that developers "suck" because they write crap code.

For example, some people write method names that aren't descriptive. It happens. I personally don't think they should be forced to write "I suck" 30 times on their school blackboard because of this. What would that actually achieve?

> You say "code purity is a self-limiting decision" without realizing writing code with correctness as an afterthought is often a more significant self-limiting decision because it increases the cost of the software in terms of maintenance and refactor cost.

No, actually, I do realize this. But the reality of writing software is that sometimes the stakeholders don't care about future maintenance and refactor cost. From the VB6 example I used before - the runtime will actually be out of all versions of Windows from 2024. But that's a problem for Another Project. You may think that writing "incorrect" code is a self-limiting decision but I can assure you that I am doing fine and my clients/managers/colleagues have generally enjoyed working with me.


Amen to this comment, I apologize if my previous comment went too far, it seems like we're overall on the same page, at least on some issues.

> For example, some people write method names that aren't descriptive. It happens.

I agree, and I would actually argue names don't matter. Well, they do, we should still pick good enough names, and never pick adversarially confusing names. But ultimately, code should be clear regardless of the name. Take a look at this Haskell talk for better arguments than I can write here: [1] Part of the reason why designing software that by definition cannot include bad behavior is such a powerful concept is because even when we inevitably name things terribly, the harm is contained.

[1] https://www.youtube.com/watch?v=ZSNg6KNzydQ

> But the reality of writing software is that sometimes the stakeholders don't care about future maintenance and refactor cost

Sure, but this part of the problem. As engineers, it's also our duty to explain to business people the implications of these. If ultimately the business decision is that "this thing won't matter in X years" then there is nothing we can do, but the limitations must be known and clearly communicated. Which, I don't think you disagree, but I didn't think this came out in your original post.


> As engineers, it's also our duty to explain to business people the implications of these.

I agree wholeheartedly! It is always a disappointment for me when I'm unable to adequately explain to people why we need to minimize technical debt, or write good tests, or use linting, etc, but I think many developers also struggle with this, and so I don't think it would be fair to say a developer "sucks" for having to drop those to write code that works in the amount of time they managed to negotiate. But, some of the best developers (technically) I know also have pretty bad communication skills!


If we accept this as true and grant that there was a method that produced code that in many ways was superior. It would take time to write and require less effort to maintain and on top of that it would have less bugs and be more reliable, wouldn't that give whatever shops adopted these practices have just an insane competitive advantage?

This seems extremely paradoxical, especially given, in your own words "[t]oday, modern software is failing left and right."

Why isn't cheaper, faster-to-develop software that requires less maintenance and isn't failing left and right able to completely take over?


The parent specifically states that correct unsafe code is more expensive to produce though?

I agree with the parent anyway. Most programming is relatively low stakes (at worst the user restarts the app or make a new shopping cart or sends the message again etc.) which then means that most programmers are trained to write low-stakes code. IMO this is why "turning it off then on again" and "delete the configuration files" are such common advice.


Seems to contradict this statement:

> When unsafe code is written, it can be shown that it "works", but this takes effort and engineering time. When things are designed such that they cannot fail, it doesn't take effort and engineering time to make sure they "work". This is why designing from the beginning with correctness in mind is superior than writing crap code and then debugging later.


If you don't have a lot of ego about code purity you probably won't mind typing a mildly self-deprecatory thing in your test setup, either.


And I guess similarly if someone does care intimately about test ordering they also ought to care that method names are descriptive. Maybe they should have had to call a method named "i_suck_and_dont_care_about_method_name_quality" to put "i_suck" in a method name? I guess its hypocrisy all around in that case.


The point of a method name like "I_suck_and_..." is to discourage misusage of the API,or at least provoke conscious thought about whether it's a good choice to make. That the name adds friction is an intended feature, not a bug.

Writing or seeing "I_suck_" is obviously going to cause you to stop and think about it.

I think there's some reasonable contention about whether "I_suck" is professional, though, sure.


There's a difference between "provoke" and "belittle", though.

I don't think I would feel too bad about myself if I had to call a method starting with "i_suck_", but I can understand and sympathize with people for whom having to type in that method name would strike right at their impostor syndrome.

The question is, of course: should we cater to those latter people? The author of this Ruby code clearly thinks not (or didn't think about it at all). I think that we (or at least I) should.


If someone would feel bad for writing i_suck as part of a method call... I don't think they'll last in the real world anyway

What's the upper bound on ego fragility that we ought to consider?


The same could absolutely be said of people who get a little emotional when they are forced to write a method allowing tests to run in order.


I agree

I'm not sure if it's always been there and I just didn't notice it, but it looks like there's been an increase of people who have an ego/spine that's as solid as your average jellyfish

How the heck do these people go about life if even small things like that are sufficient to faze them? Do they just have a breakdown or shut down if anything actually taxing happens?


If the API can be misused I think the snarky code purity response here might be to say the API is flawed, and the underlying mechanism that requires "misuse" ought to be fixed.

Also another problem is if the API can't sufficiently WARN developers when they're doing something dangerous. I think we all know there are already ways of doing this.


> ... I think the snarky code purity response here ...

I think if we can agree "taking on tech debt can be a good approach" and has dignity, we also have to agree that wishing for code without tech debt is also dignified.

:o) Which is perhaps why the "I_suck_and" gets removed from this property.

> If the API can be misused I think ... the API is flawed, and the underlying mechanism that requires "misuse" ought to be fixed.

Which in this case would mean "minitest should only be able to run tests in random order". -- I think having an (inconvenient) option as a concession is a reasonable trade off.


The context here is that Ryan Davis added test order randomization as a default feature to minitest, which was meant to be a mostly-compatible drop-in replacement for the old Test::Unit. Some people complained a lot about this and requested a flag to turn it off. I remember this being hashed out over some time on the ruby-talk mailing list:

https://rubytalk.org/t/minitest-randomization/60387/3

I think the straw that broke the proverbial camel's back was that Rails had order-dependent tests at the time. So Ryan added the flag, with his reluctance reflected by the mildly insulting name and doc comment.


> we also have to agree that wishing for code without tech debt is also dignified.

I think then my question is how to achieve this wish given the realities of the production of working code as demanded by business. Is this really a realistic wish? I am yet to work for a company that had code and didn't have technical debt. I am yet to hear of such a company.

> I think having an (inconvenient) option as a concession is a reasonable trade off.

Exactly. Perhaps these specific minitest devs ought to have called their library "i_suck_and_wrote_a_testing_framework_that_allows_writing_tests_that_are_order_dependent_and_also_its_called_minitest". :D


I find it completely unreasonable that you would blame the testing framework here. No, that line does not make sense. It does the opposite of make sense: they made a very reasonable concession and you want to call that sucking? No.

While code that needs this feature really does suck.


Hey! I guess they should have just written code that didn't necessitate the need for this method, in the first place! Oh well, everyone has their flaws! The question is if they need to write "I_Suck" all the time as penitence.


> Hey! I guess they should have just written code that didn't necessitate the need for this method, in the first place!

They did! It's not their code that necessitates it.

The only way you can say they necessitated it is by omission, as in "they didn't write the code to fill this need people have, therefore the need went unmet". But that would be nonsense because seven billion people are guilty of the same crime. No, it's the people that wrote the code under test that necessitated the need. And those people are not the minitest authors.


I think I get what you're saying! It's fine to write code that allows for and enables bad practices when it suits a pragmatic need. We therefore totally agree.


Yeah! But it also helps to discourage those bad practices in the future.

So when the minitest authors added this feature, they enabled the bad practices for a pragmatic need, and also did something to discourage it.

On the other hand, when the people wrote those tests, they weren't writing them that way to suit a pragmatic need. They were just making a mistake.

That's why the former doesn't suck, but the latter does suck. And the writers of those tests are not just enabling bad practices, they are directly performing the bad practices.


I see, so if bad practices also discourage their future usage, things are fine. So if someone writes a bad test and it makes code refactoring harder, things are fine! Perfect!


> So if someone writes a bad test and it makes code refactoring harder, things are fine!

I said it's good to discourage bad practices. Unless you think refactoring is a bad practice, you have understood my comment incorrectly.

But also you can remove the part about discouraging from the previous post, if it's confusing:

Minitest enabled bad practices for a pragmatic reason.

The tests were not following bad practices for a pragmatic reason, they did it by mistake.

Enabling is less bad than doing, and minitest had a reason while the bad tests didn't.

Therefore minitest doesn't suck and those tests do suck.

Edit: fixed a couple typos


I think we should probably go back to what I said at the very start: "I've noticed a lot of developers that have this kind of code purity complex where their way of doing things is superior even if the actual code Works - which I think is the number one most important thing about code."


So that's calling them wrong about whether the tests are bad. Okay, fine, you can have that opinion.

But there's no way to twist that around into "minitest devs are the ones that suck by their own logic".

Either nobody here sucks at coding, or some test-writers suck at coding. But those are the only two options that make sense.


> So that's calling them wrong about whether the tests are bad. Okay, fine, you can have that opinion.

What you're saying, is also just an opinion.


When I say that you can't turn their own standard against them, that is not an opinion. (Except in the "everything is an opinion" sense.) I'm asserting a fact about how words interact.


I'm sorry, if you don't think there are opinions about how words interact we really need to go back to fundamental philosophical basics here.


I do think there can be opinions, but if you stretch that enough then at a certain point you hit "everything is an opinion" territory and then it no longer matters if you call something an "opinion".

When I say your disagreement with them is an "opinion", I am not using that definition. I'm using a much more narrow definition. You have different ideas of what makes code good. Different preferences.

While my argument is not based on my preferences. When I called those tests bad I was just explaining the point. It doesn't actually matter if I think the tests are bad or not. Either way the minitest devs are absolved of their own criticism.

If you would like me to use a different word than "opinion", you can suggest one. The point is, please understand what I mean, and let's not argue about how to define words.


> The point is, please understand what I mean,

If misunderstanding is possible then clearly it's not simple to understand what people mean everytime. Maybe if you explained more clearly I'd understand why you feel your opinion is factual and feel what I say is just opinion.


What necessitated implementing the method was that _other people_ were saying that random test order didn't work with their existing test suites (written for Test::Unit). Not only did the minitest author compromise and add this method for people unable or unwilling to fix their test suites, but he even wrote minitest-bisect to help debug ordering issues. I think this is more than enough from someone working as a volunteer on an open source project.


Wishing for code w/out tech debt is naive; not dignified. It's just the lesser of whatever two evils were going on at the time the code was written.


Ryan Davis, the author, is a well known figure in the Ruby community and has never been accused of being too subtle.

This is a little tounge in cheek, but also he means it. He is kind and generous, though won’t shy away from being abrasive to get a point across.


I'd like to think I'm the same! He should definitely have to call the "i_suck_and_dont_care_about_method_name_quality", method :P


Minitest has a number of cheeky little judgments like this [1][2], especially in the docs. That said they support all these things they don't prefer, within reason, and I appreciate that.

Minitest is also full of all sorts of weird peccadillos like being written in the "Seattle.rb style" and autloading files in all gems that match a certain path [3]. These are not how I'd structure a plugin framework or write code, but it works and the code is easy to understand and hack on. I've spent a lot of time reading the internals of minitest, monkey patching or generally torturing it in ways they probably wouldn't prefer and it's been a trustworthy, if a little bit judge-y, tool for years.

[1] https://github.com/minitest/minitest/blob/master/lib/minites...

[2] https://github.com/minitest/minitest/blob/master/lib/minites...

[3] https://github.com/minitest/minitest/blob/master/lib/minites...


Honestly really enjoy conversational comments, I think documentation that is conversational is amazing.


This is a terrible attitude towards code you support for a number of years. It seems like you just get to dump on the next person, so I guess it works for you, but if one of my people sent a hatchet job in for code review they can expect to get an earful until they do it right.


Knowing that refactoring for developer ergonomics isn't the first priority of every project is important, I think. If I was working for you I'm sure you'd argue with the C-level well enough to give me the time to write tests that run in isolation, which would give you enough time to give me an earful if I didn't!


I think it's good to prevent my test suite from becoming unintentionally stateful. People come and go from projects and you have to keep on top of security updates, so being strict about how things are done makes the project easier to manage in the long run.


> I think it's good to prevent my test suite from becoming unintentionally stateful.

Agreed.


That's the difference between builing a shed and building a skyscraper.

And most developers don't understand that sometimes you want to build a shed. In that case, applying the skyscraper building conventions is overkill.

Of course building a skyscraper like a shed will never get you past the 4rth floor without the whole thing collapsing on you.

Also, at the end of a finite project, you can take some shortcuts to get things finished.


> In that case, applying the skyscraper building conventions is overkill.

Of course, which is why applying grandiose judgements to the people who just choose to build sheds, is misguided in the extreme.


Yes indeed, I fully agree.

I once insulted an employer with this expression :).

I told them that in the past we were building sheds, and that it all went very well. But now there was a skyskraper project, and that they cannot expect us to build that in the same way. I wanted to make clear that this bigger project needed more structure to support it.

All they heard was "what??? You think we build sheds???" :D.

Nothing wrong with building sheds of course. In fact those are the most fun projects.


Either you mostly worked by yourself or on a small team or you don't realize how much your coworkers hate you.

In my experience, code quality isn't usually about some purity complex. It's a recognition of the fact this code may have been written once but will have to be read many more times


Actually the most common reason I get a job is referrals from ex-coworkers.

ed: Also probably worth noting the reason for this may partially be because I don't say their code "sucks".


I work with someone like this. And yes, we hate him.

Unfortunately he’s the CTO. Of a YC-backed company, even!


> even if the actual code Works - which I think is the number one most important thing about code

Ehhhh if the most important thing about your code is that it works, IMHO, that code should be, call it, deep tech code. Exactingly defined requirements, operating contexts, rigorous development processes (thinking of you, NASA checklist). Things like databases and aerospace.

Otherwise the two (IMHO) most important things about code are adaptability and maintainability, not correctness; one way I look at is that your code can't be more correct than the product design. Give me pure but incorrect code and I can fix it. Give me impure but correct code and I can't adapt it when the needs change.

IMHO (and bias disclaimer: Rubyist here) this is why Rails took over web and Python took over academia / ML - Python is about being correct, Ruby is about being adaptable - so Python is better suited to the well-defined-hard-to-do stuff, and Ruby is better suited to the poorly-defined-easy-to-do.

> explain that purity in a way that would make the end user understand its value

Challenge accepted! :)

Good code breaks down the business and product concepts into good elements, and this has many downstream impacts. You, the end-user, can only use the concepts that we've baked into the product, and the product can only be made out of the concepts that we've baked into the code. The less this is true for something - the more that it's a powerful, flexible tool - the more expertise you need to use it (looking at you, Jira) and the closer it gets to just having you write code yourself. So doing a good job of picking the concepts we use to program a thing impacts how well you, as the end-user, can use those concepts to accomplish your goals and meet your needs.

(And that's aside from stability, new-feature rates, etc).

Edit: Just to be clear! I am very, very glad that people with your mindset exist and are working on things. I rely on the things that you make that Are Correct when I'm making my Nearly Correct things. So thank you :)


> Ehhhh if the most important thing about your code is that it works, IMHO, that code should be, call it, deep tech code.

I think code that is adaptable and maintainable but, doesn't work, is universally useless regardless of context. Definitionally, useless.

> Challenge accepted! :)

Unfortunately I am a HN denizen and therefore disconnected from the world of end users, who are sometimes convinced by my arguments that refactoring and good testing is important, but in some cases aren't, particularly when they have me on an obscene day rate for a 60 day contract where I'm delivering 3 projects on 3 entirely different codebases I've never seen before (one of which it takes me a day and a half to even compile).

I have a constant desire to write fast, good, tests, that run in complete isolation and are entirely reflective of how things run in prod. I miss the mark, many times, and I have a feeling even the purists do, since I keep having to refactor their code.


> work

Fair. Maybe another way to put what I'm trying to say is that "the code works" isn't the most important thing when, in the problem context, "works" is a fuzzy / non-binary state. Example: The code for each of the blue checks "worked", buuuuut.

> they have me on an obscene day rate for a 60 day

Oh god totes fair. I will fall back on the twin defenses of "users != clients" and "explain != convince to budget for".

> refactor

Do you find that better tested code is more easily refactored?

Kinda where I'm going: I generally wouldn't consider "this needs to be refactored" an inherent mark of missing that mark, as that can also come from changed needs and requirements. But I would expect "hitting the mark" to result in code that's easy to refactor (and that's part of how I define the mark that I, personally, aim for).

Edit: to more clearly define my goal posts; better tested != fully tested. IME too many people don't hold their test code to a good standard; so IMO "better tested" also means that the test code itself is also "hitting the mark" (or close to it).


> Do you find that better tested code is more easily refactored?

100% - I am 100% for really good, fast, compartmentalizeable testing that tells you exactly what's broken when it breaks. I am 100% for making people very aware that tests should never require another test before to run to make the test work. It's confusing and bad! Good testing makes refactoring painless, and I love that. It's the implication that test order requirements happen because a developer sucks, that I think is incorrect, and fundamentally displays a misunderstanding of what "good" software is! Good software in my view, primarily works. Everything else is secondary to that.

Works is really in the eye of the stakeholders, in my view! It is fuzzy! It is possible to convince stakeholders the problem they have with the code is, incorrect, and then the code magically "works" again. It is possible to refactor people's expectations and have them think that code without good testing, doesn't work. I find that to be the exception rather than the rule.

These are of course all just my views and others have good points and will disagree.


>IMHO (and bias disclaimer: Rubyist here) this is why Rails took over web and Python took over academia / ML - Python is about being correct, Ruby is about being adaptable - so Python is better suited to the well-defined-hard-to-do stuff, and Ruby is better suited to the poorly-defined-easy-to-do.

I see it the other way round - Rails took over the web because it offered the one true (correct) opinionated way to do web development (plus some great marketing and timing around a wave of Java/PHP disillusionment plus github taking off etc). At the time Python had many competing web approaches some going back to the 90s with no consensus about what 'correct' even was.

I don't see much difference between Python and Ruby on the correct vs adaptable continuum when other languages are taken into account. And often see Rubyists being much more purist in their views of eg Object Orientation etc and regard Python as a dirty collection of pragmatic hacks.


Interestingly about 3/4 websites still use PHP. I'm not usually a PHP dev, but my tendency to happily work with any toolset means I've worked professionally on a variety of frameworks in Ruby (Sinatra, Rails, Jekyll), Python (Flask, Django, FastAPI), and others (.NET, Hugo). I can confidently say that some of the most confusing (and magical) code I've read has been rails code. I don't know though, maybe that was the place I worked, or the version of rails, or whatever. I'd totally work at a rails shop again regardless.


I specifically think of each's principles:

https://rubyonrails.org/doctrine

https://peps.python.org/pep-0020/

"No one paradigm" vs "There should be one-- and preferably only one --obvious way to do it."


> The end user almost as a rule, doesn't care about your code purity.

It’s not that they care about the code itself. They care about the 100th feature being delivered at the same speed, and with the save average bug rate as the first.


A hundred percent, but sometimes we need to deliver the first feature in order for there even to be a company around to deliver the hundredth.


Nobody is talking about process for the first feature. Not in this conversation, not in the real world. That's all you.

There's always a reason why we should do feature n+1 the same way we did feature n. At some point you have to say stop.

You also have to realize that the 'will there be a company around' talk is usually a fiction that gets more work out of the developers. It's a rallying cry, not a statement of fact. Companies run on momentum much of the time. The moment you died isn't the moment the money ran out. It's the moment you didn't avert the money running out. See also 'default dead'. There's a lot of ways to kill a company. Loss of velocity is one of them, and that is dictated most often by tech debt and morale. Eventually the people good at cleaning up messes get tired of only cleaning up messes, and they burn out or leave.


Like I said, there's a bunch of people out there with a "code purity complex where their way of doing things is superior even if the actual code Works".


You’re just repeating a mantra not engaging.

Mostly what I deal with are people whose arguments boil down to “works for me” even when there’s evidence that it doesn’t. Thinking about improvement is hard, and a lot of people find a fictional success to be more fun than an objective one.

We shouldn’t confuse “rewarding things are difficult” with “difficult things are rewarding” or “people who want difficult things are difficult people”.

Most people are gaming the system. While pretending to play an infinite game, they’re mostly worried about getting the next paycheck with as little hassle as possible. Everybody does this to an extent, but many people make enemies of their future selves, which is not healthy. Then they run away from the problems they created, talking a big talk about Progress and bigger and better things when we all know they’re going to make the same mess again and again. Instead of doing the grownup thing and cleaning up their own messes.

They get the same 3 years of experience five times in a row and call it 15 YOE.


> You’re just repeating a mantra not engaging.

That's an interesting perspective because I feel like the things you're saying are not addressing that point: that "code purity" is a self-limiting and often self-congratulatory mantra.


> fastest/cheapest way to write code that works

Pretty much. I do really like good code and I hate touching anything that I have written and know it to be bad. But some things need to be done quick. If I open another project that needs to be green-lit for every single thing...

Also I do a lot of hardware control and some tests really need a defined order. Sure, these stricly aren't "unit" tests, but there are just a very large amount of conditions that must be met.

Components that need time to move, components that have a longer cooldown/heating period that you cannot toggle on/off at all times or in generally should be toggled as rarely as possible.

To test that your code is correct, an order to do things might just be sensible and practical. How long a test lasts is also a quality that should not be neglected. Embryonic position after each "unit" is tested would drag everything out.

But I have no problem with the requirement to call a most vulgar function to be allowed to do so. Doesn't hurt to reiterate the ideal.

It should be said that from small engineering offices to large market leading industry leaders it is still quite common that there are no tests at all...


Having tests not be order dependent is good for cases where you only want to run one test in a file. (There’s a thing people say about wanting all tests to run in some small total time. I think it’s ok for smoke tests – that is, tests that connect everything together and show that the magic smoke is not released – but for more serious testing I think one should expect tests to take a while because they are testing a lot)

The theme of the article seems to basically be giving people what you want though: it’s a tool to take on the technical debt of order-dependent tests if one deems the trade off worth it. One could imagine a library with no such option where such tests would just fail due to ordering constraints.


Most people are not conscious of the effects of friction on their decision process unless you needle them mercilessly to examine their decision process.

Having tests not be order dependent means that the tests are independent, which means if you delete test 3 then test 6 still works. Independence of tests affects your refactoring decisions. Splitting files, combining files, 2/3 conversions, 3/2 conversions, are all a metric pain in the ass if the tests are factored poorly. That means you keep limping along with other tech debt besides the tests.


> Having tests not be order dependent is good for cases where you only want to run one test in a file.

I absolutely understand the value and desire tests that run in isolation. That's super important to me.

> One could imagine a library with no such option where such tests would just fail due to ordering constraints.

One can also imagine a library with purely descriptive method names! And, perhaps that would be an even better library!


I think it’s reasonable to design the interface to guide users towards not making mistakes that will come back to bite them. Maybe this name is too antagonistic but it just seems basically benign to me.


If code quality is a concern then I feel like it's reasonable for me to want a purely descriptive method name and a better warning mechanism.


It's also good for parallelising test running.


> even if the actual code Works - which I think is the number one most important thing about code

The ability for the team reason about the code, why it works (or not) is at least as important as whether it "works". Writing code that "works" for some short period of time is often trivial. There are situations when that's necessary or acceptable (solo project, proof of concept, short runway startup, whatever). But it's a debt decision that must not be taken lightly.

> Working only with Good code is a luxury decision

It's the cheapest decision long term. And that's even before measuring staff retention and other cost driving factors.


> But it's a debt decision that must not be taken lightly.

Agreed.

> It's the cheapest decision long term. And that's even before measuring staff retention and other cost driving factors.

I usually agree. In my example, VB6 is well past EOL, and pretty soon it'll be hard to even run the code.


End users(aka your product managers) care about developer efficiency. Let use the article on this post. When you have a large code base with lots of unit tests. You want the ability to parallelize tests so it doesn't have an hour to run all the units tests. That wait time will kill developer efficiency. Waiting to see if your code will break build and also having integration tests take over hour will guarantee that you will have broken code in source code repository because devs won't wait an hour to run units tests locally and just will merge stuff especially EOD Friday.


> End users(aka your product managers) care about developer efficiency.

Not in my experience, particularly when you have a day to do something and are saying it'll take a day and a half, and not delivering additional functionality as a result of that extra half day.


I had a colleague who had founded a successful company, it got acquired and in the process of extending his very elegant work to handle many more cases it ... got worse. Adding features under time pressure to satisfy customers made it much less clean. After fighting it for a while, he eventually sounded more resigned about it, saying "that's what code that makes money looks like". Not sure if it has to be that way.


> Not sure if it has to be that way.

Neither, but I'm yet to work at a company that both had code and had zero tech debt. I think there's something about capitalism that makes people want to deliver value as soon as possible, and value isn't necessarily dependent on good tests, particularly at the start of a project.


You will always have tech debt, it's a matter of degree. Of course, you have to feel what the best ratio is for your project, which is highly dependent on circumstances, but you definitely want as little tech debt as possible.

But what does capitalism have to do with anything?


Capitalism drives people to deliver "value" as quickly as possible. That's the profit motive, which is the whole point.


The profit motive is about wanting to make money from a project, not about how it's actually achieved.

If delivering value must be done slowly to make a profit, so be it. Notice how alcohol is generally more expensive with age (vintage).


There's many ongoing projects to make whisky faster using ultrasonic agitation! Why? Because if the same value can be delivered faster, people try to. The same is true of software, and testing gets pruned.


I think I’m a bit biased.

When I was the lead engineer for the front-end for our system at work, I wanted correct “better” code.

Now that I’m doing embedded firmware work under incredible hardware constraints, “better” doesn’t mean the same thing anymore.


i did something I'm not sure I'm happy with yet in order to ship a feature in 2 days instead of 6 months. still waiting to see if it blows up. it give a bunch of not so trustworthy users access to a system that's not been thoroughly vetted for security. guess I'll find out if there's any gaps soon.


So, I worked with this one guy who then was promoted to become the product manager. His code "worked" in the way you describe. I.e. it didn't really work, but it worked enough for him to move on to the next glorious moment in his career.

And, yeah, people will recommend him to their employers if those ask. He has a good track record of having "things done". People recommend him because he knows people, not because he's a good programmer.

It's not just that one guy, it's kind of a personality profile that for some reason lacks the drive to make good things. Not in so major way how psychopaths lack empathy, but I think about it in the same way as being a personality problem, not any kind of strategic decision. It's normal for humans to want to do their job well. Regardless of financial situation or other "practical" concerns. It's a value into itself.

Back to that one guy. So, in his head, he generated great value for the company by designing external interfaces. He believed that his contribution was in making the interface to the product easy because his theory of how other people behaved was based on his own behavior: he didn't like to work hard to understand whatever he was doing, he just wanted to see a result on the most basic good-path test to declare his work complete.

I really hated to work with him because I felt like I have to mop up after a drunk teenager party every time he added more glorious contributions to the code base and was happy to see him promoted into management until I realized how much of a bad deal it was. From just being bad code that needed fixing this became bad design that had, well, no real way of fixing.

Just to give you an example: the product was meant to have HTTP interface to its internal state. Some operations could take a very long time, so there's no way around it -- it had to be asynchronous. The subject of this comment designed the interface in such a way that polling for the state change wasn't in any way tied to the request that asked for the state change. This means that in rare cases, should there be an error while two clients are accessing the API one could see the success response to the request made by the other and stop polling, while the error would've been lost.

Of course, errors are rare, and races involving errors are exceedingly rare, so even to reproduce such a condition in a test wasn't a trivial task. Also, the whole idea that the "simple and user-friendly" interface design must change because some grumpy dude showed that in some rare cases it doesn't work didn't sit well with the author of the interface.

The problem is... this product is a distributed block storage system. It has to be extremely reliable in order to be successful. I couldn't imagine anyone would deliberately ignore an obvious design error in a system like that, but when that happened, I lost confidence and interest in the product and left the project in a matter of months. Of course, no one is indispensable, but the project lost a senior programmer in a niche field who dedicated many years to working on the project. Realistically, no matter how good my replacement was, they'd have to spend upwards of a year to just bring them up to speed on whatever I was doing. To the best of my knowledge, they never did. Eventually, they stripped the project down to parts, threw away some, repurposed some others, let most of the core team go and rebranded it as something else.

I blame that one guy for the failure of otherwise interesting and potentially useful thing.


> Of course, errors are rare, and races involving errors are exceedingly rare, so even to reproduce such a condition in a test wasn't a trivial task.

Are you assuming he did this maliciously? He very well may have, but why isn't this just a bug, that any one of us might write? You seem to feel like you understand the field a lot, and perhaps you do - in that case, perhaps, he didn't?

Maybe he did "suck"! But I honestly can't tell from the information you've given me. It sounds like both you and him would not easily write a test to catch the error.


> Are you assuming he did this maliciously?

As in to sabotage the company he worked for? -- No. Did he know that he's making a worse program? -- Well, in a way... but, not really. He was really convinced of his own brilliance because he thought he designed a good interface, but he didn't realize that his entire approach to design was bad. He just thought himself genius and that's why he believed that he can easily complete a difficult task. The belief that best is the enemy of the good was a contributing factor. A motivation for him to declare himself a genius by producing mediocre results (albeit at great speed).

> It sounds like both you and him would not easily write a test to catch the error.

I didn't need to write a test to catch the error. I could show in the code how the error may happen. Unfortunately, testing as it is practiced today, has nothing on design errors, it can only aspire to validate implementation. According to some common practices testers are even forbidden from participating in designing software.


> He was really convinced of his own brilliance because he thought he designed a good interface, but he didn't realize that his entire approach to design was bad.

I see, so you evaluated his design and the skills you have allowed you to see it was bad, when he could not?


You are making it sound as if there was something special about my persona in this situation. Well, there was nothing special about "my skills". Anyone diligently following due process would've come to the same conclusion.

Another problem with this situation is that it went nowhere. The project disintegrated for unrelated reasons. So, there's no proof it would've been actually bad.

But, like I said, this is not an isolated incident. I have a better example of bad design created for the very same reasons (but, by a different person) that did and still does have negative consequences for the product. Bonus: making a test that shows the design is bad is extremely costly (basically, unrealistic).

A little bit of history first. A complex computer cluster management software was created without any plans for how to upgrade it. Traditionally, and very often still it's distributed to air-gaped systems, so it's not, and will never be provided as a service. Typically, such software manages anywhere from couple hundreds to tens and in some extreme cases hundreds of thousands of compute nodes, switches, storage nodes etc.

Eventually, the company realized they need some automated way to perform upgrades. Their first effort was to design a procedure for in-place upgrades. Customers rebelled. Not only does an in-place upgrade guarantee down-time, due to the complex nature of the software, usually (or, most likely, always), upgrades didn't go smoothly and required engaging the technical support with days or even weeks worth of email back-and-forth and potential loss of data.

This is when the company realized it needed a procedure for upgrading "in parallel". And here they came up with an idea of reusing existing code for in-place upgrades: they'd clone the management elements of the cluster, upgrade those, and then transfer the rest of the cluster under new management.

They saw it as a win. I saw it as an absolute disaster. My experience with the in-place upgrade was that it would typically leave behind a lot of shreds of the old system. Unused configuration files, or, sometimes, entire databases with confusing names would be left behind. Patched configuration would often end up having features no longer supported by the new system. If the old system lacked validation for some inputs, then new, improved validation procedures would either skip entirely over old data, or struggle with its indirect usage.

Finally, this management software had a lot of third-party integrations with popular tools. Upgrading it in-place often times made it impossible to upgrade both at the same time (eg. upgrading storage engine like Ceph while also using this storage engine to store management configuration). Every such bootstrapping required very convoluted and impossible to test code. In many cases customer would be told to simply remove the integration prior to upgrade and re-configure it afterwards. Of course in case s.a. integration with Kubernetes, this meant customer would lose data (in the form of container state that cannot be reliably serialized and saved somewhere).

To me, the solution was obvious: instead of clone and upgrade in-place, one should install the new system and import the configuration from the old one. This would not allow old stale data, would have to re-validate old input and prompt the user to fix retroactively validation errors. Upgrade and migration of integrations s.a. Ceph or Kubernetes would still be hard, but not as hard as before, as the same principle would apply to them too.

No matter how much I campaigned for my version of design, I couldn't change the situation. The person behind the existing design outranks me. Being higher on the management ladder, it makes them think they know better. They never even tried to estimate the cost the company had paid in providing tech support to a decrepit and poorly built system, or how much development hours had been spent dealing with the bad approach enshrined in bad design, while the same effort could've been applied to development of new features or, at least, to reducing the load on the tech support. I'm sure that to date the company wasted (or lost...) hundreds of thousands of euros or dollars because of the bad design. Probably more: it's hard to estimate the cost of the lost opportunities.

In order for me to prove myself correct, I'd have to build an alternative upgrade procedure. The original effort for this feature had a team of 3-4 developers working on it almost exclusively for at least half a year. Today, the company supports about four dozens of versions of its own product, and the new upgrade system would have to support at least half of that. I will never get funding for such a project, let alone on writing a test that compares the two designs (via their implementations). That'd be many years of work.

----

And the author of this design? -- Well, he's exactly the kind of guy who believes, literally, that the best is the enemy of the good. He works very fast, and often volunteers to help customers directly by adding more untested features to the product. Him being very senior gives him a free pass on code reviews and design documents which are otherwise expected from everyone else. He sees himself a hero, someone who day after day saves the company from devastating consequences of... his earlier bad ideas.


> Anyone diligently following due process would've come to the same conclusion.

> To me, the solution was obvious

> I will never get funding for such a project,

> bad approach enshrined in bad design,

> He works very fast, and often volunteers to help customers directly by adding more untested features to the product.

> He sees himself a hero, someone who day after day saves the company from devastating consequences of... his earlier bad ideas.

What I'm hearing is this guy delivers features to the customers, but your ideas are better, obviously.

If they were obviously better, you'd be able to secure funding to do them. What I think is actually happening is you've been unable to demonstrate the value of your ideas, or convince people they are valuable. The conclusion I am drawing here is that you haven't shown that your way is better - you just believe it is.

My belief is that the attitude that people suck if they write tests that must be ordered, "indicates a lack of understanding that finding the right balance of speed, cost and quality is a compromise." But, many developers believe that their pure code is just always "better".


> What I'm hearing is this guy delivers features to the customers, but your ideas are better, obviously.

Well, then you need to pay more attention... Your interpretation is wrong.

> If they were obviously better, you'd be able to secure funding to do them.

You don't understand the difference between quality and pricing policy. Best food is not the "fast food", even though customers love it and it's cheap. The "fast food" is causing obesity epidemic, diabetes epidemic and a bunch of other health-related issues that, in the grand scheme of things make the savings on the food quality not worth it. But, for some people it's easy to weasel their way out of this problem by pretending that repercussions don't exist, and that they will bear no responsibility for the consequences.

The "deliver features to the customers" guy is the one who internalizes profits and externalizes expenditures. He's a douche canoe. There's nothing positive about people like him. The fact that he gets budget to do idiotic and harmful things is not because they are valuable, but because this creates a condition for a scam that profits those funding him.

In the same way how not everyone is a dietologist and cannot be expected to assess the value of the food they eat, and so has to rely on experts, the customers using our software will not know how well different features of our software are implemented. All they have to go off is a changelog and some promotional materials. There's enough of a time buffer between the customers buying the product and discovering how crappy it is for the product authors to still make profits. And, given the conditions of our market, where the product we make is, basically, without competition, even if the customers discover it's true quality, they'll come back bagging to fix things rather than bail on us. Which gives even more freedom to the "deliverers of features to the customers".


> Well, then you need to pay more attention... Your interpretation is wrong.

> You don't understand the difference between quality and pricing policy. Best food is not the "fast food", even though customers love it and it's cheap.

> There's nothing positive about people like him.

This is the exact problem I'm getting at: Individuals don't get to unilaterally declare what good software is, what the worth of people is, what people do or don't understand.

This guy may in fact have a more nuanced and "better" understanding of software and what goes into making it, than you. Prove that he doesn't! If you can, you can take that proof into work and take his job!



Sad. It's meaningless without the shaming.


I sort of agree with the idea, the developer calling this code may not be responsible for the mess that necessitated the call in the first place.

A name like this_code_sucks_and_these_tests_wre_order_dependent would be much more appropriate. It'd still be a name that would at least raise some eyebrows during code review so you'd have to write some kind of explanation why you're enabling this.


Shaming others is and has been part of the Ruby on Rails culture since its inception.


which commit originally put it in? That guy was a comedian


This is the commit that added it, with the description "users_dont_suck_but_only_we_suck_and_only_our_tests_are_order_dependent!"

https://github.com/rails/rails/blob/6ffb29d24e05abbd9ffe3ea9...

So the "I" refers not to the general user of the function, but to the implementer of the functionality, it seems. Code philology of the day!


You didn't follow the blame further.

First, the flag is provided by minitest. That commit is an implementation of the flag in Rails.

Second, 6ffb29d moved it to prevent Rails's test framework from setting it _by default_. 281f488[1] actually added it to Rails.

minitest/minitest#a4553e2[2] appears to have added the docstring and test case to minitest.

1: https://github.com/rails/rails/commit/281f488fffc176084bf77c...

2: https://github.com/minitest/minitest/commit/a4553e2e127072c9...


did you find all this through _clicking_ in the github UI or did you use a fancier tool on your computer? impressive sleuthing!


All through GitHub.

1. From https://github.com/rails/rails/blob/6ffb29d24e05abbd9ffe3ea9..., click "Blame" on the header bar over the file contents.

2. Scroll down to the line and click on the commit in the left column.

3. Scroll down to the part of the commit that removed the line from its previous location, in activesupport/lib/active_support/test_case.rb.

4. Click the three-dots menu in that file's header bar and select "View file".

5. Click "History" in the header bar of the contributors, above the file contents.

6. I guessed here that it was introduced in commit 281f488 based on its message: "Use the method provided by minitest to make tests order dependent". Clicking through to it confirms that. There's also a comment here that identified the problem which led to, and provided context for, the change in 6ffb29d.

The OP is from minitest's documentation, so to find the introduction in minitest, it's basically the same process.

1. Go to https://github.com/minitest/minitest.

2. Search the repo for the method name. Even just "i_suck" will match the commit.

3. Select the oldest commit in the results. That's a4553e2.


Pytest has an equally deprecating option for a different "use case":

    disable_test_id_escaping_and_forfeit_all_rights_to_community_support = True

https://docs.pytest.org/en/6.2.x/parametrize.html#pytest-mar...


I'd say not equally. The pytest one is basically "we get that your use case might be a little off our beaten path, and we want to allow you to be successful, but we aren't prepared to take on the support burden of it".

The minitest thing is "if you don't do things how we envision with our limited perspective, you are bad at programming".


There is no reason to have order-dependent unit tests. It's good for test tools to do everything they can to keep you from having them.


I think pytest understands that they only need the escaping because their own code is subtly broken in many places.

And look how much it screws up text: https://user-images.githubusercontent.com/1457682/57952631-b...


You're a bad programmer. I'm a good test runner.


This is great. I feel like this will become the next meme.


Definitely. It will probably happen by the time Avatar 2 is released.


You will have to wait 10 months.


I have a method in our Java app`iKnowWhatImDoingGiveMeTheSession()` that will only work if you have set the system property `doYouReally=Yes, I really know what I'm doing`

It should never be used in production, but can be helpful to have available when doing certain kinds of investigation.


What you’re describing is a debugger!


All my migration/tweak-production-db scripts run by default in dry run mode and require `--really` to override it. It's good excercise to write mutating code as much transactional way as possible.


A debugger doesn't remember the proper steps for me, a written function does.

I absolutely use this function in a debugging context. If it didn't exist, could I get at what I needed? Yes, but it's inconvenient to use reflection to get at private Java fields during a debugging session. Not impossible, just inconvenient.

This function exists as a convenience, but it has to look weird, otherwise its convenience might lure someone into using it to do the wrong thing for a production use case.

Good APIs make the right thing easy and the wrong thing hard.


I wrote an unsafe library at a FAANG and used similar naming conventions.

The init function was named along the lines of “MyClassName_UNSAFE::initUnsafeIUnderstandTheRisks()”

And the library itself was called something like “library-name-unsafe-my-team-name-must-approve-diff”

So anyone trying to use it would have to add a library with that name to their list of dependencies and would come to us asking for a code review, and more than half the time we would redirect them to a safer alternative (valid use cases for the unsafe library were few and far between).



No, mine was in closed-source internal code.


MySQL Command line has a flag called i-am-a-dummy, I know it's supposed to be funny but I think it's actually a good idea to run in dummy mode when you can, so the insulting name might be counterproductive. I think most people who have used SQL have updated with an insufficiently restrictive where clause, or even forgotten to include the where clause entirely, at least once.


Especially in databases like mysql I often miss opt-in dangerous operations. Dummy mode should be default and opting out of it should be called --i-am-a-dummy.


That flag also has the equivalent, non-insulting alias "--safe-updates", presumably for that very reason.


Make your models thin. Then make your models simple data structures.

Then your tests can build a data structure with just what is necessary for the test. Then your tests don't need to be ordered.

The common problem is that models are fat, and to test them requires building up some very complex scenario. Every step mutates something. So the step order is critically important.

This is absolutely not necessary if your functions are pure. It's also easier if your models are simple data structures, so they require less setup.

edit;added - if you make your models thin, then you need to add "service" modules which encapsulate the business logic. Taking it outside the models makes it more testable with less effort, and it also makes it less dependent on your framework.


How do you make all functions pure, though? At the end of the day, what makes a computer interesting is that it is stateful.


Oh come on. The edge functions do mutations, but the interior functions are just functions. Somehow mathematics manages to get things done in this manner.


And then you need to cache something to disk, at which point you either plumb a callback all the way into your onion or give up.

Mathematics doesn't manage to get things done in this manner. One of the key distinctions between programming and mathematics is that programs can do things and math just is.


Can’t tell wether you are being ironic or not.


No, I am not. Perhaps I am not communicating my meaning well though...

A model (class) can represent a certain kind of data, and an instance of that model (object) can represent the details of one instance of that kind of data. On this I imagine we can all agree.

OOP attempted to organize systems by bringing the data together with the operations that could be performed on it (class/instance methods). And with the goal of hiding the implementation details, the instance methods would work on Self, mutating the data members of a given object.

There are many problems with this approach, and one is this: You often have business rules which involve greater than one model, such as how a shopping cart checkout will touch a payment record, a cart, product inventory, etc. The tidy models suddenly get wrapped up in a bigger system where some operation needs to carry and pass around multiple objects and push each object to mutate itself in various ways (or worse, just reach in and directly mutate the objects from outside).

So in a real system (complex), you can find yourself at a point where some object has changed, but it's not clear why or where. And adequately writing tests for a system like this is very difficult, because the combinations of possible scenarios is almost beyond reason.

Now imagine an alternate approach where you can still have a data model, but the operations you might perform are separate and do not mutate anything; instead, they take data (some representation of the model, whether a thin OOP object or a simple data structure such as a hash), they perform whatever operation they need to perform, and they return a new representation of that data (where presumably the details are different from the original data that was passed in).

Writing tests for this is very easy now, since the only setup required is whatever is minimally necessary for that small (single responsibility) operation. And it is more observable, because now a series of functions which make up a bigger operation have a history of outputs which can be compared or saved.

I have demonstrated this approach in a real world situation where my team built a greenfield system to replace a legacy system. Both were Rails, and the original used the typical fat model Rails approach as you would learn from reading Rails books and guides. The new replacement had minimally thin models and "service" modules which had the logic. Test coverage in the replacement system was much higher, but lines of code were much lower. Also tests ran multiple times faster in the new system. Code was easier to reason about, and debugging was easier.

The only downside, as such, was that there were now many smaller functions. So choosing good function names took more work and typically resulted in longer names (to express the intent of the function).

There was initial resistance to this approach, but that resistance later became promotion of the approach to other teams.

Ultimately this approach takes most of the business logic out of the Rails framework, which also makes it easier to transition to separate service processes/systems when things grow too big. Really it's not so unlike the intention and benefits of separating data from actions from display as the model-view-controller paradigm does.


Hey, sorry about this but… I meant to reply to that other “this is how it’s done in math” comment, not you. I agree with you in that Rails‘ Fat Models isn’t great, and I think coupling business logic with persistence was a mistake. In general I would say that ActiveRecord is not great. It’s “adequate for some usecases”.

Going full “pure functions all the way”, is where I think we’re hitting diminishing returns - the the gap between what the language says and what the machine does needs some sort of bridge. In Haskell, the bridge is a whole host of abstractions: Monads, endofunctors etc. That’s… too much stuff. I can’t say it better, I’m afraid.


Disagreed. Any sufficiently complex app will have mutation dependent tests regardless of where that complexity is stored, models/services/controllers?!/views!?

The point is you want to "replay" all mutations in each test. i.e. TestA requires Mut1, Mut2, Mut3 .... just so happens TestB also requires the same mutations.

What this function is trying to say is - Don't make TestB depend on running after TestA because TestA "set up your world" for you and it was "easier/faster/less-work" to depend on that one.

Run Mut1, Mut2, Mut3 for TestB independently. Yes, it doubles the time because you're performing the same mutations, but maybe it makes sense.

The question becomes, should TestA and TestB be part of the same actual test? OR are they sufficiently different that it just makes sense to eat the double setup time.


We run ordered Integration Tests to save time as the set-up is onerous. Is that bad?


not necessarily - the key is that you want to make the ordering dependencies explicit.

the real issue comes from tests that have accidental or implicit order dependencies, such that adding a test to the middle of a file might cause subsequent tests to fail because they expected something from the previous test.

or, refactoring a large file into two smaller files might result in a bunch of test failures, because the tests happened to depend on system state specific to the original larger file.

implicit ordering dependencies also tends to assume single-threaded test execution. a common strategy to speed tests up is to run them in parallel, and that can cause havoc if the tests aren't carefully scoped to act only on resources created by that test (such as, a test of user creation that in its teardown phase says "eh, just delete all users with a username starting with "test_")


Like everything in tech, it’s a tradeoff. You’re trading the potential for side-effects from one test to influence the behavior of subsequent tests without you knowing for decreased setup time. This might be a good thing or a bad thing—it depends! Engineering is balancing these tradeoffs and mitigating the risks their downsides bring.


It's bad, no doubt. The question is

What are the costs of the alternatives?


I think that your ordered tests are the setup.


I don't know if they exist in this library but in Python I would use subtests:

    def test_ordered_integration()
        # setup
        with subtest("thing one"):
            assert something
        with subtest("thing two"):
            assert something_else
        # etc


Now you'll struggle to run them completely in parallel (in complete isolation, i.e. unique database server and what-have-you). This will probably matter if you want to run them in CI. It's a trade-off though, and itests are a nasty monster to deal with no matter what.


Can you do setup single-threaded (or get back to being single-threaded at the end of setup) and fork before starting each test case?

If you need multiple service processes up and running, can you make them less stateful to get around some of your test limitations?


PNPM has an option called `shamefully-hoist` and it's by far my favorite option of all time: https://pnpm.io/npmrc#shamefully-hoist


Hmm, 11 characters are reasonable for this sort of a thing.

Pytest's disable_test_id_escaping_and_forfeit_all_rights_to_community_support are too many characters IMO.


I disagree that ordered tests are bad. See, I don't even have mutable state in my program -- but still have ordered tests for another reason: the ease of debugging. Say, if module A depends on module B, then B should be tested first, only then A, since if you test in the other direction, you might have a hard time figuring out whether A or B is misbehaving.


This doesn't make sense to me. If you know that A depends on B and you run your test suite while B is broken, then both A Suite and B Suite will fail. But the order is irrelevant, you'll know that B (the one A depends on) is broken and that should be the focus of your investigation. Why would there be any confusion if the tests executed in a random order?


I've answered it in the other comment -- because I might not know the entire module dependency graph. For example, if the code is not mine.


If the code isn't yours then you have to hope they ordered the tests correctly or, you know, figure out the dependency order. Which isn't hard:

- Use a static analysis tool that kicks out a dependency diagram. That literally shows the dependency order.

- Look for import/include/using statements. That shows the dependency order.

- Look for inherits/implements part of class definitions. That shows the dependency order.

- Look at the build commands. It'll tell you the dependency order.

- When you see a function, constructor, whatever being used, you have a dependency.

And if the code is yours, and you don't know the dependency order, then fix that. Write it down or something.


Well yes. But this all requires additional movements on my side. On the other hand, if the tests are always ordered such that dependencies are tested first, then I won't have to deal with the approaches you've mentioned.

Ideally, a test framework should enforce the order based on the project ontology. They don't usually do that though because they cannot easily examine the code being tested to extract the necessary dependency information.


Doesn't that only apply if you stop the entire suite upon receiving one failed test?


Even if I can have multiple failing tests, unordered testing might get tricky. For example, as I said, A depends on B, and suppose that both A and B tests are failing. If I (or really anyone else in the team) forget that A depends on B, they might be tempted to debug the A test first, only later figuring out that this is B that needs to be fixed.


What would be best probably is for them to run in the random order, but the output would be sorted always the same.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: