Hacker News new | past | comments | ask | show | jobs | submit login

The Haskell ecosystem uses a very similar convention: unsafe functions have names starting with ‘unsafe’ (or, in one particularly horrible case, ‘accursedUnutterable’ [0]). These are generally functions which subvert the type system, so the name acts as an alternate warning.

[0] Documented at https://hackage.haskell.org/package/bytestring-0.11.4.0/docs... — incidentally one of the funniest pieces of documentation I’ve ever seen!




That doc link is funny, but it's also very frustrating -- you would think that after all those warnings, there would also be some glimmer of information about what sets the accursedUnutterablePerformIO function apart from unsafePerformIO, or at least why it exists, but apparently not. It's clearly important because most of the core functions in Data.ByteString call it.

It seems like the authors have decided that the trickiest, most magical parts of their codebase are exactly where they don't care about useful documentation. That seems like a bizarre strategy unless they plan to never let anyone else help with maintenance.


“If you aren’t willing to read the source of the function to learn what it does, we don’t want you anywhere near it. No summary short of the function itself can accurately communicate how much of a fragile hairball of edge-cases this function is, and how many things you have to know about to correctly call it.”


This is pretty much correct. For reference, here’s the original documentation (changed in commit 80ff4a3018cd8909abb1d4e0c32f012a523883ec):

    -- | Just like unsafePerformIO, but we inline it. Big performance gains as
    -- it exposes lots of things to further inlining. /Very unsafe/. In
    -- particular, you should do no memory allocation inside an
    -- 'inlinePerformIO' block. On Hugs this is just @unsafePerformIO@.
The commit message notes that ‘We've had a few instances of people being tempted to use it without really understanding the consequences’.


I wonder why inlining the function makes it that much unsafe.


This is the worst kind of arrogance. All you're doing is reducing the number of people who'll understand what's actually going on, which not only makes it less likely to be used properly; it also makes debugging code that uses it more difficult.

You're deciding for other people what they should and should not know. Another word for this is gatekeeping.

Don't do that. You're not the other person. You can't read their minds, nor can you fathom their reasons. Deliberately hamstringing folks is bad policy, and supremely arrogant.

Just document it properly, lay out the risks, and leave it to the reader to decide on policy.


Oh wow your comment really rubs me the wrong way...

In my opinion, it is properly documented, almost perfectly so, by the authors giving you the full source code.

If you are not willing to read the source code, but instead always blindly rely on documentation to be correct, then in my book that's just lazyness.


> If you are not willing to read the source code, but instead always blindly rely on documentation to be correct, then in my book that's just lazyness.

Hmm, I agree somewhat with both your comment and the one you're responding to. However, the claim that you should read the source of the libraries/frameworks you're using to me is interesting - because to me it seems sane, however not always entirely viable.

For example, who would be more successful:

A) A Java developer who reads the Spring source code and dives into that verbose Eldritch codebase with years upon years of fixes, design patterns and abstractions, all to learn how the dependency injection works better, probably spending days to weeks in the process?

B) Or perhaps someone who looks up the annotations or configuration they need, or maybe other code snippets on StackOverflow or the docs, or a similar site and solves their problem so they can move on to the next tasks, while being none the wiser otherwise?

Sometimes the answer will be A (deep knowledge can be useful), sometimes it will be B (we all have limited time and energy), however it feels to me that with sufficiently complex codebases one might just get more and more confused as well, as opposed to gaining any sort of a clarity or understanding, when the framework/library actually abstracts away really complex things.

Just using Java as the example here, as it's one of the more enterprisey languages and Spring as an arguably brownfield maintenance mode framework that's huge, given its long history.


I appreciate that you broke apart the users into two groups. I imagine Group B is much larger than Group A, and that the Haskell maintainers want to emphasize "if you use accursedUnutterablePerformIO, make sure you are in Group A!".

Most Group A people didn't get there fixing surface level stuff. They got there because there was an insidious defect buried deep that required that deep knowledge to understand or fix.


Documentation is not a substitute for reading the code, but rather a complement to it.

Code by itself is a limited medium that can only do so much. Asking someone to rely solely on code to explain a complicated subject is counterproductive (the more complicated the code is, the harder it is to guess whether something in the code is a mistake, or working as intended with some unknown reasoning behind it).

Documentation points out the non-obvious parts, the reasoning behind it, the subtleties, the gotchas (things to watch out for), the best practices, etc.

Documentation is a map to guide the user, not an end-all-be-all about what the code does.

When the code does non-obvious things or has non-obvious reasons, you need both code and documentation.


I would agree in general, but the comment I was replying to is criticizing the following documentation as arrogant:

"If you aren’t willing to read the source of the function to learn what it does, we don’t want you anywhere near it."

And I believe that might be reasonable for a complex function that cannot be exhaustively documented in any form shorter than the source code.


I disagree, but I think this comes down to the wording and tone.

The following would be perfectly justifiable and I'd agree with it wholeheartedly:

"This code is incredibly complicated and we don't have the resources to document it properly. So please DO NOT GO ANYWHERE NEAR THIS CODE unless you've spent weeks/months studying it in depth. And even then, think twice!"

One is a friendly "here there be dragons" style warning that explains some things (such as why it's not documented). The other is a passive-aggressive accusation followed by an "I know better than you, kiddo" style enforcement, which exudes arrogant presumption.


But that phrasing makes an entirely different point. It's not "we don't have the resources to document it properly" — it's more like...

"There is no possible documentation that could help you to not screw up when using this function. We've tried. There used to be docs here — first as doc comments, then hidden within the function (to force you to read the function to find them), then as a special pre-commit lint rule that triggers when people add a new use of the function to the codebase without adding a special annotation to acknowledge it. None of it helped; people still screwed up the usage. Even the people who wrote the function screwed up in new usages of it — every time they tried! — because the number of things you need to remember to get right to use it, exceeds the capacity of any single human being's short-term working memory. The very existence of this function is Ozymandian arrogance. It should be banished from the Earth. The only problem is that people keep reinventing it. We're waiting for a sufficiently-smart linter, so we can teach it to not let people reintroduce any function isomorphic to this to the codebase; then we can truly excise it for good. In the meantime, we leave it undocumented, for a bit of security through obscurity: a thing that you can't find a good way of coming to a seeming understanding of, you're less tempted to use."


I'd argue that both of you are somehow correct. Give me the code, so I can see what it does. And give me documentation (be it correct comments, a dedicated documentation or whatever) to know why it does something.

Code states the obvious, but you cannot read between the lines.


It reminds me of how narrow roads are sometimes used to slow traffic, though I dont know if its effective here. Im hesitant to discourage it when the language is so explicit about it being bad, it might be quite effective afaict



It's documented that you need to read the source.


Good documentation tells you a lot more than the source could. Especially when you're talking about horrific side-effects.


Another correct way to use Hungarian Notation, according to (iirc) Joel Spolsky


To me unsafe generally means “memory safety or type safety is disabled”. It doesn’t mean the code can’t run in production. Maybe some just implemented fast inverse square root and tested it manually.

danger_force_electrify_fence() is a whole different story.


> To me unsafe generally means “memory safety or type safety is disabled”.

And that is exactly what it means. Unsafe functions in Haskell do run in production. They just subvert type safety (and I believe memory safety too, in the case of accursedUnutterablePerformIO).


Even plain unsafePerformIO is memory-unsafe, as it will let you construct a polymorphic IORef, through which you can implement unsafeCoerce.


Hmm… is that memory-unsafe? I thought that came under the heading of ‘subverting the type system’. Then again, I suppose they’re more or less the same thing in a strongly-typed and pure language like Haskell.


Yes, it is very memory unsafe. Subverting the type system often the main source of memory unsafety in most languages. It typically means you can treat an arbitrary integer as a pointer to any other object.


Rust and C# both use 'unsafe { }' blocks, which can only be enabled by a module-level compiler flag.


There is no such "compiler flag" in Rust, that's only a C# thing.

But also more relevantly here, stuff you'd be calling in those Rust blocks tends to have longer names and often calls out the fact it's specifically not checked, e.g.

  a = b.unchecked_add(c); // Like a C or C++ arithmetic operation if you overflow it's UB. This might be faster. It might not. But if you must have this anyway, that's how

  n = NonZeroU32::new_unchecked(0); // Unlike new() which returns None because duh, zero isn't non-zero, this results in UB.
Not all of them though, for example:

  v = Vec::from_raw_parts(ptr, len, cap); // Make a Vec, the pointer ptr had better actually be pointing at contiguous memory for exactly cap item-sized slots, the first len of which are in fact legitimately values of the appropriate type, if any of this is wrong that's immediately UB.


While sadly it does not require a flag to enable, at least you can turn it off with: `#![forbid(unsafe_code)]`.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: