He's right, but he's disingenuous in saying that random line duplication can't c...

coolsunglasses · on April 16, 2014

Clojure isn't particularly well suited to avoiding problems like this. I've written a lot of Clojure for work and for open source.

We need pure, typed FP langs like Haskell/Agda/Idris.

To boot, I'm having a much more enjoyable and relaxing time in a Haskell REPL than I was in my Clojure REPL.

Somebody I follow on Twitter just said something apropos:

"girl are you Clojure because when I'm with you I have a misplaced sense of [optimism] about my abilities until I enter the real world"

loumf · on April 16, 2014

I don't think clojure (or lisp) was designed to avoid line-duplication errors. It's mostly an accident of it being not very line oriented.

I just randomly picked a function from core to illustrate this, but a lot of clojure code looks similar

     (defn filter
       "Returns a lazy sequence of the items in coll for which
       (pred item) returns true. pred must be free of side-effects."
       {:added "1.0"
        :static true}
       ([pred coll]
        (lazy-seq
         (when-let [s (seq coll)]
           (if (chunked-seq? s)
             (let [c (chunk-first s)
                   size (count c)
                   b (chunk-buffer size)]
               (dotimes [i size]
                   (when (pred (.nth c i))
                     (chunk-append b (.nth c i))))
               (chunk-cons (chunk b) (filter pred (chunk-rest s))))
             (let [f (first s) r (rest s)]
               (if (pred f)
                 (cons f (filter pred r))
                 (filter pred r))))))))

There are very few lines of this function that can be duplicated without causing a syntax error because parens will be unbalanced.

I see two:

In a let with more than two bindings, you could repeat the middle ones. In clojure, this is very likely to be idempotent. In this code, it's the

        size (count c)

In any function call with more than two arguments, if the middle ones are put on their own line, they could be repeated, like this in the final 'if'

        (cons f (filter pred r))

In many cases, you will fail the arity check (for example in this case). If not, the function should fail spectacularly if you run it.

So, I think it's accidentally less likely to have problems with bad merges and accidental edits (not designed to have that property)

lgas · on April 17, 2014

Of course it's just as easy to duplicate a form as a line. When I edit Clojure code I use paredit so I'm editing the structure of the code instead of the text. Instead of accidentally "cut this line then paste twice" I could easily do "cut this form then paste it twice". Paredit will make sure I never have bad syntax but now I have the logically equivalent mistake.

yohanatan · on April 16, 2014

Your analysis only holds if closing parentheses are all gathered on the same line as final expressions (which may not be true for some styles).

chc · on April 16, 2014

It is considered unidiomatic in every Lisp I am aware of to orphan parens. To some degree this is a stylistic concern, but it's a much more open-and-shut case than, say, C brace style.

loumf · on April 16, 2014

I'm not thinking this through completely, but it seems resilient to a lot of styles.

Function calls (which is a lot of what clojure is) are an open parens to start and very likely not to have that close on the same line (because you are building a tree of subexpressions).

Wherever you put the close (bunched or one per line), if you don't put it on the line with the original open, it will be unbalanced in both spots (meaning the first line and the last line can't be duplicated without causing a syntax error).

yohanatan · on April 16, 2014

True. But consider the form:

    (if (someExpr)
      (doTrueStuff)
    )

Then a duplication of the `doTrueStuff` line would lead to true stuff being done regardless of the truthiness of someExpr (as the third [optional] argument to `if` is the else branch).

This form is not entirely unheard of either. The overtone library for example assigns labels to its event handlers like such:

   (defn eventHandler ( ...
      stuff
   ) :: event_handler_label)

Jtsummers · on April 16, 2014

This is actually why I really like `cond` in Common Lisp (and other lisps and languages). You have to make explicit what should happen if your desired expression is true, and the only way to have an `else` clause is `(t ...)` so you have to intentionally create that last wildcard spot.

mercurial · on April 16, 2014

Even in Haskell you can fairly easily get non-reachable code:

  if a == a then
      ...
  else
      -- unreachable

Combine this with the unfortunate habit (which I'm also guilty of) of allowing variables to be called a' and you'll an easy-to-miss bug.

WolfeReader · on April 16, 2014

Given that types may define their own instances of Eq, the else clause might be reachable.

mercurial · on April 17, 2014

And the content of the first block may be unreachable all the time, depending on the implementation of Eq. But you probably have a number of issues to worry about if your implementation of Eq doesn't work for the identity case.

loqi · on April 25, 2014

let nan = 0.0/0.0 in nan == nan

Jtsummers · on April 16, 2014

> but he's disingenuous in saying that random line duplication can't cause catastrophic problems in Eiffel.

I think you're referring to this part of the article:

  With such a modern language design, the Apple bug could not
  have arisen. A duplicated line is either:
    - A keyword such as end, immediately caught as a syntax
      error.
    - An actual instruction such as an assignment, whose
      duplication causes either no effect or an effect limited to
      the particular case covered by the branch, rather than
      catastrophically disrupting all cases, as in the Apple bug.

To be fair, he's not saying that it can't cause catastrophic problems in Eiffel. In fact, he's not strictly talking about Eiffel but about better constructed languages. He is saying that while it could be cotastrophic, that it wouldn't be nearly as bad as what happened here since it'd be contained to one branch of execution rather than exposed to all branches of execution.

loumf · on April 16, 2014

I'm just saying that that distinction is no less catastrophic. It really depends on the line in question, but if a line has side-effects and isn't idempotent, it could be infinitely catastrophic to repeat it unless it causes a syntax error.

Even in the gotofail case, ALL cases were not affected, only the ones after the line. It would be more catastrophic higher in the function and less lower.

BTW, his code sample is NOT what the real bug was -- in the real bug it was more like this

     err = f1(cert) if (err) goto fail;
     err = f2(cert) if (err) goto fail;
     err = f3(cert) if (err) 
       goto fail;
       goto fail;
     err = f4(cert) if (err) goto fail;

     fail:
     cleanup();
     return err; // if this returns 0, the cert is good

The issue is that we jumped to fail with err set to 0 (no error), but we skipped the f4 check. The bug is only a problem if f4 would return an error (not ALL cases)

jerf · on April 16, 2014

It seems to me the line of argument here boils down to "bad code can be bad". There's no language that can prevent code that is simply wrong from doing something wrong, not even the proof languages. Even "pure" code will happily generate the wrong value if you "map (+1) ." twice instead of once.

We should instead discuss the affordances the language has for correct and incorrect code. It is not that C is objectively "wrong" to permit an if statement to take an atomic statement, it is that it affords wrong behavior. And the reason I say it affords wrong behavior is no longer any theoretical argument in any direction, but direct, repeated, consistent practical experience from pretty much everybody who makes serious use of C... that is, it is the reality that trumps all theory.

Roboprog · on April 16, 2014

OK, he oversimplified it a little bit: the error checks had side effects, which determined the return value of the enclosing routine.

His point still stands, I think: the code didn't do what was obviously intended, and should have been flagged by the main compiler/interpreter/parser, rather than a supplemental "lint" type tool.