Making Wrong Code Look Wrong (2005)

tome · on Feb 8, 2009

I like the ideas expressed but I'd rather see them enforced by the language than by coding convention. For example to simulate type checking you could prefix _typename onto each of your variable names. Then

i_int += j_filehandle

would stick out as wrong. But luckily many languages support type checking so we don't have to do it in those languages!

I wonder is it possible to avoid mixing safe and unsafe strings (for example) using some new language feature (or a new use of an old one)?

rtra · on Feb 9, 2009

You can do that in Haskell easily.

  data UnsafeString = Unsafe String
  htmlencode :: String -> String
  sanitizeUnsafeStr :: UnsafeString -> String
  sanitizeUnsafeStr (Unsafe cs) = htmlencode cs

Jebdm · on Feb 9, 2009

You can do it in any OO language easily as well. For instance, in Python:

  class UnsafeString:
    def __init__(self, str):
      self._str = str
      self._sanitized = None
    def __str__(self):
      return self.sanitize()
    def unsafe(self):
      return self._str
    def sanitize(self):
      if self._sanitized != None:
        return self._sanitized
      else:
        self._sanitized = sanitize(self._sanitized)
        return self._sanitized

That way, as long as you wrap all input in the UnsafeString class, you'll have to be explicit if you want the unsafe version and you'll get the safe version by default.

rtra · on Feb 9, 2009

Precisely. Isn't user input handled like this on most web code? It seems the most sensible way to do it.

rtra · on Feb 9, 2009

To further expand on this, we would have a request function returning type IO UnsafeString, and all output functions would have type IO String. By abstracting the UnsafeString type we would make sure its users don't use the type's internal representation (I don't yet know if this is possible in Haskell,) so the only way to use input as of type String would be by asking a private module for a conversion, which would sanitize any dangerous input. Now you only have to look to the input methods to be sure no unsafe input is getting through.

ErrantX · on Feb 9, 2009

I'm not sure your first example is Apps Hungarian: indeed in the last segment he says that such an example is a bad variant thatis of no real usage.

I too initially thought "how great would it be to compile some of these prefixes into a language" but after readign through I think it would be difficult.

The intent is that the prefix describes what the variable is not what type it is. And so the prefix will vary from project to project.

However your example of specifying type is inherently possible and many languages DO do it (not with the prefix - they just check for type at compile time). The point, I think, Joel is making is that the Hungarian convention is a way for the programmer to vet the code as he browses it. Identifying errors and logic clashes etc. :)

rtra · on Feb 9, 2009

And indeed a programmer can vet the code as he browses it, by using a data type like the one I gave above, just by making sure all user input methods have type UnsafeString, and all output methods had type String, with UnsafeString as an abstract data type.

ErrantX · on Feb 9, 2009

yes but your still stuck on type there... those are things that can easily fail (or be made to file) at the compiler or on runtime.

What about logic errors - which is what this is supposed to help fix?

smanek · on Feb 9, 2009

Well, my first thought would be using macros to ensure you never have 'unsafe' strings anywhere.

But, for some inexplicable reason, Joel says "Don’t use macros to create your own personal programming language."

How is using macros to encapsulate common functionality any different than using functions to encapsulate common functionality? Both lead to less mental overhead, more code reuse, and code that's easier to parse.

ChadB · on Feb 9, 2009

Given the overall theme of this rant, I would assume that it's about keeping code close together and easy to understand.

Obviously, macros have almost nothing in common with functions. Macros are simply shorthand for code that will be literally placed at the spot they are called.

So rather than calling functions, with all argument expressions evaluated, and a call stack, and blah blah blah, you are just copy-pasting code without it actually ending up on the screen for a developer to see.

smanek · on Feb 9, 2009

You really think that macros have almost nothing in common with functions? Granted that the semantics are a little different but, in my mind, macros and functions are just two different tools that serve the same goal: allowing for code reuse.

(I do understand the difference, for what it's worth. I worked as a professional Common Lisp coder for a while.)

kaens · on Feb 9, 2009

Well, I don't think he's talking about Common Lisp macros.

ChadB · on Feb 9, 2009

I assumed you did, I was honestly just trying to formulate my point. That, and I was talking about C++ macros, which is what Joel was referring to as well.

Herring · on Feb 8, 2009

I don't quite buy his argument about why we need to keep things in unsafe format for a while. If that credit card app can't handle HTML-encoded stuff, how would it handle evil javascript?

jmtulloss · on Feb 9, 2009

In general, I find that it's a bad idea to destroy information. Keep all input intact as long as possible, so if you ever do encounter a problem, you don't need to reconstruct the original data. You can always process data into the correct form at runtime.

Of course, there are performance implications to this, but those can be dealt with. Encoding strings right away for performance reasons is definitely a premature optimization.

ChadB · on Feb 9, 2009

Allowing a user to edit the string later is the most common reason I've encountered.

No one likes to see an input all cluttered with """, "&" and the like.