Hacker News new | past | comments | ask | show | jobs | submit login

For me, the strongest overlooked argument for strong static type systems (like Haskell's) is security. With them, you can make the compiler prove that your code is free of certain commonly exploited classes of security problems (e.g., XSS and other injections).

Without them, you're on your own. And, for almost all values of you, that means you are pretty much guaranteed to have serious holes in your software. It's not like you can test these holes away. If you don't realize that you need to escape string X into the language of string Y before combining X with Y, do you think you're going to know to write the test that checks for whether you failed to escape X?

EDIT: minor tweak for clarity.




This is a good use of types; one of my favorites, perhaps.

It's also one I implemented with relative ease in Python. The error is deferred, but other than that the implementation works equally well. Once again the question is whether up-front errors are worth static typing, and once again the answer is a preference.


But deferring the error means the implementation does not work equally well: it sacrifices safety. When a programmer tries to combine strings X and Y that are incompatible, it's because he misunderstands what X or Y actually represents. This misunderstanding is detected when X and Y are brought together, but that doesn't mean it can't also have corrupted upstream code that also uses X or Y, just not together. This upstream code never gets a chance to run if the downstream problem is detected at compile time. But, if you only catch the problem at run time, after the upstream code has already executed, that corrupted upstream code could have launched the missiles. Run time is too late.


Maybe Haskell is the exception but I find it hard to believe that a compiler can do that for you, IMO, the only way to have a decent shot at being safe from XSS is to sanitize your inputs with a whitelist.


The compiler alone cannot do this, but the compiler combined with judicious use of the type system can. This is really the big value-add of Haskell/ML-style type systems; you can fit so many useful constraints into the type system that you can enlist the compiler into proving that you haven't made various common mistakes. If you can formalize an application logic error into a type error, you can use the compiler to enforce a wide variety of application logic error checking at compile time.

A simple example: suppose that every web form input is mapped into an "UnsafeUserInput rawtype" (that is, an algebraic data type parametrized over the raw input type). By simply ensuring that your output code only works on "SanitizedUserInput" types, you enforce that at some point in the code you apply a sanitization function to any unsafe input. Now the compiler is working for you, checking that your overall data flow respects the constraints you've encoded into the type system. In this style of programming, the type system becomes a way of encoding your desired integrity constraints inline with the functions and data types rather than outside the application logic in test suites.


http://blog.moertel.com/articles/2006/10/18/a-type-based-sol... is a very good read on the subject. basically, the compiler can have different types of strings, depending on where they come from, and how they are created. it can then make sure that different types of strings are never mixed without an explicit conversion operation. it is up to the programmer, of course, to see to it that that conversion operation involves sanitising, but the compiler will make sure you don't do

    string a = get_input_from_user();
    string b = sql_lookup(database, a);
    insert_into_output_html(b);
requiring instead

    UserInputString a = get_input_from_user();
    SQLResultString b = sql_lookup(database, ConvertUserInputToSQLQueryInput(a))
    insert_into_output_html(ConvertSQLResultToHtml(b));


Yes, a compiler can do that for you. One example:

http://blog.moertel.com/articles/2006/10/18/a-type-based-sol...

(That was written over six years ago. Today, there are even better solutions that don't just apply static types to strings but to the language fragments within them.)


A compiler could check that all inputs that are possibly from a user are sanitized or explicitly marked as being unclean / clean.


It's important to realize that safety is not a property of a string but a property of the relationship between a string and a use context. Thus all "solutions" that rely upon marking strings as clean work for only one kind of context. If you want a general solution, one that works for all injection problems, you have to be able to encode the full relationship model into the type system.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: