Hacker News new | past | comments | ask | show | jobs | submit login

I think you're really missing the point.

Your goal is to write a whole programming language.

Calculating where each statement starts and ends in order to serve a good error message is misdirected effort. Just use a terminating character, and use the standard one: a semi-colon.




> Just use a terminating character, and use the standard one: a semi-colon.

Significant whitespace is popular as well. You can argue all day about it, but at the moment I think making a language "look like python" is the safer bet for a new language.


Significant whitespace is _also_ redundant, just along different metrics. Badly indented Python gets caught before execution the majority of the time, and in presentation, it surfaces to the programmer quickly.

His argument is essentially against something like Lisp or Forth where you have such a paucity of syntax that many erroneously written statements are syntactically valid and make it to runtime.


Lisp and Forth are the extreme case, which is especially bad, because you get no error message at all.

But the middle ground are languages that can detect a syntax error, but can't figure out exactly what it is. You don't get an error message saying "You should have a semi-colon here, want me to insert one?" Instead you get some vague message like "didn't expect this keyword here" a few lines after the missing semi-colon.


It's debatable. I know two languages with significant whitespace, that's Python and Haskell. And Haskell's rules are a pain.

For the language I'm working on, I had started with significant whitespace. Then I realized that it would be a pain for things like anonymous callbacks (the kind of thing JS code is littered with) and that it was distracting. At least when you start, don't get hung up on the syntax, focus on basic things. What's important is what you are going to do with your AST, the lexer/parser part are the least important areas, unless you are doing a "syntactic skin" over a language (eg, coffeescript). You can always change the syntax later.


There's also F#, CoffeeScript, Yaml, Sass/Stylus, Haml/Jade and several others, just among the popular languages.

And if we mean "significant whitespace" as in "newlines can act as statement and expression terminators" (since we were discussing semicolons as the alternative) we can also add a ton of other languages, like Ruby, JavaScript, Go, Scala, Visual Basic and Lua.


I don't think it's fair to compare markup languages to programming languages, so we're down to F# and Coffeescript. That's not much of a trend toward significant whitespaces.

I agree with newlines vs semicolons (though I've always been told to write Javascript with semi-colons, and that's how I have encountered it in the wild).


> I don't think it's fair to compare markup languages to programming languages, so we're down to F# and Coffeescript. That's not much of a trend toward significant whitespaces.

I take your point about Haml, though Sass is actually Turing-complete, so I almost feel like it belongs in the list despite being really off-the-wall.

And I was only choosing from reasonably popular languages (so things like Boo are out even though they'd help my numbers). There just aren't that many mainstream programming languages out there. Four in a category seems like a pretty fair number to me. You could just as easily say functional programming isn't a thing if four mainstream languages is considered a paltry showing.


I'm curious why you felt anonymous callbacks were problematic in a significant whitespace language. They seem to work just fine in coffeescript, as an example.


I'm not familiar with Coffeescript. Maybe I should have a look. My thought was that getting the right level of indentation for something like:

    my_func_with_callbacks(arg1, def(x):
        foo()
        bar(),
        5)
would be a pain in terms of defining sensible rules and ensuring they are parsed correctly. So right now, I'm going for something closer to Ruby syntax. But it may just be a symptom of a lack of imagination. I'll remember to investigate Coffeescript when I revisit the syntax.


Ah. I think the practice is to treat the closure as being ended by any dedent that is lower than its first line. This specific case does come up with javascript's setTimeout, which takes a callback as its first argument and the timeout as the second. It usually seems to look like this:

    setTimeout ->
        doSomething()
        doSomethingElse()
    , 1000
I'm not terribly fond of this, personally, but it does solve the problem. I think in a language without a legacy you would just tend to avoid making callback arguments come before non-callback arguments and things would look a lot nicer.


> It's debatable. I know two languages with significant whitespace, that's Python and Haskell. And Haskell's rules are a pain.

But it's optional in Haskell's case.


>And Haskell's rules are a pain.

How?


Somehow, I find myself caught doing an incorrect indent regularly. I appreciate Python's ':' marker, which unambiguously tells you "either write the next statement on the same line, or indent". I remember it being criticized, but I think it's actually really good.


> at the moment I think making a language "look like python" is the safer bet for a new language

Why Python? Clojure and Elixir are two new languages with growing communities and they look nothing like Python.


You could also say that newlines always terminate statements unless a continuation character is used. That's a trivial policy that works very well in practice.


I don't know how you think that was the point. He doesn't say anything like that. In fact he takes the exact opposite position when dealing with parsing, saying to just write more code. Why would something which literally requires no extra effort (producing useful error messages without semicolons) be misdirected effort, while writing a more complex parser is cool?


I re-read the article and I concluded that we are both right. I direct you to the following statement:

"The principles are rarely orthogonal and frequently conflict."

Producing genuinely useful error messages at all requires a complex parser, IMHO. It's not computationally obvious to look at a line of code and say 'actually you just missed this one character'. Requiring a terminating character gives you at least a sanity check starting point to work with - it alone will catch whole categories of errors.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: