Hacker News new | past | comments | ask | show | jobs | submit login
Paul Graham's kind of dirty (On Unicode support as an exercise for the programmer) (plasmasturm.org)
17 points by toffer on Jan 30, 2008 | hide | past | favorite | 14 comments



I think the author fails to realize that Arc is open source and free. You want Unicode support? Write it in. PG didn't say that it should be in a library, he just said that he didn't want to work on it. Which is a good reason to open source it.

I forget where I read it, but an article claiming the superiority of open source software said something like

Who is most likely to do a good job on Unicode support?

1. An international who needs it

2. A hacker who is interested in that sort of thing

3. A corporate worker who is told he has to support it

The answer, of course, was basically "anything but 3," and I had to agree. I think the same logic applies to this situation. Do you expect PG to do a good job on Unicode when he could care less about it and feels like it's a waste of time? Of course not. The whole point of open sourcing it is to allow someone who would be interested in such an undertaking to go about doing it.

Also, when he said Arc isn't the kind of language for people who would be upset with lack of Unicode support, I mentally appended in a development release of a beta language to the end of that sentence. If PG isn't even guaranteeing a consistent language to build on, how can anyone be upset about lack of Unicode support? Clearly, people are missing the point of the beta release cycle and open source in general.


"You want Unicode support? Write it in."

Sorry but that's like selling someone a car and saying "Oh you want the chassis to be made of metal instead of wood? Well you can change that can't you."

I disagree. If you're building something from the start, you decide on a good character set, and use it. Namely Unicode. It's a lot easier to do that than to try and fix things afterwards.

Look at php. Dealing with Unicode in php is a major hassle because it wasn't designed from the ground up to support Unicode. It was added in later as a library. So now you have special functions for dealing with unicode, and all the hassle and verbosity that entails in your code.

In Javascript or Java however, Unicode just works universally.

I'd say choosing a character set is one of the first things you should do as a language designer - that's why I was surprised with the lack of support for anything but ascii which just doesn't cut it any more for real world applications.

Saying that adding Unicode is a couple of days work for anyone interested sounds like a recipe for absolute disaster - see how php turned out.


Sorry but that's like selling someone a car and saying "Oh you want the chassis to be made of metal instead of wood? Well you can change that can't you."

Except pg isn't selling anything, he's just letting you see and use what's ostensibly made his life easier.

I'd modify your simile to something more accurate, but modifying similes is a black hole.


OK, he's giving us a car with a wooden chassis. Might be a great car, fun to drive. But it will probably need the chassis replacing unless you want to crash.


It's more like he gave you a scalpel, a tool that's great - even indispensible - for a specific set of cutting tasks, but that might leave you disappointed if you chose to use it for cutting down a tree.

Except that scalpels aren't really all that extensible.

See, I told you it was a black hole...


Ignoring the bad simile (which others have pointed out), adding Unicode support for someone who has done it in MzScheme before probably is no more than a couple days' work, if that. The changes can be done in the bowels of the implementation and then submitted as a patch, just like any other open source project. The PHP library problems don't have to occur, because it doesn't have to be in a library.

I agree with you that adding Unicode support on top of a language is a terrible idea. But when the language's authors have said that compatibility in future versions is not an issue, when they open source the language, and when one of them says,

MzScheme, which the current version of Arc compiles to, has some more advanced plan for dealing with characters. But it would probably have taken me a couple days to figure out how to interact with it, and I don't want to spend even one day dealing with character sets.

to me, that is coming right out and saying, "Hey, I didn't feel like figuring out character sets, but I know MzScheme has a way of doing it. I'd love for you to do what you need to in order to get this working, because it doesn't interest me."

Also, if someone gave me a gift car with a wooden chassis, I would be quite happy with it. I'd drive it while it worked, and when it got to be enough of an annoyance, I'd take the car apart and rebuild it on a steel one. I certainly wouldn't bitch about it.


Sorry but that's like selling someone a car and saying "Oh you want the chassis to be made of metal instead of wood? Well you can change that can't you."

I agree. You should definitely take Arc back, demand a refund, and refuse to ever pay for it again.

In Javascript or Java however, Unicode just works universally

Yes, and what a great language Java is. You should probably keep using it.

I mean, thank god the Java 1.0 guys worried so much about character sets instead of, say, macros or closures or even regular expressions. Otherwise Java might be burdened with a lot of bolt-on special functions that add "hassle and verbosity" to the code.


"I think the author fails to realize that Arc is open source and free. You want Unicode support? Write it in."

I would like to announce the release of my new language, Barc, as in the dog that didn't. My language is free and open-source. You want syntax? Write it in! You want a compiler? Be my guest!

I think failing to include Unicode is a valid decision, but yours is a weak argument -- if it's that trivial, why isn't it already written?


Failing to include Unicode because you don't feel like doing it is a bad argument? So if I give you a gift certificate to go out to eat but fail to pay for your transportation, is that a bad idea, too?

You are the one with the weak argument. PG has offered up a language which is still being developed but already has some utility, and you're comparing that to a hypothetical language with nothing at all?


I agree with the author, Unicode seems like it should be right in the core of a language, but I'm incredibly sympathetic to not wanting to spend even a day on the issue.

I also am incredibly grateful for any open source released (I feel like it shifts political power to individual programmers vs. corporations).

All that being said, it sure would be nice for Paul to spend that day or two of effort that would seemingly save thousands of community man hours. But I can't really blame him if he doesn't want to.


What's with all this negativity... it's not like you paid for the code and he owes you something. It's like someone builds a new type of car and gives it to you, and you say it totally sucks because there isn't any paint on it.


I dunno, we're just expecting that this awesome new language he's promoting have basic functionality. Crazy, huh?


boring. don't read.


<3




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: