The main dealbreaker for me is 8 bytes (16 bytes on x64) per character in string...

jlouis · on May 19, 2013

We have the equivalent of Haskells (lazy) ByteString in the types binary() and iolist(). We have had that for at least 6 years. Their space efficiency is also on par with Haskells. If you process big amounts of data as a string(), then you are probably doing something wrong in your architecture.

peerst · on May 19, 2013

Well Erlang does have binaries for quite a while and many use them as high performance strings. So you can also use lists for strings, but why do you if you don't like the characteristics of lists?

ibotty · on May 19, 2013

do you need to do much "rich" string processing in erlang? it certainly is not designed for that usecase.

you _can_ do quiet a bit of string processing in erlang's bytestrings (note: not unicode-aware), see e.g. cowboy (which has nice to read sourcecode).

as to why erlang does not have a haskell-like ByteString, i guess, it's because haskell has very fast (but potentially unsafe) c-bindings. erlang's ports are too slow to use them for something that small-grained. i'm not sure, why no BIFs were introduced to bring somthing similar, honestly.

atombender · on May 19, 2013

> it certainly is not designed for that usecase.

That's a misleading statement, I think; it's like saying Python was not designed for web application development (which is true but also misleading).

It's more correct to say that Erlang was designed without much thought given to string processing. Erlang was designed as a fairly general-purpose language, only one with non-encoding-aware strings; the real problem is that it hasn't caught up, even though Unicode has existed since 1991 and has long been incorporated into most languages and most software by now.

Instead of adapting, Erlang seems to be stagnating in one area that users are frequently complaining about. In this day and age, I would argue that string processing is quite important for the things that Erlang can, or should, be used for.

Anecdotally, when I first tried Erlang I tried to create a naive, parallel log processor which read lines and spawned off lines to a pool of workers. As a newbie I was quickly stumped because Erlang's file I/O is abysmally designed. I eventually gave up the project, and later I found that Tim Bray, also an Erlang noob, had struggled with the exact same issue [1][2]. You would think that this is something Erlang would excel at, but apparently it's not.

I have been disappointed with Haskell's string support, too -- it's all over the place (String vs. Text types, the weird Data.Encoding module, horrible regexp library, etc.) -- but at least it's fully Unicode-aware and it has a fast ByteString type.

[1] https://www.tbray.org/ongoing/When/200x/2007/09/22/Erlang

[2] https://www.tbray.org/ongoing/When/200x/2007/09/21/Erlang

davidw · on May 19, 2013

I like Erlang a lot, and am currently working on a project that utilizes Chicago Boss, but there's a lot of truth in what you write.

Erlang is somewhat strange for a language in that it got put to use in some very important critical systems before it got big. Things like Ruby, Tcl and Python slowly got popular and changed along the way. Doing an Erlang 2.0 that improves on some of the warts is probably not so easy...

atombender · on May 19, 2013

If you are interested in the history of Erlang, this is fascinating: https://docs.google.com/viewer?url=http%3A%2F%2Fwebcem01.cem...

atombender · on May 19, 2013

That's a good observation. I don't know much about Erlang's history beyond what is on Wikipedia, but I suspect that the number of people that used Erlang before it was open source was relatively small and limited to the technical community within Ericsson. That small, focused user group explains the language's weird and weirdly antiquated style and feature set, but it doesn't quite explain why the language has not been evolved to become more modern. (Of course, the fact that something as ingenious could come out of such a small community is also very impressive and wonderful.)

jlouis · on May 19, 2013

Unicode is a big thing in R16 and R17 where more of the system will use the unicode support we do have. Erlang modules will use utf-8 encoding and so on.

As for a built-in bstring() type, it is being discussed, but there are other things more important to get into the language, maps for instance. Erlang is rather conservative in the speed with which we add stuff to the language.

Erlang has some of the fastest I/O in a runtime. But it is not easy to use correctly. The same goes for string processing. Erlang can be blinding fast at that, but you must understand how to make it fast.

yosh · on May 20, 2013

Having recently run into some non-obvious crappy performance characteristics with I/O in Erlang, I'm curious if you have any pointers to docs or code that shows how to use Erlang I/O correctly so it's fast?

digitalzombie · on May 20, 2013

More important things...

please please have a record replacement already. There are like a couple of proposals for record replacement.

http://www.cs.otago.ac.nz/staffpriv/ok/frames.pdf

lame · on May 19, 2013

You can use binaries. In Elixir strings are by default binaries containing UTF-8 codepoints, which are already nicely handled by binary matching and construction.