Hacker News new | past | comments | ask | show | jobs | submit login

Yeah, but the beauty of this approach is that it always makes architectural sense to put a given library in a single stage, and in that stage only. That library is enclosed in an "actor", and it communicates with the other stages/threads via an application-specific message (not using data types from the library; you want the loose coupling).

If the library has non-reentrant code, then you can't have more than one thread for the stage -- you can't parallelize it. But it generally won't lead to correctness problems.

curl is actually a great example, because it has an event loop for parallelism (rather than threads). So you would run a single thread for curl, because you wouldn't want more than one thread anyway.

Say you are writing a web crawler. From the network/curl stage, you just pass off pointers to blocks of memory to parsing threads. Parsing threads will be CPU bound so you will likely want to run instances of that loop in multiple threads, and you will be able to do it with no problem, since they don't depend on the curl library. They just take in blocks of memory and output some data structure to another queue.

This is also a good way to compose say the curl event loop with event loops from other libraries (GUI libraries, perhaps). Hence the relation to SEDA (http://en.wikipedia.org/wiki/Staged_event-driven_architectur...).




I read your discussion and really get all the points, and still I just think "don't make me think." The thing with Erlang is, I don't have to care about threads, pointers, libraries, re-entrant code, thread safety etc. I can just write a couple of actors, who do one thing each (in parallel, because Erlang) and they can talk to each other, share some data if they want. The only thing I need to keep in mind is, who has what data (if I need it).

I can hook up to the Erlang runtime (in local, staging, production, wherever) and talk to these actors. I can ask them "hey, what's your state now," "who are you talking to?" Sure, Erlang is slower in some cases, but for the value you get, I think it is priceless in many cases.


Sure, I wasn't saying not to use Erlang. Just saying that it is very possible to use threads and pointers in a sane (and Erlang-style) way.

I am a big fan of interpreted languages and use them as my default. The situation I was describing was basically the only time I've ever rewritten in C++ for speed! It just doesn't happen that often.

But I do wish the interpreted languages like Python, node.js, and Erlang all had better and more consistent C APIs (more like Lua's). C itself is not that hard, but the C APIs definitely put people off.

Actually I wonder for Erlang, with the web crawler example, would you have to copy entire web pages if you wanted to pass them off from a "network actor" to a "parsing actor"? The threads + pointers solution easily avoids that, while retaining modularity.


Yes, Erlang has a "share nothing" concurrency model. You would need to send the data over to another process. There is one exception, which would come into play here, and that is binary data. Normal Erlang data ("Erlang terms") such as lists, tuples, strings, integers etc. are always copied, but binary data (raw binary data, but often representing string data) is just referenced. If you download a web page, you'd save the whole content as a binary, send it over to the other actor (and it would feel just as being "copied" because there is no difference compared to the other data types). Under the hood, Erlang would just pass the reference and handle everything for you (reference counting, garbage collection etc.).

Also, when you split, or reference sub binaries, these become just pointers into the original binary data.


It wouldn't make sense to have a network actor and a parsing actor. Because these two tasks (downloading the data and parsing it) are not concurrent, you should instead have a crawler actor that downloads -> parses -> stores.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: