Hacker News new | past | comments | ask | show | jobs | submit login
Why we used Pony to write Wallaroo (wallaroolabs.com)
162 points by spooneybarger on Oct 26, 2017 | hide | past | favorite | 84 comments



    > The standard JVM garbage collection strategy is “stop the world.”
This is quite an over-simplification for what is a fundamental property to think about when designing a performance-oriented system.

The default collector in HotSpot does, I think, stop the world when collecting. But it also does multiple small collections between larger major collections. It, by default, optimizes for throughput over latency since most applications care more about overall throughput than low latency.

Even with the default, you can tune the maximum latency if you want to sacrifice throughput:

https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gc...

Not only that, but HotSpot also features a true concurrent collector, which is a command-line flag away.


The JVM concurrent GC has failure conditions. It does as much as possible concurrently but still occasionally has to stop the world.


This.

I specifically came here to write something very similar. Thank you. Call me a JVM fanboy, but when I see such comments, I tend to believe the author didn't want to spend time genuinely understanding the JVM and doing thorough testing.

I don't want to discard per-actor heap memory in Pony. That's pretty slick. Pony has been on my radar for a while because it does a lot of things well.


Depends on the sizing I believe. One problem I see in Java-based DB like Cassandra and Elasticsearch is JVM busy doing garbage collection. The major collection kicks in all the time. Probably because of bad config and bad data usage pattern, but it is still a common problem for me. I am all ears for advice.


This was an anti-pattern of the runtime model that Java inherited from Smalltalk. Tuning the GC used to be an arcane art. There was also a lot of effort put in by programming shops to just reducing GC pressure as an optimization. This is why I appreciate Golang's pragmatic approach of not using the GC by avoiding the heap. The main benefit to having GC is to make initial development faster by reducing the severity and frequency of mistakes. It's really there as a safety net, not as a foolproof end-all be-all. Use tools in a way which plays up their strengths.

It's hard to tune a GC to be super-fast. It's easy to profile and find your biggest memory leaks.


Yeah, there were plenty of GC languages with value types when Java came about, I was bit disappointed it only had them for primitive types.

Now they are trying to fit them in without breaking backwards compatibility.


How much free space is there after a collection? The garbage collector benefits a lot from having enough extra free space to keep things with medium lifetimes in the first generation.


The issue I observed was the both gen filled out very quickly so there was no chance for the minor to complete and just skip to major. But because the gens are consistently filled up, the JVM became "locked" doing full GC.


From Pony documentation:

Simplicity

Simplicity can be sacrificed for performance. It is more important for the interface to be simple than the implementation. The faster the programmer can get stuff done, the better. It’s ok to make things a bit harder on the programmer to improve performance, but it’s more important to make things easier on the programmer than it is to make things easier on the language/runtime.

This is an excellent design decision. Yes, performance sometimes introduces complexity, and I have been in teams where the philosophy is: "It does not matter if it is slower, but any high schooler that knows HTML has to be able to understand it". The purpose of a program is to satisfy the users, not the developers or the managers, and if the next developer needs to study a thing or two before understanding the code that is OK.

Simplicity should be sacrificed for performance.


The purpose of a program is to satisfy the users, not the developers or the managers, and if the next developer needs to study a thing or two before understanding the code that is OK.

Simplicity should be sacrificed for performance.

This is self contradictory. Programmer resources need to be allocated to pleasing the users. To please users, you need a certain amount of performance. Performance is not simply and end in itself. Programmer resources can be used to please users in other ways.


I was keeping it PG-13


HR is happy to know!


    Simplicity should be sacrificed for performance.
It doesn’t always have to be that way though. Several important classes of optimization lend themselves to simplifying code.

For example, code that effectively makes the same decision three times is both slow and obscures the intent of the code.

And there are ways to compartmentalize optimizations so that people working in the general vicinity don’t have to bother with the ‘clever’ code on a daily basis.


I don't see how it relates.

The idea is if I have to choose between simplicity and performance, I shall choose performance.

Your example does not make sense because you are not choosing simplicity over performance because the code was already performing slow.

Compartmentalizing the code to hide the complexity behind a black box does not apply either because you are not making a complex code simpler, the one that has to maintain your black box is still exposed to the "clever" code.


By compartmentalizing I mean in the realm of Single Responsibility at the level of a function.

A block that scans a table for a match can be moved up and out and then replaced at your leisure with a version that is more sophisticated and faster and the people using it only have to look at it if there’s a bug. They don’t have to look at it then they don’t have to grumble about how it took 10 lines of comments to document four lines of code because you used some uncommonly known property of Logic or Set Theory to eliminate a bunch of duplicate work.

And worst case they can revert the changes until they sort it out because it’s just one function with the same interface. (Also, they won’t fight you as much about the initial change because it’s easy to revert and self contained)


I have been looking over to this language every now and then for a long while now, and this article finally motivated me to actually learn it.

But it seems the documentation is a bit out of date - I cannot even get the "Hello, world"-program to build correctly, neither on Debian Stretch nor on openSUSE Tumbleweed.

It is quite frustrating if you try to learn a new language, and when you make your first step, you step on a nail.

And it is even more frustrating, because Pony looks very interesting, for the reasons the linked article explains. ):

UPDATE: On Debian, telling the pony compiler to --pic does the trick.

UPDATE2: On suse, installing binutils-gold solved the problem.

UPDATE3: Now we're talking! This language is very, very interesting, indeed!


Sorry to hear that. As far as I know the documentation should be up to date, but it's entirely possibly you've found an issue.

If you like you can hit up the mailing list (https://groups.io/g/wallaroo) or the IRC channel (#ponylang on freednode, https://webchat.oftc.net/?channels=wallaroo) to see if anybody can help, or if you think you've found a bug you can file an issue in github (https://github.com/ponylang/ponyc/issues).

UPDATE: I'm glad to hear you got it working!


> I'm glad to hear you got it working!

Me, too! This looks pretty exciting!


I'm often lurking on IRC if you need help.


Author here. Small bit of background. I'm VP of Engineering at Wallaroo Labs and a member of the Pony core team (many folks at Wallaroo Labs are now actively involved in the Pony community).

Happy to answer any questions here or if you prefer, via email:

sean@wallaroolabs.com


Have been impressed with Pony the few times I've seen it on HN, thanks for your work on Pony and Wallaroo.

I'm guessing that tooling for Pony still needs some refinement, but as things are now what would you recommend to someone new to Pony (e.g. editor, package manager, build tools, debugging tools)?

Oh and just out of interest, did you evaluate Rust as a language for Wallaroo? I do think Pony has the potential to be stronger in certain areas, but it seems like there's a certain amount of overlap in terms of use cases.


I'm an engineer at Wallaroo Labs and I've been writing Pony for the last two years. We're using pony_stable (which is the primary package manager at the moment) and make for managing builds. I currently use Sublime 3 as my editor and lldb for debugging. I know that others are using emacs and vim as well.

We did consider Rust as there are certainly overlaps in their approaches to safety (though coming from different angles). In the end we thought that Pony would be better for managing concurrency for our particular use cases, which we thought mapped well onto the actor model.


> we thought mapped well onto the actor model.

Out of curiosity, were there a widely accepted actor lib/approach in Rust (I know there are a few like RobotS and others), would that have affected your decision?


It might have. When we were looking at Rust, it was a topic of conversation.


In general, it's not a space where there's a hyper-mature library, but there are some cool stuff happening today: http://cityboundsim.com/devblog/my-full-rustfest-talk-with-n...

That said, I mean, if you want to go full actors, Pony is a great choice.


Oh wow, I used to follow Citybound development a while ago when it was still written in JS. Anselm is an amazingly talented developer and it's great to see him working with Rust!


I'm also an engineer at Wallaroo Labs. A while back I wrote up some information for people who were interested in Pony:

https://gist.github.com/aturley/49b60c98306d90ffc2f981515827...

I've been using Emacs as an editor. We use a combination of make and pony-stable to build projects. For package management we use pony-stable. LLDB is your best bet for debugging, and if you follow the link above there's a link to the pony-lldb project, which is an LLDB extension that makes it easier to work with a few Pony datatypes.


Why did you choose Pony over something like Erlang or Elixir?


I want to preface this with: I love erlang. We use Elixir for the service that powers metrics display for Wallaroo. When I learned Erlang several years ago, it made me a better programmer.

At the time we made the decision, we were worried about Erlang being able to meet the latency and throughput goals we had. I knew a number of people who worked on Riak at Basho and had a few lengthy discussions about Erlang performance. They felt that we could end up struggling to get the performance we were looking for.

I had never used Erlang for any large scale project and deferred to their wisdom and knowledge.

As it turns out, the approach that we've used to support languages like Python is in process, using C to bind Wallaroo with the other language. This probably would have been more difficult with Erlang as well, but that's hindsight.

Erlang is an awesome language. It has an amazing VM. That we didn't think it was right for us should in no way discourage anyone else from using it.


Is your application CPU bound? Because that's generally where Erlang's performance is lacking. I'm pretty sure you wouldn't run into performance issues in Erlang if you're doing an IO bound task


Wallaroo is a framework for end users writing applications. It is not an application itself. Many of the use cases that people come to us with are very CPU intensive.


Thanks, that makes sense


Just so you know, some of the earliest videos you can find about Pony are at Erlang conferences, probably because it is cited as a big influence (actor language). The two communities seem to be getting along very well.


How do you ingest data with Wallaroo? I see a mention of "...say TCP" and in the github word count there is mention of a (data)source framed-message-protocol but nothing that describes it in the (https://github.com/WallarooLabs/wallaroo/blob/0.2.0/book/cor...) link.

Dropping Python into the mix for a high throughput processing pipeline seems counterproductive. Why isn't there a tutorial in a more strict language like Go or C++, since someone pursuing this kind of high-throughput framework that can't benefit from wider parallelism, will 100% want that performance guarantee?


Ingestion:

Kafka and framed TCP are the two ingestion sources that Wallaroo currently ships with.

Python:

We added a Python API because there was a lot of interest in it. There's more to Wallaroo than just performance and folks were interested in having it available via Python. See https://vimeo.com/234753585 for some more information on the "scale independent" nature of the Wallaroo APIs.

Go:

We're working on that right now actually. Planning to release later this year.

C++:

We had a C++ API (still do), we aren't currently supporting it. There was limited interest at the time. If folks show interest we would start supporting it again.


I‘ve worked with Scala/Akka, and while I like the actors model, some things turn me off about the stack, mostly the language.

After reading the Pony guiding principles, I was delighted. This looks like an implementation where the ecosystem complexities don’t get in my way of „getting things done“!


May I ask what turned you off about Scala?


Its the JVMs C++. Because it is both OO and functional, it has too many concepts and some have a poorly thought out mental load/usefulness balance. Implicits is just the most notorious offender. For me it was a huge productivity killer, because proficiency takes a long time and the tools are poor (IDE support, compile times & build tools).

I like my languages opinionated, in as "there should be one way to do it". Python, C, Go, Clojure, maybe Rust. Designed to be a tool for their inventors set of problems, not a vehicle to implement research papers. Fast turnaround/feedback cycles.


> Because it is both OO and functional

Hmm, I actually see that as a strength. Need a strong type system, pure FP, and immutable state throughout? Easy. Need to build an OOP library that will be mostly used by Java devs? Easy as well.

Implicits are handled pretty well by Intellij IDEA afaik.

Of course, I haven't had experience using Scala in a large codebase, so I may be on the wrong track!


If it works for you, great! You have a very powerful tool at your disposal. There are many good reasons to like Scala, it is entirely possible that my experience would have been different in another team setup.


Thanks for the writeup Sean. I've been interested in pony since I read about here on HN a while back.

I have a question on this quote

> The standard JVM garbage collection strategy is “stop the world.” That is, when the JVM needs to reclaim unused memory, it needs to pause all other processing so it can safely garbage collect. These pauses are sometimes measured in seconds. That is going to destroy your tail latencies.

Thats why applications rarely use the default serial collector.

Does your comparison still hold against the CMS and G1 collectors, which do a much a better job at eliminating "stop the world pauses"?


Thanks.

At my previous job, we used the G1 collector and still had issues with "stop the world" type pauses. G1 does a best effort, but I've had applications that still experience rather long pauses (sometimes measured in seconds, usually hundreds of microseconds).

I haven't used either CMS or G1 heavily since I joined Wallaroo Labs a couple years back so, I don't have first hand experience with either recently.

Azul's Zing JVM does a really nice job of concurrent collection and if you can afford it, is a great way to improve the performance of clustered JVM applications.


What benefits does Pony have when compared to Erlang?


From the Pony side:

Biggest strength would be performance. Biggest weakness would be maturity. I think almost every pro/con I can think of can fall into those 2 buckets right now.

I'm a big fan of type systems so Pony having one is a big win for me.

The maleability of Pony and its immaturity helped us in some ways. We were able to treat it as a runtime for us to help mold and fit to our needs. That wouldn't have been possible with Erlang.


> Biggest weakness would be maturity.

Has there been any progress towards a preemptive scheduler? Otherwise, I'd think that's the most obvious weakness.


There's pros and cons to preemptive and cooperative scheduling policies. I wouldn't be comfortable labelling either as weakness. It really depends on context.

I've heard rumor of someone working on a preemptive scheduler. The current cooperative one is about to have runtime backpressure added to it which is going to be a really nice scheduling win.

https://github.com/ponylang/ponyc/pull/2264


For the kinds of systems that are built with Erlang, which is a niche that Pony seems to want to occupy, I would argue it's inherently better to have preemptive scheduling. A cooperative scheduler will always be an open invitation for bugs relating to CPU hogging. Not having to worry about these things is priceless.


Are Ponys classes akin to Erlang modules or does it take a more enterprise-y perspective on OOP?


Howdy. Erlang's modules are involved in code namespace for the compiler & runtime (i.e., the 'M' of the 'MFA' triple of Module + Function name + Arity that names a function) as well being the unit/scope/bound for hot code loading (i.e., you must load or unload an entire module at a time).

I've worked in Erlang-land far longer than I've lived in any OOP-land, so I'm not sure what you mean by enterprise'y OOP. Coincidentally, Pony's rules for "Packages" are something I just smacked my ignorant head against a few hours ago. The subsections of https://tutorial.ponylang.org/packages/ in the tutorial can probably answer at least some of your question: specifically "Package System" and "Use Statement".


I'll take the liberty to speculate and assume your parent comment was asking if Pony classes have the same features as Java classes -- namely class-level attributes, instance-level attributes, different access levels to attributes and methods. Those things.

So, does it?


Pony classes have methods. Pony classes have fields.

Fields and methods can be public or private. Private is akin to Java's package private.

There are no instance variables.


Thank you. And structs / records? I mean, if there are no instance fields, there has to be some kind of constructs that emulate them.


Sorry, I mispoke. There are no class fields. Only instance fields.


I'm not sure which features of Erlang modules you're interested in comparing to Pony classes, but I'll take a stab at trying to answer.

Pony classes are defined using a "class" keyword. Classes have properties and functions, where functions are like methods that you would find in a language like Java. Pony supports structural subtyping via interfaces and nominal subtyping via traits, and it disallows multiple inheritance.

I'm not sure how much that helps, but if you have more specific questions I'd be happy to try to answer them. I'm a little rusty on my Erlang, but hopefully it will come back to me.


Just wanted to say, thanks for the excellent article and the answers to questions here!


Thank you!


I used Pony and bindings to opengl to write the (very) start of a graphics engine. http://www.charlesetc.com/stars-game-7.html

It was surprisingly easy to learn - I found the capabilities system rather intuitive compared to other methods for safely managing data races. Also the syntax is really simple and easy to read!


JVM GC is an interesting topic since recently there was a proposal for the ZGC project in the OpenJDK mailing list. http://mail.openjdk.java.net/pipermail/announce/2017-October...

"ZGC has been designed with the following goals in mind:

- Handle multi-terabyte heaps

- GC pause times not exceeding 10ms

- No more than 15% application throughput reduction compared to using G1"


My understanding (which might be out of date now) is that G1 still doesn't perform as well as CMS, so i'd be more interested in a throughput comparison to that!


I do enjoy reading these "Why we used <x> to build <y>" posts, as they're usually insightful and provide a glimpse into languages or frameworks I'm not familiar with.

On the flip side, I can't help but wonder how likely they would have been to use Pony had none of their core team members been active in the Pony community.


Hope you enjoy this one as well.

We weren't active in the Pony community until we decided to use it for Wallaroo. We felt it was very important to invest in the improvement of Pony. And to do that, we needed to be part of the community.

Pony became our runtime instead of writing one ourself. That doesn't mean we don't have to work on our runtime. It means we are sharing the burden with others.


Oh wow, well I'm all the more impressed then. It's admirable to try out new tech and venture into the unknown. Kudos!


Interestingly enough, none of us had even written anything significant in Pony when we made the initial tentative decision. After we came to the conclusion that it might be a good fit, we started a test project and a couple of us tried ramping up as fast as we could. It was actually only after we decided it was the right choice that we started to get actively involved in the community.


Very cool, thanks for sharing!


Pony looks really interesting and I look forward to trying it out. I can't help but try to reason about how my work language would handle these problems. Go.

Highly concurrent: check.

Predictable latencies: I think that's a check, the go gc has had alot of engineering put into it to provide good pause times. There are some edge cases still.

Data safety: I think go is weakest here in terms of language design. Go does have go build -race which works pretty well for me, but I can see how some wouldn't consider it sufficient. It also wouldn't catch single threaded ownership problems, which can happen.

Easy way to interact with other languages: via CGO and buildmodes you can trivially call go from other languages. Some don't like that the runtime is still shipped and started, but the fact remains that you can cffi go functions from python very easily.

I'll have to check out pony this weekend. I've read up on it a bit but haven't compiled anything.


That's great! We're collecting stories about peoples' first impressions of Pony (https://www.ponylang.org/categories/my-first-pony), so if you're interested in contributing to that please get in touch.

I'm an engineer at Wallaroo Labs and I've been learning Go so that I can add a Go API to Wallaroo. Your analysis looks pretty spot-on based on my experience.

The buildmodes is go definitely make it easy to call Go from other languages. The trickiest thing I've run into so far with calling Go is that you aren't supposed to hold on to Go pointers outside of Go code, so I'm having to jump through some hoops to hold on to objects between calls.

I have an RFC for Pony that I need to finish up that will let you call Pony function from C. I'm planning on getting around to that soon, as I think it will help improve our FFI story.


Really good post. Haven't heard much about Pony but seems like a really interesting language and like something I could use for a particular project.


Thanks.

If you ever want to talk Pony, you can find me lurking in the Pony IRC.


Never heard Pony before. Sounds like a faster, type-safe erlang? Sign me up, looks awesome!


Well, there are several major differences. Erlang actors are lightweight green-threaded continuations, pony uses lockless threading from a threadpool, so you use will all your cores. Pony is much faster in CPU (on par with C++ with OpenMP), on IO it's about the same, as IO is mostly about avoiding waiting.

Pony uses much less memory, there's no beam overhead, the GC protocol is better, the object overhead is much smaller.

Pony allows zero-copying messages (call-by-ref vs call-by-value), which allows using fast shared memory threading models (besides copying values). All this is compile-time safe via the type-system.

The pony stdlib does not support blocking IO, which is probably a good decision, but you need more overhead adding the wait logic by yourself.

Erlang supports distributed actors, pony not yet. Sylvan and Sebastian are working on it, but it's not easy. See https://www.ponylang.org/media/papers/a_string_of_ponies.pdf

Erlang supports macros via elixir, pony not.

Pony's capability-bases type system is far too advanced for a regular user. You need much longer to learn it and come up with compilable designs. This might be frustrating.


That's the general idea. But, Erlang has a wee bit of a head start.


Does Pony have preemptive scheduling? Does Pony have an equivalent of Distributed Erlang?


Pony does not do preemptive scheduling. Pony actors have behaviors, which can be thought of as methods that handle messages that are sent to the actor; these are the unit of scheduling, and a behavior runs from beginning to end without preemption. Multiple behaviors can run at the same time, but only one behavior per actor can be running at any time.

There is currently no equivalent to Distributed Erlang, but that's something that the creator of the language has been looking into.


A bit more info. Sebastian Blessing did a thesis on Distributed Pony. You can read it here: https://www.ponylang.org/media/papers/a_string_of_ponies.pdf

At this time, there is no "in the wild" implementation.


Thanks. As an Erlang fan, I'm curious how types and networking are going to overlap.


Preemptive scheduling is one of the advantages of Erlang over other actor implementations like Akka. Please consider implementing it!


Admittedly I'm not familiar with Erlang's preemptive model, but I'll ask anyway :)

When using async based APIs with something like Akka, passing around futures and such, and leveraging multiple actors in thread pools, doesn't that get you to approximately the same place as Erlang?

Sure, the JVM isn't doing the preempting, but the OS is. Is the main downside per-thread resource consumption and context scheduling overhead?


I think there is a lot to be said for the simplicity of the erlang model. It's not just about stopping a thread from hogging the cpu. The OS doesn't have a good idea when to wake threads up, so they are effectively polling needlessly.

I do scala/akka in my dayjob and the amount of time we spend tweaking threadpools (execution contexts) and retesting just to get okay performance is insane. You wear a pretty big cost when you layer on abstractions like that.


As far as I understand, Erlang scheduling is more aware of actor model resource usage patterns and NIFs than a combination of a JVM (or another runtime) + OS. It makes it more reliable and performant in the end (in places where scheduling matters).


This is interesting - you wanted actors, you wanted a pleasant programming language and you wanted to be on the JVM.

I'm curious why you didn't choose Kotlin with Vert.X . Kotlin is a small town language that won only because it was loved by the community...Nobody "pushed" it. And it has won - it's now officially supported by Google.

Vertx is a superb actor model framework with first class kotlin support. Its extremely high performance (https://www.techempower.com/benchmarks/#section=data-r8&hw=i...) and is pretty popular (http://vertx.io/whos_using/)

From a forward looking perspective, why Pony ... Especially if it's development has stalled.


There’s also quasar for actors. http://docs.paralleluniverse.co/quasar/


We decided for a number of reasons that the JVM wasn't the right fit for us, so we didn't end up considering Kotlin.

Pony is being actively developed, and has been as long as we've been using it. Since some of our team are now core contributors to the language, we've also invested in improving Pony, which has proven to be very useful to us.


For a moment I thought this was about the Pony ORM (Python) - looks like there are quite a few popular projects named Pony, perhaps we should use "Pony Language" instead of just "Pony" for the sake of disambiguation.


Are there any statistics that compare this to some similar existing software?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: