Hacker News new | past | comments | ask | show | jobs | submit login

Java is great for a lot of things. That's why you see large distributed systems projects that power large companies being written in java. See presto at facebook or kafka at linkedin for example.

It's not as good at for loops though. If you need it for hpc you should still be using c++ underneath somewhere for simd instructions and other neat tricks only available in lower level languages. I look at java as a better python or: "frontend to c++ code"




Even with HPC (the Azul Systems folks would object to your claim), you just have to code in a foreign style of not allocating memory. Once you get that style in your tight loops, Java can run very very fast.


I've debated with cliff click about this before actually. There was a consensus that the JVM can support certain kinds of operations for that stuff (eg: basic SIMD) with that coding style. It can't do everything though.

It's a lot better to go JNI for 99% of this stuff though.

I'd also love to know when java has first class support for gpus (you know where hpc actually runs?).

I'll continue relying on c++ for that stuff. It's also what nvidia supports first class.

If you have any first hand experience with this stuff I'd also love to hear your thoughts on this as well.

Better yet: Here's a mailing list where these people (including me) sit.

Go ahead and peruse it. I've learned a lot being on here.

I'm STILL of the opinion the JVM

can't do most of these things well after having run the numbers: https://groups.google.com/forum/#!forum/mechanical-sympathy

For cpu I'll be sticking with openmp for the long term. Don't even get me started with cpu specific optimizations (eg: different intel generations)


Fair enough, I personally don't have first hand experience with the super tight loop stuff on the JVM (though that seems like it'd be fun!). The JVM has always been fast enough for MY needs, but I realize that's not true for everyone. Thanks for the link, that seems like a good mailing list.


Nah I mean this stuff IS hard. That's why I offered that mailing list. Like I said a lot of people from azul and a lot of the hardcore jvm heads sit on there.

If there's a way to make java faster for that great. You are are right that the jvm DOES support some of the tight loop stuff.

It's harder to do and leads to messy code though. In C++ you can write maintainable code and add pragmas. It's a lot cleaner and with javacpp (https://github.com/bytedeco/javacpp) we just generate the jni bindings.

We found that to be easier in practice. I'm admittedly not THAT familiar with the jvm internals.

I have just tried to get it to work and have did enough reading to know jni will just be cleaner for that stuff.

The JVM is certainly fast enough, just mainly for distributed systems code (eg: databases, message queues)

Honestly that's what makes it appealing for me. I can write fast production systems and tweak it where necessary for the HPC apps.

Java provides a fast and safe baseline for 90% of the stuff I'd want it to do. We tried for about a year to get it to work for lin alg and finally said screw it let's just do c++.


I completely disagree that Java is well suited for distributed systems.

Several such systems are written in Java solely because Java is very popular generally and is the lingua franca of "enterprise" software development, not because Java is particularly well suited to the distributed systems problem space intrinsically.

Also, Kafka is written in Scala, not Java. Though I don't think Scala is particularly any better suited to the problem either.


Tell that to facebook,twitter,linkedin, most big banks? The JVM (whether you like it or not) powers most of the bigger database systems in the world.

New code is still being written for the jvm. Lightbend and pivotal are also companies setting up large systems on the jvm. Seems to work fine for them. What are the alternatives? go?

Your point re: scala. Scala is STILL based on java. There isn't much of a difference for speed here. Akka and its ilk still rely on netty for the underlying transport mechanism (written in java) I'm very much well aware of what's written in scala. That includes spark and kafka among other things which is great. Those STILL rely on java libraries though.

Go is still slower yet: https://www.techempower.com/blog/2016/02/25/framework-benchm...

The other might be what? erlang? Good luck finding developers. The JVM is still the only platform that has not only big data mind share but things like microservices frameworks like lagom and spring boot while ALSO having things like message queues being written for it.


Overwhelmingly databases are not written in Java. Most of them are written in C and C++.

Secondly, I'm not denying that lot's of systems are built on Java. But they're built on Java because those places already have super heavy investment in Java ecosystem libraries, tools, and engineering resources. Not because Java or the JVM is somehow intrinsically awesome for building distributed databases or distributed systems things.

In fact it's quite the contrary. It's an epic pain in the ass to keep heap growth and GC latency under control in most of these systems. The former being critical for node stability and the latter being critical for dead-node detection consistency (among other things).

I used to be a distributed database engineer, and I count among my friends many people who still work at Elastic.co, Confluent, MapR, Cloudera, Twitter, and Couchbase. I currently work for a big FinTech company building core distributed transaction processing infrastructure, and I am 100% certain that compelling the use of Java for certain systems development tasks has nothing to do with it as a piece of technology and 100% to due with politics.


A lot of fintech stuff (and google's stuff) is written in c++. MapR's hadoop distribution is also c++.

If you look at a lot of the nosql databases you could list off the top of your head including: presto, cassandra, and hbase, those are all java.

Newer ones like kudu as well as some of the older ones like mysql, postgres etc are def written in c.

I wouldn't say it's complete politics there. Maintainability is a big factor in writing systems code that lasts and you can rely on. The JVM isn't great for everything but there's teams at say: twitter who's sole job is GC management and tuning. I know cloudera and co also have these people on staff. They (as well as us) know the JVM isn't ideal for every use case. Eg: I personally do a lot of GPU stuff. I wouldn't use java code for that (we use JNI/c++ for that)

If I had to argue for the jvm, I'd say with the right tuning it's reliable enough and you can also hire for it. There's trade offs of security and reliability when you start thinking about OTHER parts of this besides latency and speed. It's harder to screw up java code than c code..and a lot of people are pretty familiar with the internals (eg: off heap, unsafe)

The flink project is a great example of this. They took java and wrote their own memory manager allowing them to keep the java integrations but not deal with the GC. Many jvm based distributed systems have started working around this stuff now.


I really don't understand what you're arguing here. It's undisputed that several distributed systems are written in Java. What I'm disputing is the idea that Java is used to build these systems because Java is particularly well-suited to building distributed systems.

It isn't. I don't doubt that there are hordes of people whose entire responsibility it is to try to work around Java's rough, eclectic edges when it comes to systems engineering, but the fact that horde of people exists at all is one, among many, symptomatic indicators that Java itself is not intrinsically a particularly good fit for that problem space.

You've inverted the cause and effect.

Java is very popular, and Java is used in a lot of businesses, and some of those businesses have/had needs for different shapes of distributed systems, and those businesses have/had an easier time finding Java engineering resources, and those engineering resources built some distributed systems, and so now some distributed systems are built in Java or otherwise on the JVM.

There are even cases (Storm and Spark come to mind) where the selection of the JVM had seemingly as much to do with where those solutions were trying to position themselves in the larger ecosystem (augment Hadoop or eventually supplant Hadoop, respectively) as it did the technical merits of Java or the JVM itself.

Also, being able to hire for a thing is almost always a political issue. Like your prior comment, "The other might be what? erlang? Good luck finding developers." It's really easy to find Erlang developers. You just have to be willing to pay for them. Additionally, the proportion of Erlang developers who have a lot of experience building distributed, fault-tolerant systems is almost certainly higher than that of the proportion of Java engineers who can say the same. You don't even bother learning Erlang unless you're building distributed systems of some kind. Java is used for everything. That's just simple selection-bias. The hard part about using Erlang is getting management to sign off on using it instead of Java or some Silicon Valley darling language du jour. Anyway, hiring considerations aren't an issue of technical design or implementation merit. They're political considerations.

If you subtract the set of open source distributed systems that were born inside companies with large pre-existing JVM investments already (introducing a different kind of selection-bias), like Hadoop (Yahoo) and Cassandra (Facebook) and Kafka (LinkedIn), and instead look at the ecosystem of distributed systems that were built up as standalone efforts outside any major corporate engineering org, like MapR, Riak, Aerospike, ScyllaDB, RabbitMQ, RethinkDB, etc. you immediately see a much different set of technology choices. Overwhelmingly these "outsider" distributed systems are written in Erlang, C/C++, and Go. The Java cohort is a limited minority in that space.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: