Hacker News new | past | comments | ask | show | jobs | submit login
135M messages a second between processes in Java (2013) (psy-lob-saw.blogspot.com)
262 points by matteuan on Aug 13, 2016 | hide | past | favorite | 157 comments



The JVM is a treasure just sitting there waiting to be rediscovered.

It really is a shame that there is so much noise and unnecessary complexity around using it.


> The JVM is a treasure just sitting there waiting to be rediscovered.

It's the result of man-centuries (man-millennia?) of work, yes, and as a result is really, really impressive. But imagine if that much work had been put into something fundamentally great, like Smalltalk or Lisp, rather than something fundamentally okay like Java.

> It really is a shame that there is so much noise and unnecessary complexity around using it.

Very true. Java itself is not a terrible language (hence why I wrote 'fundamentally okay' above); it's the arcane, rococo, lunatic levels of ritual which surround it. Every time I look at Java code I'm reminded of some insane fantastic court in which one has to plead with the Minister of Small Affairs, make a sacrifice to the God of Bureaucracy, pay for a Token of Token-Paying, appeal to the good graces of the Wise Undersecretary of Vice-Small Affairs and finally spend a weekend climbing an Escher Staircase before an audience with the King, all just to get a scrap of toilet paper.


(Edited to clarify that I think Java was good in the very beginning, less so in the middle, and is good once again)

Java started off as the Golang of the 90s: small, simple, opinionated, statically-typed, in many ways state-of-the-art, and engineered to protect the programmer from doing bad things (either due to ignorance or will). It came with a batteries-included standard library.

Then, in a quirk of fate, it became very popular. Though its library provided clean abstractions, you had to chain several of them together to actually accomplish anything: this is why it was so verbose. Everyone was writing Java, but no one was sharing Java, yet. Before Apache (Jakarta) Commons, no one bothered to make wrappers for basic stuff... just like how we were writing bad JS before jQuery or underscore came out. We were caught between our own experimentation, Sun's desire to take over the desktop, and companies like IBM and Oracle that wanted to ensure they could sell commercial support for big monolithic Java appservers.

When the 'Design Patterns' book came out, someone should've made a framework, but instead we sprinkled AbstractStrategyFactoryBuilders everywhere because that's what the enterprise libraries did... and it read like a bad parody of object-oriented programming, but by now, we've learned our mistake.

Now that bad APIs have fallen out of favor, Java EE is nearly abandoned, Spring brought some sanity but also forced us to consider why we're even thinking about convention over configuration, Java is much, much nicer. In the Maven, Github, CI world, we can -- environment and company policies permitting -- easily pull in libraries and frameworks that allow us to actually get stuff done at the correct level of abstraction.


This is a lot different than I remember. As I remember it, Java was touted as a compile-once run everywhere language. Back then, you had many different flavors of UNIX, with different hardware architectures, and in practice, software was deployed (and supported!) directly on customer servers (web services didn't exist). Plus, this was a time when pretty much all code was written in C. Virtual machines with garbage collection were academic research projects.

Java (until recently) was always a sub-par language. It was difficult to work with and behaved in unexpected ways. Academics hated it because there were much better grammars. Engineers hated it because it was slow and rigid compared to C. Gosling designed it for embedded systems, of all things. But, it aimed to solve a very expensive problem, and lots of hype was directed at it.

(Edit: I should have emphasized that the tooling really sucked more than the language. We had fantastic C compilers and IDEs. And, Java had immature tools. Worse, you had to manually build Jars and package your Jars appropriately. Ant, Maven, Gradle, etc didn't exist.)

Also, the GoF authors didn't write design patterns as a response to Java. They wrote it as a retrospective on large software systems - what worked and what didn't. The 'patterns' were called such because they were observed in many independent projects, each having been reinvented time and again. The patterns proliferated with Java EE architects who pushed patterns as common ways to communicate standard design principles. You have to keep in mind that this was also a big time for 1) RAD, 2) UML, and 3) software factories. Large companies were sold a vision than in a very short period of time all they would need would be an 'enterprise architect' who would 'draw' the software, hit a button, and out would come the complete system.


Java (until recently) was always a sub-par language. It was difficult to work with and behaved in unexpected ways.

Every language has its idiosyncrasies to be sure, but this doesn't really jibe with my recollection. I remember Java being a much better alternative to C++ in many ways, including (and especially) binary object compatibility which made it possible to effectively use libraries. Remember how C++ libraries had to either be pre-compiled for your platform (which the same compiler you were using, etc.) OR be shipped as source.. and given C++'s blazing fast compilation times cough cough, well....

Anyway, in 2001 we were building Java IVR applications on Windows and shipping the jars to AIX machines and running them with no problem. The "write once, run everywhere" thing certainly wasn't 100% true (especially for desktop apps using Swing) but for many applications it really was the case (or close to it).


I'm sure it worked well in many cases. But, in our case, we were developing for Solaris, AIX, HP-UX, and Windows NT. The VMs would behave differently in different environments, and I remember VMs stack dumping on many occasions which became a major support headache. We used to call it 'build once, run nowhere.'


But, in our case, we were developing for Solaris, AIX, HP-UX, and Windows NT. The VMs would behave differently in different environments, and I remember VMs stack dumping on many occasions which became a major support headache. We used to call it 'build once, run nowhere.'

Interesting. The only thing I really remember people complaining about a lot was thread scheduling. Java always let you set a thread priority and a lot of people tried to use that to make things behave in very specific ways, but - if I recall correctly - it was the case that that behavior was always allowed to be platform independent and couldn't be relied on to create a consistent cross-platform experience. Outside of that, it seemed like UI issues with Swing (or, I guess, AWT) were the main things people ran into trouble with.

What time frame was all of this for you? Wondering if it was just the very early days of Java before some of the problems were resolved.


Also, Java only provided the lowest common denominator. A piece of software I worked on back then needed to start a long 2hr process, but only if there was more than 2GB of free disk space (which was not a trivial requirement circa 2000). But Java did not have a way to query free disk space at the time because IIRC it was missing on one of the platforms (and then it wouldn't be "run everywhere"). So instead it would run, and if insufficient disk space would just fail.

Java was always "write once, debug everywhere"


> Academics hated it because there were much better grammars.

Surely not at my university.

By 1998 it had already replaced C as the language for distributed computing classes.

C with SUN RPC or PVWM was just too clunky to keep on using and was used just as introduction to distributed computing.

Same with compiler design classes.

The ECOOP'99 was full of Java related presentations with the keynote of Jim Waldo about Jini, something that would just come in handy in our modern IoT days.

> We had fantastic C compilers and IDEs. And, Java had immature tools.

On which OS?

Visual Age, JBuilder, Zortech all were quite similar to their C and C++ siblings.

> The patterns proliferated with Java

Nah, they were already quite common in C++ with CORBA and COM/DCOM projects.

EDIT: typo to => too


Sure, universities embraced it in the late 90s because it was easier to teach to undergrads, and CS programs were becoming cash cows. But, the researchers that I knew certainly weren't using it for their own Computer Science research (aside from VMs and heterogeneous environments).

Yeah, Jini was cool! And Tuple spaces, etc. There were some cool things being built on Java, but few people were embracing it as a silver bullet.


I feel you haven't taken a look at Java in the enterprise in a long time.

Spring is considered to be an overloaded bloated mess and Java EE has become lean and mean with plenty of very capable specifications and implementations (JPA, CDI, etc...).

And FYI, the Design Patterns book came out before Java (1994 vs/ 1995) so I think you have your timelines confused. The early editions of Design Patterns didn't contain a single line of Java, they were mostly C++ (and some Smalltalk).

Java is twenty years old, and for such an old geezer, it's adapted remarkably well (and between Java 8 and Kotlin picking momentum, its legacy seems to be well assured).


I was trying to express a lot of ideas in a small space, and I have edited it since, but it's still not as clear as I'd like. You and I agree, but I have struggled to get this point across.

Spring is a bloated mess, but when it first came out it was lean new framework and a relief from EJBv1 and EJBv2. In response, EJBv3 was much better, but in the meantime Spring became the new normal, and grew into some strange swiss army chainsaw glue with now-deprecated awful xml configuration replaced by the horror of not-quite-code-but-compiled-into-the-classfile @Annotations!

The 'new Spring' is Spring Boot, which was a response to this then-obscure framework called Dropwizard which showed that all you had to do was pick a handful of very good libraries get work done. Guava, Jersey, Jackson, Jetty; some decent ORM, and you're good to go. Some of these are backed by new Java specs like JAX-RS and JPA that are legitimately pretty good.

I think the biggest problem Java still has in the hearts and minds of people outside the 'enterprise' is not because of Java, but because of the complexities that only manifest in environments where your code isn't 100% greenfield every time.

Dependency Injection isn't something you realize you need until you realize you need it -- and then you realize you need it Really Badly.

Externalized XML config files declaring reusable beans isn't something you think you need until your client is asking how they can reconfigure something in the shipped software when they can't just recompile it to what they need.


> the horror of not-quite-code-but-compiled-into-the-classfile @Annotations!

I really don't get the hate to annotations.

They have saved me lots of hours since 2009 and at close to zero cost. What's not to like?


Since they're compiled into the code, it means you can't change them without recompiling, so they're not actually configuration. But they're not actually code either, so they can't be subclassed or inherited, don't support generics, and have other quirks [1]. Furthermore, they are a sort of explicit notation to make you feel like you're getting 'convention' over 'configuration', but if it were truly convention you wouldn't need them in the first place.

For example, to teach my class how to serialize itself into JSON, I can compile com.fasterxml.jackson annotations into it, but by doing so I've made a particular JSON library a dependency of myproject-datatypes. I'd rather my myproject-controller or some other layer of my application take care of how the data looks over JSON. Luckily Jackson supports an entirely different way [2] of teaching it how to map a class, but not all libraries are so nice.

That is not to say annotations don't have their place. They're good declarative structures, but in my opinion they are used in places where they shouldn't be. This blog post [3] from 2009 accurately describes how I feel about annotations.

[1] http://www.cowtowncoder.com/blog/archives/2009/02/entry_216....

[2] https://github.com/FasterXML/jackson-docs/wiki/JacksonMixInA...

[2] http://naildrivin5.com/blog/2009/03/11/java-annotations-java...


Annotations are data about the code. The fact that you can't change them without recompiling the code is a feature, not a bug.

If you want that kind of flexibility, use external files (e.g. XML) but now the two can get out of sync, which is what is commonly referred to as XML hell.

The rule of thumb is simple: whenever you need to add information about something in your source (class, method, field, package), use an annotation. If you need to add information about something that is not source code (port, host, various sizes, etc...) then use an external file. In particular, if you specify a reference to Java code in your XML, you should use an annotation instead.


> to teach my class how to serialize itself into JSON, I can compile com.fasterxml.jackson annotations into it, but by doing so I've made a particular JSON library a dependency of myproject-datatypes

Right, because at the moment, there is no standard cross-library convention for that. One day, there will be, as there is for serializing to XML:

http://docs.oracle.com/javaee/7/api/javax/xml/bind/annotatio...


What about gson?


It's not a standard, just another implementation.


Thanks for taking time! My views below:

> Since they're compiled into the code, it means you can't change them without recompiling, so they're not actually configuration.

Neither xml was configuration. I cannot remember ever changing xml for deployment. Actually in many ways the whole build jar/war process feels a lot like compiling in Javaland and while I'll happily pick jars or wars apart for troubleshooting deploying the result in production would break multiple guidelines at most places I guess (hope).

> Furthermore, they are a sort of explicit notation to make you feel like you're getting 'convention' over 'configuration', but if it were truly convention you wouldn't need them in the first place.

Partially agree but then again every other framework I have seen demands you do something: put x classes in y folder, subclass another class or implement an interface or something.

Compared to magical directory layout annotations are less well, magic, more explicit.

Compared to subclassing it is more flexible (although I remember enjoying Propel with Symfony 1.)

Compared to marker interfaces? Not sure.

That said once you accept them there is plenty of things you don't have to specify but can override if you want/need to.

Edit: As for your third reference it is out of sync itself:

> The same goes for EJB. I have a class named FooStatelessBean. How about we assume it's a stateless session bean, and it's interface is defined by its public methods? It can then provide FooRemote and FooLocal for me, and I don't need to configure anything or keep three classes in sync

I only use three annotations on such ones: @Named at the class telling the DI framework to pick it up for injection using its name as the name (although I can override it right there and then if I need to.) @Stateless tells it to make a suitable bunch of them and that it is OK to give any of them out to anyone at runtime and @PersistenceContext (at an instance variable) tells me that I want a reference to the ORM layer.

I could easily create a combined annotation for @Named and @Stateless, even naming it @Stateless and just importing my @Stateless instead of the standard (across all Java EE implementations) one.

I guess I could also create a superclass that does all this and subclass it but some day I'll leave the project and another one will have to maintain it.

Could we have picked this out from the name? Yep. Would it be magical? Yep, more so than annotations. Would it tie my hands with regards to naming classes? Yep.


> which was a response to this then-obscure framework called Dropwizard

I sat in with an AMA in NYC with Dave Syer. He said it was inspired by a project he was co-opted to which used Rails.

Not the actual architecture of Rails, mind you, but the "just works", out-of-the-box, no-configuration experience.

Disclosure: I work for Pivotal. So do many of Spring's core committers. There's a lot of feedback from Labs and Cloud R&D into Spring, but they're still a largely self-directed team.


Trying to upvote this because it's good info! But, but, I have one better :)

Spring Boot 1.0 GA Released [1] blog post, says, and links to a Spring Issue SPR-9888:

"It's been 18 months since the original request [2] to "improve containerless web application architectures", that gave birth to Spring Boot, was raised."

The body of the original issue [2] says, midway down:

"I think that Spring's web application architecture can be significantly simplified if it were to provided tools and a reference architecture that leveraged the Spring component and configuration model from top to bottom. Embedding and unifying the configuration of those common web container services within a Spring Container bootstrapped from a simple main() method.

Though there are many frameworks and platforms today that no longer require a container I think inspiration can be drawn most from DropWizard (http://dropwizard.codahale.com/).

Another project I've seen inspired by DropWizard but leveraging Spring is HalfPipe (https://github.com/32degrees/halfpipe). Though I don't think HalfPipe goes far enough. I think to truly provide simplification the entire architecture, wherever reasonable, must be embedded within the Spring container rather than without. Though it does have several other interesting ideas."

[1] https://spring.io/blog/2014/04/01/spring-boot-1-0-ga-release...

[2] https://jira.spring.io/browse/SPR-9888


It could absolutely be both. Dave Syer is not the only person who drove Spring Boot. Phil Webb is the main day-to-day driver, from what I can tell. He's definitely been active in soliciting feedback from Labs Pivots.


> The 'new Spring' is Spring Boot, which was a response to this then-obscure framework called Dropwizard

This is correct, but what pisses me off is that the Spring folks clearly copied Dropwizard, then failed to even mention Dropwizard in their various docs, blogs, talks, etc, which you're thinking is fine. But they DO mention other (less useful) projects as supposed alternatives for Spring Boot, essentially, pretending as if Dropwizard doens't even exist when clearly it's the reason that Spring Boot exists! Shameful.

Spring hails Boot as the new way of doing things, but it's still by no means a clean break for the rest of the Spring ecosystem where programming by XML (or annotations) is still king and the APIs are still a horrible, overcomplicated mess. There's no undoing that.


Each tech it's use.

Annotations are great for class metadata - JPA or beans binding or servlets and filter paths.

Externalizing binding was the kind of nice thing that almost never quite worked right. XML (or YAML because sysads are people too) for configurable parts is ok, but if you need to mix and match services it was never ever needed. The binding API allowed that years before spring style of depencency injection brought madness into the enterprise world and in a extremely better and clean way, using the facilities of the language instead of creating ugly reflective glue that kind of worked


Agreed, I think you and I have very similar views on the JVM/JaveEE/Spring situation.


Spring Boot is very good. I can't imagine writing enterprise Java code without it anymore.


You might want to try out http://bootique.io, it's even better in many aspects


> I feel you haven't taken a look at Java in the enterprise in a long time. > > Spring is considered to be an overloaded bloated mess and Java EE has become lean and mean

I agree with those descriptions, but we have very different experiences of Java in the enterpris. In the companies i work with, EE still means older specifications, and Spring, for all its flaws, comes in as a breath of fresh air.


I'd argue much of that time has passed now. I used to write J2EE code. Now we have frameworks like lagom and dropwizard which make life exponentially better. I'd suggest taking another look.


> It's the result of man-centuries (man-millennia?) of work, yes, and as a result is really, really impressive. But imagine if that much work had been put into something fundamentally great, like Smalltalk or Lisp, rather than something fundamentally okay like Java.

The eternal irony of this is the current JVM---HotSpot---was a Smalltalk VM that Sun bought. It was an attempt to commercialize some of the work done (also at Sun) on the Self language [1].

[1] http://c2.com/cgi/wiki?HotSpotVm


Well, "an attempt to commercialise" seems to me to trivialise it a bit.

The fact is that the Self research uncovered some pretty interesting things around optimisation... at the time optimisers were pretty much based on the research done in the '70's when structured languages were still quite new-and-all. The Self work uncovered that those assumptions were really quite bad when it came to OO environments. OO-structured programmes behaved quite differently at runtime. Indeed, one of the most surprising (then) results was that you really couldn't optimise all that well at compile-time, but that you needed continuous runtime analysis and optimisation do anything really effective. But when you do, then it's hella effective! Remember that we were still in an era when byte-code interpreters were viewed (mostly rightly) as woefully slow and inefficient.

Sun had part-funded the Self research for quite a number of years (>5? if memory serves me) so when they found themselves with a bytecode interpreter that really needed some serious help there was a natural fit, and why should they not have gained some advantage out of some pure research work they had funded?

As I recall it, most of the Self research was done at Stanford. Sun just provided money, and only much later in the project's lifetime, a home in Sun Labs ("Church" side of Sun).

(All this is purely from memory, so I might have misremembered bits of it - I can go upstairs and dig out all those old OOPSLA conference proceedings if it matters.)


If Smalltalk and Lisp are so magical then why didn't they win?


Smalltalk has a strength and a weakness in its 'image' approach. Smalltalk doesn't really have source code and object code, it has an 'image' full of objects which you can run, and which contains all of the standard library, the development environment, and your code. You work in this image, and save changes as you go along.

That makes Smalltalk super easy and convenient to develop in, and very consistent across platforms. But it's a roadblock to publishing code, working as a team, deploying, and the other things you need to do with software that isn't a toy. The Smalltalk environment and community certainly have ways of doing all those things, but they can be clunky, and some of them are quite recent.

Meanwhile, C++ was a bit of a pain to write, but once you'd written it, sharing it was as easy as emailing a file, or checking it into RCS (or that hot new CVS thing), and building a production release was a matter of running 'make' and going for lunch.

Java kept C++'s simple approach. Smalltalkers at the time laughed at how caveman-like the experience was. But it turned out that that simplicity was exactly what you needed to grow virally on the then-nascent web.


I don't get why you're being downvoted. Lisp had a 40 year head start, it had commercial backing in multiple iterations, it has a grassroots community that pesters the other to this day.

I can't say anything about Smalltalk because I don't know enough about its history, but at this point in time it's pretty sure that unless some sort of major event happens, Lisp won't ever go mainstream.


Because Worse is Better applies here as usual. Java was fast enough and familiar enough and the libraries were good enough and the language safe enough and Sun supported it well enough in the enterprise. Nothing else had the same combination of okayness. I never enjoyed Java but it had my half-grudging respect.


I'd agree with you with the caveat that "worse" in that paper means that something might be less elegant or rigorous on a theoretical level but still the more practical choice.

And now that we have good, modern high level languages with robust but flexible type systems I'd choose Haskell or Ocaml or maybe Swift over Smalltalk or Lisp if I wanted to stick my neck out technically on a project. I consider dynamically typed languages to be significantly "worse" in an absolute sense for building non-trivial software.


> And now that we have good, modern high level languages with robust but flexible type systems I'd choose Haskell or Ocaml or maybe Swift over Smalltalk or Lisp if I wanted to stick my neck out technically on a project. I consider dynamically typed languages to be significantly "worse" in an absolute sense for building non-trivial software.

I don't know if there are any typed SmallTalks, but Common Lisp has optional static typing: you can declare types and the compiler will enforce them. This seems to me the best of both worlds: the freedom to interact with your program dynamically, with the safety of strong static types.


But with Common Lisp, you don't get the safety of strong static types. If the compiler can prove that there's a type error, it will throw an error. That's not the same as throwing a type error when the compiler can't prove that there isn't a type error, as a sound type system would.

Even worse, Common Lisp will use your type annotations to optimize code, which can change the behavior of your program. It's better to think of them as type hints for optimization than to think of them as optional static types.


That's true, because doing so in Lisp is undecidable and because you allow function definitions to change (that's why type checking works better with inline and local functions).

The type of analysis applied to Lisp is adapted to its type system: you have ranges of values, unions, etc. In particular SBCL can exploit some relationships between runtime values and types (I detailed it recently, so I prefer not to repeat myself: https://news.ycombinator.com/item?id=12222404)

As far as I am concerned, it is already quite powerful to detect the type-related errors that I tend to do. The part that cannot be decided during compilation is left to be checked at runtime, which is strong dynamic safety. Instead of manipulating untyped bits, you have all you need to never corrupt the state of your program. The good news is that if implementors make progress with type inference, you don't have to change the language: more checks can move from runtime to compile-time (the hard stuff would be the check the "satisfy" type ;-).

With some types left to be checked at runtime, you end up in a situation that is no worse than statically typed programs which throw exceptions for other kinds of errors. Sometimes, those errors arise because programmers implement a dynamic things, like ad-hoc tags to identify objects at runtime (so this is basically a dynamic type system in disguise). Other errors related to safety are tied to temporal issues, deadlocks, permissions, priority, security, etc. You can't always encode them with types. There are formal tools for dealing with them, but at this point, whether you use them generate safe code in Lisp or OCaml is not very important (and generating Lisp code is straightforward).

> Even worse, Common Lisp will use your type annotations to optimize code, which can change the behavior of your program.

This is a strange concern. The modified behavior belongs to the set of behaviors that would be possible without checking types, not a new, faulty behavior that contradicts what the code says.


No caveat needed: that's exactly how I intended the use of 'worse'. I linked the original essay in another comment below.


> Because Worse is Better applies here as usual.

It's not that simple. They lost for very pragmatic reasons: Smalltalk and Lisp were worse than Java in many ways (even smaller than Java's initial implementations, Smalltalk's binary images were undeployable in production, these two languages are dynamically typed, etc...).


[1] https://en.wikipedia.org/wiki/Smalltalk#History, scroll midway down:

"During the late 1980s to mid-1990s, Smalltalk environments -- including support, training and add-ons -- were sold by two competing organizations: ParcPlace Systems and Digitalk, (...) Both firms struggled to take Smalltalk mainstream due to Smalltalk's substantial memory needs, limited run-time performance, and initial lack of supported connectivity to SQL-based relational database servers. (...)"

"In 1995, ParcPlace and Digitalk merged into ParcPlace-Digitalk and then rebranded in 1997 as ObjectShare (...) The merged firm never managed to find an effective response to Java as to market positioning, and by 1997 its owners were looking to sell the business."

There was also Strongtalk, a strongly typed variant of Smalltalk. The guys who wrote Self and the guys who wrote Strongtalk were both eventually acqui-hired by Sun and they were the ones who wrote HotSpot [2][3]

[2] http://www.strongtalk.org/history.html

[3] https://en.wikipedia.org/wiki/Strongtalk


'Worse is Better' is exactly about pragmatic reasons prevailing over idealism. The 'MIT Way' is the stereotyped opposite approach, and LISP is a great example of that: in theory it's the 'right way', but in practice it fails to address some aspect of real world needs somehow.


I understand that but I still dislike the "Worse is Better" characterization, which sounds very condescending to me and which is usually put forward by proponents of the languages that lost, hinting that the reason for the outcome is because people are stupid and picked the worse alternative.


Well, the originator was a 'losing' LISP guy who admired the robustness and practical effects of the other approach:

"However, I believe that worse-is-better, even in its strawman form, has better survival characteristics than the-right-thing, and that the New Jersey approach when used for software is a better approach than the MIT approach."

"There is a final benefit to worse-is-better. Because a New Jersey language and system are not really powerful enough to build complex monolithic software, large systems must be designed to reuse components. Therefore, a tradition of integration springs up."

Richard Gabriel https://www.jwz.org/doc/worse-is-better.html


Looking at how Go has became successful, which is basically Java 1.0 or if you prefer Limbo merged with Oberon-2, it isn't very far from how people pick languages.

At least the amount of insecure C code decreases, as it gets written in a safer language


I remember Smalltalk and Lisp were pretty slow at the time and required tons of resources to run. Smalltalk and Lisp had better concepts but they fell short in delivering practical result.


Java wasn't really light on resources at any point of its history either.


Java's resource requirement fitted pretty well within the hardware capability at the time, and grew with the hardware over time. Smalltalk and Lisp were quite ahead of their time requiring way more resource than the hardware could provide at the time.


Ok which time are we talking about? In 1965, sure. In the times Java was actually around, it hardly had an edge against Common Lisp (can't comment on Smalltalk) implementations of the day by any resource metric. That's two and a half decades already.


Why is nobody mentioning Clojure in this thread? I feel like Lisp kind of did win in the sense that Clojure is a reasonably sane Lisp that's gaining popularity and that lives on the JVM.


Not only is Clojure "a reasonably sane Lisp", it's a practical language that feels designed (because it is) using "power trough simplicity", decomposition, decompletion.

This frees up an enormous number of brain cycles to design the upper layers.

It embraces it's host (JVM, JavaScript) which enables tremendous reach and code-reuse; at the same time it fixes most WATs and subtler pitfalls (by using immutable, persistent data (structures) by default).


If only the problems of slow startup and unfriendly error messages could be solved - it would be like a dream coming true.


I think the spec library that they're putting together right now might be able to solve the error messages. What I'm waiting for is to be able to hit something like dot in my ide and see a list of suggestions, I'm not sure if spec can be leveraged to that extent.



Completing the first half of a symbol that I've been typing already works reasonably well with Cursive. What I'm looking for is something that looks at the result of an expression and figures out what other functions can accept that as input. Matching up specced postconditions with preconditions might work, but I'm not sure how well.


> I feel like Lisp kind of did win in the sense that Clojure is a reasonably sane Lisp that's gaining popularity and that lives on the JVM.

'Reasonably sane' is a matter of opinion. I really, really dislike Closure. I don't want to ban anyone from using it, but it really feels wrong to me. Far better to just use Lisp on the JVM, with a well-defined interface to Java.


Because just like with many other things humans use the least common denominator wins. Everybody understands the concepts in Java pretty much but having trouble with LISP concepts. Just by interviewing ~40-50 Java programmers (mostly juniors) I can tell that even concepts like recursion is not well known or understood among them. At this level they think they do not need to know these advanced concepts to be successful as a software developer. As a company you want to use a language that everybody can understand so you do not have trouble with hiring, few exceptions (like Jane Street).


You asked a good question. And I hate to say it but that fact has driven a lot of people up the wall and they couldn't grasp how a language could win widespread adoption.


Java and the JVM are entirely separate beasts and should not be taken together. While most JVM code began as Java code, there are enough non-Java languages hosted on the JVM at this point that it should be looked at through the Java lens only.


>...that it should be looked at through the Java lens only.

I think you missed a "not" there.


I did indeed. Good catch!


You should not confuse or conflate Java the language with Java the virtual machine. They really are quite distinct technologies.


> The JVM is a treasure just sitting there waiting to be rediscovered.

So true, long long back when people were going NodeJS ape-sh*t I did some benchmarks proving JVM can blow V8 out of water any given day (URL if someone is intrested http://blog.creapptives.com/post/9924551244/the-node-redumpt...) and got trolled by NodeJS fanboys alot.


> I did some benchmarks proving JVM can blow V8 out of water any given day (URL if someone is intrested http://blog.creapptives.com/post/9924551244/the-node-redumpt...) and got trolled by NodeJS fanboys alot.

Anyone who's worked with the JVM code knows what an amazing feat of engineering it is, really far more sophisticated than anything comparable. I've also experienced a similar reaction from NodeJS fanboys in the past.

That said, V8 is undergoing heavy development by a group of good people, and I'm sure it will catch up, eventually. The GC is improving all the time, for example.


But nodejs was never really about JVM vs. V8 in the first place.

It was about async io, handling everything in a process and a handful of threads, instead of spawning a new thread for every request (and paying for OS context switch).

Of course in your article you mention Netty. But Nodejs is (still) the only ecosystem where everything is async (event-loop style), vs. Java Netty or Python Twisted, in which you only have a subset of libraries.

Or am I mistaken?


Akka, Play and Lagom are part of the Lightbend Reactive Platform -- everything's async from the ground up, built on the JVM and Scala.

The authors put together the Reactive Manifesto: https://www.lightbend.com/blog/reactive-manifesto-20


That manifesto site drives me completely nuts. It reads like somebody is trying really, really, really hard to describe Erlang without actually saying, "Erlang".


> It was about async io, handling everything in a process and a handful of threads, instead of spawning a new thread for every request (and paying for OS context switch)

Given that the data, and the events about the movement of data, come from the kernel, there are still just as many context switches with lightweight threads.

The only real advantage of lightweight threads, as pcwalton pointed on recently in another discussion, is that they can have much smaller stacks, so you can have many more of them before you run out of address space.

The biggest advantage of Node.js, though, is that it let people who know JavaScript write server-side code with reasonable performance and scalability. If you already know Java, there's no reason to use Node.js. If you only know JavaScript, or JavaScript plus some 'slow' server-side language like Ruby, the Node.js is an easy way to get more performance and scalability.

Go is filling a similar niche. If you know Java or C# or C++, Go is pointless, but if you only know JavaScript, Ruby, Python, etc, then the ease of learning Go makes it a viable option on the server side.


Lightweight threads? Nodejs doesnt use them. There is only a process and a handful of threads which are basically workers.

And I don't buy that the only point of nodejs is that _that it let people who know JavaScript write server-side code with reasonable performance and scalability_.

Writing nodejs code is a complete different beast from frontend javascript. I actually think writing nodejs is much harder to understand and code than eg Golang or Java.

For me, personally, the point of Nodejs is neither performance nor the fact that its _async_.

Nodejs has a really good balance of performance, tooling, ecosystem and scalability.

Of course I could be wrong.


Cool article, Would be interesting to see these benchmarks updated :)


Now that the CLR is getting more attention because of Microsoft's cross-platform efforts, how comparable is the CLR to the JVM, engineering-wise?

I ask because, as a C# developer, I like C# as a language, but am jealous of Java's extensive libraries and the legendary engineering of the JVM. How likely is it that the CLR can serve as the basis for future Big Data projects like Hadoop or Spark?


The best would be to do as we do.

My employer works with both eco-systems, so we happen to jump between JVM and CLR all the time.

Actually on my case, I am jealous that on .NET side I get the AOT compilers for free (NGEN, .NET Native, MDIL) instead of having to buy them, official unsafe, painless FFI, while I will need to wait for Java 10+ to use them on JVM based projects.


For the JVM, look at the JNA instead of the JNI — that makes the FFI a lot saner.


Unless you're doing something like high-frequency trading, I doubt you'll notice a difference between the CLR and the best JVMs.


For me, as an outsider, the JVM looks a bit like .NET: this huge, extremely complex codebase which does everything and anything, very optimized and comes with great tools.

But to me that is a problem. Even though the JVM could do X or Y better than another platform, just the time spent learning about it feels like a waste of time, as I could already be productive in another language which is less optimized and less complex, but much faster to develop with. It seems just learning to use a Java IDE is a job in itself.

I do mathematical modeling, by the way. If I need speed, I use optimised libraries. So I might not be the target user of Java, although I've often heard good things about it for scientific computing.


>Even though the JVM could do X or Y better than another platform, just the time spent learning about it feels like a waste of time, as I could already be productive in another language which is less optimized and less complex, but much faster to develop with. It seems just learning to use a Java IDE is a job in itself.

Any other comparable language (basically with static types) would want to use would require an IDE too. Of course you can write C or C++ or Rust, etc without one, but you'll be as unproductive as writing Java without one (nothing inherently IDE-demanding that Java has compared to these).

Or at least configuring Emacs for the task, which can be a full job in itself too.

Now, if you talk about something like Python or JS etc, maybe. But the JVM != Java, and those languages run there too.

Plus, there's nothing really to "configure" to run the JVM, any more than you need to configure Python. There are GC and other options, but you can ignore that for the same (or even more) time than you could in Python.


> Any other comparable language (basically with static types) would want to use would require an IDE too. Of course you can write C or C++ or Rust, etc without one, but you'll be as unproductive as writing Java without one (nothing inherently IDE-demanding that Java has compared to these).

I disagree on that one. I find writing c++ without an IDE to be doable. However with java the import statements are not human memorable.

Same goes for class names and variable names. In java everybody writes getFoo and setFoo. Which just gets annoying, and I personally feel that the protobuf lib does it right in there c++ bindings. foo(), set_foo(...).

Same goes for golang. Golang does a great job of having easily memorizable packages, but also allows for more complex git based urls.


> However with java the import statements are not human memorable.

I suppose that depends on how many classes you need to import. I wrote a fair amount of Storm code (JVM/Java-based) and did all of the import statements by hand in vim (I don't think vim would have auto-completed by Java imports from a .clj file even if it had been setup correctly). Didn't seem that hard. I sometimes had to look at API docs for class names, but I'd have needed to do that anyway.

The only time IDE-based autocompletion would have been nice was when I was trying to read sample code. Since Java is so verbose, nobody writes import statements in their sample code. Required more googling than I'd have liked.


I started writing Java with no IDE; and mostly just used * imports; that worked pretty well (and I don't remember it being a headache at all). In IDE land (since, uh, I guess the first I used was called Roaster) automatic imports meant I could always use specific imports, which is a tiny bit better in my mind, but not worth ever doing it manually.


>I disagree on that one. I find writing c++ without an IDE to be doable. However with java the import statements are not human memorable.

Whereas C++ imports are? Except if you code all the relevant functionality by hand, or just rely a few std stuff. But then you should compare like to like (e.g. only using the Java analogue to std, utils, collections, etc).

>Same goes for class names and variable names. In java everybody writes getFoo and setFoo. Which just gets annoying

So? Expanding to getFoo/setFoo is something even Vim can handle, no IDE needed even.


> Any other comparable language (basically with static types) would want to use would require an IDE too.

Go is both statically-typed and perfectly usable without an IDE.

> Or at least configuring Emacs for the task, which can be a full job in itself too.

Modern emacs is pretty easy to set up. Take a look at prelude or spacemacs. You're up & running within a few minutes.


>Go is both statically-typed and perfectly usable without an IDE.

To the same extend that Java is perfectly usable without one.

Besides I find Go intolerable with or without an IDE. Not sure how people do any work without Generics, probably don't care about being DRY or don't smell the badness in e.g. using different math methods per type.

>Modern emacs is pretty easy to set up. Take a look at prelude or spacemacs. You're up & running within a few minutes.

Up and running with ...something. Customizing that, and getting to use it is another story altogether.


Java is great for a lot of things. That's why you see large distributed systems projects that power large companies being written in java. See presto at facebook or kafka at linkedin for example.

It's not as good at for loops though. If you need it for hpc you should still be using c++ underneath somewhere for simd instructions and other neat tricks only available in lower level languages. I look at java as a better python or: "frontend to c++ code"


Even with HPC (the Azul Systems folks would object to your claim), you just have to code in a foreign style of not allocating memory. Once you get that style in your tight loops, Java can run very very fast.


I've debated with cliff click about this before actually. There was a consensus that the JVM can support certain kinds of operations for that stuff (eg: basic SIMD) with that coding style. It can't do everything though.

It's a lot better to go JNI for 99% of this stuff though.

I'd also love to know when java has first class support for gpus (you know where hpc actually runs?).

I'll continue relying on c++ for that stuff. It's also what nvidia supports first class.

If you have any first hand experience with this stuff I'd also love to hear your thoughts on this as well.

Better yet: Here's a mailing list where these people (including me) sit.

Go ahead and peruse it. I've learned a lot being on here.

I'm STILL of the opinion the JVM

can't do most of these things well after having run the numbers: https://groups.google.com/forum/#!forum/mechanical-sympathy

For cpu I'll be sticking with openmp for the long term. Don't even get me started with cpu specific optimizations (eg: different intel generations)


Fair enough, I personally don't have first hand experience with the super tight loop stuff on the JVM (though that seems like it'd be fun!). The JVM has always been fast enough for MY needs, but I realize that's not true for everyone. Thanks for the link, that seems like a good mailing list.


Nah I mean this stuff IS hard. That's why I offered that mailing list. Like I said a lot of people from azul and a lot of the hardcore jvm heads sit on there.

If there's a way to make java faster for that great. You are are right that the jvm DOES support some of the tight loop stuff.

It's harder to do and leads to messy code though. In C++ you can write maintainable code and add pragmas. It's a lot cleaner and with javacpp (https://github.com/bytedeco/javacpp) we just generate the jni bindings.

We found that to be easier in practice. I'm admittedly not THAT familiar with the jvm internals.

I have just tried to get it to work and have did enough reading to know jni will just be cleaner for that stuff.

The JVM is certainly fast enough, just mainly for distributed systems code (eg: databases, message queues)

Honestly that's what makes it appealing for me. I can write fast production systems and tweak it where necessary for the HPC apps.

Java provides a fast and safe baseline for 90% of the stuff I'd want it to do. We tried for about a year to get it to work for lin alg and finally said screw it let's just do c++.


I completely disagree that Java is well suited for distributed systems.

Several such systems are written in Java solely because Java is very popular generally and is the lingua franca of "enterprise" software development, not because Java is particularly well suited to the distributed systems problem space intrinsically.

Also, Kafka is written in Scala, not Java. Though I don't think Scala is particularly any better suited to the problem either.


Tell that to facebook,twitter,linkedin, most big banks? The JVM (whether you like it or not) powers most of the bigger database systems in the world.

New code is still being written for the jvm. Lightbend and pivotal are also companies setting up large systems on the jvm. Seems to work fine for them. What are the alternatives? go?

Your point re: scala. Scala is STILL based on java. There isn't much of a difference for speed here. Akka and its ilk still rely on netty for the underlying transport mechanism (written in java) I'm very much well aware of what's written in scala. That includes spark and kafka among other things which is great. Those STILL rely on java libraries though.

Go is still slower yet: https://www.techempower.com/blog/2016/02/25/framework-benchm...

The other might be what? erlang? Good luck finding developers. The JVM is still the only platform that has not only big data mind share but things like microservices frameworks like lagom and spring boot while ALSO having things like message queues being written for it.


Overwhelmingly databases are not written in Java. Most of them are written in C and C++.

Secondly, I'm not denying that lot's of systems are built on Java. But they're built on Java because those places already have super heavy investment in Java ecosystem libraries, tools, and engineering resources. Not because Java or the JVM is somehow intrinsically awesome for building distributed databases or distributed systems things.

In fact it's quite the contrary. It's an epic pain in the ass to keep heap growth and GC latency under control in most of these systems. The former being critical for node stability and the latter being critical for dead-node detection consistency (among other things).

I used to be a distributed database engineer, and I count among my friends many people who still work at Elastic.co, Confluent, MapR, Cloudera, Twitter, and Couchbase. I currently work for a big FinTech company building core distributed transaction processing infrastructure, and I am 100% certain that compelling the use of Java for certain systems development tasks has nothing to do with it as a piece of technology and 100% to due with politics.


A lot of fintech stuff (and google's stuff) is written in c++. MapR's hadoop distribution is also c++.

If you look at a lot of the nosql databases you could list off the top of your head including: presto, cassandra, and hbase, those are all java.

Newer ones like kudu as well as some of the older ones like mysql, postgres etc are def written in c.

I wouldn't say it's complete politics there. Maintainability is a big factor in writing systems code that lasts and you can rely on. The JVM isn't great for everything but there's teams at say: twitter who's sole job is GC management and tuning. I know cloudera and co also have these people on staff. They (as well as us) know the JVM isn't ideal for every use case. Eg: I personally do a lot of GPU stuff. I wouldn't use java code for that (we use JNI/c++ for that)

If I had to argue for the jvm, I'd say with the right tuning it's reliable enough and you can also hire for it. There's trade offs of security and reliability when you start thinking about OTHER parts of this besides latency and speed. It's harder to screw up java code than c code..and a lot of people are pretty familiar with the internals (eg: off heap, unsafe)

The flink project is a great example of this. They took java and wrote their own memory manager allowing them to keep the java integrations but not deal with the GC. Many jvm based distributed systems have started working around this stuff now.


I really don't understand what you're arguing here. It's undisputed that several distributed systems are written in Java. What I'm disputing is the idea that Java is used to build these systems because Java is particularly well-suited to building distributed systems.

It isn't. I don't doubt that there are hordes of people whose entire responsibility it is to try to work around Java's rough, eclectic edges when it comes to systems engineering, but the fact that horde of people exists at all is one, among many, symptomatic indicators that Java itself is not intrinsically a particularly good fit for that problem space.

You've inverted the cause and effect.

Java is very popular, and Java is used in a lot of businesses, and some of those businesses have/had needs for different shapes of distributed systems, and those businesses have/had an easier time finding Java engineering resources, and those engineering resources built some distributed systems, and so now some distributed systems are built in Java or otherwise on the JVM.

There are even cases (Storm and Spark come to mind) where the selection of the JVM had seemingly as much to do with where those solutions were trying to position themselves in the larger ecosystem (augment Hadoop or eventually supplant Hadoop, respectively) as it did the technical merits of Java or the JVM itself.

Also, being able to hire for a thing is almost always a political issue. Like your prior comment, "The other might be what? erlang? Good luck finding developers." It's really easy to find Erlang developers. You just have to be willing to pay for them. Additionally, the proportion of Erlang developers who have a lot of experience building distributed, fault-tolerant systems is almost certainly higher than that of the proportion of Java engineers who can say the same. You don't even bother learning Erlang unless you're building distributed systems of some kind. Java is used for everything. That's just simple selection-bias. The hard part about using Erlang is getting management to sign off on using it instead of Java or some Silicon Valley darling language du jour. Anyway, hiring considerations aren't an issue of technical design or implementation merit. They're political considerations.

If you subtract the set of open source distributed systems that were born inside companies with large pre-existing JVM investments already (introducing a different kind of selection-bias), like Hadoop (Yahoo) and Cassandra (Facebook) and Kafka (LinkedIn), and instead look at the ecosystem of distributed systems that were built up as standalone efforts outside any major corporate engineering org, like MapR, Riak, Aerospike, ScyllaDB, RabbitMQ, RethinkDB, etc. you immediately see a much different set of technology choices. Overwhelmingly these "outsider" distributed systems are written in Erlang, C/C++, and Go. The Java cohort is a limited minority in that space.


We do scientific computing on the jvm (mainly deep learning). We wrote: https://nd4j.org

which immitates numpy and co on the jvm with builtin gpu support and linking against blas like other frameworks. We also hand implemented a lot of ufuncs for you in c++. It's mainly oriented towards dense atm though.

Sparse will come at some point but it hasn't been a big feature for us.

It uses: https://github.com/bytedeco/javacpp underneath.

Happy to answer questions!


Yeah, I've actually heard about ND4J and was interested in learning about it. But as I said in my parent post, Java seems (from the outside) as this huge unwieldy language that is hard to get into.

What could I do with Java/ND4J in scientific computing that I couldn't do with other packages, or that is much easier/better done on the JVM?


Yeah and it might be. We're targeting the big data folks with this framework frankly.

Anything that does static typing might be overkill for just running matrix math.

I created nd4j with the intent of having the same primitives and programming model accessible to me for production environments and also affording me the same privileges of optimizing things in c with nice clean abstractions.

scala MIGHT be easier which is why we created nd4s: https://github.com/deeplearning4j/nd4s

The JVM and the developers who use it are mainly people with salaries soliving a very specific class of problems not people publishing papers. The one exception to this I've seen is the NLP community. I'd argue it's just a different audience who needs the things the jvm has, eg: static typing, integrations with libraries, etc.

For large codebases you really can't beat jvm tooling..if you just need a click repl and throwaway code? It might be a bit much if you don't already know it.


Java really isnt that bad. I think the biggest issue surrounding it is a package manager, better ui/templating language and a more obvious/documented import and linking api. Outside of that it isnt too bad

Edit : also events were annoying in 7. I dont have a particular interest in creating a class just to act as a listener. I believe 8 introduced lambdas which is nice


Java 8 with lambdas and streams is actually fairly nice to program. It's still Java, but it is much better than it was before Java 8.

Maven is complicated, but Java 9 with project jigsaw might help things out on the packaging side.

I don't do ui or any templating in java (analytics and simulations) so don't really know what's going on in those areas.


Is Java planning on custom primitives analogous to c# structs, or is copy-by-value semantics still the exclusive province of the core language primitives?


Yes. Value Types, Reified Generics and Generic Specialization (e.g. List<int>) will all be in Java 10.


Project Valhalla is supposedly bringing value types to the JVM.


I'd consider maybe looking at gradle then. The awesome thing about maven is it is os agnostic and is at least stable.

Most of the jvm libs are stored in maven central.


Gradle is fantastic to work with. Groovy is nice when you need it, but the DSL is just much better to work with than Maven.


Gradle 3, due out soon, will use Kotlin instead of Apache Groovy as its prefered language for writing build scripts and plugins.



Ooooh how exciting! I like that a lot. I can find this out myself, but what do they plan on doing with existing Groovy plugins for gradle pipelines?


...which will make it completely typesafe.


I'm not saying Java is bad or not. All I'm saying is that, from my perspective, why should I bother? It just seems very big and unwieldy and if I'm not going to do these JVM optimizations anyway, why bother?

It's really just an impression.


Forces you to 100% commit to oop paradime. Often much easier to test and more clean than counterparts. Its used in enterprises for a reason.


> which does everything and anything,

same here. Until I wrote my first android app, used JPG images in a loop, figured "well, they run out of scope, and everthing is a reference, and java has 'garbage collection', so everything should be fine, right?". Wrong. Ran out of memory. The one time I thought I could rely on the JVM and all the delicious java-ness, it lets me down. I'll stick to C++ where WYSIWYG.


> Wrong. Ran out of memory

Android VM(DVM) is different from the JVM we are usually talking about. When developing Android apps, you need to understand the life cycle of components to prevent memory leaking. Your issue may come from your poor understanding of Android platform.


> Your issue may come from your poor understanding of Android platform.

possibly. But what is my responsibility as a developer ? If I learn a standardized language do I now have to learn the JVM/DVM standards of how its implemented ? Whats wrong with a destructor being called when an object goes out of scope and have its resources freed ? If I have to now worry about how the VM implements that, learning the language was useless.


Sure you have to.

It is no different than having to learn what gcc, clang, Visual C, aCC, SunPro, xlc, icc ... do to your C code outside of what ANSI C specifies how it should behave.

And even for those parts that the ANSI C specifies how it should behave, those compilers do not generate the 100% same code.

Also one needs to know how the underlying OS and CPU handle that generated code as well.

Specially interesting regarding the understanding of the underlying OS on UNIX platforms, is to learn how much the platform deviates from the POSIX specification in regards implementation specific semantics, because POSIX just like ANSI C, leaves platforms quite some freedom of implementation.

Understanding the target platform is always required.


Are you talking about finalizers? They often are mistaken for destructors, which they are not.

Android has a terrible (terrible!) sdk, in my opinion, but that is not java's fault.


> Ran out of memory

My experience is that almost every "out of memory" situtation I've had to deal with as a sysadmin on HPC clusters traced back to a Java program.


You've got that backwards. .NET was Microsoft's attempt to "embrace and extend" java. Early versions of C# were basically copy/paste compatible with Java.


A lot of this thread is in regard to development in Java, but a lot of the noise and "unnecessary complexity" also involves end user applications.

I spent a lot of time supporting Java based desktop applications. I've seen a situation where a single organisation sells two products. And the installation guide basically says "these applications need an absolutely correct JRE version, down to the point release, or they will fail to run. And they will only use the system default, and don't have any concept of bundled Java. And they don't support the same version of Java, so our recommendation is to give users two desktops".

And I know that's not Java's fault, but when a certain monopoly releases every single application with that anti-pattern, you get pretty sick of see null pointer exception errors because a user applied a (two year old) Java security update.


Complexity for the sake of complexity has badly tarnished the Java ecosystem's reputation. I feel like the community has countered that phase by creating new, smaller tools, more modular tools, similar to what exist in other ecosystems, but the foothold of bloated, overly complicated crap is still strong in the enterprise world.

A simple rule of thumb is just don't use anything with the word "Spring" in it, and you're well on your way.


I agree. I think a lot of the bad reputation of java comes from the user-visible parts, from slow swing etc.


Most Java software is enterprise and that stuff is famous for horrible UI/UX, regardless of language.


Alternatively: any medium sized high throughput Java app needs somebody with a PhD in garbage collection tuning, because otherwise the app will just grind to a halt regularly :(


Two things:

- I became the JVM whisperer for a fairly large company a whole year out of college. JVM tuning is not that complex, nor is the Java memory model. Shipilev's talk on the JMM[1] is almost everything you could ever need to know, and you can refer to it when you need it.

- That you actually can meaningfully tune the JVM GC is something you can't say about a lot of the alternatives. Like, I love the CLR, but the "self-tuning" nature of the garbage collector has bitten me more than once in the past.

[1] - https://shipilev.net/blog/2014/jmm-pragmatics/


Thanks for that link! Very timely. I have been trying to gain a solid understanding of the java memory model.


It's not really that bad. And on the plus side, VisualVM helps a ton because you can connect to a running JVM and actually see what's happening with the garbage collector (among other things).


I've seen high throughput apps doing 45 second "stop the world" garbage collection cycles because of high disk I/O and enabled GC logging (the process just sat there waiting for some I/O time to flush the GC log file).

So in theory tuning GC isn't thaaaaaat complicated. In practice however I've seen horrible things :(


Sure, I'm not saying pathological cases can't come up. Just that JVM GC tuning isn't usually that big a deal. IOW, it isn't some "black art" that only uber-wizards can comprehend..


This is true of most any environment. When speed is a requirement, people can make things fast.

For most of us. Speed is not a requirement.


Or it is but the slow part is the database. I wonder what percentage of business applications are some kind of RDB CRUD system - get some rows, put them in a table, let the user update or delete, send them on to the next system...

When you're doing that kind of application the overhead the language adds is lost in the DB noise floor.


It sure is, 20 years and who only knows how many engineering hours have gone into it. What noise are you referring to?


Kotlin might pull that off


There's also more light weight alternatives to "XML heavy traditional frameworks" on Java side. Check for example Play[1], which makes the "write some code, test in browser, write more, test" cycle possible.

https://www.playframework.com


Java in HFT environment is wonderfully perverse and beautifully abused in so many ways.


Yeah, it's amazing how fast it goes when you give it tons of RAM and turn off GC.


Or you could just use Azul Systems pauseless JVM and not have to worry about the GC ever.


Is there any reason someone would want to run multiple Java processes communicating at high speed? It seems useless, since the massive overhead of each process makes IPC useless in a high-performance environment.


One nice thing is you can segment GC sensitive tasks from non sensitive tasks.

Another is core affinity.

Another is 1 process that is "native" and another is JVM.


I'm not sure why you think there is massive overhead, can you explain a bit more about that ?


Not the parent but the biggest overheads are going to be startup time (which is easy to work around by just bringing up the procs and keeping them there) and memory overhead. There's some extra memory used for keeping all of the code or other shared resources around but let's assume that's negligible.

Because the JVM never gives memory back to the OS, you have to give it the maximum amount of memory that it could ever want, plus enough to last you between GC runs. Often this means that you have to give it at least twice the memory it actually needs. If you have more than one process, you end up having to multiply that extra "slack" memory by the number of processes that you have, which means that you have all of this wasted memory sitting around all of the time. "Memory is cheap" is a common attitude but in my experience in web stuff it's the only resource that's anywhere close to 100% used on an app server. So things with high memory overhead mean that you need more machines or bigger ones.

Of course, whether that matters depends on the actual numbers of how much memory is needed, how much is "wasted", whether that's actually your bottleneck, etc.


In my experience on the web its all IO bound.

If you are using multiple JVMs and high performance IPC libraries you have largely already solved the memory bound problem on the JVM with standard techniques like arena allocation.


Well what if the message passed creates an object?

The GC will have to cleanup a billion every few seconds.


Moving blocks of memory around is something I am expecting java to do quite fast, just because the VM has intrinsic instructions to do that. You could do the same thing in any other language.


The thing to remember though Is that these libraries are type safe, fast & general purpose.

I bemoan the lack of a similar library in go for instance because the lack of generics.

You can certainly do the same thing in C as well, but I can't easily mix it with code that has automatic memory management.

As for dynamic languages, can you actually do this? I'm not saying its impossible, I've just never seen it.


> these libraries are type safe,

not in this blog post:

    UnsafeAccess.unsafe
    UnsafeDirectByteBuffer.getAddress


Those are wrapped in type safe methods though (all type safety cross process has this step regardless of language/lib).

In nontrivial examples you tend to wrap more than Java longs but the techniques are the same.

EDIT: I realize now that I was bringing a lot of context with me that is not obvious just from the blog post. Martin Thompson and Todd Montgomery who are mentioned in this post (and Richard Warburton who isn't). Are the main authors of the real-logic libraries, which are extremely well known in JVM performance circles. Aeron in particular is a great IPC lib. Other libraries that are similar and in the same circle are Open-HFT Chronicle and the LMAX Disruptor. For that matter the author of this blog post @nitsanw is pretty well known in that arena.


Unsafe is usually used for allocating memory directly rather than having it be managed by the runtime. This has nothing to do with type safety.


In my sector it's compulsory to manage yourself the memory (videogames)


With all those memory accesses marked as unsafe, I'm wondering why not just use C++.


Can someone elaborate on this:

"IPC, what's the problem? Inter Process Communication is an old problem and there are many ways to solve it (which I will not discuss here)."

How is IPC an old problem and how was it solved?


It was a big focus of the Mach kernel in the early 90s. It was a microkernel, meaning that they moved as much functionality as possible out of the kernel and into user processes. For example, to access the file system each call (open, read, write) was really a message to a file system server process. So IPC performance mattered a lot.

Here's an example some heroic attempts to make it fast, getting down to about 50 uS on a 1991 CPU: http://homes.cs.washington.edu/~bershad/Papers/p175-bershad....


Thanks for the link.


I think the two main solutions are pipes and shared memory.


(2013)


Has this metric been surpassed since?


Of course. Since this is a hardware-bound metric, newer faster harder can transmit more messages.

A more timeless metric would be something like "30-instruction message-passing on x86"


Rephrased: given the same hardware, has anyone found a technique that results in greater throughput?


Yes, absolutely. I use KDB to get billions-of-transactions-per-second, but then an interpreted language is clearly better than the JVM.

Put grossly, the "technique" isn't well defined; Even things like "message" aren't well defined, and at the end of the day, are we really talking about a new-and-improved memcpy[1]? Or are we talking about something that's less-slow than other things (and if so, what?)

Not saying this isn't interesting, just that I don't know if it's interesting. Something I want to know is when Java programmers struggle to get into 200k/sec HTTP requests, can I point them at this?

[1]: http://www.codeproject.com/Articles/1110153/Apex-memmove-the...


Is this sort of code purely bound by # CPU instructions? Surely other aspects of hardware could also improve.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: