Hacker News new | past | comments | ask | show | jobs | submit login
Things Java Programmers can learn from Clojure (lispcast.com)
128 points by goatslacker on March 7, 2013 | hide | past | favorite | 55 comments



Nice article. It does, however, trigger a thought:

> By making values mutable, this magical-changing-at-a-distance is always a possibility.

I agree, very much so. However, I could also simply promise that none of my code, nowhere, will modify that DateTime. On top of that, I'll ask all my colleagues to not mistreat it either.

Now, that's what many of us are doing now, and the entire point if the OP's first section is that that's just asking for trouble. But how is that fundamentally different from promising to never call toRubles() on a DateTime object? That's what you do with dynamic typing. Yo, here's an object. Please only call methods that it has, and I won't tell you which those are so you'll have to guess the types it may have and then browse to the API docs first (unless it has a method_missing, which you'll have too look at the API docs for as well - but what if it's duck typing and only one of the ducks quacks method_missing?)

Sure, I'm exaggerating. I like dynamic typing. I just have the idea that immutability is for nouns what static typing is for verbs. Just like Java has it half-assed, I feel a bit that Clojure has it half-assed the other way around.

In terms of Steve Yegge's conservative vs liberal discussion[0], it feels like Clojure went all liberal on one end, just to get super-conservative on the other.

[0] https://plus.google.com/110981030061712822816/posts/KaSKeg4v...


One difference is that in dynamic typing, if the object doesn't have the method, the runtime will tell you it can't be done.

If you mutate DateTime when the rest of the code expects it not to change, the runtime will let you, and you'll only notice through a logic error somewhere, that probably won't be obvious until way too late and the error has propagated through your system and screwed a chunk of your data.

On a conservative/liberal scale, decisions seem less conservative if the stakes are higher.


Yeah, replying to myself. I'm not sure if that's considered weird but ok.

According to this logic, most mainstream languages don't make sense. Notably, Java, C#, C, C++, Clojure are all screwed up. Not sure about other lisps. Ruby, PHP, JavaScript, Python, F#, Scala and Haskell have it right though.

(yeah sure, you can do mutability in Scala and immutability in Java, but the languages and their community lean toward the other, which is what matters here, I suppose)

Given that I've been much of a C# fanboy recently, my nose bleeds.


> it feels like Clojure went all liberal on one end, just to get super-conservative on the other.

In the brief time that I used Clojure, I also thought that the contrast between immutability and dynamic typing was very strange. Ultimately, I think it makes more sense than having the entire language be highly dynamic (or highly static).

I think Clojure's choice is not inconsistency, but that there is a kind of budget for craziness in a language. In Clojure, you can go crazy with dynamic types on the foundation of immutability, transactions, etc. In Haskell, you can go crazy with really awesome static typing tricks that would be utterly unfathomable if the language wasn't very rigid in every other way -- immutable, referentially transparent, side-effect-tracking. These two pack all their craziness into one corner of the language.

On the other hand, other languages -- Python, Javascript, Ruby -- seem to spread the craziness (I am not sure I picked the best word for this!) around, instead of having one super dynamic feature and everything else very inflexible. These languages tend to lack the really big flashy features: macros, typeclasses, etc.

It seems that you have a lot of flexibility as a language designer on how you spend your craziness budget, as long as you don't go too high (and become incomprehensible) or too low (Java?).


Maybe constraints rather than craziness? Constraints tailor a language, make it fitter for one purpose or another. But you have a limited constraint budget - overspend and everything is difficult.

As an old boss of mine (hi Paul Earwicker) used to say - "flexibility is just design decisions you haven't taken yet."


Craziness is not zero sum game, but it's hard to convince people that complain about "expression problem", "modular abstractions", "better DI", all the phrases to capture part of the problem, to read Real world haskell or Scala in Depth, both demanding books.

jerf did a good writeup, maybe a little overboard on clispscript, but besides that

http://www.jerf.org/iri/post/2908


Clojure's dynamic typing is somewhat different from other languages, though, as it really just has two non-atom types: seqs for collections and maps for data objects. Virtually all collections and objects can be manipulated as one of these two types. So "calling a method" on an inappropriate type is basically passing an inappropriate map to a function. You might want to disallow it, but Clojure allows it for good reason: this makes it possible to have many data access/manipulation functions that work on all objects. So the question is, do you want to forbid passing an argument that doesn't make sense to a function or open the door to lots of useful functions that would work on all objects. Clojure simply chose the latter.


> this makes it possible to have many data access/manipulation functions that work on all objects.

This is not actually true. Add proper type inference and structural typing, and it's perfectly possible to have all those generic data manipulation functions while also having strong static typing.


But using structural typing (on the JVM, where Clojure lives) is slow, because it requires using reflection, which severely limits its use currently.


I'm talking about structural static typing. At runtime, all the types would be fully determined, so no need for reflection. (Or vcalls for that matter).


How would you write a general function that prints out the values of all fields of any object (without reflection)?



I'm not quite sure, but I don't think this has anything to do with structural typing.


I think of it as each language making functionality trade-offs, in their design, for specific goals. Each language tweaking the recipe to make a language that best fits the programming model they advocate.


Huh?

If I do this:

    (defn to-rubles [x] (* x 1.2))
    (to-rubles (java.util.Date.))
I get an error:

    ClassCastException java.util.Date cannot be cast to java.lang.Number
How is this unreasonable?


The compiler didn't tell you it was broken. You had to run the program first.

Which means it's possible the error wasn't caught till it was in production.

Which means it's possible that a customer discovered a bug that a compiler could have discovered if you had used a language with a compiler capable of compile time static typing.


Which means you really, really, really need a good automated unit test suite. When you switch to a dynamically-typed language, you trade compile-time type guarantees for the ability to ask primitives and objects what type they are at runtime. So, you have to incorporate those questions into your unit tests. Especially when you are doing casting--which of course you should do only when there isn't a better option.

As an aside, static typing without type inference (a la Java) is just a pain. The compiler is just smart enough to enforce typing, but not smart enough to figure out what type anything is unless you explicitly tell it. This puts an extra burden on the programmer.

Fortunately Scala has type inference, so that's an option if you want to use a statically typed functional language on the JVM.


> When you switch to a dynamically-typed language, you trade compile-time type guarantees for the ability to ask primitives and objects what type they are at runtime.

This doesn't strike me as true. In a statically-typed language like C# or Java I can ask objects (and in C#, primitives) what they are all day long. Could you expand what you mean a bit?

(I also don't find Java particularly onerous regarding its lack of type inference--I mean, Java is onerous, just not because of that--and I find that I don't really use type inference much in C++ or C# either. But that is more of a question of taste.)


"In a statically-typed language like C# or Java I can ask objects (and in C#, primitives) what they are all day long."

Which shows that Java and C# are a hybrid of static and dynamic features. Using reflection and introspection to invoke behavior at run time is dynamic-language behavior. More strongly typed languages such as Haskell won't allow you to do this, as far as I know.


>More strongly typed languages such as Haskell won't allow you to do this, as far as I know.

Haskell has Data.Typeable. It'll let you reify some types for run-time reflection.


I mean that in situations where you have a possible type error, you can substitute compiler typechecking with your own typechecking by using something like (cond (= (type foo) bar)) or (cond (= (class foo) bar)). And you can decide how you want to handle it, instead of necessarily throwing a type error. Plus, polymorphic functions allow you to handle different types with the same function instead of having to overload it. And the test suite lets you run tests like (is (instance? Bar foo)) to help you catch type errors before you push to production.

So, what I mean is, although giving up the type-checking compiler opens up the possibility of introducing runtime type errors in production, it's far from a given that they are going to happen, because dynamic languages give you plenty of other tools to prevent them.

And Java's type system is onerous because it gives you all of the rigidity of static typing but none of the power of type inference. You really notice the difference when you switch to a language like ML or Scala that allows pattern-matching and polymorphic functions, unlike Java where you have to use overloading.


>Which means it's possible the error wasn't caught till it was in production.

Statically typed languages have plenty of errors that aren't caught until production. If you're really serious about compile-time guarantees, you'll want to use something like Haskell, Agda or Coq (in ascending order of extreme guarantees).

Of course, even formal verification won't protect you from an incorrect specification of your program.


Why these things in stay in books and blogs and never make their way into Java web apps:

1. Use immutable values:

Models used in client server communication need to follow Java Bean spec which is like the exact antagonistic concept to immutability. Service methods that implement business logic are stateless. As objects are not shared across threads, nobody feels the need for immutability. The most popular frameworks Spring and Hibernate dictate this architecture.

2. Do no work in the constructor

Are constructors still used? All services are wired using dependency injection. Models are either DTOs or Hibernate POJOs - both dumb and anyways don't do any work.

3. Program to small interfaces:

Interfaces are dictated by functionality they provide and not size. They can have hundreds of methods. This is how it looks: DocumentService - put every document related method in here, FinancialInstrumentService - put every instrument related method in here

4. Represent computation, not the world

Almost everybody begins OOP with this misconception - objects in real world directly map to OOP objects. Its maybe a good place to start but how many grow up from the initial simplistic rule - map nouns in requirements to classes. So, you end up with objects that don't really mean anything and don't do much. Naive use of UML diagrams also leads to this. Discovering abstractions is tricky. One needs to really live with requirements inside out before they present themselves. Who has so much time? Believe me in a quarterly release- developers get around only 3 weeks – rest is divided into BA, QA, UAT, freeze, deployment time.

PS: Please don’t get me wrong OP makes good points. It's sweet but the reality is different. May be Google does this (and they do in Guava which is just an example one after another of good stuff in Effective Java). But there's a big corporate java world out there that does things differently. They have well defined easy run of the mill patterns where these things don't fit (yet). This was just a peek into it.


1: Just because some ubiquitous libraries have dictated a practice doesn't make it a good idea or a good model. In fact, Hibernate's tracking of mutable objects in the current session is one of the most painful things about it. There's nothing inherent about client server communication that forces beans on you?

2: That's good, but work in the constructor is by no means gone.

3: That sounds like god objects. "FinancialInstrumentService" is way too much functionality (or poorly named). Getting a current quote, getting historical quotes, doing quant-stuff and placing a trade are all completely separate concerns. Opening a document, saving it, printing it - all separate concerns that are better kept in separate interface. Now, there's little harm in having an implementation implement several interfaces if the implementations has much in common. Extra bonus: Easier to write tests for.

4: Mostly agree, but:

> Believe me in a quarterly release- developers get around only 3 weeks – rest is divided into BA, QA, UAT, freeze, deployment time.

I won't believe you, because I've worked on a team that pushed out releases every two weeks, and got 8-9 days development in for each.


I would go a step farther and say that OpenDoc should be in a separate interface from SaveDoc.


That's what I meant - should have said interface_s_, I see now :)

Actually, opening a document is probably not even a good abstraction. DocProvider is better: You can have FileSystemDocProvider and BlankDocProvider - the latter can't meaningfully be said to "open" a document, but it's interacted with in the same way.


(1) The "Java Bean spec" is not closely followed for precisely this reason. Technically it requires setters for all attributes but in practice many libraries (certainly Spring) permit initialization via the constructor and treat getters as sufficient.

(2) Because of (1) constructors are indeed still used. In fact they're very actively preferred because IF you use the constructor then all immutable attributes can be declared final making the compiler warn you of unintentionally altered state.

(3) I violently disagree. Vast interfaces are a code smell and would fail code review in every team I've ever worked in. If you can't do this then odds are in a deficiency not in the principle but in your understanding of the domain.

(4) Sounds like a problem with your process not the principle.


> 2. Do no work in the constructor

> Are constructors still used? All services are wired using dependency injection. Models are either DTOs or Hibernate POJOs - both dumb and anyways don't do any work.

Guice encourages constructor injection.

> 3. Program to small interfaces:

> Interfaces are dictated by functionality they provide and not size. They can have hundreds of methods. This is how it looks: DocumentService - put every document related method in here, FinancialInstrumentService - put every instrument related method in here

Surely, by the point you reach hundreds of methods, you can refactor your service into several different, smaller services.


"Interfaces are dictated by functionality they provide and not size. They can have hundreds of methods. This is how it looks: DocumentService - put every document related method in here, FinancialInstrumentService - put every instrument related method in here"

Then what is the point of having interfaces at all?

There is no way multiple implementations will be created for such interfaces, so just make it a single class and be done with it. Otherwise, you are missing the whole point of an interface.

I guess the exceptions are proxy objects and other such patterns, but Java makes it difficult to implement such patterns without explicitly implementing each method to forward the exact same call to the delegate.


"Models used in client server communication need to follow Java Bean spec which is like the exact antagonistic concept to immutability."

I agree, and believe the wide adoption of Java Beans marked the end of any real application of object oriented principles in common enterprise Java development.

Public access to all state through setters, and no encapsulation of state in your data model. This is utterly antithetical to object oriented programming principles.


Tangential:

OP reminded me of Google Testing Blog [1]. Misko Hevery used to share lots of great insights there. Awesome yet practical stuff on coding in Java world, which anybody will greatly benefit from but not many are aware of. I hope more people read his stuff. I wish he stayed a bit longer before moving on to the AngularJS project - Java needed him more than the script.

Another Google Testing Blog star was James Whittaker, who's now moved on to Microsoft. Google Testing Blog doesn’t seems the same exciting place any more. And to add to that the crazy new design of blogspot - some stuff keeps happening on the page that's beyond one's control.

[1] http://googletesting.blogspot.com/


If Java programmers would only apply the concept of immutability the world be such a better place... Every time I log from within a setter to see who's calling this and when, a little fairy falls from the sky.


I have hacked Haskell just to be able to see what goes on inside a particular method chain.


Do you mean you have trouble with seeing what's going on inside a long function composition? I wrote a cool function to examine it with almost no code changes:

  module TraceCompose where

  import Debug.Trace

  -- | Trace a value, resulting in the value itself
  idTrace :: Show a => a -> a
  idTrace x = trace (show x) x

  traceCompose :: (Show a, Show b, Show c) =>
    (b -> c) -> (a -> b) -> a -> c
  traceCompose f g = h f . h g . h id
    where h f x = idTrace (f $ seq x x)

  -- Replace "normal" function composition using traceCompose.
  test = (+1) . (*10) $ 4
    where (.) = traceCompose
When you run "test", it will print:

4 40 41

(i.e., the outputs of each function at each point in the composition in order of execution.)


We call this trading one hell (direct mutability) for another worse hell (Indirect mutability).


I thought that classes+methods were meant to transform an object from one consistent state to another consistent state. So why is immutability suddenly a big deal? (I.e., if it's a big deal, may it be because encapsulation is lacking? IOW, is immutability a rediscovered "private" keyword, just much more cumbersome to use?)


BTW, immutability has nothing to do with encapsulation. Encapsulation is the hiding, or access restriction, of data structures from "foreign" functions. Those foreign functions may only manipulate the encapsulated data through a prescribed set of functions.

We achieve this in clojure through clojure's lovely typing mechanism which allows you to declare types that have publicly known functions without any data declarations; and private implementations of those functions that know the data they are manipulating.

One last note. C was much more encapsulated than C++, Java, or C# because in C you would declare your functions in a .h file, and your variables in a .c file. No other .c file could see your variables, so they _had_ to use your functions. The public/private keywords were added to C++, and then to Java and C# because the act of putting variable and function declarations in the same source file broke encapsulation, and we needed a way to re-assert it. That reassertion was only partially effective.

The bottom line is that all the C based OO languages are _less_ encapsulated than C.


Because the particular style of OO you're describing has turned out not to work very well. It turns out that often you can't effectively encapsulate state in a class such that no one else but the class itself needs to know about that state.

As the size of a class grows, mutable member variables approach globals. If you wisely break the class into smaller ones, classes will still contain instances of other classes, so to reason about the behaviour of your class you often have to understand the possible state of its sub-instances. This is especially true if multiple classes reference the same instance.


Think about what immutability means. It means that there's no assignment statement. If there's no assignment statement, there can't be any side effects. If there are no side effects, you can't race conditions or concurrent update problems. You can't have resource leaks. You can't have order dependencies.

Does Clojure guarantee all these things? Not entirely; because Clojure allows you to invoke the java stack, which is decidedly not immutable; and because clojure provides mechanisms for changing state; albeit with significant discipline imposed.

The end result is that clojure is _much_ safer to use in complex and multi-threaded systems, and is likely much, _much_ safer in extreme multi-core applications.


Java's private is class based, not object based, so it's not really private. That is sort of broken.

Closure actually has truly local mutability, in its "transients" feature


Re >2. Do no work in the constructor

The author doesn't propose a solution here, so I'm worried he's thrown out the baby with the bath water. Yes, it's true that you should have no File IO in a constructor when it violates the SRP. At the same time (and as the author notes), it's very convenient to have Foo.fromFile(String path) or similar.

Here's what I think the compromise looks like:

   public static OptimizedPage fromFile(String path) throws IOException {
      return OptimizedPage.from(path, FileReader.readFile(path));
   }
Which seems like having your cake and eating it too.

Or to put it another way: that File IO needs to happen either way -- forcing the caller do write:

   OptimizedPage foo = OptimizedPage.from(path, FileReader.readFile(path));
whenever they mean

   OptimizedPage foo = OptimizedPage.fromFile(path);
Is a subtle violation of DRY.


So yes - static factory methods - or, if it gets complex enough, a distinct factory object - are the right place to do work that would otherwise need to be done in the constructor. Use package-private constructors to enforce that people don't bypass them.


Those are factory methods because I think

   OptimizedPage.fromFile(pathName);
reads cleaner than

   new OptimizedPage(pathName);
Do you think there's a real difference (besides readability) for preferring a factory method to a constructor in this context?


As a library vendor it gives you more flexibility, particularly if you make the method return an interface rather than the concrete type. Also it can make working with frameworks that call a constructor by reflection easier (you probably know if this is happening though). It's certainly worth separating the "inner" (i.e. the one that takes a FileReader) from the "outer" constructor so that you can call the inner one for testing (perhaps with a mock FileReader), but admittedly you can accomplish that equally well with a public constructor that calls a protected constructor.

For internal code I doubt it makes a lot of difference - you can always change it if you need it - but I prefer to use static factory methods everywhere for consistency.


I think that the article's defense of immutability is a bit poor. Making a date mutable means that it won't always reference the same point in time: yeah, a birthday will remain constant, but an appointment may not; the same goes for a lot of types. That's why C and C++ gives us the "const" keyword: the same object can now be used as mutable or immutable, depending on circumstances.

The other points, especially the second, are spot on, although they have a lot more to do with the Java culture than with the language itself.


The date of an appointment might change, but the date itself does not. Dates are values that refer to a fixed period of time; the meaning of "March 7th 2013" won't change if I move an appointment.

The correct way to handle an appointment is to have a reference to an immutable date. When you want to change the date the appointment is on, you change the reference to a different date, not the date itself.


You are thinking about both dates and appointments the wrong way. Dates never change, they are fixed points in time. A date is not an appointment; an appointment has a date associated with it, and many other things. Appointments do not change either: they are canceled and rescheduled (which means making a new appointment).

The problem with C and C++ having the const keyword is that it is overly restrictive and is not flexible enough. What you are trying to express with const is that the value of an object before a computation is equivalent to its value after the computation, under some definition of equivalence. What const actually says is that the object will not change at all. C++ tries to mitigate this with mutable, but this is too rigid: the same things are mutable or immutable regardless of the context.

What we really need is what you see in specification languages: the ability to declare precisely how a value is modified by an operation. Maybe I have an object with a stream member, and I want to say that nothing will be written to the stream without saying that the stream will not flush its buffer for one operation, whereas in another operation I want to ensure that the buffer will not be flushed.


What bothers me just a wee bit in section 1 of his example is that Date is more or less a simple container class for data - one that could be implemented as a HashMap.

It is easy to make classes like Date immutable, but that does not help with a lot of problems. Date is super-easy to test, only state, no logic. Similarly, static methods are easy to test.

The real pain in Java is when a class has both state and logic - and those are precisely the classes you cannot easily make immutable.


Not necessarily - take joda-time's DateTime class as an example, which contains plenty of methods for changing fields in the date, but all of which return new DateTime instances to caller. DateTime itself is immutable.


I would add "Create custom annotations sparingly". It makes Java interop and debugging harder. But, that is just a personal pet peeve.


Good article.

Writing immutable software is hard, when the language doesn't provide constructs for it. It will require some discipline. There are some good frameworks/tools for the JVM for Java programmers for this (the whole AKKA package is a good place to start).

I would highly recommend Venkat Subramaniam's book for Java programmers: "..Concurrency on JVM.."- http://pragprog.com/book/vspcon/programming-concurrency-on-t...


OK but these things are not enough. I'm a long time Java dev and all these were known in c.l.j.p. and IRC since a very long time. I was doing precisely that, even using the "functional Java" libs when they came out and doing even more radical things...

And I can tell you that even when doing that switching to Clojure is pure joy.

Because even when doing what TFA talks about, this still doesn't solve lots of very nasty Java issues, like the totally outdated approach to concurrency.

What Java programmers can learn from Clojure is that it's possible to create a language targetting the JVM which cannot deadlock and which can offer a dramatic reduction in the size of the source code.

Sadly you simply cannot apply these to "Java the language": Java is utterly verbose and there's no way to have a "sane" way to deal with concurrency in Java (yes, I've got my Java "Concurrency In Practice" copy since it came out).

The other major thing to learn about Clojure + Datomic is that there's a world out there made of something else than the special kind of hell that Java/C# + ORM ([N]Hibernate) + XML + SQL is. (and before you start whining like cry-babies, Datomic can be backed by SQL DBs)

Programmers who haven't done it so yet should really go watch videos by Rich Hickey, here are three particularly good ones:

"Simple made easy"

"The value of values"

and "The Database as a value"

http://www.infoq.com/presentations/Datomic-Database-Value

Now sure some will criticize Clojure as being a Lisp-1 and not having real reader macros, others will rightly point out that Clojure's documentation sucks big times and that stacktraces are still a serious issue.

But at least Clojure is showing that there's a saner way than this Java madness.

You have to realize that Clojure is Rich Hickey's fourth attempt or so at a Lisp dialect and that he had lots of very painful experience working on Java Real-World [TM] maddening codebases.


What do you mean by the totally outdated approach to concurrency? I'm also a longtime Java dev who's embraced these values for a long time, and I generally thought of using ExecutorServices and Callables as a very functional approach to concurrent programming. If you're referring to manually starting threads and using synchronized monitors to ensure thread-safety, then I agree, but java.util.concurrent really doesn't encourage that approach.


Not the author, but I'd say he's referring to two Clojure features that make programming with concurrency so nice, specifically STM (Software Transactional Memory) and immutability by default. STM is much better (IMO) approach to concurrency than locking, but it's really only viable (as far as I know) in a language like Clojure where values are immutable by default.

The great thing about STM is that it's easily composable, while locking isn't.


It can be done at the page level in an OS if you want. It's not efficient, but then it hasn't been shown to be that efficient in a controlled functional language yet either. Locking still wins on clock cycles, unfortunately.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: