Hacker News new | past | comments | ask | show | jobs | submit login
Pattern Matching for Java (java.net)
370 points by steve_barham on April 19, 2017 | hide | past | favorite | 146 comments



IMO this is the biggest thing available to modern languages that Java is missing. I would absolutely love to see this, particularly pattern decomposition. I wonder if you could do it without something analogous to Scala's sealed classes though--you really want your type checker to be able to assert every match has considered every branch (without having to specify a "default" everywhere). That means you need to be able to mark classes as not-dynamically-extendible, so the type checker has the full set of subtypes available.

Edit: Just got to the bottom of the article. Looks like sealed hierarchies is exactly what they explore.


I mean, reading this article is basically a giant advertisement for Scala and its powerful pattern matching.

  functionThatReturnsAnyVal match {
    case i: Int if i > 2 => System.out.println("Greater than 2")
    case i: Int => System.out.println("Less than or equal to 2")
    case s: String => System.out.println("Hello, " + s)
    case _ =>
  }


This still seems a terrible example just to try and avoid naming an object. Isn't the comparison to:

    Object o = functionThatReturnsAnyVal();
    if (o instanceof Integer) {
        Integer i = (Integer) o;
        if (i > 2) {
           System.out.println("Greater than 2.");
        } else {
           System.out.println("Less than or equal to 2.");
        }
     } else if (o instanceof String) {
        String s = (String) s;
        System.out.println("Hello, " + s);
     }
I get that the case syntax is kinda nice, but this particular example just doesn't seem to get there for me. Roughly half the lines, which is good. None of them hard to reason about. Which makes it a wash.

Or is the comparison to something else?


> I get that the case syntax is kinda nice, but this particular example just doesn't seem to get there for me. Roughly half the lines, which is good. None of them hard to reason about. Which makes it a wash.

I suppose it's a matter of opinion, but to me the comparison of these two snippets is nowhere near "a wash". The readability of the former is leaps and bounds ahead of the java version, and it will only be more apparent as the example grows in complexity.


As either example grows in complexity, they both wash down to a check and a call to another method to do the heavy lifting. In that case, the length of the methods will not grow apart from each other heavily.

And let me be clear. My gripe is only with the example. Not the idea. I happen to like pattern matching, but have never used it like those. Usually I use it to tease apart data by parts. (though, to be fair, I have found I use it less than I thought I did. Not sure why...)


In reality my function would rarely exist in the Scala world since having a weakly-typed return like that is cringeworthy to say the least. But it's better to see this in action with Scala options.

  val myOptionalVar: Option[String] = functionThatReturnsOption()

  myOptionalVar match {
    case Some(str) if str.length > 10 => System.out.println("This is a long string.")
    case Some(str) => System.out.println("This is a short string.")
    case None => System.out.println("This code would be fifteen lines of null checking in Java.")
  }


With Java 8 having Optional and Lambdas, this is no longer the case:

  final Optional<String> myOptionalVar = functionThatReturnsOptional();
  
  final String output = myOptionalVar
    .map(str -> str.length() > 10
      ? "This is a long string."
      : "This is a short string.")
    .orElse("This code would be fifteen lines of null checking in Java.");
  
  System.out.println(output);


I think this is less-rare than you think. In my Scala code, I have a lot of matching on weakly-typed values which come from parsing JSON objects (where I also deal with a lot of optional values).


I think the prettier syntax actually really helps write code faster. A similar form of syntactic sugar appeared with lambda expressions. Instead of needing to explicitly subclass and write out the overriding method, you can just create an anonymous function.

It's way less code and it makes the resulting code easier to reason about. The code's intent is more elegantly conveyed.


(String) o;

:)


Fair. :) I was much more worried about if I had the required spaces at the beginning of the line, than if I wrote this correctly. Still a static error, though, that the compiler would catch.


> reading this article is basically a giant advertisement for Scala and its powerful pattern matching.

Javas strength has always been to take the good parts of its competitors after they've checked out in production (and not just "wouldn't this be a great idea ..?") and implement them. It will probably never be ahead of the curve due to this, but at least it is remarkably free of "looks good in theory, useless in reality"-features.


In general, I agree with all but your last point. There are many things in Java that were good ideas at the time of introduction, but turned out horrible (serializable, finalize, etc.)


Fair enough. I think both of your examples exist since JDK 1.0 (or 1.1), though I'd have to look to be sure. I think they've wised up after that.


Type erasure for generics...


Wrong. Type erasure for generics is one of Java's luckiest breaks. At worst, it imposes a tiny inconvenience, but in exchange this is what allowed Java not to bake a variance model into the VM and libraries, and this is precisely what allows languages with different variance models -- like Java, Kotlin and Clojure -- to both share code and actual data objects with no runtime conversions. This is something that can't happen on .NET (see, e.g. how clumsily their Python implementation interacts with generics). This was a huge win for almost no cost.


I completely agree, what a blessing in disguise. Scala wouldn't be possible without it.


It was designed in part by Phil Wadler, so it's not surprising it's a success.


You mean it's a giant advertisement for ML and its powerful pattern matching?


> IMO this is the biggest thing available to modern languages that Java is missing.

It always amuses me to see features of ML, a language from 1973, described as "modern"; e.g. algebraic datatypes, pattern-matching, type inference, parametric polymorphism, etc.

C (from which many popular languages like C++, Java, C#, PHP, etc. are derived) came out in 1972.

I wouldn't say it's a case of being "modern", so much as paying attention to what else has been tried before, rather than sticking with what one already knows (when designing a language).


I didn't use "modern" to describe features, I used it to describe languages. Modern languages have a tendency toward multi-paradigm feature inclusion. So while the features themselves aren't new, combining well-received features from other languages and paradigms is largely a characteristic of modern languages. And newer languages have the upper hand here since it's easier to include a feature from the start than it is to retrofit it into a language.


I wonder what the programming world would look like now if functional programming concepts had broken through in the 80s and 90s, instead of OO. C would no doubt still be C, filling the same minimalistic imperative close-to-iron niche, but what about higher-level languages?


The move towards Interactive Applications has helped OO and scripting languages. There are still plenty of applications for functional programming in the most demanding of environments, for safety and high availability. However, most students are not good enough at math to learn functional programming concepts. Whether that's the teachers' fault or the students' fault or society's fault, I can't say. But if we paid programmers for safety critical systems much more than we pay for good front-end developers, you would see a much higher level of adoption of functional programming languages.


> However, most students are not good enough at math to learn functional programming concepts.

When does this misconception disappear?

You don't need to be good at math to use functional programming. It's nice that there is a correspondence between math and FP, but it's mostly irrelevant when coding. You could as well say you need a FP background when learning math. Both statements are nonsense and usually spread by people who mostly read complicated blogposts instead of writing actual code using FP.

FP just offers a nice bunch of intuitive and predictable ways of processing information.


Take Lisp as a perfect example, I doubt you'd find very many people who think that you need to be strong at maths to learn Lisp. I wonder if this misconception came about because of Haskell's (unfounded) reputation of being a "scary abstract maths language".


> However, most students are not good enough at math to learn functional programming concepts.

I don't agree with this. Many programming concepts outside functional programming can be very hard to grasp, yet we expect students to pick them up, and we expect front-end Web devs to use (some of) them every day.

Some examples off the top of my head:

- Pointers (e.g. the many incompatible meanings of "" in C).

- Classes and instances.

- Implicit state.

- Mutable global state!

- Concurrency (events, continuations, actors, etc.).

- Multithreading. Yes, it's a form of concurrency, but it also offers a unique combination of being a) the default go-to strategy for implementing concurrency, and b) so fundamentally hostile to any attempts at understanding.

- `x = x + 1`. This can be completely baffling when first encountered!

In this context, functional programming doesn't really require anything particularly difficult. Maybe recursion is hard to grasp, but that's not unique to FP; FP just uses it more often, making it harder to avoid. The only thing I can think of that's pretty much a requirement in FP, that we don't already require everyone to grasp, is first-class functions; and they're already pretty widespread in scripting languages.

What about monads? What about denotational semantics? What about cartesian closed categories? What about all that other complicated math stuff? None of it is needed to do FP*! Just grab a small Scheme interpreter, ignore "call/cc" or anything ending in "!", and start coding.

Let's look at, say, the accepted papers from the last OOPSLA (Object-Oriented Programming, Systems, Languages & Applications): http://2016.splashcon.org/track/splash-2016-oopsla#event-ove...

Looking through the titles, I don't know what an "enclave" is, or "conditional future synthesis", or "first-class effect reflection". I have no idea what a "non-linearizable concurrent object" is, or what constitutes a "dependent object type". Does that mean I'm not good enough at math to learn OOP? Not at all; I can just run `python` and start hacking.


Oops, I meant the incompatible meanings of the asterisk in C; HN swallowed it as markup, and emphasised the rest of the comment.


Half the world runs on Excel, a functional programming environment. If your average manager is mathy enough to understand functional programming, the average programmer should be as well.


I retrofitted F# style "discriminated unions" (which are basically sealed heirarchies) into C# by creating a series of generic types `OneOf<T0, ..., Tn>` which can hold exactly one value,

Each type has a `.Match` and `.Switch` methods, in to which you have to pass lambdas to handle each case `.Match(Func<T0, TResult, ..., Func<Tn, TResult>`.

I don't know if this would work in Java, given the generic type erasure, but it might...

1. https://github.com/mcintyre321/OneOf


Why the fixation on pattern matching? Don't get me wrong, I enjoy the benefit it provides, but for OO code multiple dispatch is a more elegant and idiomatic way to solve the "I don't want to implement the visitor pattern" problem.

FYI, I recently found out that C# can actually do multiple dispatch

https://blogs.msdn.microsoft.com/shawnhar/2011/04/05/visitor...


Multiple dispatch is sometimes more elegant, provided you can divorce the method being dispatched to from the argument classes. If you need to do it with a visitor pattern and multiple callbacks, it's much worse. If you need completeness checking, it's worse. If you've got a simple concise algorithm best expressed in a loop, it's worse. If you've got a nested structure that you want to partially destructure, it's worse.

As you've found, overloading is statically typed multiple dispatch; discovering the right overloaded method at runtime is functionally equivalent to multiple dispatch if your overload resolution rules prefer more specific derived types (rather than simply giving up with ambiguity).


would you be able to give me a pseudo code example of things that are easy to do with pattern matching, but not multiple dispatch? I'm intrigued, I always considered them equivalent.


Here's some concrete examples in Rust:

http://pzol.github.io/getting_rusty/posts/20140417_destructu...

It's not just about finding the right bit of code to run; it's also about pulling some fields out of the polymorphic value so they're available as nice short names, without qualification.


I suppose languages from the ML family, and rust, haskell etc are somewhat oriented around nested structures of algebraic types that you want to destructure to pull out little atoms of data out of from some well-defined node. But that's orthogonal to the OO approach, where once all the little atoms of data are sealed up in their protective casing of an object you don't want to deal with them anymore.

C# is a weird beast though - it's pure OO on some level, but then keeps making getters and setters even easier to write. It's easier to write an "object" where every field has a getter and setter, than it is to write an actual constructor for private variables, so that's what people do.


To me it seems pattern matching without possibility to actually enforce that all cases are matched seems to miss the most useful scenario:

What I want to do is create proper sum types. For some reason most OO languages were designed around the idea that being "open for extension" was a good thing. Normally if I make a base class, I want a closed and known set of sub classes. E.g. if I make a payment method then I want to make a Credit or Invoice kind where one requires Credit Card details and the other requires an address. In F# that's a simple sum type. In C# making a sum type is a convoluted mess of making an outer base class and private nested sub classes.

Here a sum type with the two cases plus an exhaustive pattern matching switch would have solved in 10 lines what takes me 300 lines to do in C#. I think the concept of "fixed" or "closed" hierarchies is foreign enough to OO that it probably needs to be a completely separate concept from the normal types. F# shows that it can be implemented as sugar on top of the regular CLR type system. All that's needed is that some keyword such as "enum class" is converted into a sealed hierarchy of plain record classes with no methods on them. The switch statement we already have can then treat these "enum classes" specially by checking that all cases are matched.

A sketch solution for C# could look like

    enum class PaymentMethod
    {
        CreditCard(CrediCardDetails cc),
        Invoice(Address billingAddress),          
    }
    
    // later
    switch (paymentMethod)
    {
      case CreditCard(ccdetails):
         Console.WriteLine($"Creditcard expiring {ccdetails.ExpiryDate}");
      case Invoice(addr):
         Console.WriteLine($"Bill to {addr.Name}");
    }
Writing the above in current C# would require significantly more boilerplate. This isn't the best pattern to use in all situations: but in many cases when the types are merely data carriers it doesn't make sense to put the logic in the class as OO prescribes, so the FP recipe works much better.


In the C# example in the code I posted, all cases are matched. If you add a new animal subclass, that's not "explicitly" handled, it will route to the default (Animal, Animal) method.

It's equivalent of

    _ -> default_function
At the end of a pattern matching statement.

Essentially it's impossible for not all the cases to be matched, as far as I can see.


Yes but making new classes map to default is what a default clause already does (which is basically what a base case in an abstract base class does with normal inheritance).

What I want is to make a new type added mean the code won't compile until it is explicitly handled in all pattern matches that do not have a default clause. Basically what I'm saying is that this:

   bool HandlePayment(PaymentMethod pm) {
      switch(pm) {
         case Invoice(addr):
              SendInvoice(addr); 
              return true;
         case CreditCard(ccdetails)
              return PrcessCreditcard(ccdetails);
         default:
              // What?
      }
   }
Would be a lot better if it didn't require the default, and the compiler could detect the error that occurs when someone adds a new case (BitCoin) to the types of payment method. Inheritance and abstract baseclass for the processing means that you'd do this

   bool HandlePayment(PaymentMethod pm)
   {
      return pm.Process(order);  // sends an invoice, charges credit card etc
   }

   and you simply have 

   abstract class PaymentMethod { abstract bool Process(order); }
   class CreditCard : PaymentMethod {  ... }
   class Invoice : PaymentMethod {  ... }
And this is of course the regular way of writing OO, but I find it unnatural and cumbersome in many cases. It's hard to define concise descriptions of what types are actually available, and you have to write a ton of boilerplate to ensure a closed hierarchy and complete matching in all scenarios where you can't enclose the logic inside each type.


Scala gets this right, in my opinion, with sealed traits and case classes. "enum classes" feel like a much more limited option by comparison. Although I'd take either over the status quo to be sure.


What's the difference between the enum class I outlined and scalas case classes? I don't know Scala but reading the docs on its case classes, their description seem to fit exactly what I tried to describe (immutable types, closed for further inheritance etc).

The example they give for notification = email | sms | voice looks a lot like what I was trying to sketch with the paymentMethod example.

http://docs.scala-lang.org/tutorials/tour/case-classes.html

I didn't intend for my proposal to be more limited than than the Scala case classes at least :). I want exactly that. An FP closed type hierarchy with enforced pattern matching.


Case classes can have methods on them, for one - I can see how that might be added to yours, but it'd probably be very awkward syntactically if you wanted different methods on different cases. To me, it also feels like your enum class is more of a special case. Scala's case classes and sealed traits work together as independent features that come together to provide value. I've used case classes before without using the 'enum-like' property of sealed traits, where I just want to get equality, unapplying, etc... for free.

They also have more configurability - while a case class provides default implementations for things like equality, you can override those (this could be a pro or a con depending on whether you prefer flexibiltiy or performance, I guess - although arguably the flexibility should only have cost if you use it, assuming it's implemented well).

It's definitely not a huge difference, and arguably Scala can suffer from it's philosophy of providing tools that can do something rather than tools for something that means the language often comes across as huge and as having too much stuff in it. I like it, but it's not the C# way right now for sure.


Ah - yes method impl on the cases is certainly a difference. I'd probably choose compatibility with F# sum types if this lands on the CLR though, but I can see how equality and some other things might be useful. Otherwise you are forced to pattern match for all logic which might end up like an inside out object.


I think matching + destructuring solve similar but different problems. Often times, items you don't need to do something based on some unbounded set of things nor do you need an abstraction.

For example, if I am writing a logic system in rust I can either match with one single arm or a nested match implement simple rules like double negation. I don't see how you could easily do this with multiple dispatch like the one you link.

I find myself really wish more languages would implement both the system you link and something like rust's/ML's. Usually it's easier to build multi-dispatch than match syntax though. In rust you can use HashMaps of TypeId -> Box<Any>. I would do a similar thing in C# (though now I am going to use this shiny new feature).


> multiple dispatch

> "I don't want to implement the visitor pattern"

Both multiple dispatch and visitor bring type unsafe dynamism and possible runtime errors. Pattern matching and algebraic types fit well into static type system and are much more safe.


Why does multiple dispatch bring type unsafe dynamism or runtime errors? Or do you just mean C#'s implementation of multiple dispatch?


Yep, there is no way to check the exhaustiveness in the compile time. Say you have an Expression parent class and a list of child classes: say Add, Const, Var, and a corresponding visitor. Add new child class, and the visitor would be non-exhaustive. Multiple dispatch is not very different in that sense.

I think that Alan Key spoke about dynamic nature of OOP, that's one of the examples. Even in the statically typed environment OOP demonstrates its dynamic nature.


I'm lost here. Either people didn't read the article and are just relying on preconceived notions about what stuff can and can't happen with multiple dispatch, or I am missing something blindingly obvious. Anyway, consider this example:

    class Expression {}
    class Add : Expression {}
    class Const : Expression {}
    class Var : Expression {}

    void FSpecialization(Expression e1, Expression e2) { ... }
    void FSpecialization(Const c, Var v) { ... }

    ...

    void F(Expression e1, Expression e2)
    {
        FSpecialization(e1 as dynamic, e2 as dynamic);
    }
How is this not exhausted? Any new sub class of expression will be routed through to the first FSpecialization. Same as it would with a

    _ -> default_f
in algebraic pattern matching.


It's not exhaustive, because you're only dealing with the Const+Var case. You ignore the other 5 cases. Having a default case doesn't suddenly make it exhaustive, it's just an explicit acknowledgement that you can't.

Because what happens if you introduce a new expression, but not realize this means the implementation of F should be updated?

Regarding algebraic pattern matching, you can let Haskell check exhaustiveness and then drop the default case. The compiler will then warn you if you forgot one.

To give you an example, in our codebase we have an enum that is used 3068 times in one solution. A very similar one (you would need domain knowledge to understand the difference) is used 2985 times. Both are not defined in that solution by the way.

It's not out of the question they will receive a new enum member down the line. It would be very useful if the compiler was able to tell you a switch that uses either enum is no longer exhaustive.

If you could exhaustively switch on an enum, you could skip the default: case and have the compiler enforce you covered all cases. (I know, the C# enum==int would make that impossible.)


>How is this not exhausted?

If you would remove FSpecialization(Expression e1, Expression e2) a.k.a wildcard pattern, compiler would not warn you about your dispatching is not exhaustive. And if you'd add another type of expression, all your dispatchers and visitors without wildcard would become non exhaustive, staying valid code from a compiler perspective at the same time.


I'm sorry would you mind going into a bit more depth about that example? It's not obvious to me why the compiler can't figure out the the visitor is no longer exhaustive.


Visitor is just a collection of functions for all type's subtypes. The problem is that visitor knows nothing about class' subtypes, just like the base class knows nothing about it's children, so you have to add visit cases manually for each child class. The only way to detect that visitor is non exhaustive is to apply it to the child class it has no visit function for.


I think both are not exhaustive (or don't have exhaustiveness enforced by the compiler).


> FYI, I recently found out that C# can actually do multiple dispatch

But also good to know, that it can be costly and unsafe. `dynamic` is basically using reflection at run-time, which for most cases is fine, but it's something to keep in mind. It also means that type checking is delayed until run-time instead of compile-time.


A dynamic type is probably just implemented as a tagged bit of data, surely. Which is the exact same way an algebraic datatype would be done. So I dunno why it would be any slower.

I think the safety thing is a non issue as well - what arguments could you pass in to the statically typed function that would cause the use of dynamic to bite you in the ass?


> A dynamic type is probably just implemented as a tagged bit of data, surely

A dynamic "type" is really just `object` whose methods are looked up at run time. It's not tagged data at all. See [1] for more info, specifically the example(s) at the bottom, as the reflection code is what gets executed at run (albeit with caching so that methods aren't looked up all the time).

`dynamic` is a complex feature that enables multiple dispatch as a side-effect, but only because it allows a whole lot more.

[1]: https://visualstudiomagazine.com/Articles/2011/02/01/Underst...

> what arguments could you pass in to the statically typed function that would cause the use of dynamic to bite you in the ass

    public void Output(int value);
    public void Output(Person person);
    ...
    dynamic aValue = "Some String";
    Output(aValue);
Compiles, because the actual resolving of which overload to call is done at runtime, not at compile time. And at runtime, there is no overload that accepts a string.


I am talking about what value you could pass into the multiple dispatch code in the article I linked to. I know you can cause run time type errors using dynamic in general, but in the code I linked it was very well contained and water tight.

Dynamically typed languages usually represent objects as something like a struct with an int tag to represent the type, and a void pointer for its value. The actual reflection going on in the code I posted for multiple dispatch would surely be nothing more than an int comparison - same as what Ocaml probably does.


> I am talking about what value you could pass into the multiple dispatch code in the article I linked to

Ah, I misunderstood the question. Yes, in that case it's fairly type-safe.

> The actual reflection going on in the code I posted for multiple dispatch would surely be nothing more than an int comparison

Is this based on gut reaction or are you talking about optimizations that `dynamic` performs? I ask because my understanding is that `dynamic` is like a really efficient reflection-emitter, that attempts to do at runtime the same thing that the compiler would do at compile time, but slower because it has to look stuff up via reflection.

From Eric Lippert [2]:

> The magic is: the compiler emits code that starts the C# compiler again at runtime. The runtime version of the compiler analyzes the call as though the compile-time types of all the objects had been their actual runtime types, generates an expression tree representing that call, compiles the expression tree, caches the delegate for next time, and runs the delegate.

[2]: http://stackoverflow.com/questions/10330805/c-sharp-multiple...

So maybe `dynamic` at runtime figures out that it can do a "a struct with an int tag to represent the type, and a void pointer for its value", but I wouldn't assume it does and from what I've read on it, I wouldn't think that it does.


Base classes already have to carry their runtime type with them anyway. As do nodes of an algebraic data type. You may well be right that the compiler might not be "sufficiently smart" to see this very limited use of dynamic and realise no special reflection magic is actually required. But considering dynamic is only used to actually pull out the runtime type in the example code, and not to call arbitrary methods on it that may or not be there, I assumed - maybe wrongly - that the runtime hit would be the same as the branching that occurs in Ocaml or Haskell or F# when it comes across a "match" statement.

But to answer your question - yes, entirely based on gut reaction and what a compiler would ideally be able to do given the code posted. So very probably wrong.


> I think the safety thing is a non issue as well - what arguments could you pass in to the statically typed function that would cause the use of dynamic to bite you in the ass?

None if the function is written correctly, but by that logic you don't need type safety at all. The point is that the dynamic technique gives you no warning if you forget to implement one of the cases.


My point was that the dynamic technique in the posted code was written in a very small, contained point where it would be impossible due to the signature of the surrounding argument for it to go wrong.

    void React(Animal me, Animal other) 
    { 
        ReactSpecialization(me as dynamic, other as dynamic); 
    }
Unless you pass in a casted value here, it's fool proof.


The point is that if you forget to write one of the ReactSpecialization methods nothing detects it. Perhaps more importantly, if you add a new type of Animal there's nothing to tell you you need to write a bunch of new ReactSpecializations.


Aha! There's the crux of an argument. It's a valid point actually, that ties into the expression problem.

As an OO apologist I do feel compelled to point out that

    type Animal = Cat | Dog | Mouse

    let react = function (a, b)
    | Cat, Dog -> ...

    ...

    | x, y -> $"{x} is not interested in {y}."
Would suffer from the exact same issue though.


Only if you deliberately add the catch-all case. If you don't and just handle all the cases explicitly, many languages will give you a warning or error when you add a new unhandled variant.


Yeah, it's a valid point.


No fixation. It can just be more convenient sometimes (and less convenient in others).


wow. I havent used C# for at least three years, but that is a really powerful feature.


What's even better is that the DLR consists of some pretty tight code. Historically I was using a ConcurrentDictionary containing Funcs compiled using Expression. I've found the DLR is just as fast (for things that you can express with it), with a much lower overhead. I've started using it in dynamic scenarios wherever possible.

Basically, this is going to be about as fast as you can get with the statically-typed underlying runtime.


> I've found the DLR is just as fast... with a much lower overhead

Can you expand on this? I would not have expected that it would be faster than compiled expressions.


The codegen creates a cache of delegates (created using expression trees), but uses a structure more suited to the problem than a simple ConcurrentDictionary.

[1]: http://geekswithblogs.net/simonc/archive/2012/07/20/inside-t...


Ah, faster than the `ConcurrentDictionary`. That makes sense!


Dynamic calls gets compiled at runtime, so there's an initial hit (as there would be with Expr.Compile) but after that ...


I was originally very happy to hear pattern matching was coming to C#, but I don't think it is actually that useful a feature without discriminated unions/sum types too.


What C# has now is better described as fancy form of Destructuring . Perhaps they'll add actual pattern matching in a later release.

Actually the proposed java version is better than C#'s because it allows a form of exhaustiveness checking which C# doesn't even attempt to provide for.

However they had to add the `sealed` keyword too, which does not currently exist in java


"How we declare the destructuring pattern in the AddNode class, or declare AddNode so that it implicitly acquires one, will be covered in a separate document."

Please use a Scala unapply() and not just constructor parameters. The former gives much more latitude to build patterns.

I have a ton more to say about all of this having written in Scala and Kotlin extensively. I'll just say that it is useful they are already thinking about new variable assignment to parts of matches and NOT concerning themselves w/ "smart casts".


Pattern matching is possibly one of the most under-utilized approaches that can simplify a lot of ugly/complex logic.


On the other hand it "prettifies" bad code style - branching by dynamic casts / type comparisons.


How is that bad code style? That's exactly what virtual method dispatch does.

The authors in the article even address this point, stating that some operations might make sense as instance methods if they are instrinsic to the type hierarchy, but others, especially ad-hoc queries, are extrinsic to the types and best expressed as pattern matching.

Simply calling this "bad code style" sounds a lot like a justification for only having a verbose way to use an alternative to virtual method dispatch.


Enumerating and testing for type equality each derived classes in one place (i.e. manual type switch) is certainly a bad code practice in OO languages. Yes, the better way is to let virtual method dispatch mechanism decide.


Depends. It's useful when you want to have all dispatch code in one place (as opposed to it being scattered across numerous files). Doubly so if your language (like most languages) lacks multimethods. Triply so, if you need to dispatch on built-in types you can't extend, which sometimes happens in e.g. Java.


Keep in mind that it is mostly considered bad code style today (at least in Java) because the current tools we have (if-statements, instanceof checks, like the article lays out) lead to bugprone code. A better facility like pattern matching could eliminate a lot of this.


Pattern matching allows far more than just branching by type comparisons. And I disagree that this is bad code style, this is actually a very common pattern in compilers (e.g. when you iterate a list of instructions, which may be of different types)


I disagree. What's bad about that/what's the alternative?


The problem with pattern matching as I've seen it implemented is that unsafe and safe pattern matching look exactly the same (even the exact same code line can be safe or unsafe depending on context). See the bottom of http://typelevel.org/blog/2014/11/10/why_is_adt_pattern_matc... . For this reason I tend to prefer explicit virtual methods even though they're more cumbersome.


The pattern matching I have in mind is very dynamic in nature and does not concern itself at all with some static notion the data that is being matched. Either there's a match, or there isn't. It's a very binary outcome.


That's not the important/valuable use case for what people normally refer to as "pattern matching".


I've recently found dynamic pattern matching pretty valuable to completely replace REST-like server-side APIs with a data-oriented API that gets pattern matched and dispatched based on the actual values and shapes of the data structure.

It helps almost completely avoid the /get/this /get/that /set/those explosion of API getters and setters that ultimately leads to very complex client logic that makes any notion of consistency at a distance very hard to reason about.


I really don't think you're talking about pattern matching in the usual sense. So it's probably better to use a different term.


vtable of regular dynamic dispatch.


IMO the hierarchy of need for this goes:

1 - case classes / value classes / data classes, whatever you want to call them.

2 - match-and-bind syntax

...

11? - fancy pattern matching

This maybe says more about how much I pay attention to what's upstream in Java, but I found the fact that this is just a hypothetical proposal, in April of 2017, strangely shocking. I guess I figured it had to be on the docket for a future java version already.


This is not just a hypothetical proposal, I believe there is a working prototype and a lot of thought has gone into this. More things are coming, but pattern matching is a low hanging fruit that doesn't have difficult interactions with the type system or VM. Value classes need a lot of language enhancement to be really useful, and those things are likely to land before value classes themselves.


Project Valhalla (featuring value classes) is scheduled for Java 10 — so it'll probably arrive in production around 2020.


I cannot comment on the Java release schedule, but Project Valhalla will be done when it's done, and it is not a small task. Project Amber is intended for smaller changes that may have been investigated as part of Valhalla and can be released earlier.


Maybe I'm misunderstanding your comment, but aren't case classes a fundamental piece of pattern matching? Or are you rather just suggesting the importance of which kinds of pattern matching are the most important?


match { case v: java.util.Date => }

This still works without case class extractors and is useful. That's what I mean by matching and binding.

Extracting/destructuring in the match statement can sometimes be more trouble than it's worth. It's definitely brittle to changes in the case classes. Then there's the question of the (difficult to reason about?) cost of the abstraction.


I think some/most of the "needed" features for Java are already available in some other JVM languages. I'm not too familiar with the JVM ecosystem but Kotlin comes to mind.


I'm currently thinking through the design of a pattern-matching feature for JavaScript, through a superset I'm building called LightScript[0].

I liked that this article laid out the specific options and various shades of gray that can constitute parts of "pattern matching".

I'm curious for thoughts on a syntax like this:

    x = match y:
      case > 3: () => "it's bigger than three"
      case 2: () => "it's strictly equal to two"
      case Integer: (x) => `some int smaller than two: ${x}`
      case String: (s) => `some string: ${s}`
      case Array: ([ first, ...rest ]) => `a list starting with ${first}`
      case Object: ({ a, b }) => `an object with property a: ${a}`
      
This is somewhat more difficult given that even when using Flow or TypeScript, there's relatively little type granularity available at runtime.

Any thoughts?

[0] http://lightscript.org


To me the biggest missing features are not this but:

* heredoc/multiline string/embedded interpolated strings * concise array, map, and object literal initializers. * structural/anonymous types

Simple things like multivalue return become an exercise in boilerplate in Java. As much as I like Immutables.org or @AutoValue, I should be able to return a struct or use destructuring operations at the callsite.

The hack in Java is to use annotation processors to provide nice fluent builder patterns, but really the language should have first class support for this.


> concise array, map, and object literal initializers.

http://openjdk.java.net/jeps/269 Just a few more month.

Still nothing for the other two, but 1/3 is better than nothing.


Those aren't new language features though, mostly just standardizing Guava conventions into the JDK.


Fair enough, but they do the job. I understand the wish of the designers to not change the syntax if a standard library extension does the job for 99% of the cases. Changing the language syntax is a far more elaborate process.


I think it gets real ugly once you need nested data structures, the readability goes down (maps of lists of maps). Also, using overloads means it's limited to small arity, e.g. 10.

Not having first class multiline literals really makes it painful to write something like React or GraphQL in Java.

One of the things I give props to MS for is having the courage to evolve C# and TypeScript at a breakneck pace. Each release of TS, they add features I've been wishing for. They seem to be listening to developer productivity issues and responding.

I under Java's desire to preserve backwards compatibility, but they end up breaking it anyway, and make design decisions based on how they can shoehorn things into the older JVMs, but each new release ends up with some issues. Witness how long it has taken folks to migrate to Java8. If we're going to wait years for major releases, then let's make them well worth the trouble when they finally arrive.


> Not having first class multiline literals really makes it painful to write something like React or GraphQL in Java.

Certainly. And that's one of my biggest pain points with Java, but a different point in your list. One day they will exist ... one day. Hopefully.

> Witness how long it has taken folks to migrate to Java8. If we're going to wait years for major releases, then let's make them well worth the trouble when they finally arrive.

I've used every release since JDK 6 in production one or two month after it was released. Once that hurt me (Rhino/Nashorn), but usually it hasn't been a problem. I think people who wait until the last moment when the older versions go out of support to switch JDKs do themselves a disservice out of fear. And fear is never a good reason.


Would be really great to see this make it's way into the language. Here are some ways I've found to work around its absence: * adding a match() function to a common base type. So if it could have subtypes Foo(x), Bar(y, z), then the signature would look like:

  <T> T match(Function<X, T> foo, BiFunction<Y, Z, T> bar);
* in the cases where I don't have control over the common base type, I've written a builder pattern that constructs a sequence of predicate-function pairs and applies them appropriately. This looks like so:

    new PatternMatchingHelper()
    .append(Foo.class, foo -> doSomething(foo.x))
    .append(Bar.class, bar -> doSomethingElse(bar.y, bar.z))
    .otherwise(x -> fallbackValue())
    .apply(objectOfUnknownType);
The main disadvantage here is that it doesn't work well with generics, because the class objects have raw types.

You could probably extend the second approach to get something close to the arbitrary predicates that pattern matching would provide, but the syntax wouldn't be nearly a clean as having it in the language.


Just use another language if you want pattern matching so badly. Twisting Java to look like Scala doesn't do the language justice.


You act as if what op described must be the entirety of their code base. What if pattern matching is just a small part of a larger problem better solved with Java?


It's not so easy to just choose another language; there might be hundreds of thousands of lines of Java already written. In that case it's better make the best of a bad situation by doing something like this, than not even trying.


Any substantial differences from Scala's implementation? From what I remember of Odersky's book it seems very similar if not identical. (Which is fine - I'm all for Java incorporating the best parts of Scala.)


The syntax looks similar, but there's no mention of an `unapply()`-type thing, which would limit the flexibility of this proposal greatly.


For me the best parts would include:

- Type inference

- Everything is an object (no more primitives)

- no obligatory ';' as a statement separator

- Type-classes

- Scoped import statements

At which point we're actually recreating Scala and we might as well switch :)


Caustic people may argue that Java already has the worst parts of Scala :p


If you like pattern matching, you'll love it's generalization into Prolog unification as explained in eg. [1].

[1]: http://www.amzi.com/AdventureInProlog/a10unif.php


I really really like these suggestions. Especially the first parts where I have found myself writing a lot of instance-of-cast and just marvel at how much clutter it actually adds without adding more description to the code. I do also ponder if there is a limit to how much we should potentially shorten the expressions, not to make the expressions extremely dense it understanding. The last opinion is just taste though, admittedly.


I feel these new language features that get grafted onto Java end up awkward and unpleasant to use. Java is still Java, and it exacts a stiff awkwardness tax for writing code in a style different from how Java OO was envisioned 15-20 years ago. The examples in the linked article look fine, but the difference between pattern matching as "possible direction" and pattern matching as "new language feature" could easily end up like the difference between Orson Welles as Charles Foster Kane and Orson Welles as Falstaff.

I wonder if Oracle could create some excitement around the platform by creating or adopting a new language as an official repla^D^D^D^D^D "new member of the family" to implicitly succeed Java just like C# replaced Visual Basic for most use cases. Java could be kept around as a sop to die-hards like VB.NET was. It's great that the JVM allows a thousand flowers to bloom, but it's not great that the only "official" choice on the JVM is a language that hasn't been able to evolve very far from its 1990s roots.


Oracle's most profitable customers are still using Java 6. That wouldn't change if they were pushing Kotlin.


I don't know Kotlin, but judging by their "Kotlin for Scala Developers" article I don't know if it would be a big enough jump to get people excited, especially since a project like this would take years to come to fruition. I think Scala is a good first try at what the successor language could be, but it has two problems: its reputation as a hard language, and the fact that it partly deserves said reputation. (Maybe if Odersky succeeds in simplifying the concepts underlying Scala as he's trying to now with Dotty, the result will be a potential Java-killer.) Oracle would have to fund a research project to choose an existing candidate and spend years developing it to fit the role it would play in the market. Which sounds very unlike Oracle, so I'm not holding my breath.

EDIT: And to your point, I think it's a bad sign if their most profitable customers think that none of the Java language updates they've released in the last ten years is worth upgrading for. The longer their customers stick with 2006-era Java, the longer they pass up Oracle's current offerings, the less attached they feel to the future of the platform, and the more likely they are to make a big change when they do make a change.


The syntax still looks tedious compared to Ceylon:

    if (is String name) {
        // compiler knows 'name' is String in this block
        print(name);
    }
    else {
        print("some other text");
    }
Why declare a new variable? The Ceylon syntax works great with union types too.


I had the same comment on C#'s implementation:

https://news.ycombinator.com/item?id=12971841#12972691

In C#, the "wanting the supertype" answer makes some sense with explicit interface implementations [0]. Java doesn't have that, and AFAIK there's no way to declare something that's visible on a supertype but not a subtype.

[0] I think it's a bad reason. But the possibility does at least exist.


You can do this in Java using derive4j: https://github.com/derive4j/derive4j.

It generates code for algebraic data types which offer a typesafe interface similar to the `case` statements in languages that natively support pattern matching.


In the same way you could "do" lambdas with single method interfaces in Java pre-8. It was possible, but it was still a pain to do.


It depends. Bare minimum, in scala you have to do:

  sealed trait Either[A, B]
  final case class Left[A, B](a: A) extends Either[A, B]
  final case class Right[A, B](b: B) extends Either[A, B]
and it is still not a real sum type due to exposing subtyping (cf. https://github.com/quasar-analytics/quasar/wiki/Faking-Sum-T...)

with derive4j:

  @Data
  interface Either<A, B> {
    <X> X match(Function<A, X> left, Function<B, X> right);
  }
or, at your preference:

  @Data
  interface Either<A, B> {
    interface Cases<X> {
      X left(A leftValue);
      X right(B rightValue);
    }
    <X> X match(Cases<X> cases);
  }
So it is actually not much boilerplate.

The real drawback of (any) java implementation is lack of TCO.


How about...

    instanceOf x is? {
        Int { System.out.println("It's an Int"); }
        String { System.out.println("Hello, " + x); }
        _default { System.out.println("This type is not explicitly named here."); }
    }

And if you don't want the type, the value of any other method, function, or attribute could be checked by "is?" or whatever syntax token you want there.

Further tests could be saved for within those blocks to save complexity.

What's odd though is that in Java this particular example seems to want multi-method. Testing the type explicitly and acting on it is more akin to duck-typed language programming. You see this pattern pretty often in Perl for example. If I want to write in Perl, I typically reach for Perl.


It's not just the types/instanceof operator though

They're talking about making the Pattern matcher match more complex patterns.


Along with with Value Types [1] Java 10 could be an even more significant improvement than Java 8 was :)

[1] http://cr.openjdk.java.net/~jrose/values/values-0.html


I would love to see this land in Java. Pattern matching is one of my favorite features of Rust.


Jeez, seems like overkill when you get that free in more modern of stacks. Move to Elixir. Pattern matching is done right and it's brilliantly designed. I understand that if you're having to maintain Java applications, but why anybody would stomach a NEW project on that behemoth is beyond me. I'm not just talking about performance, I'm talking about object topology BS, complicated libraries, etc. Java (for me now) seems like it's bolted on to the Web and simply doesn't have it's place as a Web stack. My opinion of course but with the great amount of stacks out there -- I don't get the fascination.


There is some very cool language design work from Chinawat Isradisaikul and Andrew C. Myers at Cornell on adding very powerful matching constructs to a Java-like language.

Paper: http://www.cs.cornell.edu/~chinawat/papers/pldi13-p343-israd... Slides: http://www.cs.cornell.edu/~chinawat/papers/chin-pldi13-slide...


Reading this reminded me of all the discussions that took/are taking place for pattern matching in C#[0]; reading this I had the same reactions that I did for C#: cool stuff. It's also nice to see the languages reach parity in features; I only hope the two language committees pay attention to the research/feedback that the other receives.

[0]: https://github.com/dotnet/roslyn/issues/10153


"Finally, the above code is less optimizable; absent compiler heroics, it will have O(n) time complexity, even though the underlying problem is often O(1)."

Is this addressed with pattern matching?


How does Vavr's (formerly Javaslang) pattern matching hold up? http://www.vavr.io/


I already use vavr, sadly such pattern matching is not exhaustive, which is the most important feature for me.


Java already has simple pattern matching with exceptions (try {throw ...} catch ...). It's surprisingly ergonomic, even though it's just a hack


Shameless plug: I recently wrote a similar piece exploring pattern matching in Javascript http://jarrodkahn.com/posts/6/slug/


Looks quite Swift-like. Seems like a big win all around for pattern matching to become more "main stream".


This is nice. But I think my number one feature would be first-class relationships in object oriented languages.


Am I missing something or can most of the examples just be achieved with method overloading?


Java only has single dynamic dispatch. That means it will use the method of the most specific sub-class, but will not dispatch on the runtime type of the argument, only its compile time type.

So something like:

  for(None n: expreTree.children()) {
    intMath.eval(n);
  }
will not work as expected if eval is overloaded.


Really great proposal. Is this conversation in relation to Java 10 or just pie in the sky?


This is an exploratory document only and does not constitute a plan for any specific feature in any specific version of the Java Language.

I hope it does go forward, though. Really useful feature, and given the constraints of the existing language, the proposed syntax looks pretty liveable-with.


I honestly don't need this feature.

Am I really the only one?

Or do you all just have buckets of source code filled with if instanceof then cast just screaming for this language feature?


I have buckets of code that would be much shorter and more typesafe if it was that way at least.


ITT at least 3 JavaScript libraries to do this.

Also how the heck do you destructure Java objects with private fields without resorting to bean accessors?


good approach taken from scala programming language. makes things simpler and shorter.


It is an old concept used in most functional languages, like ML, haskell or even Prolog via unification.

So not an invention of scala, only that is also lives on the JVM.

http://wiki.c2.com/?PatternMatching


Really?! exprswitch?! Unreadable and long!


That's just a placeholder. Several times in the doc they say that the operator name needs to be explored, specifically to see if overloading 'switch' is feasible.


This isn't a spec, it's a thought experiment. There's no need to criticize syntax at this point.


If you're interested in exploring pattern matching, I've created a Javascript/TypeScript library dedicated to it. [0]

You can see it live! [1]

[0] https://github.com/jiaweihli/rematch

[1] https://runkit.com/jiaweihli/57db70d841de7f1400d64f73




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: