Hacker News new | past | comments | ask | show | jobs | submit login
Decision Tables (hillelwayne.com)
166 points by dailymorn on Oct 18, 2018 | hide | past | favorite | 49 comments



What I find most remarkable about the comments here is how quickly programmers try to jump to code (I'm not passing judgment, just making an observation). The goal of this post is to present a simple yet useful specification tool. Sure, a decision table implemented in Clojure would likely be more readable than one implemented in assembly, but figuring out what the decisions should be would be far, far more costly and significant (in all interesting cases) than implementing, or reading, either. Sometimes a specification is not even implemented in a single program, but in some distributed system, with no direct representation in code, and sometimes a specification specifies the environment the system interacts with, and is not even implemented by a computer at all. Coding is important, but specifying is usually much more so, as it is both more costly and has a higher impact. Developers should learn how to think about specification and use various aids -- like decision tables -- without thinking at the code level, which is often a distraction.


I agree, but what's even better than a specification is an executable specification. If we see a decision table as data (see discussion below) then it can be both -- no chance for the implementation to go out of sync with the spec, and "don't repeat yourself".


Good specifications are often written at a level that generating executable code from them is often undecidable and almost always intractable in practice. What would really be nice -- although we have not been able to achieve even that at scale yet -- is a formal refinement between implementation and specification (i.e. you can't execute the specification, but you can ensure it's in sync). There are, however, various techniques that can do so approximately [1]. But I believe that even a good spec without any formal tie to code (which we certainly can do) gives you 90% of the value of a spec that is formally tied to the code (which we don't know how to do just yet).

[1]: E.g. trace-checking: https://arxiv.org/pdf/1111.2825.pdf


> Good specifications are often written at a level that generating executable code from them is often undecidable and almost always intractable in practice.

What about modern dependently-typed meta languages like Idris and program synthesizers like Nadia Polikarpova's work on Synquid?

Type-Driven Program Synthesis, by Nadia Polikarpova (StrangeLoop 2018) [video] https://www.youtube.com/watch?v=HnOix9TFy1A

MIT Comcom: Synquid program synthesizer, and more command-line tools for the web

http://comcom.csail.mit.edu/comcom/

Idris Software Foundations https://github.com/idris-hackers/software-foundations


We don't know how to scale those [1]. It's very easy to specify a simple sorting function in Idris/Java with JML and prove it correct. It's a whole other matter to specify the global properties of an entire system, and verify them all the way down to the small bits of code. Various code synthesis tools are even more limited, as they tend to rely heavily on fragile automated reasoning tools like SMT solvers, although both are orders of magnitude away from being able to handle real applications that I think that comparing them to one another is a bit beside the point.

The biggest programs ever written and verified with deductive proofs -- be it dependently typed languages or any verification based on deduction, correspond at most to ~10,000 LOC of C, taken years of work by academic experts, and had to be significantly simplified to make proofs easier (i.e., they had to use only the simplest data structures, etc.). In contrast, business applications are 2-3 orders of magnitude bigger.

[1]: And when I say "we don't know" I mean that it is possible that some time in the next 2-5 decades we may be able to do something, but it's also possible that there are fundamental complexity limitations that mean that this particular path will never lead to something similar to what we may imagine it could.


I've been thinking a bit about model driven development. I think what you say is true, but at a bare minimum the specification should be machine readable.

Take the IRC RFC, for instance. A lot of it is amenable to code generation, but machine-parsing the spec is so hard (I know.. I tried, many moons ago). I feel like a big chunk of that spec could be written in such way that a big part of an IRC client or server could be generated. For other parts, a generator could at least create boilerplate to be filled by a human and maybe some automatic tests.


> but at a bare minimum the specification should be machine readable.

I don't think it's the bare minimum, as an informal specification is far better than none at all, but I agree that a formal specification is better -- at least in the more complicated cases -- as it forces precision and also allows the use of tools that could check it in various ways.


> Good specifications are often written at a level that generating executable code from them is often undecidable and almost always intractable in practice [...] (which we don't know how to do just yet).

Sure, but I'm talking about this article about decision tables. We don't have to spec the entire program in an executable way. We can just take part of the logic, encode it as a decision table, and declare that this is the spec. Then we read it in to the program and execute it. Nothing intractable about it! (Or if there is, then no self-consistent spec is possible in the first place.)


Encoding a decision table is the boring, easy, part, even if you do it in assembly. But that's not what the post is about. It's about the hard part, which is figuring out what the decisions should be. In many, many cases (most cases I would say), writing the code is the easiest, least interesting part of developing software.


But I totally agree! We shouldn't even write code for a decision table, because it's generic code which doesn't need to be adapted at all. Just use a decision table implementation off the shelf.

But I don't agree that encoding the decision table is boring and easy -- if the business logic is complex enough to require a spec. The good news is that now the table itself is all we need -- no self-written code, and no separate spec.


I once had an argument about that. A very smart colleague of mine claimed to have specified a very complete natural language tense system in Mercury, sort of hard-coded into the language if I understood him correctly. He claimed that it was way better than what general semanticists had come up with and that the use of logic in semantics is totally superfluous.

My point was that it doesn't matter how good his system is, as long as he doesn't separate declaration from implementation, it will be of almost no use. Nobody will be able to run his Mercury program in 100 years from now. Then again, it's not possible to run my Higher Order Logic with Henkin models and Categorial Grammar out of the box either.

So I'm still not sure who had the better argument. Maybe if researchers could agree on a common programming language, we could indeed dispense with descriptive formalisms?


>Coding is important, but specifying is usually much more so

Lacking code, however, usually leads to important oversights in the specifications. Internet routers had to have a "compatible with Cisco" option because Cisco engineers read a specification, wrote an implementation, had no reference code to test against, and ended up with a widely deployed broken protocol incompatible with the correct version.

Code is critical for specifications because it doesn't suffer the flaws of human interpretation and assumptions. If it does disagree with human interpretations of the specification then it quickly becomes obvious and either the code or specification can be fixed in a revision.


> Code is critical for specifications because it doesn't suffer the flaws of human interpretation and assumptions.

Code is a formal specification at a very particular level. But you can have formal specifications of any level you like that is just as precise and mechanical as code (and usually much clearer).


Programming languages that support pattern matching support decision tables. This is an OCaml port of the FizzBuzz example (fixing a mistake where the result of the modulo is compared with T and F):

    match n mod 3 = 0, n mod 5 = 0 with
    | true, true -> "FizzBuzz"
    | true, false -> "Fizz"
    | false, true -> "Buzz"
    | false, false -> string_of_int n
This a "cleaner" version that takes advantage of wildcards (which appear in the article, but not in the FizzBuzz):

    match n mod 3, n mod 5 with
    | 0, 0 -> "FizzBuzz"
    | 0, _ -> "Fizz"
    | _, 0 -> "Buzz"
    | _, _ -> string_of_int n
Note that the article states, "two rows can’t overlap in what inputs they cover," but in case analyses expressions of PLs with pattern matching, two rows can share a non-empty set of common matches and the higher row will take precedence.


I guess you could do it in anything, even if it doesn't support pattern matching:

(I apologise for this Python abomination):

    print([{ (True, True): 'Fizzbuzz', (True, False): 'Fizz', (False, True): 'Buzz', (False, False): i }[i % 3 == 0, i % 5 == 0] for i in range(1, 101)])


If you're not aware, indenting your code marks it as preformatted, and so won't wrap in HN. Code which is preformatted and not wrapped is really hard to read on mobile, and even on desktop the end of your code is cut off in a scroll box.

It's almost always easier to read if you don't indent it, though HN should probably fix their style sheets so it doesn't matter.

If I'm quoting something (another reason people often indent their comments) I will normally wrap the whole thing in * so that it is italicised.

I don't know a good way to make code stand out, maybe someone else has an idea.


> If I'm quoting something (another reason people often indent their comments) I will normally wrap the whole thing in * so that it is italicised.

Personally I prefer putting a greater than sign in front of each paragraph I am quoting / quoting from.

It’s what we do in e-mail and on most imageboards and in markdown. It’s easy to read and easy to understand that it’s a quote.


The rust version with wildcards looks very similar. Neat. https://play.rust-lang.org/?gist=69e15f3bf9a18359de10e657c4f...


Same thing in swift: https://iswift.org/playground?fOI8XQ&v=3

    for i in 0...100 {
        switch (i % 3, i % 5) {
            case (0, 0): print("FizzBuzz")
            case (0, _): print("Fizz")
            case (_, 0): print("Buzz")
            case (_, _): print("\(i)")
        }
    }


Elixir:

  def fizzbuzz(x) when is_integer(x) do
    case {rem(x, 3), rem(x, 5)} do
      {0, 0} -> "FizzBuzz"
      {0, _} -> "Fizz"
      {_, 0} -> "Buzz"
      {_, _} -> Integer.to_string(x)
    end
  end


Yes, but it would be nicer if the table was stored as a table -- as data -- not as code.


Why? It'd be a mapping from "patterns" (a concept that exists only within the language's compiler/interpreter) to executable closures. How is that any different from the AST of the match-expression in the GP's comment? And what would you do with that mapping, besides using it as an AST to generate code, perhaps after adding/removing some match clauses?

In this case, the most optimal representation of the data is the code.


I would go with a "compromise"; while decision tables should be "code", I think it could be nice if there was some syntatic sugar for making these tables actually tabular. Of course semantically it would be very much the same as a pattern match.


> what would you do with that mapping

Well, I could write another tool to analyse it, eg extract some statistics; I could write another tool to generate it, eg based on data.


Why can't you do either of those things with the AST of the match-expression? And what stops you from leaving the code as code, and extracting that AST by calling into the compiler?

(Maybe I'm being unfair here. I write mostly Elixir, and it's really easy in Elixir—one or two lines—to extract the AST of the body of a function defined in a given file and then work with it. It's just as easy to hold data canonically in its AST representation, and then both analyze it, and generate code from it, at runtime.)


I really liked this article. I feel about the same about this as when I first understood state machines.

Sure, we could write a loop with an if-else-if-else block inside to implement a state machine, but once we understand the formalism properly it might be better to use a generic piece of code to run the state machine, and a data structure mapping (state, input) pairs to actions/outputs.

Similarly, we could write a big if-else-if-else to implement our business logic, but once we recognise that we have a decision table, we are better using a generic piece of code to implement the decision table idea, and storing our logic in a table.

In both cases we are separating logic from data.

Also relevant to decision tables, since they're far more efficient for some situations, is decision trees. Many people are familiar with decision trees as a machine learning method, or as a method for analysing a decision (utility, probabilities, etc.), but they can also be used to express the same logic as a decision table as in this article. Again, a generic piece of code can execute the decision tree, and the tree itself can be stored as data.


    understood state machines.
Of course decision tables and (finite) state machines are essentially the same: each FSM induces a decision table (indexed by current state and current action) and vice versa.


Many things become much simpler once you switch from procedural to declarative programming. The downside is that in many cases you'll need to write an interpreter for the declarations and very often it's more complex than it appears. Of course, if there's a known design of the interpreter (state machine, etc.) things become much simpler.


See also: Racket's 2d syntax library:

  #lang 2d racket
  (require 2d/match)
   
  (define (subtype? a b)
    #2dmatch
    ╔══════════╦══════════╦═══════╦══════════╗
    ║   a  b   ║ 'Integer ║ 'Real ║ 'Complex ║
    ╠══════════╬══════════╩═══════╩══════════╣
    ║ 'Integer ║             #t              ║
    ╠══════════╬══════════╗                  ║
    ║ 'Real    ║          ║                  ║
    ╠══════════╣          ╚═══════╗          ║
    ║ 'Complex ║        #f        ║          ║
    ╚══════════╩══════════════════╩══════════╝)
https://docs.racket-lang.org/2d/index.html


That's pretty cool! I've never seen a programming language that accepts ASCII art as valid code.


there are a few... asciidots for instance. the racket library above actually has (basic) editor support in dr racket though


That's why I like Erlang, it allows for multiple function heads and pattern matching. So these kind of tables are very easy to represent without too much noise:

    f(t, t, t) -> 1;
    f(t, t, f) -> 3;
    f(t, f, t) -> 7;
    f(t, f, f) -> "cucumber";
    f(f, _, _) -> null.
Pretty close to the table, no?

The other example:

    fizzbuzz(N) ->
        case {N rem 3, N rem 5} of
            {0, 0} -> "FizzBuzz";
            {0, _} -> "Fizz";
            {_, 0} -> "Buzz";
            {_, _} -> N
       end.
Here I used a case statement, but it also matches the shape pretty well.

That's valid code, not pseudocode, you can compile it in a module an run it.

Notice how multiple function clauses separated by ; allow much nicer code patches. If a function wasn't working well, instead of adding an if statement inside, you can add another function clause instead.


I personally prefer Elixir but I have to say that the more I use Erlang libraries (like for state machines, trees and graphs) the more I grow fond of it and the more I view both languages as what they really are: a brother and a sister. ^_^


Most, if not all, functional languages support pattern matching: Haskell, OCaml, Clojure, Erlang/Elixir, Purescript, Idris, Elm, etc.


Sure. I just like Erlang in particular and thought it approximated the table format pretty well.


I recommend checking out the material about Subtext[1] for elaborations on tabular programming. Often in practical systems an enumeration of finite state is a tool to "squash" very large decision tables into a more comprehensible pattern. But having the decision table aids in discovery of just how many states you have to deal with.

[1] http://www.subtext-lang.org/


Yup! There are many versions of Subtext. The most relevant to decision tables is perhaps schematic tables: https://vimeo.com/140738254

"a cross between decision tables and data flow graphs"


There is a whole category of systems (rule engines, BRMS) dedicated executing and more importantly managing predicate logic in a way that business user's can understand and maintain. Just one example from the Java space:

https://www.drools.org/


This is a common technique in designing tests of a system; if you can make a table of inputs and expected outputs, you have a roadmap on what to test.

I had a textbook that had a whole chapter about this, I will find the title and edit this post.

EDIT: A Practitioner's Guide to Software Test Design, Lee Copeland, Artech House (January 2004), ISBN-13: 978-1580537919.


Yes. I've been arguing for some time that "smart contracts" should be written as decision tables. Not in some byte-coded language. Decision tables map well to contracts. They can be understood by managers and lawyers. (This may be a negative for some scammers in the cryptocurrency world.)


These are a great way to create shared understanding with your non-tech business partners when requirements get complicated. My team used these a lot in an e-commerce context where checkout had quirky discount rules. e.g. give 10% off if the user has a product from X category in their cart, but not if the product is in Y subcategory or Z blacklist. And it's additive with other promotions of type A, but greatest-wins with promotions of type B.

Truth tables can eliminate a lot of ambiguity compared to a wall of text or dozens of bullet-point requirements.


We use decision tables in Domain Specific Languages that we develop for several sectors. In this paper you can see how we use them in the medical sectors to represent some medical decisions http://voelter.de/data/pub/MPS-in-Safety-1.0.pdf (see the Appendix for screenshots)


Ahh this brings back memories. One of my first projects at a large investment bank was working on a tax engine where the first step was transaction classification using a decision table with dozens of columns and hundreds of rows.

The best part? It was loaded into a running service by parsing a complex Excel spreadsheet which lead to so much fun with debugging.


Traditionally, n%m is true if it's not 0. You might want to add (n%m)==0 in the table.


This is probably very difficult, but wouldn’t it be interesting to write a program that generates the minimal function that matches a given decision table? I imagine with current techniques this would only be possible for very simple functions, but the programming workflow if you had such a tool would be neat. Instead of writing a function and tests, you simply write the test, and grow it until the function looks how you want it to.


For strictly Boolean decision tables (that is to say, truth tables[1]) this is already a solved problem, through a method known as Karnaugh mapping[2]. It can transform any arbitrary truth table into a Boolean algebra expression, which you can then simplify using any Computer Algebra System you'd like.

You could probably expand this method to other forms.

[1]: https://www.wikiwand.com/en/Truth_table [2]: https://www.wikiwand.com/en/Karnaugh_map


Decision tables are like an extension of truth tables [1] then, with the realization that instead of mapping the inputs to boolean outputs of true/false, you can map each output to any value. That’s clever, I like it!

[1]: https://en.wikipedia.org/wiki/Truth_table


A few years ago I wrote a Ruby example if anyone wants to try it: https://github.com/aw/ruby-decision-table


Nice explanation, but the lesson I learned from it is "if you make a mess of your system, you might need a messy system to fix it again". Which might not be what the author intended, but does jive with experiences I've had.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: