Joe Armstrong: Why OO Sucks

dgreensp · on July 15, 2012

OO vs. FP is just a matter of whether you focus the nouns or the verbs. The counterargument to the OP is that surely a function that manipulates "data" is less powerful and abstract than one that manipulates objects.

For example, take an "interface" or abstract data type like Array, consisting of a length() and a get(i) method. (This is really called List in Java and Seq in Scala.) There may even be an associated type, A, such that all items are of type A. This is very powerful because functions written against the Array interface don't depend on the implementation; we can store the data different ways, calculated it on demand, etc.

The "binding together" Joe is complaining about is binding the implementation of length() and get(i) to the implementation of the data structure, which is surely understandable. The alternative, seen in Lisps and other "verb-oriented" languages, is that there is a global function called "length" which takes an object... er, a value... and desperately tries to figure out how to measure its length properly, perhaps with a giant conditional.

The original OO (SmallTalk) was about message passing rather than abstract data types; just the idea that an object was responsible for responding to certain messages, and that these communication patterns completely characterized the object. This is how we think about modern cloud services, too; it's kind of inevitable. Who would complain that S3's "functions" and "data" are too coupled? Who would ask for a description of S3 in terms of what sequence of calls to car and cdr it makes internally? OO concepts allow a functional description of a system that starts at the top and can stop at any point.

The "everything is an object" philosophy gets a bad rap. It's a big pain in Java, especially, because of how the type system works. Ideally I'd be able to define a type of ints between 1 and 3, an obvious subclass of ints in general, whereas in Java I find myself declaring "class [or enum] IntBetweenOneAndThree" or some nonsense.

joe_the_user · on July 15, 2012

You know I think this misses something.

OO vs. FP is just a matter of whether you focus the nouns or the verbs.

I am working on gigantic OO system right now. And the OP is correct. OO sucks. OO is about minding together a bunch of crap and getting it to slightly, only slightly less crappy. But it can do that and for that I am grateful.

FP is about constructing something that is completely elegant from the start. If you can do that, it's great and you will be doing far, far better than OO. Not comparable.

The problem is that so far few have been to construct elegant, uh, cathedrals. And when you've already got a huge, sinking mess, you can't use FP to fix it. Not comparable again.

Neither is better or worse. But they're wildly different. Now, if someone figure out how to not write messes, FP is simply a win. Of course it happens that I wrote my mess from the ground up so I'm pessimistic about the mess-avoiding thing. But hey, it might work.

But I think it is important to say that the two philosophies aren't comparable.

carsongross · on July 15, 2012

The key word there is gigantic not OO.

All gigantic systems are crappy, no matter what their underlying language/paradigm is. Slightly less crappy is a win.

it · on July 15, 2012

The problem is that OO thinking tends to inflate systems, spreading code all over the place even though it logically belongs in one place and adding object wrappers to things that don't need it. In my experience taking over Python code written by Java developers, I can usually shrink their OO code and make it more reliable by refactoring it into conceptually equivalent functional code wherever it makes sense and falling back on procedural style where appropriate.

carsongross · on July 15, 2012

In my experience I'd guess you aren't dealing with a deficiency of OO, after all Python is an OO language.

I'd bet you are dealing with over-engineering, which is a cultural issue within the Java/J2EE community. And perhaps a lack of closures (I prefer those over list comprehensions, since they are more general) which make java needlessly verbose.

klodolph · on July 15, 2012

Languages are not inherently OO or FP, but they support OO or FP style programming. Python supports procedural programming very well, you'll see lots of "def" and no "class". If you argue that the integers and strings manipulated by a procedural Python program are called "objects" and therefore it is still OO, I shall point you to the C standard which indicates that the integers and strings in a C program are also called "objects".

You can do procedural programming in Java, but you'll have to make all of the functions methods on some dummy class. This is cumbersome, which is the real complaint here. The "everything is a class" mentality is both an issue with the language and an issue with the community, but we tolerate it because they still make useful programs.

Everyone needs a little "re-education" or assimilation in order to switch languages and not write puke-tastic code in languages you don't use every week. A seasoned Java programmer will likely have no trouble writing correct Python code, but you have to wait for a few dozen sleep cycles before the programmer will write idiomatic Python code.

carsongross · on July 15, 2012

I agree, but I do think that java, in particular, suffers from two distinct problems:

1) A lack of closures, which, as you point out, turns every obviously functional problem into a ridiculous object model.

2) A culture that creates libraries that suffer from over-abstraction, over-engineering and that tend to model a technical aspect of a problem rather than what a non-expert end user of the library would find intuitive.

gordonguthrie · on July 15, 2012

> Languages are not inherently OO or FP, but they support OO or FP style programming.

I couldn't disagree more. Languages are as they are designed to be. Erlang is FP and Java is OO by design.

Ingaz · on July 15, 2012

If you trying to write FP in Python - you're in trouble. (Python is my main language for several years)

It does not have "mandatory OOP" but it remains as mainly imperative OO-language with some FP-goodness. C# is the same, Java - not.

it · on July 16, 2012

If you try to do your entire program as pure FP then I guess that is true, but it's often possible and beneficial to do a lot of the work in a functional style.

Honestly though, this misses the point. If your language forces you to use an unsuitable paradigm, it's time to use another language if you can.

roguecoder · on July 15, 2012

The solution to gigantic systems are APIs, whether services or objects. Using FP you end up emulating objects, just at a different scale.

wonderzombie · on July 15, 2012

"The counterargument to the OP is that surely a function that manipulates "data" is less powerful and abstract than one that manipulates objects."

I don't see how that follows at all. Objects are glorified types. They're a bag of state plus a collection of functions which take that state as an implicit parameter. Polymorphism is "just" a form of delegation, which itself does not require you to glue together data and functionality.

If this were Haskell, you might define a typeclass for a List datatype.

    class List a where
      length :: Int
      get :: Int -> a
      ...

Then implementors for Heap and Array would do something like this in their respective packages

    instance List (Heap a) where
      length x = ...
      get x i  = ...

    instance List (Array a) where
      length x = ...
      get x i  =

And so on. Functions which want a List add a type restriction:

     addListLengths :: (List a) => [a] -> Int
     addListLengths []     = 0
     addListLengths (x:xs) = length x + addListLengths xs

addLengths doesn't care if List is a Heap or Array, and calling length on a List will do the right thing.

I suspect there are cultural factors at work more than anything else, which does get back to what you say initially: do you prefer to live in the Kingdom of Nouns, or not?

In practice, I don't think there's a List typeclass in Haskell. I suspect people generally just use a bog-standard list. :) If you want a special implementation like a heap, you go find one and use it. I suppose this is one example of a cultural of "explicit is better than implicit."

Myself, I write in Java regularly and I just don't generally see a whole lot of value in the List abstraction over ArrayList or LinkedList or whatever. I suppose the reader can see List and take it as a given that it'll behave in some way. That's something.

On the other hand, I suspect the programmers don't think much about a List, either. It's a bit worrisome to contemplate the idea that not only don't programmers know what object they're really dealing with, but they shouldn't know. I highly recommend perusing "Building Memory-efficient Java Applications: Practices and Challenges"[1], which really gets into this stuff at a technical level. I don't know the extent to which this happens in FP-land; I'm mainly commenting on Java culture as I have observed it.

[1] http://www.cs.virginia.edu/kim/publicity/pldi09tutorials/mem...

spacemanaki · on July 15, 2012

> In practice, I don't think there's a List typeclass in Haskell.

I'm a Haskell n00b, but there is Functor, Foldable and friends, right? That's basically what you're talking about, although the methods are generalaized map, fold and so on instead of get and length. It seems like this just supports your point: there's nothing making functions operating on data in an FP less abstract than objects and methods.

JoshTriplett · on July 15, 2012

> OO vs. FP is just a matter of whether you focus the nouns or the verbs.

Indeed: http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom...

meric · on July 15, 2012

In Haskell I guess it would look like this: I'm still new to Haskell hopefully someone can improve it?

    data M = M1 | M2 | M3 deriving(Show, Eq, Ord)

    fromM M1 = 1
    fromM M2 = 2
    fromM M3 = 3

    instance Num M where
      abs = abs
      a + b = fromInteger (mod ((fromM a) + (fromM b)) 3)
      a * b = fromInteger (mod ((fromM a) * (fromM b)) 3)
      a - b = fromInteger (mod (abs $ (fromM a) - (fromM b)) 3)
      fromInteger 0 = M3
      fromInteger 1 = M1
      fromInteger 2 = M2
      fromInteger 3 = M3

Is this what you're looking for? :)

klodolph · on July 15, 2012

A more generic version allows you to parameterize based on the bound:

    newtype BoundedInt b = BoundedInt Int

    class Bound b where
      boundRange :: t b -> (Int, Int)

    fromBounded :: BoundedInt b -> Int
    fromBounded (BoundedInt x) = x

    toBounded :: Bound b => Int -> BoundedInt b
    toBounded x =
      let result = BoundedInt x
	  (minb, maxb) = boundRange result
      in if minb <= result && result <= maxb
	 then result
	 else error $ "Out of range: " ++ show x

    instance Bound b => Num (BoundedInt b) where
      abs = toBounded . abs . fromBounded
      negate = toBounded . negate . fromBounded
      signum = toBounded . signum . fromBounded
      x + y = toBounded (fromBounded x + fromBounded y)
      x - y = toBounded (fromBounded x - fromBounded y)
      x * y = toBounded (fromBounded x * fromBounded y)
      fromInteger = toBounded . fromInteger

This assumes you want exceptions for overflow. Nowadays, you can put numbers in the type system, obviating the need for a "Bound" class, but I'm not familiar with it yet. The implementation above will also silently overflow given large enough bounds.

wonderzombie · on July 15, 2012

I'd ask, I suppose, why he wants ints from 1 to 3 in the first place. Why are these semantically important, and what's the intended meaning?

If you just need a set of numbers which only has those three, a helper function with a modulus seems easier. Or an infinite list of [1,2,3]. But it's not clear to me what the use case is, so...

klodolph · on July 15, 2012

Yes, a Lisp-like language could use a giant conditional for a generic "length" function. But that's not how the source code usually looks, and there are Lisp compilers which optimize it so it's not done as a conditional.

In CLOS, it will look as a separate DEFMETHOD for each type you want to define the function on, and these can be in separate modules. DEFMETHOD will just extend the generic function. And the function itself can be optimized to use the normal tricks -- things kind of like vtables -- to optimize performance.

The S3 example is a bit facile since (1) S3 objects are like mud, they're all the same and (2) they map cleanly to the OO paradigm since they encapsulate discrete chunks of external state. At the other end we can throw around examples like BLAS and LAPACK, whose functions are much closer to the "functional ideal" (ignoring mutability, here). For example, if I want to solve a linear equation, how do I express that as a single dispatch method? Does it go on the matrix, or on the vector? Or do I have to create a new object from the two called LinearSystem, just so I can invoke a single method?

We think about things in terms of coupled data, as pure functions, and as dirty generic functions (like the Lisp example). The only real lesson is that people get pissed off at languages that force them to shoehorn everything into one category (like Java).

bithive123 · on July 15, 2012

A series of assertions and "I just don't see it"-s presented as self-evident when they are anything but. No examples of real cases where OO does in fact "suck", ending with the claim that in order to understand the popularity of OO one should "follow the money".

What? I mean, I don't even...

MortenK · on July 15, 2012

Indeed, it's a very odd piece. No arguments except opinions ("state is bad", "functions and data shouldn't mix" etc). I fail to see the underlying reasoning for any of his gripes. It just comes of as a frustrated computer scientist / mathematician complaining that his theoretically correct way of making machines do his bidding, is not loved by all.

comex · on July 15, 2012

harmful.cat-v.org has a lot of odd opinions:

http://harmful.cat-v.org/software/dynamic-linking/

http://harmful.cat-v.org/political-correctness/girls-in-CS

arrakeen · on July 15, 2012

i'd take anything i read at cat-v.org with a grain of salt, they seem to appreciate the art of trolling: http://gofy.cat-v.org

discussit · on July 15, 2012

The second link is to gibberish from reddit. Not sure if it's a joke or if it represents the opinions of cat-v.

But the first link is on the money. I would not call it "odd" as that seems to carry a negative connotation. Rather, it's sensible, albeit irreverent, thinking to question dynamic linking (as the Plan 9 people have) and alas the mainstream of software development has a difficult time thinking sensibly and prefers to "follow the herd".

Dynamic linking is anything but a clear "win", using the silly web lingo of today. It's one of those many engineering trade offs we continue to live with even though the original rationale for its adoption no longer exists.

Here's the problem as I see it.

You link to a library of n functions where, e.g., n > 10. But your program only uses 1 or 2 of those functions.

There is no accounting method for keeping track of which functions each of those "4000" different binaries uses. How easy is it to tell me what functions each of those "4000" binaries uses, and which libraries they reside in? I have to resort to binutils hackery to extract this info, but it seems like basic information that should be easily available... because it should be used in making decisions.

Why link to a library with, e.g., > 10 functions when your program only uses, e.g., 2 of them?

How many functions in those libraries that your program links to are not used by your program?

No big deal you say. And that's true. Because of dynamic linking.

Dynamic linking to some degree makes us disregard this wasteful "black box" approach to use of library functions.

But what if we started static linking? Then maybe we start to think more about those functions that are in the linked library but are not used. We might even question the whole idea of libraries.

What if we took an accounting of all the unused functions? How much space would they account for?

Why package functions together in libraries? Whatever the original reasons were for adopting this practice, do they still exist today?

How many libraries do you have installed on your system that are only used by one or a few programs? Is there a threshhold for how many programs need to use a group of functions before it justifies creating a library tobe shared?

If we really want to have functions that are to be shared among many disparate programs, and we have "4000" programs (cf. the no. of programs on a system in the 1980's) then it makes less sense to place them in arbitrary libraries (how many people know the location of every C fucntion purely based on library names intuition?) that have to be linked to as a unit. We perpetuate a black box. It makes more sense to have each function available on its own, and each program can select only those functions that it needs.

This is if we were static linking.

I do a lot of static linking because I move programs from system to system. It works very well.

Another thing that I sometimes think about is the sticky bit. We try to achieve the same effect with caching. You call a program for the first time, it take some time to load. You call it a second time and hopefully it's all in the cache, and it is much more "responsive". But what if it cannot fit in the cache? What if another program displace it? Why can't we conscisouly keep a program in a "cache"? We give control to the OS, and we hope everything works as intended.

Dynamic linking reminds me very much of package management systems, particularly those that build form source. It's extremely difficult to tell what a particular install procedure consists of. You type "make" and what happens next is in many respects a black box. Only so much information can be reliably extracted from the system. For example, if you wanted to know all the possible Makefile variables in the package management system, it's virtually impossible to get a list. You can get most, but not all.

The way thse systems work is a lot like shared libraries. The mainatiners err on the side of overinclusion of dependencies, some of which may not actually be needed, in order to keep the black box working reliably.

comex · on July 15, 2012

Well, I'd distinguish your "maybe there's a better way" from the article's

> All the purported benefits of dynamic linking [..] are myths while it creates great (and often ignored) problems.

which is tripe.

In my view, the fundamental problem with static linking as we currently know it is that when libraries release bug fixes-- especially security fixes (which I think are the real clincher), but really any of the minor bug fixes that happen all the time in more complex libraries-- you almost certainly want all the programs on the system that use the library to switch to the new code, and by the nature of libraries, there are probably a lot of them. In some cases you could rely on a distro / traditional package manager to provide new binaries for everything, but that would be a lot of wasted bandwidth and even if you don't use proprietary software, you probably want to be able to use some software from outside the distro. So you really need an automatic re-linker of some sort and probably a new binary format that can be re-linked, and some infrastructure to keep track of what binaries exist on the system and what they depend on. Plan 9 never had that, and at that point I think you're solving more fundamental problems than static vs. dynamic linking (which is a good thing) and should take a step back and see what you can do with it, but who knows-- it might be interesting to talk about, but you have to come up with it first. :)

General comments on your post: I think the boundary of a library is not quite arbitrary, because

- libraries tend to be developed independently by different people! It might be nice to develop things in a more unified fashion, but in general I don't think you can avoid people having specific (functional) interests and areas of expertise, and it's nice to have a unit of code that someone can "own".

- random interdependencies tend to be a bad thing; ask Google. Organization is good.

- many libraries have the job of parsing file formats or doing other things where the selection of which functions to invoke generally comes from user-supplied data-- you can't ask for half of FreeType or, dare I say it, WebKit; you need to be able to parse whatever the user throws at you, so it's all or nothing.

Which is not to say here couldn't be improvements.

p9idf · on July 15, 2012

"So you really need an automatic re-linker of some sort and probably a new binary format that can be re-linked, and some infrastructure to keep track of what binaries exist on the system and what they depend on. Plan 9 never had that."

Plan 9 has all of those things. Namely 7l† and mk††.

† http://plan9.bell-labs.com/magic/man2html/1/2l

†† http://plan9.bell-labs.com/magic/man2html/1/mk

Locke1689 · on July 15, 2012

Dynamic linking seems to be a rather straightforward technical argument. I'm not sure what to tell you if you can't or won't parse it.

comex · on July 15, 2012

I've had a conversation about it on HN in the past:

http://news.ycombinator.com/item?id=4112517

I wouldn't mind discussing it again.

einhverfr · on July 15, 2012

when zlib had that double-free bug a while back, how many programs had to be updated because of static linking?

I think the problem is that in most of these cases, people only count one side of the ledger. On the whole, I think static linking causes more security issues than dynamic linking, but dynamic linking causes some other problems that are less well accounted for.

rapala · on July 15, 2012

A security hole in a library is "automatically distributed" to every program dynamically linked against it. Yes, the coin has two sides.

einhverfr · on July 15, 2012

I am not saying that dynamic linking is perfect. However, a lot of it comes down to how manageable is security. How many programs do I have to update? How sure am I as the sysadmin that I got them all? This is easier with dynamic linking than with something ubiquitous but often statically linked like zlib.

Yeah there is a tradeoff. I am not saying there is no downside. I am just saying security-wise, I prefer a single-point-of-correction to a case where I may not know where the weakest link is.

zacharyvoase · on July 15, 2012

“Data structure and functions should not be bound together” — I can't agree with you more.

However, in Smalltalk (and even Ruby, to some degree) objects are not data structures, they are collections of functions invokable on a 'thing' with an unknown structure. They have an internal structure—potentially immutable—but you never see this, because you only interact with methods on the object.

And in many cases, there is syntactic sugar to make invocation of these methods look like slot access: think of Objective-C’s `@property`, or Python's descriptors, or Ruby's `def method=(value)`.

When people talk about 'object-oriented languages' in such general terms I get frustrated, because there's a lot more nuance to this than simply 'bundles of functions and data structures'. That's a very implementation-led way of looking at it. The reality is that these objects are supposed to represent real-life situations where knowledge of what something is, or how it behaves, or how it fulfills its contracts is unknown. If NASA's Remote Agent[1] had been implemented in Haskell, OCaml or ML, do you think debugging DS1 would have been as simple as connecting to a REPL and changing some definitions in a living core? I don't think the image-based persistence of SmallTalk and many Lisps would be possible in a purely functional or traditional procedural language.

And what is a data type anyway? It's supposed to represent a mathematical set of possible values. Sure, you can use a simple array to build a b-tree, but don't you want to explicitly state that variable x is a b-tree if that's the case? I was always taught that explicit is better than implicit.

I should probably stop ranting now, it's just that if you’re going to start hating on programming paradigms, at least sound like you've thought your argument through a bit more.

[1]: http://www.flownet.com/gat/jpl-lisp.html

m0th87 · on July 15, 2012

"If NASA's Remote Agent[1] had been implemented in Haskell, OCaml or ML, do you think debugging DS1 would have been as simple as connecting to a REPL and changing some definitions in a living core?"

Erlang in particular is known for rock-solid support for this, and Armstrong never advocated for pure functional languages.

"Sure, you can use a simple array to build a b-tree, but don't you want to explicitly state that variable x is a b-tree if that's the case?"

You can do this in Erlang with atoms.

joe_the_user · on July 15, 2012

I'm not exactly sure what you mean here. Last I programmed Ruby, objects had member variables.

JonnieCache · on July 15, 2012

The point is, they're all private, all the time.

Let's overlook the fact that ruby allows you to ignore this using stuff like Object#instance_variable_get ;)

Ruby is a strange one because it has a fundamentalist approach to OO, yet it is shot through with FP ideas. Most ruby programmers use the latter heavily. Also, ruby's love of syntactic sugar makes generalisations hard to apply to it, for example the aforementioned all-private member variables combined with the single-line getter/setter macros, and the method invocation sugar. The ruby programmer is often able to have their cake and eat it.

notJim · on July 15, 2012

99% of the time I read these articles that say $commonly_used_thing [1] sucks, the arguments are always "it is fundamentally incorrect" or some variant thereof [2], and strawmen [3] abound.

Where these arguments fall short are in addressing the simple fact that highly-skilled people produce very neat, well-designed systems that they are pretty happy with from a technical standpoint, and that make money every single day using $commonly_used_thing. If you can't acknowledge that $commonly_used_thing has some good attributes, and that it actually works well for many cases, I don't understand why I should take you seriously.

[1]: Examples of commonly_used_thing: ORMs, OOP, SQL databases, NoSQL databases, operating systems, platforms.

[2]: There are a handful of variants. I think my favorite is the magical phrase "impedance mismatch", which I think in non-buzzword-speak translates to "fuck you, I'm right"

[3]: Most-frequent strawman: the most essentialist, rigidly-formal version of $commonly_used_thing, when in reality, nearly every version of $commonly_used_thing compromises to cope with reality.

LnxPrgr3 · on July 15, 2012

Every time someone says $commonly_used_thing sucks, people come out and point out that $commonly_used_thing is being used for $productive_activity.

It is possible to do amazing work with broken tools, or with the wrong tools. That doesn't mean these tools aren't broken or that they couldn't be better matched to the job.

Not that I think OOP is inherently evil. Reading the article, I don't think the author quite understands OOP. One example:

"In an OOPL I have to choose some base object in which I will define the ubiquitous data structure, all other objects that want to use this data structure must inherit this object."

If he really believes this, no wonder he's railing against OOP. That would be horribly broken.

espeed · on July 15, 2012

Reading the article, I don't think the author quite understands OOP.

The original author is Joe Armstrong (http://www.sics.se/~joe/), the creator of Erlang.

LnxPrgr3 · on July 15, 2012

Your point? Using inheritance the way he's suggesting is a rookie mistake.

Either that, or I'm completely misunderstanding him. He can't be talking about hating composition, though, can he? He uses it in his own Erlang examples!

barrkel · on July 15, 2012

I don't think Joe ever claimed to be a great programmer. Erlang looks as odd as it does because it was hacked up in Prolog, which few compiler engineers would choose as a starting point.

I think Erlang is interesting precisely because it was created by someone with a little distance.

nosequel · on July 15, 2012

I take it you have never written anything significant in Erlang?

Try writing something in erlang and I bet you'll see the real power of pattern matching, supervision trees, and the whole message passing infrastructure. If the people who wrote Erlang weren't good programmers they certainly got lucky.

barrkel · on July 15, 2012

You seem to have mistaken my comment for a criticism of Erlang.

Ingaz · on July 15, 2012

Armstrong fan here :)

Jokes away - the "really great programmers" for me are people who can do extremely complex things very simple.

For me, Armstrong is somewhere near Peter Norvig. Both made me change my point of view, some complex things became simple (and some simple things became not so simple).

"Erlang was hacked up in Prolog" - it's the same as "Java was hacked in C". First implementation of Erlang was written in Prolog, first implementation of Java was written in C.

barrkel · on July 15, 2012

No; because C is a fine implementation language. It's low-level enough to be efficient without a lot of moving parts, there's very little magic.

For me, Armstrong got interested in the idea of building reliable systems, went off and did a PhD on the topic, and then created a prototype implementation of the concepts developed therein in Prolog - a declarative language essentially built around combinatorial search with pruning. In other words, he's a fine high-level thinker, and a very competent wielder of Prolog; but I believe I've read elsewhere that it was Mike Williams who rewrote the VM interpreter in C, and all the low-level imperative stuff like GC.

notJim · on July 15, 2012

I essentially agree with what you're saying, although in most cases, I would argue against the characterization of the tool as "broken". But I think that's semantics.

If this article was titled "Functional programming offers a number of advantages over OOP when dealing with $productive_activity"[1], my response would have been positive.

[1]: It goes without saying that the article would have to live up to the title. Obviously merely re-titling this article in such a way wouldn't be sufficient.

smsm42 · on July 15, 2012

This seems to be the fashion today. You don't write an article saying "In some cases, outlined below, approach Y can be superior to approach X given conditions A, B and C". You write an article saying "X sucks! Here is how stupid it is - if you want to do N with X, you have to write {this horrible code}. Everybody should abandon X immediately, since it is broken beyond repair, and move to Y, where you can do {cool code snippet}. People use X only because they didn't know about Y or are stupid, you now know about Y, so guess what choice you've left?".

LnxPrgr3 · on July 15, 2012

Fair enough. In most cases, it's probably more "suboptimal for the task at hand" at worst.

For some reason, anything like this turns into a crazy holy war. At the risk of invoking another holy war:

"For these people, the iPad is unsuitable for content creation for anyone unless it’s suitable for them." -- John Gruber

s/the iPad/\$tool/; s/content creation/\$activity/;

nivertech · on July 15, 2012

Yes, I also caught this mistake. Inheritance heavily used in textbooks, but in real life systems, composition more prevalent.

martmarc · on July 15, 2012

The "impedance mismatch" is not a magical phrase. It simply refers to the mixing of orthogonal concepts: like trying to fit square pegs into round holes. Sure, you can do it. But do you really want to? You have to admit that the (very old) article/blog who made that term famous saw the issue coming and was right about the current (very sad) state of affair we're in. It was "The vietnam war of software development" or something like that. The problem when trying to fit square pegs into round holes is that you end up --to use another term that's going to become "magical" to you-- with a "complected" system. The "elephant in the room" if you wish. You probably want to watch "Simple made easy" by Rich Hickey, the author of Clojure.

LnxPrgr3 · on July 15, 2012

I for one wish every working programmer would watch and understand this presentation.

A link, for the lazy: http://www.infoq.com/presentations/Simple-Made-Easy

MarkMc · on July 15, 2012

Wow, I am genuinely shocked by the comments in this thread. I didn't realise that so many people held the polar opposite view to me. It's a bit like suddenly finding out that all your friends are racist.

I love object oriented programming. For me it aligns perfectly with the way I think - it allows me to produce a system of interrelated 'things' where each thing (or group of things) has a well-defined role and can hide its internal state and behaviour from other things.

When I see how some code tackles a problem I get an emotional response from how 'clean' it is. Does it smell bad or is it a work of beauty and elegance? If the code feels wrong I get an urge to make it better and for me that process of improvement relies heavily on object-oriented concepts. I get a real buzz from creating a clean, elegant solution to a problem: Trying to do that without object-oriented features would be like trying to write a letter by holding the pen with my teeth. Ugh.

fauigerzigerk · on July 15, 2012

So what is the natural and clean design for an operation that represents the sale of a property? Say we have these objects: the buyer, the seller, the agent, the property and the contract. Which one of these would you prefer?

property.sell(buyer, seller, agent, contract)

seller.sell(property, buyer, agent, contract)

buyer.buy(property, seller, agent, contract)

agent.sell(property, buyer, seller, contract)

contract.sign(property, buyer, seller, agent)

The state of all the objects may be modified, and there are different types of properties, buyers, sellers and agents. So you might want polymorphism along any of those hierarchies.

jpatte · on July 15, 2012

  contract = agent.makeContract(property, seller, buyer)
  seller.signContract(contract)
  buyer.signContract(contract)
  agent.registerSale(contract)

Nothing complicated here, really. Just follow the "natural" flow and respect each actor's role. Don't try to group all operations into a single one when there are independant actors involved.

Note : I see the "contract.sign()" solution coming back a lot. I will admit this is "pure OO", but to me it doesn't make any sense. A contract here is a data object, not an actor. It shouldn't process anything. How many times do you see contracts sign themselves (or even do anything) in reality?

fauigerzigerk · on July 15, 2012

Your "data object" versus "actor" argument is intersting. Does that mean an object model should mimic certain properties of the real world that aren't even part of the system, like the knowledge that agents can act whereas contracts can not?

There are clearly conflicting goals here. It could be that you need polymorphism along one hierarchy but that would make the design look very unnatural in the eyes someone who knows the problem domain.

In my experience, OO models tend to diverge greatly from the real world over time, because after all we're not modelling the world, we're modelling a solution to a problem and we shouldn't fight the tendency for the language of the solution domain to dominate the language of the problem domain.

Sometimes this can mitigated by having interfaces on different levels of abstraction, but that often leads to bloated and slow systems.

jpatte · on July 15, 2012

Clearly the goal here is not to simulate an ecosystem where agents, buyers and sellers happily live together.

However, the idea here is to organize your object model exactly as you would organize a group of employees. Each of these employees has a specific job and a specific set of responsibilities - ideally only one. You can describe the solution to your problem as the result of their interaction. These employees are actor objects.

To interact efficiently, they need to exchange information, in the form of data objects. These data objects don't do anything except hold a piece of information - exactly as you would have two employees exchanging notes or emails.

Most of the time these data objects correspond to the language of your problem domain. The actor objects however may have nothing to do with any "real world" activity, depending on the job you want them to do in the process. So yes, the object model diverge from reality over time because we introduce new actors with specific roles that have no "real world" equivalent. But that's just fine.

The important thing is to keep seeing the distribution of responsibilities among your objects and their interaction as the work of a team of independent experts, not as "things doing stuff".

n72 · on July 16, 2012

In this case, I don't consider signContract part of the seller's or buyer's role in that signing is more closely aligned with a contract than a person. That is, a person does a lot more things - views house, negotiates, etc. If you put all these methods on person you could end up with hundreds of methods on person and violate SRP. Thus, it seems to make sense to me to have

contract.acceptSignature(seller); contract.acceptSignature(buyer);

radiac · on July 15, 2012

To a certain extent it may depend on context (eg code for a conveyancing firm would have a different focus to that of a landlord or mortgage company), but a lot of those instances would already be properties of other instances. In particular, since the contract is the object which ties them all together, I would imagine the best syntax would be closer to:

  contract = new Contract(property, seller, buyer, agent)
  contract.sign()

You would then have contract.sign() call property.change_owner(), agent.register_sale() etc. Different classes of properties, buyers, sellers and agents can then implement those methods however they want.

The Contract never needs to know anything about those other classes, so you create the different types of property etc by subclassing the appropriate base class, and customising the functions that the contract calls.

Sure, OO may not be the right choice for every occasion, but it is just another way of doing things - it does not suck, it makes some things easier and some things harder. Granted, in certain languages it makes some things a lot harder - but just avoid those languages. I think a good programmer would be able to see the benefits of OO and decide whether it was the most appropriate choice for their current project, rather than dismissing the entire concept outright based on a poor understanding or bad experience in some languages.

fauigerzigerk · on July 15, 2012

That could be a useful solution (and the one I would probably choose as well), but what if you primarily need polymorphism along the property type hierarchy because the sale of a home is so different from the sale of a mall?

Also, you get the objection that contracts don't sign themselves. I remember very well that in the early 90s, OO models were promoted as a means for business people to talk to software designers. It never worked out that way.

The real world knows processes. Processes are not some appendage of any of the objects involved. So why not model a process as a function?

sell_property(contract, property, buyer, seller, agent)

radiac · on July 15, 2012

Like I said, your implementation would depend on context, but to adapt my example, I'd probably put a function on the Property class to determine which Contract class to use - something like:

  contract = property.create_contract(seller, buyer, agent)
  contract.sign()

However, I think from what you're saying that your sell_property() function would need awareness of all property types, so adding a property type not only requires changes to the part of the code which contains the property data structure, but also to the sell_property() function - and any other function throughout your code base which uses properties. That is of course possible, but OO does make it easier to find where to implement logic specific to different property types.

fauigerzigerk · on July 16, 2012

As I said, I'm assuming that the operation depends on the types of _all_ objects involved. So what you actually want is multimethods.

In an OO system you have to simulate multimethods by chaining the calls through all objects, but that obscures the actual functionality of the operation. OO works well if method resolution depends on exactly one type. That's why I chose an example where it's not clear that one type dominates.

In the absence of multimethods I find that 95% of the time a simple if else construct does the trick just fine. But I do value the OO style single type dispatch where it really fits. It is just overused.

danenania · on July 15, 2012

I think I see where you're going with this, but it seems like primarily just a semantic improvement. You have a bunch of objects (collections of state), and an action that's going to modify some or all of them. While I agree that namespacing the operation as an independent function instead of arbitrarily attaching it to one object or another is ideal, I don't see how the operation itself has necessarily been simplified. Is it because now instead of modifying the objects/holding state, we can just process and persist whatever needs processing and persistence and be done with it?

fauigerzigerk · on July 15, 2012

The operation itself isn't necessarily simpler. The system as a whole may become easier to understand, more productive to build and less error prone if design decisions can be justified in a rational way.

After all, the classes and functions we create become the language in which we think about the system. If there is no clear reason why a particular operation belongs to one class more than to three others, it should not belong to any class.

Adding an operation to a class has implications like method resolution and dependence on internal state. We make assumptions based on that and those assumptions should turn out to be true.

For instance, it is entirely clear why List.add(Object) is a method in the List class. It is that list and only that list that is modified. It is only the List class hiearchy along which specializations of that method should be searched for. The operation depends on the internal state of the list and not on the internal state of the Object parameter. So all our assumptions hold and we will be able to remember to which class this add method belongs.

In my property sale example it's not like that. Any and all class hierarchies could conceivably be used for method resolution. The state of any and all objects could be modified.

ajuc · on July 15, 2012

The usual answer, as always, is to add another layer of abstraction. Enterprise Java:

    @Resource("ejb/PropertySale")
    PropertySaleManager psm;

    psm.sell(property, buyer, seller, agent, contract);

This style of code is so popular in the code I've seen, that I almost think EJB3 is just conspiracy to allow procedural programming in Java, and to keep people thinking they still do OO to satisfy their egos.

ajanuary · on July 15, 2012

Which one is best: cats, alligators or pencils?

You're question is about how best to model a domain of which you've given the scantest of details. Domain modelling is something you have to do in any programming paradigm and is equally as obtuse if you don't have proper knowledge of the domain.

Come back with a more detailed description of the domain and then we can start comparing the emergant designs from OO and other paradigms.

fauigerzigerk · on July 15, 2012

The point is, if several objects are involved, all of them get mutated and all of them are in a type hierarchy, you're going to have a hard time justifying which object should be the receiver of the message based on what feels more natural.

MarkMc · on July 15, 2012

Without knowing any more about the system I'd suggest: contract.sign(property, buyer, seller, agent)

It seems that properties, buyers, sellers and agents could exist without 'knowing' about contracts, but a contract at some point needs to reference the other entities. Also, it seems odd that the contract can exist without knowing the details of buyer, seller, etc. I'd prefer: new Contract(property, buyer, seller, agent) and if possible make it immutable.

fauigerzigerk · on July 15, 2012

I would tend towards this solution as well, but does it meet the "looks natural" criteria? Contracts that sign themselves?

I think many classes we invent are nothing more than processes in disguise and we could just as well model them as functions.

A simpler example. Should the BankAccount class have a transferTo(BankAccount other) method to transfer money into another account? What if there are different account types and the exact process depends on the types of both accounts?

Of course it's possible to do it that way, but is it really the cleanest way to imagine this? I don't think so.

jpatte · on July 15, 2012

A simpler example. Should the BankAccount class have a transferTo(BankAccount other) method to transfer money into another account?

A BankAccount is a data object, so no processing in there. You could have a Banker object (or a MoneyExchanger object, which is "less real world" solution) which would be the actor object responsible for this task. So the natural message is myBanker.transferMoney(sourceBankAccount, targetBankAccount, amount).

What if there are different account types and the exact process depends on the types of both accounts?

You would need to handle of the possible cases into your actor object - which is why it is interesting here to have an actor object specialized in just that. There are several techniques you could apply to dispatch the call to the appropriate implementation of transferMoney().

I think many classes we invent are nothing more than processes in disguise and we could just as well model them as functions.

Well, yes, classes do things - at least actor objects do things. Btw actor objects generally have no state, so you can see them as "super functions", with many advantages over basic functions (inheritance, polymorphism etc).

fauigerzigerk · on July 15, 2012

OO purists actually frown upon the kind of stateless Manager objects you suggest. But I can imagine a scenario in which a money transfer would be something very complex that merits its own process class.

I just don't think it makes sense to mandate that kind of heavy weight function class for every operation or be forced to subordinate an operation to an arbitrary class. For instance, formatting a date in Java works like this:

  Date date = ...;
  DateFormat df = DateFormat.getDateInstance(DateFormat.LONG);
  String s = df.format(date);

So the date format formats the date. Why is that? Why is date subordinated to date format here? It could just as well be the other way around. There may be some implementation related reason for that but conceptually it makes no sense at all, it's impossible to guess and hard to remember.

  s = format(date, date_format)

makes a lot more sense to me.

jpatte · on July 15, 2012

OO purists actually frown upon the kind of stateless Manager objects you suggest.

Yes, but pure OOP has been proven impractical many times over the last 20 years. What we are looking for here are practical rules that will help us organize our code in an OOP framework.

So the date format formats the date. Why is that? Why is date subordinated to date format here?

Actually the name "DateFormat" is unfortunate. It should have been "DateFormatter", because it is clearly an actor object - while a date is a data object. Suppose you have a fire in your house, you want help to extinguish it. So you call a fireman, and you basically say to him "here's a fire : do your job". In the present case you have a date, and you want to format it. So you "call" a DateFormat(ter) and you say "here is a date : do your job". That's exactly the same, natural principle of delegation. If you need something to be done, call an expert to do it for you.

Of course it gets quickly tedious having to explicitly call the actor objects for everything. So it might be useful to add "convenience methods" to the data objects, which would simply call the appropriate default actor and pass themselves as arguments to the work operation. So here you would then be able to call date.format(DateFormat.LONG), which means any date would basically become able to format itself. It has good and bad sides, and there is no clear answer (that I know of) to determine in which cases it's okay to do that or not.

s = format(date, date_format) makes a lot more sense to me

That's because you see formatting as an action, and you're thinking action => function. In an OO environment you should rather be seeing formatting as a responsibility, and thinking responsibility => class.

fauigerzigerk · on July 15, 2012

That's because you see formatting as an action, and you're thinking action => function. In an OO environment you should rather be seeing formatting as a responsibility, and thinking responsibility => class.

I understand what you're saying and I understand OO design very well because I used it for decades. I just don't agree. The question we're asking here, I think, is not "how should you behave in an OO environment?", the question is "is OO a good software design principle".

My answer to that is no, and your insistance on dividing classes into data and actor classes tells me that you're well on your way to joining my opinion soon ;-)

radicalbyte · on July 15, 2012

PropertySale is a thing. It should be modeled. You'll probably want to store some state with it (when the sale happened, to whom, the price, etc etc).

fauigerzigerk · on July 16, 2012

Yes property sale is a thing. It's a process. Why can that thing not be modelled as a function? Property sales are rather complicated processes so having a class for it could be justified. But OO forces us to do it that way for every tiny operation. It's pure bloat.

radicalbyte · on July 16, 2012

It's bloat, until you realise that you need to persist a lot of information about the "thing".

Then you realise that the "thing" isn't actually that little at all.

BTW modern OO has constructs which allow a "Verb"-like structure: Generics. You write logic for a certain class of "things", and your compiler makes sure that what you're doing is actually possible.

pacala · on July 15, 2012

You may enjoy learning a language with decent module system. Abstract APIs are not exclusive to OO. The confusing part about OO is that it conflates modules with data types, which causes even experienced people to botch their designs. Tongue in cheek: As long as you have enough discipline to write OCaml in whatever OO language your project happens to use, you are going to have a good time.

tikhonj · on July 15, 2012

"Trying to do that without object-oriented features would be like trying to write a letter by holding the pen with my teeth"

I think what you're imagining here is Java without the OO bits. And that would be horrible!

What you should imagine is Haskell. That's more like typing a letter instead of hand-writing it, to overstretch your metaphor.

I've found that it's much easier to write much more elegant code in Haskell than it is in any OO language. Instead of writing a program without OO features, you're writing a program with advanced functional features. Rather than just being removed, the OO features are replaced.

seanmcdirmid · on July 15, 2012

I'm also an object luver; I prefer to think about my designs in objects in a very anthropomorphic way. There are plenty of examples of bad and good designs in any language, while any paradigm is not a panacea. Bad OOP designs only dominate the industry b/c OOP dominates.

You can build some beautiful systems with a functional programming language, especially if you are into formal mathematical elegance, I've ween some amazing things done with Haskell. But the systems I work with are very intrinsically stateful; objects just work better.

LnxPrgr3 · on July 15, 2012

I'm all for proper rants against popular tools to keep people on their toes.

This isn't one of those.

"Objects bind functions and data structures together in indivisible units. I think this is a fundamental error since functions and data structures belong in totally different worlds."

Sure—a class defines a type and operations on that type. What's fundamentally wrong about date.addDays(1) vs. date_add_days(date, 1)? (Let's skip the mutable state argument and assume both versions return a new date.)

There is the problem that sufficiently opaque classes are hard or impossible to extend. That's the class author's fault: this is an avoidable problem in every object-oriented language I've used.

"Functions are understood as black boxes that transform inputs to outputs. If I understand the input and the output then I have understood the function. … Functions are usually 'understood' by observing that they are the things in a computational system whose job is to transfer data structures of type T1 into data structure of type T2."

A constructor is a black box that converts a data structure of type T1 into a data structure of type T2. Objects just also have other black box functions defined on them.

Sure, some objects are stateful, but they don't have to be.

"In an OOPL I have to choose some base object in which I will define the ubiquitous data structure, all other objects that want to use this data structure must inherit this object."

Um, no. This is a job for composition, not inheritance.

"Instead of revealing the state and trying to find ways to minimise (sic) the nuisance of state, they hide it away."

They hide state's implementation, for mutable objects.

  std::vector<std::string> some_list;
  std::cout << "Items: " << some_list.size() << std::endl;
  some_list.push_back("Hello, world!");
  std::cout << "Items: " << some_list.size() << std::endl;
  // Oh no! State, EXPOSED!

Sure, an allocated piece of memory might have grown, or even moved. Why should I care? I still see the state I care about, presented through a hopefully useful abstraction.

This rant seems to somehow miss the points of both object-oriented and functional programming, instead harping on mostly meaningless (or outright wrong) details. Or am I missing something here?

bluesnowmonkey · on July 16, 2012

> What's fundamentally wrong about date.addDays(1) vs. date_add_days(date, 1)?

One issue involves cross-cutting concerns. Say I'm creating a JSON API for my system. I use an API framework that lets me return objects that implement a JsonSerializable interface, and the framework calls to_json() to create the actual network response. The built in container classes (Hash, Array, etc.) implement this interface already, calling to_json() on their elements. I make all my core domain classes implement a JsonSerializable interface with to_json() methods. Now I can return an array of objects and it just works. Very straightforward.

Problem 1: I only need to_json for the API, but now that code is available everywhere. Every time I change my JSON API, it requires recompilation and testing of every component of the system, whether or not they are even related to the API.

Problem 2: Next I go on to add an XML API, HTML representations, database persistence, third party API integration. All those concerns go into my core domain classes, which become quite bloated. I could use decorator classes, but they require me to be aware of whether I'm using a decorated or non-decorated object. They also add a lot of boilerplate. Code becomes both bloated and ugly.

Problem 3: Next I need a new version of the JSON API, to be run concurrently with the old one. The new version is going to serialize things in a different style, so each class needs two versions of to_json(). I could make a to_json2(), but how would the API framework know to call it? I fork the framework. Now I have to merge my changes every time they release a new version of the framework. Code is bloated, ugly, and unmaintainable.

Problem 4: Next I need to return API responses with classes from third party libraries, for which I don't even have the source to fork. Fuck.

Ultimate solution is to wrap everything -- foo_as_json(foo), bar_as_json(bar), foo_as_xml(foo). It's ugly as hell but it's the only thing that basically works.

LnxPrgr3 · on July 15, 2012

I owe the author an apology on one point. "Minimise" is a correct spelling, though neither my system's dictionary or I realized this.

tikhonj · on July 15, 2012

I think minimise vs minimize is just British vs American spelling respectively. So neither is strictly wrong.

NinetyNine · on July 15, 2012

There's a certain allure to saying code should be a certain way because of natural properties of computing, or our own feelings on what things are different and similar from what other things.

The reason OO shines is because it allows you to make that distinction at the domain level rather than the code level. You organize your software into business objects, or components, and have these interact with each other. They allow you to separate it out in a way that a new developer can come to the project, understand what the code should be doing, and look for the classes which seem the like objects involved, including the types of data it has and the things it can do.

There are all sorts of nasty things we've invented in OO over the past few years (mixing up inheritance and composition, using way too much state), but it gives us a lot of advantages from an engineering point of view.

stock_toaster · on July 15, 2012

Isn't that a bit of a false dilemma though? Can you not have clean interfaces and separation of concerns without OO -- even with something as 'simple' as python namespaces and dicts?

el_presidente · on July 15, 2012

Dicts are objects.

stock_toaster · on July 15, 2012

That is an implementation detail due to python's object support. Dicts are a native type, that are hashmaps.

Unless you mean "object" as in "a thing"....

Maro · on July 15, 2012

I think he means dicts combine data, state and functionality, expose a clean API and hide the details.

einhverfr · on July 15, 2012

This is true to some extent, but a lot of it has to do with the fact that some OOP environments abstract too much.

The tricky part is in defining your interfaces and getting that right. OOP is one option, not the only one, and not something which is strictly an either-or thing.

The object oriented paradigm is a useful tool, but not every problem is a nail, and not every hammer is useful for driving nails.

pessimizer · on July 15, 2012

OO isn't the only way to implement the Actor model.

programminggeek · on July 15, 2012

I think OOP took off because it seems like a great way to model things. The idea that you can simulate a car by saying you have a base kind of car with properties and actions and then a Ferrari is a kind of car, so you can just kind of take that car object and make it have more horsepower and a different body type and you have a Ferrari, is very exciting.

Businesses like to model things and simulate things. So, in that respect, OOP was probably an easy sell because it's selling an idea of what businesses want, even if it hasn't worked out exactly as they hoped in all cases.

einhverfr · on July 15, 2012

I see a bunch of caveats here.

The basic issue is that OOP is easily hyped and far too easily taken too far. My favorite OOP environments are decidedly un-OOP in specific ways (Moose, for example, has very transparent data structures, which is really useful).

The first big criticism I have is with the idea that "state is the root of all evil." I think the truth is more nuanced than that. State is, in many cases, extremely necessary to track. The problem is that state errors create bugs that are very difficult to track down (you can eventually figure the state out, but how did it get corrupted)? A better approach I think is for state to be approached declaratively and with constraints. This is why things like foreign keys, check constraints, etc. in the RDBMS world are so nice. In fact good db design usually has a lot to do with eliminating possible state errors. Wouldn't it be great if OOP environments gave that possibility! Well, Moose does to a large extent (another thing I really like about it).

A lot of it comes down to the really hard question of "what should be abstracted?" The correct answer is a question, "what makes your API most usable?"

This is why I think it's important to be able to move in and out of the OOP worlds, and why OOP taken too far runs into the problems the author mentions, but that it also doesn't have to.

6ren · on July 15, 2012

- though he has no qualms about misleading and deceptive answers.

- "If I understand the input and the output then I have understood the function." A good point, I tend to think of "information hiding" http://en.wikipedia.org/wiki/Information_hiding as applying to state, to enable SOTSOG, but it also applies to pure functions.

- "define all my data types in a single include file" That quote sounds silly, but in practice, I find it much clearer if all the part of a data type are next to each other, uncluttered by methods. It also supports Brooks' observation: "Show me your tables, and I won't usually need your flowcharts; they'll be obvious" ("tables" being datastructures). In Java, I tried this by defining fields in superclasses, methods in subclasses. But having two classes per class was awkward. (I ended up keeping code entirely separate except for very core methods - still not happy with it). But I don't think this is entirely a language problem, it's partly just complexity management is hard.

- this article makes me feel antagonistic, but in fact I never liked OO when taught it; it seemed dogmatic, not actually useful in practice. But I did like the idea of an ADT, where you can package something up (esp. a list, hashtable etc), and work at a higher level of abstraction. Subdividing tasks and SOTSOG

Locke1689 · on July 15, 2012

Fundamentally I think OOP is either state or syntactic sugar. If your methods don't modify internal state then they're basically doing ad hoc type polymorphism on their first argument (this is the way that virtual methods are implemented, by the way), which just makes the '.' syntactic sugar that at the same time limits composibility because it demands an inheritance hierarchy.

Then it just comes down to whether or not you believe that mutable state is a good design choice. I don't. I think state is the root of all evil. For one, it makes my job as a PL researcher of 1) writing a formal analysis and 2) using the formal analysis to write a compiler, less attractive than blowing my brains out.

Note that you can have objects without OOP. Python has objects. Python is not object-oriented. Same with O'Caml or Racket. I'm not arguing against using state. If you're programming a state machine, you may want to model it with state. That would be a pretty good choice. The problem is that OOP says everything is a state machine. Do you believe that or not?

Either way, I'm putting my best efforts towards state-corralled languages.

msluyter · on July 15, 2012

I expect this to be a busy thread.

I think the point about state has some traction. For example, Chaper 15 of Effective Java (awesome book on Java, btw) is entitled "Minimize Mutability," so I think this idea is one that has caught on even in fairly traditional OO languages.

As for the other points... I do think sometimes that using OO to model real world objects may not always be wise, esp. if the result is a deep hierarchy, as in the OO 101 example of, say, Boston Terrier < Dog < Mammal < Animal < Thing... And then someone changes Dog and gives your Boston a tail... I dunno. I have only vague intuitions here, but perhaps the "objects as models of reality" might be perfectly suited for reality simulators of some sort that require stateful elements a la Sim City but not generally. Or, perhaps a better example: You could model a chess game as classes of Pieces on a Board, with methods like King.isInCheck(), or Queen.canMoveTo(Square) but this to me seems clumsier than simply having an 8x8 array of enums with the logic living in functions and not inside individual pieces.

pacala · on July 15, 2012

The historical win of OO was polymorphism. The competition to OO was procedural code that consist(s|ed) of hardwired procedure calls. Enter polymorphism, which provides a way to abstract over functions, not only over values. Of course, this is nothing new to functional programming where functions are first class citizens, but it's new for procedural programming. Modern OO is about stateless objects, dependency injection and unit testing, aka functional programming.

nessus42 · on July 15, 2012

I've been watching some talks online recently by Rich Hickey of Clojure fame, and he's a very interesting and convincing speaker. He basically makes the same argument that Armstrong makes here.

I'm not clear, however, how the pro-FP, anti-OO crowd address the Law of Demeter, which is often summarized as "One dot: good. Two dots: bad." The canonical example where the Law of Demeter serves us well comes from some of the original Demeter papers, which I actually read a long time ago when they were current. This canonical example is that of an object to represent a book.

One of the initial selling points of OO was that if you encapsulate the representation of an object from its interface, this ends up giving you a lot more flexibility. For the case of representing a book, pre-Demeter, a typical OO organization would have been to provide a method to give you chapters of the book as Chapter objects, and from there you could get Section objects, from which you could get Paragraph objects, from which you could get Sentence objects, from which you could extract the words as strings.

The Demeter proponents correctly argued that this OO organization of the Book rather defeats the goal of encapsulation, since with this organization you cannot restructure the internals of the Book object without breaking the API. E.g., if you decide to insert Subsections between Sections and Paragraphs, your API for extracting all the sentences of a book will change, and consequently, much of the client code will have to change.

The Demeter folks argued that instead of having to explicitly navigate to sentences, you should just be able to call a method on the Book object directly to get all the sentences. Without special tools, however, this is hard for the implementers of Book, since now they have to write tons of little delegation methods. I take it that people who are serious about following the Law of Demeter do do this, however. In the original Demeter system, Demeter would do this automatically for you. The problem with the original Demeter system is that few people actually ever used it, and it was rather complicated for Demeter to provide this automatic navigation.

So, back to FP: Rich Hickey argues to forgo damned objects and to just let the data be data. So if I follow Hickey's advice, how am I supposed to represent a book? As a vector of vectors of vectors of vectors of strings? If so, then how do I prevent a change in the representation of the Book from breaking client code? If I had followed the Law of Demeter with OO, then everything would be golden.

Sure, with this naive FP approach, I could also provide a zillion functions to fetch different sub-structures out of the book. E.g., I could have a function to return all the sections in a specified chapter, and another to return all of the sentences in the book. This, however, would end up being little different from the OO approach following the Law of Demeter, with the further downside that if you change the representation of the book, you don't know that you haven't broken the client code, because you have no guarantee that the client code isn't accessing the representation directly.

Please advise.

sshumaker · on July 15, 2012

You can certainly achieve this in Clojure using deftype/defrecord, with methods that operated on those types.

But I think the point he's making here is a good one. Why exactly ARE you modeling a book as a series of chapter, section, paragraph, sentence, word objects? Just because you can? Or because it's actually a useful API to developers (the latter seems somewhat doubtful).

It's very tempting when you're working with an OO system to actually try and model the data as it corresponds to the real world equivalent, but it's rarely a good idea. Design your code around the API you want other developers to use.

In the case of a book, it's unlikely every book will follow that rigid chapter / section / paragraph / sentence / format (what about a book that doesn't have chapters at all? Or has pictures? Or is stream of consciousness). You'll want a more flexible system - I'd imagine it would end up looking like some kind of markup representation. And no surprise, markup is best dealt with as a nested data structure (XML, JSON), not a rigid set of container classes. So even in this trivial example, a bunch of encapsulated classes aren't really an appropriate representation.

This is true more often than not in practice. Rather than building a rigid class hierarchy, having functions that operate on data (and frequently that data _is_ typed) often yields better results.

pacala · on July 15, 2012

Caveat. I never saw any mathematical formalization for Law of Demeter. Contrast this with the Substitution Principle, which is directly defined in terms of logical implication.

With the regard of the example presented, there is at least a trivial solution. There is a BookApi which is concerned with providing Book operations, and there are the Book/Section/Paragraph data types. List<Sentence> getAllSentences(Book) is defined by the BookApi. Changing the data types doesn't affect any client of the BookApi, only the implementation of the BookApi.

Conversely, coupling the methods to the actual data types results in inflexible designs. Should a Book know how to display itself in a gui widget? Which widget library? Should a Book also know how to save itself in a database? Which database api?

nessus42 · on July 15, 2012

So, basically what you are saying, as far as I can tell, is that there is absolutely no difference between the OO approach using the Law of Demeter and the FP approach. (Though with an FP approach, you are perhaps more likely to make things immutable, which is a good thing.)

If so, perhaps this isn't surprising, as Demeter was layered upon Scheme.

Regarding writing to databases or to screens, decent OO programmers do not put methods to do this sort of thing into the domain objects. For i/o to a database the typical OO approach is to use a separate DAO class, and for GUI i/o the typical OO approach is to use an MVC framework, or some such.

Also, I'm not sure why anyone would want a mathematical formulation for the Law of Demeter. I think that anyone would be hard pressed to formalize every aspect of good design, as what is a good design isn't a mathematical property, but more one of cognitive psychology and the art of engineering. The Demeter system did, however, automate implementation for the Law of Demeter based on specifications you would give it, so The Law of Demeter isn't just some touchy/feely thing either.

pacala · on July 15, 2012

Since the answer to the rhetorical questions is a resounding no, there is very little meaningful behavior to be attached to the "domain objects". "Domain objects" do not obey the "Law of Demeter" and they are nothing but the data types of OCaml or Haskell. The other side of the coin are abstract apis, which are polymorphic collections of stateless functions, aka modules. Indeed, modern OO is functional programming in disguise.

Regarding Demeter, naming an anecdotically supported observation "Law" is a bit of a stretch ;)

nessus42 · on July 15, 2012

> naming an anecdotically supported observation "Law" is a bit of a stretch ;)

Maybe, but it's a common thing to do. C.f., the Law of Thirds.

haldean · on July 15, 2012

Which may be why almost everyone calls it the "rule of thirds". [1]

[1]: http://www.google.com/trends/?q=%22rule+of+thirds%22,+%22law...

IsTom · on July 15, 2012

With the FP approach you can just write "show book as a widget" function yourself, without the whole inheritance hodge-podge. In case of real libraries the change of data structure probably won't be a problem as author would probably supply a function Book -> (Title, Chapters, Sentences, Strings) which would be guaranteed not to change and discourage the use of actual constructors.

dspeyer · on July 15, 2012

Have you ever tried to read any Demeterized code? It's an idea that needs to die.

If I see book.line(n) and I want to know how it counts footnotes, and Book::line is defined as this.sectionWithLine(n).line(n) and those are defined in terms of Chapter::line and Page::line I'm going to need notepaper. And as I'm flipping among five nearly-identical functions with the same name, I'm going to have a terrible time finding which one has the footnote logic in it.

nessus42 · on July 15, 2012

> Have you ever tried to read any Demeterized code?

What are you suggesting as an alternative?

It should be noted that the original Demeter system would fetch sentences (or whatever) based on a description of the data structure that was provided by the implementer of the data structure alongside its implementation. In this system, you wouldn't have to examine code to answer your question, but you would have to be able to understand some perhaps complex computer-understandable description. That, of course, might be about as fun as reading an XML Schema. Also, since the real Demeter system never caught on, these days we have to do the computer's job manually, which is surely a chore.

Regarding many functions that all have the same name obfuscating things, there's nothing in the Law that requires that, of course, though I can see how it might sometimes happen in reality.

einhverfr · on July 15, 2012

I think the obvious solution is to stop worrying about what's perfect on paper and instead approach things from two very simple questions:

1) What are my use cases? and

2) What should be abstracted in order to create the cleanest interfaces with the most predictable behavior?

3) What should be exposed to ensure that debugging is easiest? The answer here is simple, "everything."

I think if we focus on clean interfaces based on use cases, we get something better.

The problem with the book example is that a book is something that is very hard to optimize both for reading and composition at the same time. If I am writing a book, the functionality I need access to is very different than if I am reading it. For reading, it's probably sufficient to have some methods to access the table of contents, list of tables, list of figures, etc. and the index, and a single method: get_page(int) which returns a page of text, which can then be processed, and if sentences move on to the next page, we just fetch the next page and process it too. For writing, however, you are going to have trees connected to linked lists, and it will be very different. I may want to change a paragraph, or add a new one.

But a book is a bad example, because it is internally structured in the way we think of it in order to be functional. In other words, with a book, the internal structure is the user interface along with helpful guides like the ToC and index. A better approach would be to note that the actual structure of the ToC, LoT, LoF, etc. doesn't need to correspond to the actual paper and so the data structures underneath could be very different, but the overall programming interface would probably mirror the actual paper structure pretty closely.

Now, if you were representing a book binding process, or book repair process programmatically that might be more interesting.

einhverfr · on July 15, 2012

I should have added, not only is it internally structured in order to provide a good user interface, but we've had hundreds of years of perfecting that user interface, so printing and book conventions are remarkably useful to the reader and ignoring them is always at one's own peril.

wonderzombie · on July 15, 2012

For a book, I would keep the text itself as something very simple, such as a big old list of lines. A chapter is a range of lines. A section is also a range of lines. Adding subsections is a matter of adding a field to Book and it is (surprise) a range of lines, too.

You could implement this as a byte offset if you wanted, but you get the idea. If the text of the book is data, the rest is metadata. IMO it should live with the data, preferably in as lightweight a fashion as possible. Adding more metadata is a matter of adding fields, but as long as you stick to a basic form of representation, you should be fine.

Your helper methods help connect interrelated pieces of data. If you want all the chapters in a set of pages, you check for which chapters either overlap or are a subset of the page range you got.

If you want to add footnotes to pages, for example, you might define a footnotes field. That field has the text of the footnote and the line that it points to, perhaps with an offset to denote specifically which word it wants. (Maybe this is an argument for using byte offsets in the first place.)

Is this roughly what you had in mind, or have I completely missed the mark?

nessus42 · on July 15, 2012

> Is this roughly what you had in mind, or have I completely missed the mark?

Maybe, but I'm not sure. If you don't encapsulate the data behind an API, then I suspect that you will eventually be sorry. And if you do encapsulate the data behind an API, then you're doing OO.

You may be right that your design for the representation of the book may be better than the typical naive OO representation, but there's nothing that forces an internal OO representation to be naive, or for an FP one to be sophisticated.

I guess what I'm saying is that I still don't understand how the FP approach is supposed to be different from the OO approach using the Law of Demeter. With the law of Demeter, methods only return data, not other objects ("two dots: bad"), so it's not as object-filled as a more naive OO approach. And in a case such as your FP representation for a book, where the representation is bound to be rather complicated, it seems that you will surely want to encapsulate it, in which case, the FP approach has adopted some OO. I.e., the FP approach and the OO approach have met in the middle and have become the same.

wonderzombie · on July 15, 2012

Incidentally I'm about done watching one of Rich Hickey's talks about complexity, the one called Simple Made Easy. Thanks for the recommendation; it has given a me a lot to think about. Are there more, or was that the one you were referring to originally?

"You may be right that your design for the representation of the book may be better than the typical naive OO representation, but there's nothing that forces an internal OO representation to be naive, or for an FP one to be sophisticated."

Well, there is, in that OO is inherently more complicated, in the most trivial sense; compare an idiomatic implementation of storing, incrementing, and displaying an integer in Java versus C. At a minimum, for Java, you're going to want an object. For C, you neither need nor want any such thing; a function or two and an integer are sufficient.

If the only thing that exists is a bag of data and code which pulls out that data, you have a much simpler system. I don't think this is controversial; an object is the same thing, in essence, except that due to language implementations and semantics, there are additional layers of complexity that are inherent and inextricable. Hickey goes on about this better than I could, although one of my favorite recent discoveries from C++ is the explicit keyword. The point is that with purity and referential transparency, it is far far easier to reason about you program. Pure functions are orthogonal, no?

I have heard of the Law of Demeter, but in practice, I'm not sure I see it obeyed that often if ever. APIs tend to return objects, or assume you'll chain method calls, or what have you. It's a fact of life, at least in my experience with Android development as well as vanilla Java. Maybe there's some subtlety that I'm missing out on? Otherwise it seems like a dud, if only because nobody wants to maintain those niggling chains of accessor calls.

I think that's really the crux of the issue: culturally, OO tends to value taxonomy in the form of object hierarchies; polymorphism; inheritance; and the like. All of that adds complexity. No, there's no reason why OO has to rely on mutable state and hierarchies. But the conventional wisdom about best practices is a powerful influence. You could compare theoretical implementations, but honestly, taking a look around would probably give you a better idea of how this theoretical scenario is likely to play out. That's the basis for comparison, here: Hickey is contrasting a simple approach with software development as it happens now, typically, not as it might or ought to happen.

Also, for what it's worth, I don't think encapsulation is irretrievably the province of OO. APIs are not a concept over which OO has exclusive ownership, right? The idea that you might want to sequester state goes hand in hand with concepts of purity and referential transparency. Or the idea that you should use a function which implements an algorithm as an entry-point to a data structure rather than muck about in it yourself (think of a heap).

nessus42 · on July 15, 2012

> Are there more, or was that the one you were referring to originally?

That's the one. Though there are two with that name, I believe. One for Strange Loop and one for a Ruby conference. The two talks are mostly the same, but it's worth watching both. Then watch all other talks by Rich Hickey too, as they are all excellent.

> At a minimum, for Java, you're going to want an object

Java, if you ask me is a degenerate case. It's a language that pushes consistency at all costs, in the face of any common sense. Most other OO languages don't pressure you into programming this way. I.e., they give you normal functions in addition to methods.

Though when programming in Java, I am personally not averse at all to just putting functions into an empty utility class that exists solely for holding nothing but static methods. Scala actively encourages doing this, as it provides for a type of object called a "package object" that exists for this particular purpose.

> I have heard of the Law of Demeter, but in practice, I'm not sure I see it obeyed that often if ever.

Maybe people in the trenches don't typically obey "best practices", but the OO best practices gurus all seem to push the Law of Demeter these days. I'm sure it's difficult to figure out how to do well. But then again, so is figuring out how to apply Rich Hickey's advice on simplicity!

> nobody wants to maintain those niggling chains of accessor calls.

Nobody wants to be locked into an inflexible API either, but I agree that people are lured by what is easy!

Hickey is contrasting a simple approach with software development as it happens now, typically, not as it might or ought to happen.

It would be interesting to know how what Hickey would do is different from what OO best practices gurus recommend. Indeed it's clear that in the real world, people don't often follow either advice.

> Also, for what it's worth, I don't think encapsulation is irretrievably the province of OO.

For the most part, it's usually considered to be. API's surely existed before OO, but they were typically for one program to talk to another, not for communication within a program. Before OO, different parts of programs typically revealed all their data structures, and you were allowed to go to town on them. Though, it's true that for certain things you probably weren't supposed to do that. E.g., as you said, on the malloc heap in C.

I believe that encapsulation as we know it today was invented by the CLU programming language. It was certainly the first language to enforce encapsulation. CLU isn't necessarily what we would consider an to be OO language today, as it didn't have inheritance, but it considered itself an OO language at the time.

jlouis · on July 15, 2012

Just because it is Functional Programming it does not mean to "throw away encapsulation". Most languages still allow you to limit the extent to which a piece of data is exposed. And that is needed for large systems - otherwise you will have to change many many functions as your data representation changes. This is also true of loose coupling. Seeking to modularize your code is always a good idea - but it is not impossible to achieve in non-OO languages. After all they also have module systems for exactly that.

The trick is to have a small set of functions which are very generic. And then stack these on top of each other as a mini-language from which you build up the access pattern you desire. After all it is a functional language, so these kinds of constructions should be fairly simple.

As for representation, consider that a book is represented as a tree: chapter -> section -> subsection -> paragraph -> body_text. But nobody said that this was the natural representation in a machine. You could have made a database table - CREATE TABLE book (id SERIAL, chapter INT, section INT, ..., body_text TEXT). Now access is by a small mini-language, namely SQL SELECT statements. Another way is to represent Chapter, Subsection, ... as dimensions in a vector space. The body text is then a point at a coordinate of Chapter x Section x Subsection. This has a KD-tree representation because it has neat spatiality for instance.

We could represent the book as a list: [(chapter section subsection ... body_text) ...]. A list has a formal derivative called a "zipper". This structure allows us to go "forward and back" through the paragraphs of the book - one paragraph at a time. This is the perfect representation for a screen reader. If we need to go larger strides, we can probably consider a zipper of a Rope-structure or a finger tree.

Finally, we could use a trick where the book is represented as a function. That is, we simply call book(3, 12, 32, 4) to get Chapter 3, Section 12, Subsection 32 paragraph 4.

What I really don't like about OO is when it is taken as the assumption that physical objects in our world has an isomorphic representation in the machine. This is nearly never the case. Good OO programmers naturally know when to break away from an isomorphic representation - but sadly this is not always what is taught in schools. You know, a Cat is an animal and so is a dog. Hence they are both subclasses of the animal class :)

nessus42 · on July 15, 2012

> Just because it is Functional Programming it does not mean to "throw away encapsulation".

Surely not, as even Clojure has support for OO programming. But Rich Hickey says that 90% of the time that encapsulation is used by OO programmers, it shouldn't have been. Instead, they should have just used a map or a list, etc.

So, does something as complex as representing a book fall into the 90% case or the 10% case?

Or maybe Hickey is really just alluding to the Law of Demeter and saying that methods should return data and not objects. It might very well be that if the Law of Demeter were religiously followed in OO programming, that the use of objects would fall by 90%.

jlouis · on July 15, 2012

The ML family of languages have structures, signatures and functors for encapsulation. Yet, it has little to do with OO in the traditional Javaesque kind of sense. Encapsulation does not require an object to achieve. Hickey says OO a'la carte and splits up the conflated concepts.

The reason he says to use a map or a list is because of another conflation danger. If you have an opaque book - you have a book but not its underlying representation - then if you want to represent multiple these, you should make the container "exposed". Because that choice is the modular one. All the functions which operate on my container can now be used. For instance (MAP spellcheck books). But I can also easily do (PARALLEL-MAP spellcheck books) right? Or even (MAP-REDUCE spellcheck id books) if I have petabytes of books and a cluster supporting MAP-REDUCE. This is why templates or generics are so powerful. They expose this modular relation.

But encapsulating the list into a "books" class will mean we have to write lots of extra code for each of the above 3 variants. We naturally don't want to do that.

Ingaz · on July 15, 2012

Even more - encapsulation is not always mean OOP.

Erlang have encapsulation more strong than in any OOP language - all you have is server process and messages thrown to it.

nessus42 · on July 15, 2012

How is that not OO then? OO is not defined by Java. The first OO languages and systems were descended from Lisp or built within Lisp, a functional programming language.

OO without inheritance is still OO. OO with immutability is sill OO. Encapsulation is the heart of OO. Polymorphism is also quite important. Inheritance is a distant third property of OO, but it is one common aspect of OO that is highly over-used. Most OO "best practices" gurus discourage rampant use of inheritance these days and say to prefer composition.

Back to the whole point of this thread, Rich Hickey says specifically NOT to use encapsulation (90% of the time). That would then not be OO. No encapsulation, no OO.

vukk · on July 17, 2012

This is a pointless argument, since both views are "correct". Even Joe Armstrong says that "You can take two views, you can say that either erlang isn't object oriented... [or] you can say its more object oriented than all the object oriented languages."

But I do not think it's very useful to say that Erlang is object oriented, since it will confuse a lot of people, since it is not anything like the other "object oriented languages".

http://www.infoq.com/interviews/Erlang-Joe-Armstrong http://www.infoq.com/interviews/johnson-armstrong-oop

Ingaz · on July 22, 2012

>>But I do not think it's very useful to say that Erlang is object oriented, since it will confuse a lot of people

I find it useful to instill this kind of confusion in people around me.

"You talk about encapsulation? Look at erlang"

"Polymorphism? Haskel"

"OOP? Javascript, CLOS"

"Static/dynamic typing? Look here - your java code is in fact dynamic, if you're really want to taste real static - look at ML-family or something near"

acomar · on July 15, 2012

The standard lisp approach is to define your API first, and write code the meets the API. You define your API in a manner that makes the solution to the problem you're trying to solve obvious. So behind this API, you can change your data representation however you please -- just don't break the API.

As for your specific question, which representation of a book is the simplest solution? That's the one you should be using. Hickey's talk on simplicity really does a good job of explaining this point.

cageface · on July 15, 2012

The funny thing is that most of these APIs wind up with what's essentially a "this" pointer at the front of every argument list so you're really doing OO anyway.

I'd like to read a persuasive critique of OO but so far I haven't found one. Time objects are a really bad example because time calculations are exactly the kind of hairy mess you want to hide behind some kind of abstract API.

adeelk · on July 15, 2012

The funny thing is that most of these APIs wind up with what's essentially a "this" pointer at the front of every argument list so you're really doing OO anyway.

Do you have much experience with non-OO programming? Or even OOP, for that matter? How can one honestly believe that every instance of passing data into functions is automatically OOP? Did you even consider the fact that this has nothing to do with objects?

Time objects are a really bad example because time calculations are exactly the kind of hairy mess you want to hide behind some kind of abstract API.

You miss his point here. Time not being an object would not imply that it can't be abstracted into an API. Just see Clojure's clj-time library for an example.

xyzzyb · on July 15, 2012

I'm sure that GP is referring to point number 2 of the article -- which explicitly says that time should (or at least could) just be data with no associated methods.

I winced at the naïveté in that section myself. It sure seems like he's arguing that OO programmers muddy the waters when they make something as "simple" as time and dates into an object with methods for storage and retrieval.

http://infiniteundo.com/post/25326999628/falsehoods-programm...

IsTom · on July 15, 2012

Except you can have many "thises" in a single scope. In OOP you'll write a.b(), in FP you'll write b a or b(a). Having "this" pointer is not OOP.

einhverfr · on July 15, 2012

What we do in LedgerSMB usually is write our API's first in SQL which is decidedly not OO.

Then we write objects then flexibly encapsulate that.

Then we write UI and automation logic around that.

Now, long-run we're trying to find better ways to encasulate the behavior inside SQL. I don't think we've settled on a method yet but may do so sometime between 1.5 and 2.0 (a year or three).

jonwingfield · on July 15, 2012

Would an API be defined as "A set of functions that define the available operations common to that set?" If so, would those functions then delegate to progressively more concrete functions that work against data structures to perform their duties? I suppose it's obvious where I'm going here. I don't see how this is any different from just saying the word "Object."

Don't get me wrong, I'm not trying to come off too sarcastic here. I'm actually quite interested in challenging the OO norm that is obviously too easily accepted right now.

pacala · on July 15, 2012

Each Module defines an api that talks about N different ValueTypes. Saying the word Object limits the api to expose exactly one ValueType.

nessus42 · on July 15, 2012

As to coming up with the "simplest" representation, that's almost certainly excellent advice whether doing OO or FP. It should be noted that Hickey has a rather specific notion of "simple" that may be rather different from more typical intuitions. This is why Hickey had to differentiate between "simple" and "easy", where "simple" means not intertwined with other things, NOT what seems like the least amount of work. The least amount of work is what is "easy", not what is "simple".

If one is encapsulating the representation behind an API, this simplicity principle is much less important for the representation than it would otherwise be, however, as any complexity of the representation is fire-walled at the API. And the Law of Demeter is a good principle for helping to keep an API "simple", as functions of the API that follow The Law of Demeter should return only data and not objects.

I'm still not sure how this API business meshes with Hickey's advice, however. He seems to be rather adamant that hiding data behind an API is the wrong thing 90% of the time. I have no feel, however, for knowing if something as complex as a book falls into the 90% case or the 10% case.

nessus42 · on July 15, 2012

Rich Hickey specifically says that his approach makes you more readily adaptable to future requirement changes, rather than less adaptable, as an OO die-hard might claim. I'm not sure I understand how the "get it right to begin with" approach fits into the adaptabile-to-future-requirement-changes promise.

I see how following OO, as long as you also follow the Law of Demeter, could live up to its promise of helping you adapt to future requirement changes. It has that unfortunate cost, however of requiring you to sometimes write a zillion little delegation methods upfront, which could be quite a combinatorial chore if you don't have the Demeter system automating this for you.

Also, I'm pretty sure from listening to a number of Rich Hickey's talks, that he is saying very specifically NOT to hide your data behind an API. Encapsulating the data behind an API would just be the same as the OO approach for all intents and purposes. Hickey is saying to let the data be data. I.e., just data.

tikhonj · on July 15, 2012

Instead of giving you an example that anybody actually uses, I'm going to tell you about a cool idea I've been reading about that hasn't gotten much actual use.

The basic idea is to use a generalization of pattern matching. Languages like ML and Haskell support pattern matching, but in rather limited ways. Crucially, patterns are not first-class citizens of the language. (For Haskell, at least, there are some libraries to remedy this, but I don't know how effective they are.)

So how can we generalize pattern matching to help you solve your book problem? Normal patterns allow you to match data types in the form C x₁ x₂... where C is some constructor and x₁ x₂... are either matchable symbols or arbitrary patterns. An example of a pattern would be Chapter (Cons (Section content) rest). We differentiate between the matchable symbols and the constructors on case: lowercase means matchable, uppercase means constructor. This is somewhat limited: you cannot easily write code that is generic over the constructor at the head of the pattern. You could write a function that counts the sections in a chapter, but you could not write a function that counts the sections in anything.

So let's relax the restriction that patterns have to be headed by a constructor. We can now have patterns in the form x y. These are static patterns: you can match data against them without evaluating the pattern. With this, we can imagine writing a function to count sections generically:

    count_sections x = 0 -- If this is some terminal, it cannot be a section
    count_sections (Section content) = 1 + count_sections content
    count_sections (x rest) = count_sections rest

This goes through the entire data type you passed in and counts all the sections it sees. It assumes sections can be nested. This will let you count the sections in a Book or a Chapter or a Series or whatever you want.

So, this is generic over the data you pass in. However, if you wanted a function to count Chapters or Sentences or what have you, you would be forced to write it. This calls for another extension to pattern matching: dynamic patterns. Patterns are now in the form x ŷ where x is a variable and ŷ is a matchable symbol. Constructors are still uppercase, so Section is a constructor and not a variable.

A variable in a pattern can be instantiated with another pattern. So now we can write a truly generic count function:

    count constructor (constructor) = 1
    count constructor (constructor x̂) = 1 + count x̂
    count _ (x̂ ŷ) = count ŷ

So now if you want to count chapters in your book, you would just invoke count Chapter book. If you want to count sections in your chapter? Easy: count Section chapter.

You can also use patterns and constructors for polymorphism by overloading functions on different constructors. One interesting idea is to allow adding cases to functions after their definition. This way you could have an existing toString function and then, when you've defined a book, add a new case to it:

    toString += | Book title content -> "Book: " ++ title

This way you can have a toString function that works on any type of data.

All my examples are obviously in pseudocode. (And hey, it looks nothing like Python! The whole "runnable pseudocode mantra annoys me.) I haven't covered all the edge-cases, and I haven't even begun talking about a type system for this mess. Happily, there's somebody who has, and wrote a book about it (that's where I got all these ideas): Pattern Calculus by Barry Jay[1].

[1]: http://www.amazon.com/Pattern-Calculus-Computing-Functions-S...

I'm also not sure whether this is the best possible approach. However, I think it's certainly very neat. If you like this idea, the book is definitely worth a look.

pufuwozu · on July 15, 2012

I started reading this comment and immediately knew you were talking about Pattern Calculus. Awesome to see someone else interested in the idea.

I live a couple of blocks over from UTS and went to a programming language event there. I only found out about pattern calculus after talking to Barry Jay. Turns out UTS students have been working on an implementation, called Bondi:

http://bondi.it.uts.edu.au/

If you're interested in playing around, Eric Torreborre posted some of the UTS lab tutorials online and some of his solutions:

https://github.com/etorreborre/bondi-tutorial

nessus42 · on July 15, 2012

Hmmm, this might be kind of similar to what the Demeter system actually did. With Demeter, you would provide some sort of machine readable descriptions for your data structures. Once you implemented this description, you could ask Demeter to fetch you all sections of a book, or all sentences of it, or what have you. If you were to then change the representation of the book, you would't have to change any of the client code--you'd only have to change the machine-readable description of the data structure. Then when client code asked Demeter to fetch all the sentences, everything would still work fine.

Maybe the best solution is to just resuscitate Demeter! Though apparently nobody actually used it either.

Dn_Ab · on July 15, 2012

This is the first time I've encountered this and what you wrote packs a lot of ideas in a small space so forgive me if I have misunderstood.

What you write seems like an even more powerful version of the Active Patterns in F#, which are already really powerful. Active Patterns are the closest thing to First class patterns in a production language * .

Haskell has a similar thing in views. But I think another concept, unique to and playing better to its strengths, while attacking this level of problem are Generalized Algebraic Datatypes. I mentions GADTs because it deals specifically with "This is somewhat limited: you cannot easily write code that is generic over the constructor at the head of the pattern. " Like I said, I've never heard of the pattern calculus - is there any problem it solves that could not be solved with GADTs with a similar level of effort?

* certainly biased but I see active pattern use more than I see extractors or views

tikhonj · on July 15, 2012

I'm afraid that I'm not too familiar with either active patterns or GADTs at the moment. However, given the understanding that I do have, I think they are both unrelated to the basic dynamic pattern matching that I was talking about.

As far as I can tell, active patterns only affect the constructor. They let a constructor like Even to run an arbitrary function when it's matched. This lets you customize what a constructor actually means, which seems very useful, but is orthogonal to dynamic patterns. Dynamic patterns affect the patterns and not the constructors, where a pattern is the expression that is actually matched. That is, given a function f (Foo a b) the pattern is Foo a b. Active patterns would let me customize how Foo works, but the full pattern (Foo a b) still looks and acts the same.

Given this meaning of "pattern", I think active patterns do not constitute first-class patterns. You can't take a pattern and pass it into a function or make up a pattern of variables that can be any pattern. With first-class patterns, you should be able to take the pattern (Foo a b) and pass it into a function as a pattern. So a function like

    match pattern (pattern x̂) = x

would make sense in a system with first-class patterns. Here, a pattern is an argument to a function and then used to match against the next argument. Hopefully this clarifies the particular thing that dynamic patterns add to the language.

GADTs are also unrelated. For one, GADTs only affect the type of a constructor and not its behavior. I did not talk about types in my post at all--everything I talked about would make sense in a dynamically typed language that had pattern matching. It just so happens that dynamically typed languages tend not to have pattern matching like this, so it's associated with statically typed languages.

Also, GADTs, like active patterns, only affect the constructor. They do not change the actual patterns used to match against values (except by restricting possible types of the matched values).

So really both active patterns and GADTs are somewhat orthogonal to dynamic patterns. When I said a code that is generic over the head of the pattern, I meant a function like:

    doFoo (a b) = b

where doFoo will match both Foo x and Bar x and Baz x and extract x in every case. This function is generic over the constructor in the sense that it matches regardless of what the constructor actually is--the constructor becomes just another matchable term.

This sort of function, which just uses static patterns, is already something that a GADT cannot do. Dynamic patterns are even more flexible. My match function above lets you take an arbitrary pattern like (Foo a b) and match (Foo a b x) against some value to get x. You're passing a pattern into a function which then uses that pattern to match against its next argument. This feature has nothing to do with the type of the constructor.

I hope this cleared things up for you. However, it's almost 4 in the morning and I'm not entirely coherent. If you're still confused, or if I made some glaring errors, feel free to email me about this when I'm more awake :P.

Dn_Ab · on July 15, 2012

Excellent! Thank you, yes I am clearer, the concept is even more awesome than I thought .

>I think active patterns do not constitute first-class patterns.

I agree, but like I said, they are the closest thing in active use, they extend the type of structures one can deconstruct to encapsulated ones while being more flexible in implementation. They fall in the class of techniques which extend and make matching more flexible, other examples would be multimethods and predicate dispatch. So very much orthogonal to approaches in pattern calculi but I think in a subspace in terms of matching.

>GADTs are also unrelated. For one, GADTs only affect the type of a constructor

Yes they are unrelated that's why I focused on expressitivity, and you are right, I misconstrued what you meant by generalizes on constructors. But everything has a cost - runtime, learning, cognitive overhead - what I was curious of is: how much does the pattern calculus buy you? GADTs are another concept that allow one to elegantly tackle problems where it looks like the pattern calculus would help. I was just looking for an example where something like GADTs fumbles in comparison.

It's clearer to me now that the pattern calculus is definitely more flexible in terms of match semantics but I can't think of any situation in practice where this would give an added advantage.

Also, Is this typically done via term rewriting and is there anything on the decidability of this?

tikhonj · on July 15, 2012

I think the main advantage is what I pointed out before--being able write functions that work on a wide variety of data types. I think my count example is something that could not be done with GADTS:

    count constructor (constructor) = 1
    count constructor (constructor x̂) = 1 + count x̂
    count _ (x̂ ŷ) = count ŷ

Keep in mind that this is pseudocode, so some of the semantics may not be perfect. The basic idea is this: the first argument count takes is a pattern, like a constructor. The second argument is matched against the dynamic pattern (constructor x̂). This pattern has one matchable variable x̂ and one free variable constructor. This free variable is bound in the previous argument to the function.

So this lets you use the count function to count any sort of sub-pattern in its argument. Let's imagine you have some hierarchy like (Book (Cons (Chapter stuff) (Cons (Chapter stuff) nil))). You could use the count function to count the number of chapters like this:

    count Chapter book

Basically, you're parametrizing your count function on the pattern you want to count. So count can work on any data type and any pattern. If you wanted to count sections, you could:

    count Section book

Now lets pretend that sections have numbers. Instead of (Section content) we have (Section number content). You can now count something like how many sections have the number 5:

    count (Section 5) book

I think you could generalize this even further, but it would be more complicated.

I'm not sure how this is typically implemented, largely because it typically isn't :P. I think there's essentially one implementation in a language called bondi[1] but I don't know how it is implemented.

[1]: http://bondi.it.uts.edu.au/

As for decidability, I'm not sure. The entire pattern calculus--that is, an extension of lambda calculus supporting this sort of pattern matching--is Turing-complete. I don't know if just the pattern matching part is undecidable though.

As I said in my original post, there is a nice book on this called Pattern Calculus. I think you can read it online[2], which is nice because it turns out to be fairly expensive on Amazon.

[2]: http://www.springerlink.com/content/978-3-540-89184-0

polymatter · on July 15, 2012

Thanks for an excellent write up to the idea. That was very clear.

I am very intrigued and was looking at purchasing that book to learn more - but then I saw the price. I'll have to scrounge a copy from a library somewhere. I thought on demand printing was going to reduce the costs of books on the long tail!

tikhonj · on July 15, 2012

Oh yeah, it's more expensive than I thought :/. Happily, I was lent it by a friend.

You might have some luck getting it as an ebook from the library. My university's library seems to only have it as an "electronic resource", and it's a pretty big library.

I like the book, but a good part of it is just background. It goes through a bunch of variations on the lambda calculus in building up to the "pattern calculus". If you're already familiar with the basics, you might be better off just reading some papers on it. (Hopefully there are some you can read for free, but I'm not sure.)

tikhonj · on July 15, 2012

Oh hey, it seems that you can also read it online: http://www.springerlink.com/content/978-3-540-89184-0.

I'm not sure if there are any restrictions on this, and the format is annoying, but it might be worth looking into.

ufo · on July 15, 2012

TBH, looks like the only hard bit here is fiting static typing into your system. I believe the dynamic languages with pattern matching facilities (erlang, echeme) should already be able to pass any symbol they want for the constructor name.

tikhonj · on July 15, 2012

Yes, they can use any symbol. But I don't think they can match against a constructor (that is, write a pattern like (x y) where x can match any constructor).

Also, I don't think they can have dynamic patterns. That is, you can't take part of a pattern, pass it into another pattern and use the resulting combination of patterns to match against some value.

These two things are the interesting generalization of pattern matching that I was talking about.

ori_b · on July 15, 2012

Relevant: http://www.daimi.au.dk/~madst/tool/papers/expression.txt

The salient paragraph: "Whether a language can solve the Expression Problem is a salient indicator of its capacity for expression. One can think of cases as rows and functions as columns in a table. In a functional language, the rows are fixed (cases in a datatype declaration) but it is easy to add new columns (functions). In an object-oriented language, the columns are fixed (methods in a class declaration) but it is easy to add new rows (subclasses). We want to make it easy to add either rows or columns."

einhverfr · on July 15, 2012

Isn't the whole question what you need to do to get a nice, clean interface?

Also wouldn't it be true that the ideal data structure of a book would depend on whether you were optimizing it for reading or for writing?

Wouldn't it be better to just say we should follow whatever creates nice, clean interfaces but allows the developer access to internals as needed for debugging purposes? The problem I have with the principle of least knowledge (the Law of Demeter) is that it is used to make objects opaque. I think that's what causes the biggest backlash.

I mean the whole point is to create clean interfaces, right?

The law of least knowledge is interesting though, provided it is not an excuse for opacity. Why should my program have to know beforehand what parameters a stored procedure takes? Why not discover them at run-time and map in properties or other variables?

vrotaru · on July 15, 2012

Umm... Pseudo-code follows

     module Book = struct 
         type t = { sections: Section.t list; ... }
         ....
         get_paragraphs t = map (fun s -> Section.get_paragraphs s) t.sections  
         .....
     end

So it is not that hard to have both FP and DL

the1 · on July 15, 2012

DOM is a tree. Tree traversal, filter, fold, query, search .. are well known.

pmjordan · on July 15, 2012

So, there have been a few replies to this post, but as far as I can tell, you're asking about maintaining invariants ("not breaking the structure"), which nobody has addressed so far. Let's have a look how this is handled in different types of languages (^ means I have extensive experience with these, so I might have some of those without a ^ wrong - some languages are also tough to categorise, but bear with me):

1. statically-typed OOP: e.g. C#^, Java^, C++^, F#, OCaml, Scala, etc.; specifying the type of a field is a kind of invariant assertion (e.g. name must be a string not a number), and depending on the expressiveness of the type system, these static assertions can become arbitrarily specific. You might still need to do runtime checks in the mutating methods if certain invariants (e.g. length > 0) can't be expressed statically in the language. If internal state can be set to mutable objects (e.g. mutable strings) from outside, you can still break the invariants by mutating the objects after they've been checked. Some type systems (such as Java's generic types) are even notoriously weak, requiring extra care.

2. dynamically-typed OOP: e.g. JavaScript^, Objective-C^(see note), Ruby, Python, Io, etc.; generally, no invariants are maintained by default, and everything happens at runtime. So unless you specifically check invariants in the object's methods, invariants can often easily be broken by passing in data with unexpected structure. It is often trivial to pass in objects that satisfy the invariants, then change them in arbitrary ways to violate them afterwards. Encapsulation is frequently also easy to break by extending classes or prototypes.

3. statically-typed functional or procedural: C^, Haskell, ML, ATS, etc.; just as with the statically-typed OOPLs, you can explicitly encode many invariants in the type system. You can't set a number-typed field in a record to a string value. To maintain "softer" invariants, you will typically have runtime checks in the mutating functions. The languages with a pure functional bent typically encourage concentrating mutable state into a select few places in the code. Those places can be gatekeepers for enforcing invariants.

4. dynamically-typed functional or procedural: Lisps such as Clojure^, Scheme and CL^, Erlang, etc.; Where mutable state is allowed, or the default it's just as much a free-for-all as the dynamic OOPLs. You need to make your mutating functions aware of all your invariants, or they're trivially easy to break. In the pure-functional or immutable-by-default languages, again, since mutation is concentrated to a few hotspots (transactions in Clojure, messages in Erlang), those hotspots can easily be made to refuse performing mutations that break invariants.

Comparing 1 with 2, and 3 with 4 respectively, it's pretty obvious that the static languages make it easier to maintain invariants. The more expressive the type system, the more you can nail down the permitted structure. The culmination of this are theorem-proving languages. Interestingly, those tend to be functional.

So, if you compare 1 and 3, encapsulation buys you some safety in some cases, but realistically, it's just as easy to make a system that permits violation of invariants as it is in functional languages. Some of the more pure functional languages replace encapsulation with constrained mutability to help maintain invariants.

Comparing 2 and 4, I'd argue the functional languages actually do a better job helping you maintain invariants. The malleability of objects in (2) languages makes them hard nail down. Immutability on the other hand at least provides stability in time, if not in structure.

Note that I'm only evaluating the ability to maintain invariants here, not trying to come up with an overall judgement on languages. In some situations, maintaining invariants is very, very useful. In others, you don't really care that much.

(note on Objective-C): while you declare fields and variables with a specific static type, any object-typed field can point to an object of any run-time type. So the static types are mostly just annotations and affect ARC semantics, they're not actually enforced or safe, it basically relies on duck typing; Key-Value Observing in particular makes some pretty deep changes to the run-time types of objects, which wouldn't be possible in e.g. Java. Unlike some of the other dynamic languages, the non-object C types and mostly-fixed memory layout of objects give some level of structural safety. Still, there's nothing stopping you from creating a number class that returns YES for isKindOfClass:[NSString class], thus making it straightforward to fool most invariant checks.

nessus42 · on July 15, 2012

Hmmmm, well I wasn't asking about invariants in the way that this term is typically used. I was asking about adapability to future changes. The lure of encapsulation has been that it places a firewall (i.e., the "interface" or "signature") between your representation for some information and the client code that needs access to this information so that you are free to change the representation without changing the client code. It is true that having an unvarying interface is a form of invariant. The problem is that adapting to future needs may involve varying this invariant.

Rich Hickey claims that 90% of the time, encapsulation has the opposite of its desired effect, however, due, in part, to breaking the client's ability to use generic functions on the encapsulated data.

I know that I have personally seen quite a bit of OO code that did not live up to the promise of adapability, and so I would like to know more precisely what the alternate approach that Rich Hickey (and I presume other FP advocates) proposes is, and how it would address the issue of adaptability with examples that bear on real-world situations.

Many responses here have been, You can do encapsulation in FP languages. Sure you can, but that's not the alternate approach! That's the OO appraoch! I know that you can take an OO appraoch in functional programming languages, just as you can take a functional approach in OO programming languages, but Hickey and the OP specifically say that encapulation is extremely over-used, and to think twice and then think twice again before doing it.

The talks I have seen by Rich Hickey are inspiring, but they demand a follow-up talk (or maybe book) that provides the gory details, rather than just more potentially dubious promises.

slurgfest · on July 15, 2012

In most of your statically-typed OOP languages, a type declaration really is just an assertion like "the value bound to 'foo' is always a number". This falls far short of asserting that it has the intended properties. For example, "is a number" is miles away from "is the largest prime number less than n."

For most interesting programs, it is simply not enough to have something which is a number. So you must use some supplementary mechanism to verify all the interesting properties. In practice the supplementary mechanism is not as awkward as writing extended type theory proofs in your programs. Now if this supplementary mechanism is good enough to depend on for all the interesting correctness checks, why isn't it good enough for the basic stuff as well?

If you do make the type system expressive enough to verify everything, it is Turing complete, and many assertions are only verified by calculating them in the type system. But if you write a correct implementation in your Turing-complete type system, it is at least as easy to write a correct implementation in the ordinary way. So why not just write the same code twice, and raise an exception if a result does not get a quorum?

This might be a valid strategy for some cases. But in most cases, some kinds of errors are just much more important to cover than others.

Static vs. dynamic typing is not the issue. The first issue is what kind of system you have for annotating code with appropriate assertions - easier is better, assuming that more assertions are always better.

The second issue is how much of this annotation is mandatory. Push it to the limit and you get a "bondage and discipline" language. Pair it with a type system that is less than extraordinary and you spend a significant share of your time wrestling an awkward type system.

Some people seem to enjoy this activity. But unless you do most of your coding under the influence of heavy sedatives, the odds are that the majority of your errors are not simply passing the wrong primitive type. And unless something is really wrong with you, most of your errors are not due to your inability to put an unbreakable padlock on your own code to limit your own intentional malice. So it is certainly possible to emphasize these cases.

But that should be based on some understanding of costs. If a given class of error is infrequent and low-cost, there should be a burden to justify spending 50% more time always handling that class of error.

If the language understands those costs better than the human then that could be helpful (particularly applicable to DSLs). But if it does not then it is better for the human to be able to manage this for themselves.

The real argument seems to be about which classes of assertions should be mandatory, not anything about the internal type accounting. Seems to me that this should be a decision specific to the purpose. If you are teaching programming your requirements might be different from if you are writing missile guidance systems. Moreover, different people might have different tastes!

So I am getting pretty tired of the facile suggestion that static typing (in particular, as implemented in mainstream blub languages) is invariably safer and that this makes it an obvious choice. The only variable at hand is NOT how much you care about maintaining invariants. It is really not that simple. It is also very significant what kinds of invariants you mean to maintain and what amount of cost you are willing to absorb to maintain them.

phleet · on July 15, 2012

So I have very limited experience with FP, and a reasonable amount with OO (mostly dynamically typed).

I can really see the benefits of FP, but there are some problems I have trouble modelling with FP.

For instance, if I have a simple 2D rendering engine, I just want to say "add this object to the screen". The object might be a geometric primitive (square, circle, etc.), it might be generated particles, or it might be an image or video or something. The way I deal with this at the moment is have Drawable or something to add to the screen with something like, which implements an interface with a "draw" method.

    screen.addToScene(new Circle(...))
    screen.addToScene(new Square(...))
    screen.addToScene(new ParticleGenerator(...))
    screen.addToScene(new ImageSprite(...))

Then the game would loop over each of the Drawables, then call .draw() on them, which is implemented differently for everything that implements Drawable.

How would I model this in FP?

The only solution I can think of at the moment is to have a draw function that does pattern matching on the type of thing and do it that way. How do people do stuff like this in scheme or other languages with limited support for pattern matching?

The problem I have with this is that it means every time I want to add a new kind of thing, if it implements many methods in an interface, I have to go to many different files to implement how this new kind of thing works.

Among other things, that's a huge pain for revision control, since if I have 3 coworkers adding new kinds of things that can be drawn, we're all going to have to modify the draw function. In OO, we'd each just be creating a new subclass in its own new isolated file.

As a second question - what should I read to get a good idea of how to sanely model things in OO and FP? I've read a lot of debate about the right way of doing things, but I don't really know where to learn this stuff. The OO class in university was completely useless, since the examples were outrageously contrived and too small to see any real benefits. I'd ideally be looking for 1 book that explains how to model real problems in OO very clearly, and one book for how to model real problems in FP.

aaronharnly · on July 15, 2012

I encourage you to read Philip Wadler's essay The Expression Problem, which addressed precisely the dilemma you point out:

http://www.daimi.au.dk/~madst/tool/papers/expression.txt

In brief, if you think of data types are rows, and behaviors as columns, the question is how to extend either the rows or the columns naturally.

In your example, it is easy to add a new row (datatype) – create a Triangle class which implements the Drawable interface. It is difficult to add a new column (behavior) – if you realize all of these datatypes should also have a "extrude to additional dimension" behavior, you're going to have to individually implement that in all of your different classes, across many different files, etc. All of the problems that you note arise when adding a new datatype in the FP strawman.

It's important to recognize that this is indeed a difficult problem, and that addressing it well takes real care.

The c2 wiki has a good distillation:

http://c2.com/cgi/wiki?ExpressionProblem

and this paper (by the Scala people) is very nice:

http://www.scala-lang.org/docu/files/IC_TECH_REPORT_200433.p...

Some approaches that languages take, to varying degrees of success, include typeclasses in Haskell, multimethods in Clojure and elsewhere, the Visitor or Extended Visitor patterns in OO languages, controllable extensions in C#, Ruby, and Scala, etc.

phleet · on July 16, 2012

Thanks for the sources - will sit down and read them later.

The row, column analogy is pretty apt - it's an interesting way of looking at things.

While it is true that adding a new behaviour in OOP is a pain in the ass, I definitely find myself adding new datatypes far more frequently than adding new behaviours for an interface.

Isn't adding a lot of new behaviours to data typically a sign than the interface is bloated and is now dealing with too many things?

ericbb · on July 15, 2012

To attach behaviour to data in a functional language, you typically use first-class functions. Your draw function would look for a function in each drawable object and call it.

To your second question: find some great open source projects and read their code.

phleet · on July 16, 2012

The problem with open source projects is that great != popular.

There are certainly a lot of very successful projects that have painfully unmaintainable code that still get maintained because so many people are invested.

And the problem with great open source projects is that they'll only show you how they did their problem right - but seeing how to do it wrong and _why_ it's wrong is typically a more useful process.

I think it's generally easier to avoid doing bad things you understand the problem with than try to write excellent code by blindly following patterns that you assume are good because someone well known uses them.

ericbb · on July 16, 2012

Excellent points. I'm doing a lot of code reading myself these days so it's nice to be reminded of some of the risks. :)

damian2000 · on July 15, 2012

In the 1990's I saw OO as just another tool which was infinitely better than what I had at the time. I hazard a guess that most devs at the time were still working with procedural languages like C, Cobol, Fortran, Pascal or Basic. OO gave you abstraction and encapsulation, making it a little easier to write better code, that's all.

For me it was never OO v.s. FP, it was just OO v.s. the status quo. If OO was hyped up so some guys could make money from it (as the article suggests), then who was behind it? -- Bjarne Stroustrup, Anders Hejlsberg or James Gosling? I think not.

gabordemooij · on July 15, 2012

To me object oriented programming makes a program 'come to life'. In our daily lives we are surrounded by objects: trees, houses, books... to name just a few...

I love the fact that I can reason about these 'natural' concepts in my code. Thinking 'in objects' sparks my creativity and boosts my imagination. It helps me to visualize otherwise very abstract notions.

I love to talk about a 'Book' instead of an Array. With good Object Oriented code, technical concepts and natural ideas seem to come together. To me, the benefits of writing object oriented code have more to do with human-computer symbiosis ( http://en.wikipedia.org/wiki/Smalltalk ) than with pure technical correctness, it just fits my mind.

If you want to appreciate the real beauty of objects I recommend to skip Java and C++ for a minute and look at Smalltalk. I just read the Blue Book (Smalltalk-80) and I had tears in my eyes. The elegance and beauty of this language is just stunning.

carsongross · on July 15, 2012

I find grouping functional code along with the data the code is supposed to work on reasonably intuitive. The OO religionists definitely sold the world a bill of goods on the reuse arguments, and the religious fervor was silly (just like it is with todays functional zealots) but still.

Having an 'x' and hitting '.' and seeing what 'x' knows to do with itself doesn't suck.

borplk · on July 15, 2012

From a purely scientific view, OO is a terrible idea because it moves the program further away from the mathematical form and makes it harder (if not impossible) to say, logically prove the correctness of the program. But from a practical perspective OO is a great idea because it makes many things so much easier.

yen223 · on July 15, 2012

"But from a practical perspective OO is a great idea because it makes many things so much easier."

This seems true at first, but after having worked with C# a while I'm not so sure about that. OO introduces some weird issues that aren't immediately apparent: 1. Verbosity When the shortest method call looks like "abcObject.functionXYZ()", code gets huge really fast character-wise. This actually does make it harder to read and debug existing code.

2. Multithreading Multithreading in any programming paradigm is a pain, I'd give you that. But OO exacerbates the problem because of the way each property of an object is essentially a global state within its local scope. It makes it quite tricky to enforce thread safety.

Having said that, I'm not sure that rewriting it in, say, a functional style would make things simpler. Sure it is easier to prove correctness, but as soon as you introduce IOs, everything goes to hell. I guess OO seems to be the worst pattern, except for all others that have been tried.

icebraining · on July 15, 2012

I don't think verbosity is an issue with OO any more than it is with any other paradigm. 'abcObject.functionXYZ()' is not any more verbose than 'functionXYZ(abcDataStructure)'. I think it's mostly a question of the community's coding style, and C♯ and Java are notoriously verbose even in their standard library.

borplk · on July 15, 2012

I think the advantage of OO is that is transforms the code into something that is closer to the "human" interpretation. Modelling the problem like we have it in real world. Instead of thinking like a computer, you can more easily think as a human and let the computer do the rest. Although of course it comes with trade offs.

slurgfest · on July 15, 2012

Wouldn't a purely scientific view have something to do with empirical data? You seem to rely on some kind of axiom that it's always better for programs to be in "the mathematical form". This would probably also rely on certain unexamined assumptions about the purpose of programming and the ergonomics of the tools when they are actually wielded by the grubby hands of a population of real people.

alttab · on July 15, 2012

What about the benefits of abstraction? I'm not going to introduce hyperbole on how I think this article is overstated, and instead I'm going to ask HN members who have more experience with functional programming how they leverage concepts similar to abstraction with Haskell, clojure, or erlang, etc?

tikhonj · on July 15, 2012

It's funny you should mention that--it seems Haskell usually gets accused of being too abstract. In Haskell, it's actually common to write very abstract code.

In fact, I think that is one reason why Haskell is tricky to pick up at first. OO "abstraction" is never that abstract--you can always tie it down to some concrete thing or idea. An iterable is obviously something you can iterate over, and that's about as abstract as you get.

Haskell, instead, finds its core abstractions in math. It uses ideas from abstract algebra and category theory that are more general than anything I've seen in OO. The core abstractions like the infamous monads, functors, monoids, arrows and so on are not easy to tie down to anything concrete at all.

For example, the standard functions that work on monads (like foldM) can work on a dizzying array of different types: functions, lists, nullables, continuations, errors, parsers, ASTs. Any given function like this does the same thing parametrized by the behavior of the type in question: so foldM always does a fold, but the exact behavior depends on the exact monad being used to do the folding. And there is no obvious relationship between the different behaviors that can be modelled this way.

In short: Haskell uses ideas from math to get very abstract code, for better or worse. These mathematical concepts can model a gigantic amount of different things which, intuitively, may not seem related at first.

zaphar · on July 15, 2012

Abstraction:

   * Haskell typeclasses
   * Clojure Protocols, Multimethods
   * Erlang Processes

Abstraction is not limited to Objects. Objects are just one way of expressing abstractions.

alttab · on July 15, 2012

Thank you. This gave me enoug keywords to do some good research. Now clearly these methods could even be applied to ruby and other oo enabled languages as well

ef4 · on July 15, 2012

> What about the benefits of abstraction?

You seem to have a strangely narrow definition of "abstraction". Start here: http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-4.html#...

ColinWright · on July 15, 2012

There was a substantial discussion when this was submitted 3 1/2 years ago: http://news.ycombinator.com/item?id=474919

I do wonder if this discussion repeats all the same points, or if it raises new ones.

hurshp · on July 15, 2012

What I find so obtrusive about OOP which I feel is a massive issue (maybe has to do with the last sentence in Joe's post). OOP is pushed into places it does not belong and causes a lot of impedance issues.

OOP developers want if something doesn't talk OOP then to make it talk OOP, for example ORM's and SQL databases.

It is a tables and sets, most of computers use sets and tuples, yet OOP needs to be serialized and abstracted away and pushed in almost becoming a data type in it's self.

And I think there are other issues and pervious failures like this.

nivertech · on July 15, 2012

While I dislike OOP, I think that CLU-style ADT has some merits, especially when ADT implemented in pure functional way.

Just because you don't have explicit schema, doesn't mean that you have no implicit schema.

Likewise just because you don't have explicit objects and classes, doesn't mean that you have no implicit objects and classes.

I code in Erlang, and I treat every gen_server either as a singleton object or as a class (in case I spawn many instances of it)

nessus42 · on July 15, 2012

Wow, CLU! Those were the days!

Tloewald · on July 15, 2012

The whole basis of this argument is that functions and data structures should not be "locked in a cage together". Replace object with "file" in the entire article and you'd have an equally but more obviously ridiculous argument.

Does OOP have problems? Sure. Is Erlang great in some ways? Sure. But this argument is silly.

timruffles · on July 15, 2012

I think a project along the lines of Todo MVC for general programming languages - https://github.com/addyosmani/todomvc/ - would work really well for illustrating these kind of debates.

Ideally it'd be a reasonably involved problem domain (rather than a todo list) with persistance, networking and something which requires parallelism/concurrency (I'm sure there are other categories too). This'd expose each language to the types of complexity that exposes the really interesting differences - does language X allow a clean API even when we require immutability for parallelism, does language Y impose boiler plate on simple problems, Z require unreadable line noise?

I find these debates nearly useless without evidence and code to read.

lukifer · on July 15, 2012

OO is just one design pattern of many; it should be used when the mental model is a good fit for the problem domain. It does annoy me, though, when languages or frameworks force the use OO when it's unneeded.

Use a function when you need a function, and a class when you need a class.

yason · on July 15, 2012

My sentiments exactly couldn't have been better put than the author did in the opening sentence: When I was first introduced to the idea of OOP I was skeptical but didn't know why - it just felt "wrong".

lightblade · on July 15, 2012

What I finds funny is that all these OO design patterns and best practices are aimed to solve problems that doesn't exist in FP. Of course I may be over generalizing, but you get my point.

bborud · on July 16, 2012

It would be useful if Armstrong stated which languages and which programming styles he disagreed with.

"OO" is too vague. Not only does it include languages as different as JavaScript, Perl, C++ and Java, but there are wildly different ways of using each of these languages.

For instance it is entirely possible to write purely functional code in JavaScript if you want to.

erlkonig · on July 15, 2012

Heh. The "-deftype second() = 1..60." is a problem, since some minutes have 61 seconds in them.

jongraehl · on July 16, 2012

Author does not understand OO, or argues against strawman OO:

> In an OOPL I have to choose some base object in which I will define the ubiquitous data structure, all other objects that want to use this data structure must inherit this object.

CurtMonash · on July 15, 2012

His history is wrong. OO won in large part for a good reason -- it was a way of implementing, if not enforcing, modularity. One that people accepted, unlike LISP.

artsrc · on July 16, 2012

I like modules that are more like objects. I would like to supply parameters to their constructors and be able to refer to them via names.

Languages without modules, like JavaScript, support that kind of thing.

http://wadler.blogspot.com.au/2009/08/objects-as-modules-in-...

kyledrake · on July 15, 2012

"Reason 2 - It was thought to make code reuse easier. I see no evidence of 1 and 2."

https://rubygems.org/stats

astrange · on July 16, 2012

Code reuse works at the library level, not the individual object level. You're not exactly fishing an object out of a worldwide sea of classes there.

Actually, immutability can code reuse pretty easy, but many OO systems make no effort to encourage that (one reason Python is not my favorite language).

vph · on July 15, 2012

data structures and functions are very much related and connected. In fact, all data structures are invented to perform certain specific functions. OO is not the answer to all things, but it provides a natural way to fuse data structures and its associated functions together.

tikhonj · on July 15, 2012

In this case, "function" means mathematical function rather than purpose.

Of course, if you want to take a somewhat reductionist approach, you could argue that data structures are functions. After all, in the lambda calculus, all you have are functions and you can use them represent numbers, pairs, lists, booleans, conditionals--basically whatever you want.

jroseattle · on July 15, 2012

The conversation around programmatic semantics and languages and the like always come and go over time, and feelings about them ebb and flow. Over umpteen years as a developer, I can say that I've found there's something about every language that will cause one to say "why do I have to think about this like that?" Nothing is perfect, but certainly a lot of languages do a few things quite well.

As such, I would be grateful to hear an argument of why object-oriented programming structures are incorrect. I disagree with the reasons provided by the OP because of the slant toward personal preference. The arguments posted here are specious; I can find holes in each of the points made.

1. Data structure and functions should not be bound together - very true, they should be independent. However, this statement: "Objects bind functions and data structures together in indivisible units." This implies how something is implemented (or rather ALWAYS implemented), and while a tight binding is possible in most OO-supporting languages, it's not requisite. Just because the ability to violate this exists doesn't make it awful; it just makes it complicit on the programmer to use the right approach in a given situation.

2. Everything has to be an object - in some languages, this is true. However, this causes what problems? For the OP, this is nothing more than semantical ickiness. I won't defend any implementation of things like time and date and other primitives, but the chief complaint here seems to be how that information is accessed and the form of which it takes. I simply find the "this-is-an-object-so-it-feels-wrong" argument quite lacking.

3. In an OOPL data type definitions are spread out all over the place - this is organizational, but I'm not sure what "find" means in context. I guess it depends on the language being used, but I question why this is an issue for the OP. "In Erlang or C I can define all my data types in a single include file or data dictionary." I can do the same thing in Java or C# or other languages, if I want. For most developers, "finding" data type definitions has more to do with documentation than the actual language.

4. Objects have private state - of course they do, it's the nature of OOP. This statement: "State is the root of all evil. In particular functions with side effects should be avoided." This is unfounded (not the side effects part, which has nothing to do with state.) State, as the OP points out, exists in reality but should be eliminated from programming. Just as the bank example points out, one will want state to be accounted for in cases of deposits and withdrawals from an account. Thinking that state can only be handled in a certain way (which is what this argument suggests) is limiting in evaluation and unimaginative in assessment.

Most of the arguments show personal preference to application development, and with that I totally understand. But these arguments are intended to show why the languages which support OO are conceptually wrong, as if the concepts of the alternative are an accepted truism.

Locke1689 · on July 16, 2012

not the side effects part, which has nothing to do with state

State is a side effect. In fact, that's basically the definition of side effect -- I think you need to revisit the basic principles of formal semantics.

State, as the OP points out, exists in reality but should be eliminated from programming

State is the entire point of programming. State shouldn't be eliminated, it should be corralled so effects are traceable and formalizable.

Objects rely on being mutable. An object-oriented language pushes objects everywhere. This pushes mutation everywhere. This is the opposite of corralled.

jroseattle · on July 16, 2012

> State is a side effect. In fact, that's basically the definition of side effect -- I think you need to revisit the basic principles of formal semantics.

You're right, thanks for the correction. My line of thinking was that addressing the issue of side effects in programmatic functions was not the same thing as addressing state -- while not mutually exclusive, they're not the same thing.

> State is the entire point of programming. State shouldn't be eliminated, it should be corralled so effects are traceable and formalizable.

I won't go so far as state being the entire point, but it's simply another way to programmatically solve a problem. It absolutely should not be eliminated, but rather used responsibly -- just as with anything else in a programmatic environment.

Locke1689 · on July 16, 2012

I won't go so far as state being the entire point, but it's simply another way to programmatically solve a problem.

Writing to memory is stateful. Writing to a file is stateful. Printing something to the screen is stateful. Seeing the effect of your program is stateful. Would you say that the point of programming is to produce an effect?

jroseattle · on July 16, 2012

That's fine. All your points are perfectly valid, but they have nothing to do with OOP, which is what I was referencing.

Locke1689 · on July 17, 2012

Objects rely on being mutable. An object-oriented language pushes objects everywhere. This pushes mutation everywhere. This is the opposite of corralled.

As I said in another comment:

Note that you can have objects without OOP. Python has objects. Python is not object-oriented. Same with OCaml or Racket. I'm not arguing against using state. If you're programming a state machine, you may want to model it with state. That would be a pretty good choice. The problem is that OOP says everything is a state machine.

ww520 · on July 15, 2012

I don't get it. If people don't like OO, why don't they just not use it. Just use your favorite methodology to get the job done. Why do they have to bad mouth it?

i_cannot_hack · on July 15, 2012

Because they want to promote discussion and exchange opinions, and maybe advice others against something that they think is a bad idea. To just silently avoid stuff is a very stifling and unproductive way to handle things.

They do not badmouth OOP. It is not a human being with reputation and feelings. They criticize its usage.

mattacular · on July 15, 2012

But he is criticizing the design and fundamentals of OOP. Something doesn't have to be human or have a reputation to be badmouthed...

it · on July 15, 2012

If we fail to point out the harmful effects of excessive OO usage then there is a higher likelihood of encountering it as legacy code written by people who didn't know any better. This can lead to wasted time and decreased job satisfaction.

stonemetal · on July 15, 2012

He is the father of Erlang, so he did Just use your favorite methodology to get the job done. already.

Why do they have to bad mouth it? Advertising, of the hey come check out my oil, it is so much better than your regular oil verity.

artsrc · on July 16, 2012

We, the software development professionals, have learned some things about the properties of OO techniques and should share that knowledge.

People use OO as a way to attack good design and defend junk. Here is an example:

> We believe bound method references are harmful because they detract from the simplicity of the Java programming language and the pervasively object-oriented character of the APIs.

From here:

http://java.sun.com/docs/white/delegates.html

glassx · on July 15, 2012

I don't know the context of the post, but remember that Erlang is a community effort, not a dictatorship. He has to expose his arguments and motives in order to keep using his favourite methodology.

Ingaz · on July 15, 2012

OOP as collection of some practical principles to structure code - is good.

But mandatory OOP and OOP as religion is something unforgivable for me.

nirvana · on July 15, 2012

I can express my OO ideas in erlang with no problem (objects become processes).

I cannot express my concurrency ideas-- that erlang makes super easy-- in the OO languages.

Maybe Go changes this but I haven't used Go yet.

glassx · on July 15, 2012

Agreed. And Joe Armstrong himself used this argument here: http://www.infoq.com/interviews/johnson-armstrong-oop

Ingaz · on July 15, 2012

Agreed that processes in Erlang is even better objects than objects in OOP.

>>I cannot express my concurrency ideas-- that erlang makes super easy-- in the OO languages.

In fact main Erlang ideas are many times "stolen". I discover Erlang in: Scala (obvious), F# (also obvious), D (also obvious).

Actors + messaging = "DIY Erlang".

Although I do not know a language that makes code hotswapping as fundamental feature.

michaelochurch · on July 15, 2012

The people who originally came up with OOP knew what they were doing. The inspiration was the cell, which hides immense mechanical complexity behind a simpler interface of electrical and chemical signals. When interfaces are simple, it limits the unexpected dependencies that can exist between software modules. Alan Kay wasn't saying, "Go off and write bloated objects" but, "When software must be complex, strive to provide simple APIs."

For example, when you write a SQL query, do you have to micromanage the database in how it's performed? No. That's an example of encapsulation. The implementations are different, but the interface is fairly stable. That's something OOP-like that works very well-- an interface that hides (encapsulates) complexity.

Objects are general and powerful, and that's part of the problem. "Power" isn't always good; GOTO is also extremely powerful, but should be used sparingly. Objects are not specific. Is it a function, or a tangle of methods, or a data object, or a "singleton" module? This isn't clear, and it becomes even less clear in the industrial world where tens of hands pass over code and it turns into mashed potatoes.

There are some good ideas in "object-oriented programming", but it's also an extremely complicated programming model and it's hard to develop OOP code correctly. If you don't know what "open recursion" is and why it's dangerous, you shouldn't be doing OOP.

What happened in the 1990s is that there was an effort (now becoming acknowledged as a failure) to commoditize programming talent-- to make 5 mediocre programmers able to replace a great one, and thereby prevent what we see now (the long-term "threat" of top software engineers outclassing professional managers in social status and compensation). Thus was born a bunch of design-pattern cargo-cult stuff designed to make programming slow, tedious, and limited but easy enough that mediocre people could do it, if they were stacked on top of each other in large enough numbers. Thus were born bloated, horrible codebases that bastardized "object-oriented programming" beyond imagination-- 21st-century spaghetti code.

People should be required to learn the basics of programming first. They should start with immutable data objects, referentially transparent functions. Mutable reference cells can come next as an optimization. Then, it's a good idea to learn type systems through OCaml or Haskell. After that, they can tackle the hybrid OO/FP of Scala. (Once you've learned Scala, there's no reason to use Java unless you need performance.) There are times to use OO and times to use FP, but if you aren't smart or curious or dedicated enough to grok FP, then you'll never actually understand OO either and you have no business trying to use it.

People learn best when they're presented with one new concept at a time, and the problem with OOP is that it presents tens of new concepts at the same time, with no separation. It leads to cargo-cult programming because people start coupling concepts that don't necessarily belong together.