A Programming Idiom You've Never Heard Of

psykotic · on Jan 4, 2012

This idiom is even more useful in functional programming where the last step is an inverse of the first step in the applicative sense rather than in the side-effects sense.

Mathematicians call it conjugation, and it can be usefully thought of as a temporary change of coordinates or change in vantage point. Problems often become simpler, even trivial, when working in the right coordinates.

In practical programming, it is commonplace that the "inverse" that occurs in conjugation idioms is a genuine right-inverse but only partially a left-inverse. This is true for his example of squares and square roots. For another instance, consider

    reverseWords = unwords . reverse . words

The composition words . unwords :: [String] -> [String] is the identity, but unwords . words :: String -> String is the identity only for strings whose words are separated by a single space rather than multiple spaces, tabs, etc. This property can actually be put to good use:

    normalizeWhitespace = unwords . words

If you like, you can think of unwords . words as an orthogonal projection onto the subspace of strings on which unwords is a genuine left-inverse of words (cf. Moore-Penrose pseudoinverses).

That's cool and all, but the effect on reverseWords is that it normalizes interword whitespace in the process of reversing words, whether you like it or not.

Incidentally, it's not too hard to abstract out this pattern with invertible functions:

    data Invertible a b = Invertible (a -> b) (b -> a)

    invertible = Invertible
    apply (Invertible f f') = f
    unapply (Invertible f f') = f'
    inverse (Invertible f f') = Invertible f' f
    compose (Invertible f f') (Invertible g g') = Invertible (f . g) (g' . f')
    conjugate i f = unapply i . f . apply i
    on = flip conjugate

Now you'd have to set up invertibles for all the usual foo/unfoo pairs of functions. Assume that has been done. Our example only needs words/unwords:

    import Data.String as S
    words = invertible S.words S.unwords

With those preliminaries out of the way, you can then reimplement reverseWords very neatly:

    reverseWords = reverse `on` words

In fact, it's now so concise and self-explanatory that you hardly need the named definition anymore and can just refer to reverse `on` words directly.

sfvisser · on Jan 4, 2012

One of the interesting properties of the Invertible datatype is that it can be made an instance of the Haskell Category type class, which makes invertible functions fully composable using the (.) operator.

dons · on Jan 4, 2012

Yes, composable data structure editors (lenses) with an overloaded (.) make my day to day work much more pleasant.

jfoutz · on Jan 4, 2012

ya know, i was thinking of bracket, but your example is better.

fwiw, the haskell bracket docs contain "captures a common allocate, compute, deallocate idiom in which the deallocation step must occur even in the case of an error during computation. This is similar to try-catch-finally in Java. "

agumonkey · on Jan 4, 2012

Some guy at Adobe Research made a conference about inverse relations in GUI context ( classical example being the degrees to fahrenheit relation ). I can't locate the URL right now * frustrated * I think it was their C++ guru .. can't remember anything else // ahh self cache miss

agumonkey · on Jan 6, 2012

Turns out I greatly exagerated my memories (sic), he was mentionning equivalence relation using c++ concepts then went into constraint/dependant interactive input.

http://stlab.adobe.com/wiki/images/2/20/2008_07_25_google.pd...

page 18, 38

this was hosted at googletechtalks http://www.youtube.com/watch?v=4moyKUHApq4 (240p)

shubber · on Jan 5, 2012

This idea is incredibly seductive to me, but the trouble I've always had is made completely lucid in your description of "conjugates" that are "only partially left inverse." The issue is that so many operations that it would be nice to be able to use as a conjugation aren't actually reversible.

At best, you have situations where the flawed reversal is acceptable - the Ruby File::open("file"){} example, or the J example for computing magnitude. close(open("file")) isn't exactly as if nothing had happened.

sqrt(a2) ?= a // only if a >= 0

But, consider parsing and templating: Could you template back exactly the input that you parsed to get an internal representation? Only if every character of input is unambiguous and significant - in other words, almost never.

waqf · on Jan 4, 2012

> If you like, you can think of unwords . words as an orthogonal projection onto the subspace of strings on which unwords is a genuine left-inverse of words (cf. Moore-Penrose pseudoinverses).

It's even more like a reflector in a https://en.wikipedia.org/wiki/Reflective_subcategory.

(To make this analogy precise, you'd have to have some category structure where single-space is initial among all whitespace strings. Which is certainly possible; you just have to persuade yourself that that's a natural thing to do.)

ArbitraryLimits · on Jan 4, 2012

Another nice simple example of the "pseudoinverse idiom" is when you have a "boolean" variable in C, i.e. 0 is false and non-zero is true, and you want to set a particular bit in a bit flag to 0/1 from this boolean value. If the origianl "boolean" value is foo, you do "!!foo" since !<non-zero> == 0 and !0 == 1, and bit-shift the result. I never made the connection with Moore-Penrose pseudoinverses, but there it is.

tmoertel · on Jan 6, 2012

Actually (unwords . words) is the identity only for lists of strings containing no whitespace. For example:

   Prelude> let xs = [" a"] in xs == (words . unwords) xs
   False

psykotic · on Jan 7, 2012

Thanks for catching that!

Martijn · on Jan 4, 2012

Also:

  identity = Invertible id id

sgentle · on Jan 4, 2012

A similar pattern that blew my mind when I first saw it was ruby's scoped open().

instead of

  f = open "test.txt"
  print f.read
  f.close

You can write

  open "text.txt" do |f|
    print f.read
  end

The file is automatically closed at the end of the block, and it also handles exceptions and so on. I find it also lines up better semantically with how I think about opening handles to things.

I believe Python also added something similar via the 'with' syntax. Not sure about other languages.

jlarocco · on Jan 4, 2012

Yeah, Python has "context managers" that let you do the same thing. Here's the Python equivalent:

    with open("text.txt") as inf:
        print(inf.read())

User defined classes that implement special methods __enter__() and __exit__() can be used as context managers. More info: http://www.python.org/dev/peps/pep-0343/

For just file output/input Scheme has the "with-output-to-file" and "with-input-from-file" macros which redirect output and input to a given file. I vaguely remember Common Lisp having a similar thing, but I can't remember the name off hand.

pjscott · on Jan 4, 2012

Common Lisp has with-open-file for reading and writing files. It works pretty much the same way as everything else in this thread, so I won't post example code, but you can see some here:

http://www.lispworks.com/documentation/HyperSpec/Body/m_w_op...

198d · on Jan 4, 2012

I immediately thought of Python's context managers when I read this. What the article appeared to describe, which shouldn't be lost on any programmer, is "do stuff then clean up after yourself".

Luyt · on Jan 4, 2012

I often use:

  for line in open("name.txt"):
      print line # Or do something else with it.

reinhardt · on Jan 4, 2012

Me too when I'm lazy but we shouldn't; it happens to work fine on CPython but on implementations use garbage collection instead of reference counting (e.g. Jython) it leaks file handlers. The "with" statement is the right cross-platform way to handle resources.

rbanffy · on Jan 4, 2012

I'd imagine it's an implementation bug. Once the iterator finishes, the file should be closed and the file handler freed.

yangyang · on Jan 4, 2012

Perhaps in this case - but how about a case where you partially iterate over the file, but then the file object goes out of scope? CPython would close (and free) that immediately, but other implementations probably wouldn't.

rbanffy · on Jan 4, 2012

While editing the answer to masklinn comment, this case came to me. You both are, of course, right. There is no reliable way to close the file from the for iterator.

:-/

masklinn · on Jan 4, 2012

> I'd imagine it's an implementation bug.

No, it's a code bug.

> Once the iterator finishes, the file should be closed and the file handler freed.

There is no reason for that to happen, let alone for that to happen deterministically.

rbanffy · on Jan 4, 2012

Indeed. In generational garbage collectors, the file object may remain until the collector decides to collect it. But isn't the file object also raising a StopIteration error? Shouldn't it at least close itself then?

axiak · on Jan 4, 2012

You can always fp.seek(0) and iterate again. Iteration and file openness are separate concepts, and any API worth its salt (e.g. the python file api) keeps it that way.

euccastro · on Jan 4, 2012

If you've opened that file in the right mode, you can conceivably append to it once you reach its end. Even in read only mode, I imagine you may conceivably seek back for further reading?

rbanffy · on Jan 4, 2012

That wouldn't be the idiomatic "for line in file" loop.

masklinn · on Jan 4, 2012

As far as iteration goes, there is no difference between

    for line in open(somepath):
        # stuff

and

    f = open(somepath)
    for line in f:
        # stuff

and the second case would be outright broken if the file just decided to close itself when it stops iterating (as others noted, the developer may now want to append new content, or use file.seek and fly around the file's content).

Not to mention a major issue: `break`. The iterator gets no mention of breaks and the iteration is not terminated at this point (as far as the iterator is concerned, that is) so adding a `break` to your iteration (to stop it early because you've found the data/line you needed in a linear search) would suddenly start leaking file handlers where those were collected beforehand... not good.

rbanffy · on Jan 4, 2012

For that to work as it does in CPython, the file would have to know it has no name reference attached to it and that its scope is limited to the iteration and that it can close the file when the iteration ends. I'd love to have this idiom supported under other implementations - it's beautiful.

masklinn · on Jan 4, 2012

> For that to work as it does in CPython, the file would have to know it has no name reference attached to it and that its scope is limited to the iteration and that it can close the file when the iteration ends.

Which will never happen and really has nothing to do with iteration (or the file object itself)

> I'd love to have this idiom supported under other implementations - it's beautiful.

I disagree on its beauty, but beyond that I can assure you it's never going to be a property of python-the-language.

shubber · on Jan 5, 2012

On the other hand, the Ruby equivalent is

def in_context(*args) setup_context yield if block_given? ensure teardown_context end

I guess you could build a special class, but why?

repsilat · on Jan 4, 2012

In languages in which you can count on the timely cleanup of objects this can be done with the destructor. In C++, for instance, file handles are typically closed when the file object goes out of scope. Moreover, if an object has a file as a member variable, the object's default destructor will dispose of the file.

IIRC:

    {
        ifstream file("text.txt");
        copy(istream_iterator<string>(file), istream_iterator<string>(),
             ostream_iterator<string>(cout, "\n"));
    }//file automatically closed here

and

    class ThingWithFile {
        ifstream file;
      public:
        ThingWithFile(const char *filename) : file(filename) {}
    };
    void func() {
        ThingWithFile thing("file.name"); //file is opened
    } //file is closed

evincarofautumn · on Jan 4, 2012

Deterministic destruction is an excellent language feature. I really wish there were a garbage-collected language that separated destruction from deallocation, because the two are completely orthogonal.

When an object goes out of scope, the destructor should run immediately; then, when the garbage collector wants to actually deallocate destroyed objects, it can go right ahead. No idiocy like in C# with “using” blocks.

dpark · on Jan 4, 2012

If it were possible to efficiently detect "when an object goes out of scope" (actually the question is when the object becomes unreachable) in the general case, then every language would do it. They'd immediately run the destructor and immediate collect the memory, because why not? But it's not possible to efficiently detect when an object becomes unreachable in a general way, or at the very least we don't know how. (I'm not certain if it's actually been proven impossible.)

bunderbunder · on Jan 4, 2012

There's automatic reference counting, which kind of works in some cases. But fails spectacularly in others - a simple doubly-linked list will defeat ARC. In essence, any reasonably complex program that relies on it will probably leak memory, unless the programmer has taken the time to find the spots where the ARC fails and takes care of them manually.

If a reference cycle happens then the only good way to detect that the objects involved have been orphaned is to walk the object graph. Doing that every time an object is deallocated would be horribly inefficient. Also, if you're doing it then there's really no need for the reference counting in the first place, since the object graph walk is perfectly capable of finding all the orphan objects on its own. Much better to just do it periodically, so you can get a whole swath of objects each time you walk the graph.

At which point we've come to to garbage collection, and discovered why that's what all the kids are doing these days.

dpark · on Jan 4, 2012

Reference counting is also not free by itself (ignoring its cycle problems). There's overhead to all the incrementing and decrementing. It can add up in modern systems that toss around objects willy-nilly. Reference counting also adds a ton of additional synchronization points in a multi-threaded scenario.

bunderbunder · on Jan 4, 2012

Yup. I don't think it's any coincidence that the only really prominent place where ARC is being used is iOS.

It's a platform where reference counting was already in place, where you can have a reasonable hope that most the programs are too small and simple to take much of a hit on the overhead, and where memory leaks can't get too far out of hand because the OS reserves the right to kill processes whenever it wants.

masklinn · on Jan 4, 2012

> where you can have a reasonable hope that most the programs are too small and simple to take much of a hit on the overhead

What overhead? There's no runtime overhead to ARC compared to manual reference counting.

bunderbunder · on Jan 4, 2012

If I understand the distinction you draw between deallocation and destruction correctly, then in .NET a well-designed object that does not maintain unmanaged resources is effectively destroyed the moment the last reference to it goes out of scope. After all, it's disappeared from the world, never to appear again.

All the garbage collector does is comes through later and deallocates the memory it consumed.

ansgri · on Jan 4, 2012

Normally, the deallocation of the destroyed object has no visible effect, and you're not expected to do anything with object after it's destroyed.

So, in case of C++, object is destroyed when goes out of scope, but the physical act of deallocation may be deferred by the standard library as whatever 'smart memory manager' wishes.

ahupp · on Jan 4, 2012

To support this you'd need refcounting of each object on top of the gc, which would be a substantial overhead.

nostrademons · on Jan 4, 2012

Technically you only need refcounting of objects with a destructor. For everything else, destruction is a no-op and you can just let the GC re-use the object's memory. And if the destructor is used pretty much just for side effects (closing files & sockets etc.), these will typically be rare & pose little overhead.

Would suck in a dynamically-typed language, though, because you have no idea whether a reference points to an object with or without a destructor, and so always have to pay the overhead of checking and possibly decrementing the refcount. I don't really see this working without a static type system.

BTW, you're often better off with explicitly-scoped destruction (eg. the Ruby & Python way) than refcounting, for the same reason that boost::shared_ptr can be tricky and Java programs can leak memory. If you don't have clear object ownership & lifetimes, it's very easy to stash a reference to an object in some longer-lived object (a cache, for example) and accidentally hold onto it way longer than anticipated.

dpark · on Jan 4, 2012

> Would suck in a dynamically-typed language, though, because you have no idea whether a reference points to an object with or without a destructor, and so always have to pay the overhead of checking and possibly decrementing the refcount. I don't really see this working without a static type system.

It's worse than that. Any object that contains references to other destructible objects (including transitive references), must have an implicit destructor to decrement the refcounts of the destructible objects it points to. In fact, any object that holds a reference, directly or indirectly, to a generic "Object" class or equivalent must have an implicit destructor. This makes a huge part of the object graph implicitly destructible.

shangaslammi · on Jan 4, 2012

This is exactly what Python does, and while Python isn't the fastest language around, it's not that slow either, when compared to other dynamically typed, interpreted languages.

nostrademons · on Jan 4, 2012

OTOH, refcounting is one of the main reasons why it's virtually impossible to eliminate the GIL, which is a showstopper for many multi-core usages of Python.

Jare · on Jan 4, 2012

My understanding is that circular references wreak havoc with the concept of 'scope of a single object', and make implicit deterministic destruction unsafe if not impossible:

a = new A(); b = new B(); a.b = b; b.a = a;

Whichever you destroy first, the other destructor will have a reference to an unconstructed object.

repsilat · on Jan 4, 2012

To get around this you can distinguish between references that imply ownership and those that don't. In C++ pointers and references do not imply ownership, but values do.

It isn't ideal, but I think it's an impossible situation - I think predictable destruction is fundamentally at odds with automatic, transparent memory management when you push the ideas to their limits. The idea of automatic resource cleanup is tied closely to the idea of tight (recursive) resource ownership. Automatic memory management exists mostly to take care of the cases when object ownership and lifetime isn't clear.

Reference-counting smart pointers (used properly) can technically give you deterministic destruction, but it's kinda beside the point. We want our file handles and our connections closed as soon as we're done with them, not some time in the future when someone else's code cleans up its references to our old objects.

bunderbunder · on Jan 4, 2012

Which is where .NET hits a pretty good compromise, IMO. For things where deterministic cleanup really is important (file handles, connections, COM objects, etc.) there's IDisposable to provide for deterministic destruction of that stuff.

Folks do still complain that they can't deterministically destroy managed resources such as instances of String. I suspect those folks have missed the plot. That desire is indeed at odds with the memory manager; what is being requested could not (for the most part - the CLR's large object heap does complicate the story a bit) be provided without wholesale downgrading .NET to a less efficient collection algorithm.

It'd be much wiser to instead learn how to implement performance-critical routines that really would benefit from manual memory management in a language that's better suited to that kind of task, such as C. Whining that a garbage collected run-time uses garbage collection is just being silly, not entirely unlike complaining that your horse isn't very good at swimming.

spullara · on Jan 4, 2012

Objects aren't deallocated in garbage collected languages. The memory they had allocated just doesn't get copied to the new heap -- which is a big difference.

nostrademons · on Jan 4, 2012

You could argue that objects aren't deallocated in non-GC languages either, the memory they had allocated is just available for other objects to use. Which sounds kinda like "deallocated" to me, and yet it's the exact same thing as the allocation pointer in a stop & copy GC being free to overwrite them.

spullara · on Jan 4, 2012

Generally in other languages the memory is immediately freed and available to be reallocated. This is not true in most garbage collection strategies.

bunderbunder · on Jan 4, 2012

Nor is it particularly desirable. One of the cornerstones of how a modern generational GC performs as well as it does is that blocks of objects are freed up all at once.

If they were deallocated in a piecemeal fashion then:

- Freeing up memory would take longer, because you can't just change a pointer back to the root of the generation.

- Allocating memory would take longer, because it involves searching for a big enough empty slot rather than just throwing the new object on top of the heap.

- More memory is wasted overall, because of heap fragmentation.

richdougherty · on Jan 4, 2012

That depends on the GC implementation. Maybe you're saying that because most modern collectors tend to generational collection, and copy between generation heaps.

But copying is not required for GC. The trusty mark-and-sweep algorithm doesn't copy, for example.

See: http://en.wikipedia.org/wiki/Garbage_collection_%28computer_...

Someone · on Jan 4, 2012

In fact, one can claim that mark-and-sweep is one of the few garbage collection algorithms that actually collect garbage.

Many others collect non-garbage. Because there typically is way less non-garbage than garbage, that is what makes them competitive with alloc/free.

richdougherty · on Jan 4, 2012

That's an interesting point, thanks.

alok-g · on Jan 4, 2012

I totally wonder why do they not have it this way. I have posted the same question on various forums but have never heard any real answer.

richdougherty · on Jan 4, 2012

Well, deterministic deallocation is difficult to combine with non-deterministic garbage collection.

Going out of scope is deterministic, but the object that has gone out of scope might still be referenced from somewhere if the reference has "escaped". Sometimes automatic escape analysis can work when references escape, but not always.

Would you like the object to have its destructor called even though it might be still be accessible? What would happen if the program subsequently tries to access the object?

Most languages are naturally uncomfortable with tidying up some resoure while it still might referenced. For this reason languages prefer to leave it up to the programmer to explicitly manage destruction themselves.

A perfect example is C#'s IDisposable interface. The language obviously "knows" that a disposable object needs cleaning up after it is used - that's what IDisposable means - but it cannot work out when this should happen. So it lets the programmer decide when that actually happens. If a programmer wants destruction based on scope then she should use the using statement. If she wants destruction based on some other condition then that is possible too. The language cannot make this decision.

I hope that answers your question.

statictype · on Jan 4, 2012

The CLR also has different concepts for destruction and deallocation. Not sure what the using block and IDisposable have to do with that.

Shorel · on Jan 4, 2012

Yeah!

Objects in the stack are easy to use, logical, and orders of magnitude faster to instantiate than objects in the heap!

(when the underlying runtime uses malloc)

drewrv · on Jan 4, 2012

I believe c# has something similar too, "using"

  using (FileStream fs = File.Open(path, FileMode.Open, FileAccess.Write, FileShare.None))
  {
      Byte[] info = new UTF8Encoding(true).GetBytes("This is some text in the file.");
      // Add some information to the file.
      fs.Write(info, 0, info.Length);
  }

kenneth_chau · on Jan 4, 2012

I was about to put this down. Although, the C# world requires an IDisposable interface and implies that you're operating on a resource.

Ruby has similar concept when you open a file and pass in a block to operate on the File object.

I'm pretty sure the author is just excited about the language and failed to recognize that this pattern is everywhere in different programming languages.

ansgri · on Jan 4, 2012

The author is excited that functions have their inverses by default, I think. Analogous construct in Python would be somethoug like:

  with_under do_something() as f:
    ...

translated to

  f = do_something()
  ...
  f.undo_something()

With do_something automatically matched to undo_something.

noblethrasher · on Jan 4, 2012

You can do the same thing in C# that you can in Ruby

    new FileEx (SOME_PATH, () => ConsoleEx.ReadLine (), () => ConsoleEx.CurrentLine != "QUIT").WriteAll ();

is equivalent to

    using (var file = new File(SOME_PATH))
    {
        string line = null;

        try
        {
             while((line = Console.ReadLine) != "QUIT")
             {
                file.WriteLine(line);
             }
        }
        catch (Exception ex)
        {
            throw;
        }
        finally
        {
            sw.Close();
        }
    }

darklajid · on Jan 4, 2012

For a second I was intrigued and tried to find that FileEx class that I obviously missed to notice before.

It doesn't seem to be part of the framework though? Homegrown and _you_ expand it to something that is equivalent to the second code snippet?

noblethrasher · on Jan 4, 2012

Yes, it is homegrown[1]. I have quite a few little utilities that use (or abuse) iterators to get something like RAII in C# (I don't like relying on using/IDisposable and I especially don't like requiring it in my public types).

Another one that I use a lot [2]:

    SqlCommandEx sql = "select * from person";

    foreach (dynamic result in sql)
    {
        var full_name = result.first_name + " " + result.last_name;

        DateTime dob = result.dob;
         //do other stuff

    } //connection automatically disposed.

[1] Relevant source: http://pastebin.com/LwHD8MHc

[2] https://github.com/noblethrasher/Prelude/blob/master/SQL.cs

masklinn · on Jan 4, 2012

I'm going to assume I missed something, because from what I'm reading:

1. You're avoiding IDisposable for reasons you don't explain and I fail to see

2. You replace it with an idiom which is supposed to do the same thing (except abusing iterators instead of using tools dedicated to that job) and which I don't believe works, as your connection will not (as far as I can see) be disposed of if there's an error in the foreach block. And the code you linked does not seem to use this "pattern" anywhere

noblethrasher · on Jan 4, 2012

Avoiding using/IDisposable is taste thing; I'm not saying that anyone else should do it.

If called from the `foreach` statement, the code in a finally block of an iterator is guaranteed to execute even if the loop exits early due to a `break` or exception. The compiler will also generate a call to Dispose() as well.

Also, it's not really an abuse. When dealing with an unmanaged resource, you're almost always either pulling a stream of data out or pushing a stream of data in. That's precisely the use case of IEnumerable and its dual, IObservable (SqlCommandEx is an example of IEnumerable and FileEx is a loose example of IObservable).

P.S. Now there might be a thus far undiscovered bug in my SqlCommandEx code to which I linked (e.g. the compiler only generates calls to Dispose() on IEnumerator<T> which might not work for dynamic types, I haven't checked) but it's possible to do what I described in principle.

yread · on Jan 4, 2012

Yes, you can do this but I usually just end up using the helper functions i.e

      System.IO.File.AppendAllText(path, "This is some text in the file.");

Also, why are you writing text to binary file instead of just writing the text as a text file?

spullara · on Jan 4, 2012

Java 7 has a implementation pattern built in for this now:

String readFirstLineFromFile(String path) throws IOException { try (BufferedReader br = new BufferedReader(new FileReader(path))) { return br.readLine(); } }

Anything that implements the AutoCloseable interface gets this behavior for free.

draegtun · on Jan 4, 2012

In Perl you can just use scope:

  {
    open my $fh, '<', "text.txt";
    print scalar <$fh>;
  }

  # $fh goes out of scope so file handle is closed

piranha · on Jan 4, 2012

That's a bit different, Python without `with` does the same, but that's just property of garbage collector in this case.

draegtun · on Jan 4, 2012

Correct. To mimic it exactly is straightforward though:

  sub with_open_file {
      my ($filename, $code) = @_;
      open my $fh, '<', $filename or die $!;
      $code->($_) for <$fh>;
      close $fh;
  }

  with_open_file 'text.txt' => sub {
      my $f = shift;
      print $f;
  };

And to add to my original answer it's not just the effects of GC that are handy but also arbitrary scoping. Here is an example of what you can do with nested dynamic scoping:

    our $rake = "in shed";
    sub where_is_rake { say $rake }

    where_is_rake;  # => in shed

    {
        # take rake out of shed
        local $rake = "rake in hand";
        where_is_rake;  # => rake in hand
    
        {
            # pile up leaves
            local $rake = "using rake";
            where_is_rake;  # = using rake
        }
    
        where_is_rake;  # => rake in hand
    }

    where_is_rake;  # => in shed

stevecooperorg · on Jan 4, 2012

C# and VB have the `using` statement;

    using(var db = OpenDbConnection())
    {
       // read from db, connection closed on next curly.
    }

WalterBright · on Jan 4, 2012

In D, this is done with scope guard (or you can use destructors):

    f = open("test.txt");
    scope(exit) f.close();
    print(f.read());

The scope(exit) causes the following statement to be executed at the exit of the scope, regardless of how it leaves (return, goto, fall through, throw, etc.).

The advantage over try-finally is the 'finally' statement is associated with the open, rather than some arbitrary distance away.

masklinn · on Jan 4, 2012

> In D, this is done with scope guard

Which is nowhere near as good, as the closing operation has to be setup explicitly. Go has the same issue with `defer`.

WalterBright · on Jan 5, 2012

The various tradeoffs between try-finally, RAII, and scope guard is discussed here:

http://www.d-programming-language.org/exception-safe.html

Scope guard has a number of distinct advantages.

derleth · on Jan 4, 2012

Common Lisp has had with-open-file for a long time now. The great thing about with-open-file is that it ensures the file gets closed regardless of what the code in the scope does, using unwind-protect. This implies you can use the same method to create your own with-held-resource macros that work the same way.

michaelochurch · on Jan 4, 2012

Clojure has with-open, to the same effect. I don't think Scala has one natively but the Odersky book covers how to make one using a by-name parameter and an anonymous class.

thristian · on Jan 4, 2012

Interestingly, Python has the "with" statement for this basic genre of operation (executing a block of code while some particular resource is held), although it doesn't require that the third, "undo" step be automatically derived from the first step - it's just a protocol where a 'start' and 'end' method are called on the object representing the resource.

It sure makes it easy to be sure you're closing your file handles/database connections/mutexes, though.

heyitsnick · on Jan 4, 2012

I find it very handy with GUI programming under python too.

with wait_cursor(wxwindow):

   ... change cursor to hourglass whilst process runs...

I have similar context processors for locking controls or other temporary GUI changes

chrislipa · on Jan 4, 2012

This idiom is common in mathematics, where it's called a conjugate.

Any time you have an element, y, of a group, the action of pre-composing with any other element, x, and post-composing with x's inverse to get xyx^{-1} is called conjugating, and the result is a called a conjugate of y.

http://en.wikipedia.org/wiki/Conjugation_(group_theory)

robotresearcher · on Jan 4, 2012

Constructor / Destructor pairs are how you do this in C++. This is so fundamental to the object system that it's a bit steep to claim that "you've never heard of it". You have. The author's example of "get the rake - use rake - put away rake" maps exactly to [construct - call methods - destruct].

Given the jmp-iness of exception handling in C++, this is the recommended (i.e. the only safe) way to clean up after yourself.

humbledrone · on Jan 4, 2012

There's even a commonly used name for it: Resource Acquisition is Initialization (RAII). It's been around for many years, and is fundamental to modern C++ style. It's beyond steep to claim we've never heard of it, just because some language has a feature that helps make it easy. Python has "with," Ruby has blocks, many languages have higher-order functions. Not an uncommon concept.

dextorious · on Jan 4, 2012

This is just for resource allocation.

The pattern the article talks about is FAR more general -- it's about all reversible processes, even things that have nothing to do with allocation.

masklinn · on Jan 4, 2012

> This is just for resource allocation.

It's the same thing, you can represent the pair (operation, inverse) as a "resource". Python calls it "context management" (and context managers), the means are slightly different, the result's the same.

I agree that OP talks about something of a wider scope though, the language features described in these sub-threads are specialized solutions to precise issues, they're not generalized reversal of arbitrary operations.

hythloday · on Jan 4, 2012

Sentry classes, which are a generalization of RAII, are used idiomatically in C++ for non-allocation tasks, such as whitespace skipping, logging, and profiling.

JoeAltmaier · on Jan 4, 2012

Also mapping state to scope e.g. mutex objects that set/clear on ctor/dtor

Shorel · on Jan 4, 2012

We can add code to destructors to do general stuff, not only deallocation. The same applies for constructors.

In C++ this works amazingly well for objects in the stack (not in the heap, as you need call the destructor yourself).

sigil · on Jan 4, 2012

Is there a programming languages where functions have inverses?

How about a computer architecture where every instruction has an inverse, and therefore all functions and even programs are invertible? (Quantum computing, for example.)

http://en.wikipedia.org/wiki/Reversible_computing

andrewflnr · on Jan 5, 2012

Interesting. I'm sure you could do a VM where all the opcodes are reversible. Definining inverses of functions as inverses of opcodes and other functions would then be feasible. Might make a cool project.

jbarham · on Jan 4, 2012

Go handles this type of situation nicely using deferred function calls:

"Go's defer statement schedules a function call (the deferred function) to be run immediately before the function executing the defer returns. It's an unusual but effective way to deal with situations such as resources that must be released regardless of which path a function takes to return. The canonical examples are unlocking a mutex or closing a file." http://golang.org/doc/effective_go.html#defer

v-yadli · on Jan 4, 2012

Seems that everybody is talking about files. :-) Open, operate, close. I think the author tries to bring us more, mathematically. Like a matrix operation: Move to origin, rotate, move back. But this is a lot harder than just "resource guard", imo. Any ideas?

morsch · on Jan 4, 2012

Exactly. So far I'm kind of disappointed by the HN debate, it seems like many people only read half the article and are then quick to point out that their own favorite language has a way to release resources when done.

I liked the sum under square better than the read under open example (although the latter may be more relatable). Sum under square squares items, sums them up, and then performs the "inverse-square" operation on the sum. I don't think this is similar to using X or RAII in C++.

scscsc · on Jan 4, 2012

Bidirectional programming is somewhat related.

http://www.cis.upenn.edu/~bcpierce/papers/lenses-etapsslides...

e.g. read a file, parse it into an AST, modify AST (i.e. rename a variable), unparse the new AST, close the file.

andrewflnr · on Jan 5, 2012

Awesome slides. Would you happen to have a transcript for it as well. The slides themselves are pretty clear, but there's probably stuff missing.

lispm · on Jan 4, 2012

That's in many cases what a LET with a dynamic binding does in Lisp.

Set origin draw Reset Origin

is

    (let ((origin (move 50 50)))
      (draw something))

Similar in the Common Lisp object system we use :before, :after and :around methods.

    (defmethod draw :before () (move 50 50))

    (defmethod draw () (draw-something))

    (defmethod draw :after () (move -50 -50))

draegtun · on Jan 4, 2012

> Similar in the Common Lisp object system we use :before, :after and :around methods

Ditto for Moose (in Perl)...

  before draw => sub { shift->move( 50, 50 )};

  sub draw { shift->draw_something }

  after draw => sub { shift->move( -50, -50 )};

Also you can package up these method modifiers into a role which not only can be composed into a class but can also be applied just to an object...

  {
    package DrawRole;
    use Moose::Role;
    
    before draw => sub { shift->move( 50, 50   )};
    after  draw => sub { shift->move( -50, -50 )};
  }

  my $d = Draw->new;
  DrawRole->meta->apply( $d );

Shorel · on Jan 4, 2012

I think you mean Special Operator UNWIND-PROTECT.

sdevlin · on Jan 4, 2012

People have posted a lot of examples in a lot of different programming languages. What they show is that all you really need to build this idiom is first class functions (or some approximation thereof). (It also shows that the title is not correct; many, many people are familiar with this idiom.)

When you need to wrap up a common block of code, you use a function. This lets you call the same block in a number of different contexts. When you need to fix the context in place and let the block in the middle vary, you write a function that takes another function as a parameter. An example in JavaScript (might be a little off):

    var withFile = function (path, action) {
        var f;
        try {
            f = File.open(path);
            action(f);
        } finally {
            File.close(f);
        }
    };
    
    withFile('foo.txt', function (f) { ... });

This can be generalized (to some extent and depending on language) so you don't need to write the same structures every time:

    var context = function (prolog, epilog) {
        return function (action, ...args) {
            var o;
            try {
                o = prolog.apply(args);
                action(o);
            } finally {
                epilog(o);
            }
        };
    };
    
    var withFile = context(File.open, File.close);

This example uses JavaScript' hypothetical future "rest" parameters, but there are other ways to manage the same goal.

(Other comments have mentioned the author conflates context management with applying reversible functions. I've only dealt with one of those things here, but similar techniques apply in solving both problems.)

dwc · on Jan 4, 2012

I think that few people are actually aware of this as an idiom. Rather, many people are aware of just a few special cases, but have not considered this in the more general form.

All the examples of various languages providing this for open files also kind of misses it. The point is not whether a particular language supports this for open files, but whether people recognize the idiom and the language supports doing this yourself wherever it fits. For a pattern this incredibly common it's a point well worth making.

sdevlin · on Jan 4, 2012

Yeah, you may be right that the full generality of the pattern is not appreciated.

For what it's worth, I think most commonly used languages can implement some form of the context function in my example.

masklinn · on Jan 4, 2012

> People have posted a lot of examples in a lot of different programming languages. What they show is that all you really need to build this idiom is first class functions (or some approximation thereof).

Or macros. Your `context` is called `unwind-protect` in common-lisp.

There's a difference in that `unwind-protect` does not take an "opening form" (which you call `prolog`), and this makes your `context` implementation broken: you're executing the epilog if the prolog blows up, which is not correct as you have an `o` of `undefined` at this point (and your epilog likely awaits some sort of valid resource).

Side-note, your javascript code also looks incorrect, `apply` takes a context (a `this`) as its first parameter, so you're not applying arguments you're taking the `args` array as the `this` of your prolog. You probably wanted to write `prolog.apply(null, args)`.

jey · on Jan 4, 2012

C++'s variant is RAII: http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initial...

angelbob · on Jan 4, 2012

What RAII and Ruby's block statement lack here are automatically doing the reversing of functionality.

Few languages have any procedural way to do that.

If Ruby had some way to state "the opposite of opening a file is closing a file" and then use it everywhere, we could do that. The block-syntax open is nearly that, but not quite. You'd need to have no non-block-scope way to use the resource for it to work.

RAII would also give the same benefit, if there were no non-RAII way to declare the resource.

strictfp · on Jan 4, 2012

Hmm. I don't agree with this conclusion. Take for instance a file class in C++ which opens the file descriptor in the constructor and closes it in the destructor. Now, for any instance of this class, it is impossible for the open fd to outlive the instance!. That is an as strong guarantee as I need! Now, if I leak the instance the fd will leak too, but that is more an issue of manual lifecycle management if you ask me.

This very significant property is what makes me love RAII. I don't count C#'s using-statement and/or rubys block statement as RAII since it relies on useage patterns. C++'s automatic destruction of graphs of objects also makes it outshine all other methodologies, and the guaranteed destruction is the key here.

vidarh · on Jan 4, 2012

RAII is a usage pattern: You must stack-allocate an object or wrap it in a suitable smart pointer. It's just as easy to do things differently as it is with Ruby's blocks or C#'s using statement. The reason it has a name is because it is a pattern that's talked about so much exactly because following the pattern is so intrinsic to modern C++ programming.

Once you're passing a block to a method like this in Ruby, the execution of the cleanup code is just as guaranteed as it is in C++.

strictfp · on Jan 4, 2012

I don't agree completely. As I said, you cannot leak the fd without leaking the memory of the object. This means that in C++ you get the correct behavior as long as you take care of the lifecycle of your objects correctly. You don't need to do stack allocation or smart pointers as long as you de-allocate the object when your are done with it. So the lifetime of the fd is tightly bound to the lifetime of the wrapper instance. If you stack allocate without deallocation you have a bug _anyway_ in your C++ program, thus I would like to argue that it is a problem of manual lifecycle management, not a flaw in the RAII implementation.

You do however have a point when you say that you can miss deallocation and that is just as easy as forgetting to put a variable in a block. In an ideal world I would like to be able to decouple de-allocation time from destruction time, as someone mentioned above they are completely orthagonal issues. If this would be possible, one could guarantee the destructor callback, but defer deallocation until later. This would probably give leeway to more effective memory allocation patterns while keeping RAII alive.

And another thing about blocks: how do you deal with trees and graphs of objects? That is not trivial when using blocks. You can get around it, but continuing that way would most likely lead you to something like Haskells monads (the "programmable semicolon" aspect of them).

fbuilesv · on Jan 4, 2012

You'd need to have no non-block-scope way to use the resource for it to work.

You can already do that, just `return unless block_given?`. In principle I don't see why you'd want to non block version anyway but maybe there's some use case where that is valid.

Shorel · on Jan 4, 2012

IIRC, Common Lisp has an UNWIND-PROTECT operator for that. It works even if exceptions are thrown, and it is the reason Common Lisp (as opposed to Scheme) has no continuations.

ubernostrum · on Jan 4, 2012

Python's context managers have that; define the two sides of the functionality in __enter__()/__exit__(), and voila.

C#'s IDIsposable has this, too.

So do other languages. It's really not that unusual a pattern.

d0mine · on Jan 4, 2012

It is a common idiom in C++ RAII https://en.wikipedia.org/wiki/Resource_acquisition_is_initia...

Context managers in Python:

  with fly('Seattle'):
      see_sights()

Real-world example from Fabric http://fabfile.org:

  with cd('/to/path'):
      do_stuff()

rmc · on Jan 4, 2012

Python has this with the "with" statement. It's often used for files, but you can write your own with classes and an __enter__ and __exit__ method.

However you can use the ``contextlib`` module to make really simple ``contextmanger``'s with the yield statement

e.g.:

    from contextlib import contextmanager
    
    @contextmanager
    def glcontext():
      glBegin()
      yield
      glEnd()

(This is an example where someone wanted to do OpenGL in python)

sbmassey · on Jan 4, 2012

The OP seems to be conflating two different things: undoing an action (closing a file, etc), and performing the inverse of a reversible function.

The first is amply covered by the RAII/unwind-protect etc idioms than most languages do have.

The second is not, as most programming languages don't have the concept. The closest I can think of is the ability to use a predicate in either direction in Prolog (or logical programming in general)

tikhonj · on Jan 4, 2012

This reminds me of my friend's project for a recent hackathon where he implemented a language that guaranteed every function in it was reversible. It would let you do something like this for any function.

Unfortunately, I don't think he found it particularly useful and gave up on it after the hackathon. Still, it was a cool project.

edsrzf · on Jan 4, 2012

This sounds fun. Is there a link anywhere?

snprbob86 · on Jan 4, 2012

There are a pair of research languages called Harmony and Boomerang [1] that does this with what they call "lenses". There are also some easy-reading introductory slides [2,3].

[1] http://www.seas.upenn.edu/~harmony/

[2] http://www.cis.upenn.edu/~bcpierce/papers/lenses-etapsslides...

[3] http://www.cis.upenn.edu/~bcpierce/papers/harmful-mfps.pdf

tikhonj · on Jan 4, 2012

Here's the link: http://stanford.edu/~jacobt/reversible.html

I don't think anything happened to it since the hackathon. Judging by own experience with hackathons, I wouldn't expect too much in the way of code quality. (Reading some of my old hackathon code still makes me shudder :)).

sdevlin · on Jan 4, 2012

C#'s 'using' statements allow this kind of pattern for types implementing IDisposable. It looks something like this (might be a little off):

    using (var file = File.Open("foo.txt"))
    {
        ...
    }

dubajj · on Jan 5, 2012

This is what I thought of immediately except to implement the end with behavior you have to override the IDisposable shut down sequence? (does that make sense?)

cpt1138 · on Jan 4, 2012

LISP has had this for years:

(with-open-file (output "foo.txt" :direction :output :if-exists :supersede) (print-foo output))

andrewflnr · on Jan 4, 2012

That's not a general op/inverse thing like J apparently has, though. Also, which lisp?

prodigal_erik · on Jan 4, 2012

That looks like Common Lisp, which also has a general version named unwind-protect: http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec...

SeanLuke · on Jan 4, 2012

Not really. unwind-protect is just Lisp's version of Java's "finally" statement.

prodigal_erik · on Jan 4, 2012

Oh you're right, it'd need to take a param with the setup form(s) and somehow look up the corresponding cleanup form(s), sort of like getter and setter code registered with setf.

SeanLuke · on Jan 4, 2012

Terminologically speaking, Scheme is a Lisp. And ELisp is a Lisp. And Autodesk VisualLisp is a Lisp. And Clojure is a Lisp. And Arc is a Lisp.

But Common Lisp is Lisp.

bowyakka · on Jan 4, 2012

? My lisp has more lispery than your lisp ?

Jach · on Jan 4, 2012

It's the difference between being Lisp and being a member of the Lisp Family (aka "being a Lisp"). It's a fun way to weed out 1-bit people from the population: http://www.xach.com/naggum/articles/3225130472400367@naggum.... Furthermore Common Lisp is a Lisp-2 whereas Scheme is a Lisp-1. (Aesthetically I prefer Scheme.) So really there are several working meanings at play when people say "Lisp", this is frustrating for 1-bit people.

My usual heuristic in using context clues to determine which meaning is meant: if there is no qualifier for a particular family member of Lisp, I assume something is talking about the Lisp family, unless accompanied by a code example, in which case it's probably Common Lisp. When I see someone type "LISP" I usually think "This person doesn't know Lisp", just like how you'll sometimes see people type FORTH or PERL when talking about those languages. But that's not always the case so it's useful to double-check before calling them out, especially if they give code examples. (They may just be old after all. ;) )

bowyakka · on Jan 4, 2012

Oh for sure, I was just poking a little bit of fun, thanks for the Naggum article though I had not come across that one.

FWIW I tend to think of common-lisp when people say lisp in general, the schemers and clojurites tend to self identify better.

zackzackzack · on Jan 4, 2012

Clojure has it too.

If you want to abstract the pattern out even further, then all you are doing is running a function within a certain context and then getting everything back to normal. Using under in J, scoping it out with ruby, or using with-open in Lisp are all examples of the pattern, just different ways of thinking about it.

anghyflawn · on Jan 4, 2012

I think what you're describing is somewhat different: temporarily changing context is like a let block. So for instance in CL you can rebind standard-output to a different stream using let, but then you have to do the legwork to ensure that everything is cleaned up if something goes wrong, while with-open-file or with-output-to-stream will do part of that work.

cpt1138 · on Jan 4, 2012

ssp · on Jan 4, 2012

The language Nickle (http://nickle.org) has a "twixt" control structure:

    twixt (expression1, expression2)
           statement1

If expression1 is true, then statement1 will be executed followed by expression2 in that order. The twixt statement guarantees that expression2 is always executed if expression1 is true, similarly to how "finally" works in java.

groovy2shoes · on Jan 4, 2012

> If expression1 is true, then statement1 will be executed followed by expression2 in that order. The twixt statement guarantees that expression2 is always executed if expression1 is true, similarly to how "finally" works in java.

Am I missing something here? How is that different from, say:

    if (expression1) {
        statement1
        expression2
    }

?

ssp · on Jan 4, 2012

It's different in that expression1 is always evaluated whenever control enters the twixt and expression2 is always evaluated whenever it leaves. A simple example is that if statement1 throws an exception, then expression2 is evaluated before the exception is handled. This is how finally works in Java.

More interestingly, Nickle has support for continuations, so if statement1 generates a continuation and makes a non-local exit, then expression2 is evaluated before that exit, and expression1 is evaluated before the continuation is resumed.

In other words, statement1 is always bracketed by expression1/expression2, even in the presence of non-local control flow.

groovy2shoes · on Jan 4, 2012

Aha! Thanks for clearing that up.

ajuc · on Jan 4, 2012

If there was a way to mark functions and their inverses consistently in language, then functions with inverses could be used in (semanticaly) pure functional code, even if they were destructive.

For (stupid) example:

    def reverse(x) as inversible with(reverse): #special case - reverse is itself inverse function
        x = x.reversed() #I'm too lazy to code this by hand

    def lastFive(x) as pure: #let's say language checks if pure function only uses pure and inversable functions
        return x[:5] under reverse

Here lastFive is pure function even thought reverse is destructive.

This assumes everybody sticks to one definition of reversing something, in this case functions f and g are inverse of each other, when after invoking f(x), doing only pure or reversable things to x, and invoking g(x), x is unchanged.

And I don't know if this will be useful for anything - the only point seems to be avoiding making copies and still making pure functions. But resulting code, while semanticaly pure, is not threadsafe. So I don't know if it's worth anything.

drx · on Jan 4, 2012

You could dispense with the 'you've never heard of'.

phzbOx · on Jan 4, 2012

IIRC, we use this Idiom in SICP while doing symbolic algebra.

For instance, when we have x+y=z and know only x and z, we need to reverse the equation to have y=z-x. The system was built on nodes which would propagate change (a-la events).

IMO, python's with statement with __enter__ and __exit__ looks mostly like this.

niklas_a · on Jan 4, 2012

I refuse out of principle to click on links with headlines that are so obviously trying to trick me into going to a site.

Make the headline informative instead, if you have great content people will read and share it anyway.

Better version of this headline:

XXXX: A programming idiom that helps you do YYYY

Zak · on Jan 4, 2012

Other posts have already mentioned that specific macros of this form exist in several Lisps. I've seen more than one macro-writing guide that talks about the with-something macro pattern for managing resources, and I've seen it applied in libraries that provide macros like with-database and with-query-results.

CL and Clojure could easily support a generic with macro requiring that there be an open and close method for the resource in question.

dromidas · on Jan 4, 2012

In C# this is "using". using(new something as something) { Do something with it. } //it goes away now.

Typically this idiom is the function call itself. The act of sipping coffee is the point of the function, and picking up the cup and putting the cup back down are the whole point of the function call.

mvanveen · on Jan 4, 2012

This reminds me of Suzuki's idiom, "Leave No Trace," found in the book Zen Mind, Beginner's Mind.

singingwolfboy · on Jan 4, 2012

Python has this idiom for a while now: it's known as a "context manager", also known as the "with" statement. Here's the PEP: http://www.python.org/dev/peps/pep-0343/

kabdib · on Jan 4, 2012

The BETA language does this with methods (you can't override a method, but you can add pre-and-post operations that wrap it).

I was never convinced this was a great idiom. BETA didn't become very popular, in any event.

jon6 · on Jan 4, 2012

Beta lives on in the Racket class system as (inner).

http://docs.racket-lang.org/reference/createclass.html?q=inn...)

thurn · on Jan 4, 2012

An interesting approach to this is the one used in Go, where the "defer" keyword schedules code to run on function exit. It's simple and doesn't require adding more nesting to your code.

vidarh · on Jan 4, 2012

Ruby has "ensure" if you want something to be guaranteed to run on exit - it can be used at the end of methods, or by explicitly wrapping code in begin .. end.

radiowave · on Jan 4, 2012

(Assuming it works the same way as Smalltalk's ensure...) a really nice aspect of this is that the resource is held open while you're in the debugger, but cleaned up when you finish debugging.

rbanffy · on Jan 4, 2012

"Never heard of" is quite a stretch.

During my dark past as a VB programmer (early-mid 90's) I even indented the code for pairs like this.

lelf · on Jan 4, 2012

Oh, c'mon! All people with with and RAII and files, please reread the article. At least the last example.

jodrellblank · on Jan 4, 2012

I was going to argue that point, but glance at the J docs here:

http://www.jsoftware.com/help/dictionary/d202n.htm

Obverse is defined in a table and some special cases, and there's a built in "this is the obverse of that" instruction.

You could easily tie square/sort together as a pair in Python and it wouldn't be significantly different.

Under(square, sum, mylist) or whatever.

tkahn6 · on Jan 4, 2012

Pretty sure Ruby has this.

File.open(name, mode) receives a block, executes it, and then closes the file.

silentbicycle · on Jan 4, 2012

Ruby has a specific instance of that pattern, but to my knowledge it lacks a way to say, "do some arbitrary operation that has a side-effect, do this, then undo the operation". The function taking the block still has special-case code to open and close files.

skrebbel · on Jan 4, 2012

You still have to implement the inverse of the arbitrary operation, at some point, somehow.

A programming language can't figure out what the inverse of open() is without someone somewhere telling it to close(). The point of the idiom is to have a library figure that out so that users don't have to. Therefore, Ruby supports it fully (there where API developers are sane, that is).

silentbicycle · on Jan 5, 2012

Having that built in for every side-effect in the language is pretty amazing, though.

Case by case, the block scoping idiom is a good way to handle it.