Hacker News new | past | comments | ask | show | jobs | submit login
A Programming Idiom You've Never Heard Of (dadgum.com)
239 points by pavel_lishin on Jan 4, 2012 | hide | past | favorite | 154 comments



This idiom is even more useful in functional programming where the last step is an inverse of the first step in the applicative sense rather than in the side-effects sense.

Mathematicians call it conjugation, and it can be usefully thought of as a temporary change of coordinates or change in vantage point. Problems often become simpler, even trivial, when working in the right coordinates.

In practical programming, it is commonplace that the "inverse" that occurs in conjugation idioms is a genuine right-inverse but only partially a left-inverse. This is true for his example of squares and square roots. For another instance, consider

    reverseWords = unwords . reverse . words
The composition words . unwords :: [String] -> [String] is the identity, but unwords . words :: String -> String is the identity only for strings whose words are separated by a single space rather than multiple spaces, tabs, etc. This property can actually be put to good use:

    normalizeWhitespace = unwords . words
If you like, you can think of unwords . words as an orthogonal projection onto the subspace of strings on which unwords is a genuine left-inverse of words (cf. Moore-Penrose pseudoinverses).

That's cool and all, but the effect on reverseWords is that it normalizes interword whitespace in the process of reversing words, whether you like it or not.

Incidentally, it's not too hard to abstract out this pattern with invertible functions:

    data Invertible a b = Invertible (a -> b) (b -> a)

    invertible = Invertible
    apply (Invertible f f') = f
    unapply (Invertible f f') = f'
    inverse (Invertible f f') = Invertible f' f
    compose (Invertible f f') (Invertible g g') = Invertible (f . g) (g' . f')
    conjugate i f = unapply i . f . apply i
    on = flip conjugate
Now you'd have to set up invertibles for all the usual foo/unfoo pairs of functions. Assume that has been done. Our example only needs words/unwords:

    import Data.String as S
    words = invertible S.words S.unwords
With those preliminaries out of the way, you can then reimplement reverseWords very neatly:

    reverseWords = reverse `on` words
In fact, it's now so concise and self-explanatory that you hardly need the named definition anymore and can just refer to reverse `on` words directly.


One of the interesting properties of the Invertible datatype is that it can be made an instance of the Haskell Category type class, which makes invertible functions fully composable using the (.) operator.


Yes, composable data structure editors (lenses) with an overloaded (.) make my day to day work much more pleasant.


ya know, i was thinking of bracket, but your example is better.

fwiw, the haskell bracket docs contain "captures a common allocate, compute, deallocate idiom in which the deallocation step must occur even in the case of an error during computation. This is similar to try-catch-finally in Java. "


Some guy at Adobe Research made a conference about inverse relations in GUI context ( classical example being the degrees to fahrenheit relation ). I can't locate the URL right now * frustrated * I think it was their C++ guru .. can't remember anything else // ahh self cache miss


Turns out I greatly exagerated my memories (sic), he was mentionning equivalence relation using c++ concepts then went into constraint/dependant interactive input.

http://stlab.adobe.com/wiki/images/2/20/2008_07_25_google.pd...

page 18, 38

this was hosted at googletechtalks http://www.youtube.com/watch?v=4moyKUHApq4 (240p)


This idea is incredibly seductive to me, but the trouble I've always had is made completely lucid in your description of "conjugates" that are "only partially left inverse." The issue is that so many operations that it would be nice to be able to use as a conjugation aren't actually reversible.

At best, you have situations where the flawed reversal is acceptable - the Ruby File::open("file"){} example, or the J example for computing magnitude. close(open("file")) isn't exactly as if nothing had happened.

sqrt(a2) ?= a // only if a >= 0

But, consider parsing and templating: Could you template back exactly the input that you parsed to get an internal representation? Only if every character of input is unambiguous and significant - in other words, almost never.


> If you like, you can think of unwords . words as an orthogonal projection onto the subspace of strings on which unwords is a genuine left-inverse of words (cf. Moore-Penrose pseudoinverses).

It's even more like a reflector in a https://en.wikipedia.org/wiki/Reflective_subcategory.

(To make this analogy precise, you'd have to have some category structure where single-space is initial among all whitespace strings. Which is certainly possible; you just have to persuade yourself that that's a natural thing to do.)


Another nice simple example of the "pseudoinverse idiom" is when you have a "boolean" variable in C, i.e. 0 is false and non-zero is true, and you want to set a particular bit in a bit flag to 0/1 from this boolean value. If the origianl "boolean" value is foo, you do "!!foo" since !<non-zero> == 0 and !0 == 1, and bit-shift the result. I never made the connection with Moore-Penrose pseudoinverses, but there it is.


Actually (unwords . words) is the identity only for lists of strings containing no whitespace. For example:

   Prelude> let xs = [" a"] in xs == (words . unwords) xs
   False


Thanks for catching that!


Also:

  identity = Invertible id id


A similar pattern that blew my mind when I first saw it was ruby's scoped open().

instead of

  f = open "test.txt"
  print f.read
  f.close
You can write

  open "text.txt" do |f|
    print f.read
  end
The file is automatically closed at the end of the block, and it also handles exceptions and so on. I find it also lines up better semantically with how I think about opening handles to things.

I believe Python also added something similar via the 'with' syntax. Not sure about other languages.


Yeah, Python has "context managers" that let you do the same thing. Here's the Python equivalent:

    with open("text.txt") as inf:
        print(inf.read())
User defined classes that implement special methods __enter__() and __exit__() can be used as context managers. More info: http://www.python.org/dev/peps/pep-0343/

For just file output/input Scheme has the "with-output-to-file" and "with-input-from-file" macros which redirect output and input to a given file. I vaguely remember Common Lisp having a similar thing, but I can't remember the name off hand.


Common Lisp has with-open-file for reading and writing files. It works pretty much the same way as everything else in this thread, so I won't post example code, but you can see some here:

http://www.lispworks.com/documentation/HyperSpec/Body/m_w_op...


I immediately thought of Python's context managers when I read this. What the article appeared to describe, which shouldn't be lost on any programmer, is "do stuff then clean up after yourself".


I often use:

  for line in open("name.txt"):
      print line # Or do something else with it.


Me too when I'm lazy but we shouldn't; it happens to work fine on CPython but on implementations use garbage collection instead of reference counting (e.g. Jython) it leaks file handlers. The "with" statement is the right cross-platform way to handle resources.


I'd imagine it's an implementation bug. Once the iterator finishes, the file should be closed and the file handler freed.


Perhaps in this case - but how about a case where you partially iterate over the file, but then the file object goes out of scope? CPython would close (and free) that immediately, but other implementations probably wouldn't.


While editing the answer to masklinn comment, this case came to me. You both are, of course, right. There is no reliable way to close the file from the for iterator.

:-/


> I'd imagine it's an implementation bug.

No, it's a code bug.

> Once the iterator finishes, the file should be closed and the file handler freed.

There is no reason for that to happen, let alone for that to happen deterministically.


Indeed. In generational garbage collectors, the file object may remain until the collector decides to collect it. But isn't the file object also raising a StopIteration error? Shouldn't it at least close itself then?


You can always fp.seek(0) and iterate again. Iteration and file openness are separate concepts, and any API worth its salt (e.g. the python file api) keeps it that way.


If you've opened that file in the right mode, you can conceivably append to it once you reach its end. Even in read only mode, I imagine you may conceivably seek back for further reading?


That wouldn't be the idiomatic "for line in file" loop.


As far as iteration goes, there is no difference between

    for line in open(somepath):
        # stuff
and

    f = open(somepath)
    for line in f:
        # stuff
and the second case would be outright broken if the file just decided to close itself when it stops iterating (as others noted, the developer may now want to append new content, or use file.seek and fly around the file's content).

Not to mention a major issue: `break`. The iterator gets no mention of breaks and the iteration is not terminated at this point (as far as the iterator is concerned, that is) so adding a `break` to your iteration (to stop it early because you've found the data/line you needed in a linear search) would suddenly start leaking file handlers where those were collected beforehand... not good.


For that to work as it does in CPython, the file would have to know it has no name reference attached to it and that its scope is limited to the iteration and that it can close the file when the iteration ends. I'd love to have this idiom supported under other implementations - it's beautiful.


> For that to work as it does in CPython, the file would have to know it has no name reference attached to it and that its scope is limited to the iteration and that it can close the file when the iteration ends.

Which will never happen and really has nothing to do with iteration (or the file object itself)

> I'd love to have this idiom supported under other implementations - it's beautiful.

I disagree on its beauty, but beyond that I can assure you it's never going to be a property of python-the-language.


On the other hand, the Ruby equivalent is

def in_context(*args) setup_context yield if block_given? ensure teardown_context end

I guess you could build a special class, but why?


In languages in which you can count on the timely cleanup of objects this can be done with the destructor. In C++, for instance, file handles are typically closed when the file object goes out of scope. Moreover, if an object has a file as a member variable, the object's default destructor will dispose of the file.

IIRC:

    {
        ifstream file("text.txt");
        copy(istream_iterator<string>(file), istream_iterator<string>(),
             ostream_iterator<string>(cout, "\n"));
    }//file automatically closed here
and

    class ThingWithFile {
        ifstream file;
      public:
        ThingWithFile(const char *filename) : file(filename) {}
    };
    void func() {
        ThingWithFile thing("file.name"); //file is opened
    } //file is closed


Deterministic destruction is an excellent language feature. I really wish there were a garbage-collected language that separated destruction from deallocation, because the two are completely orthogonal.

When an object goes out of scope, the destructor should run immediately; then, when the garbage collector wants to actually deallocate destroyed objects, it can go right ahead. No idiocy like in C# with “using” blocks.


If it were possible to efficiently detect "when an object goes out of scope" (actually the question is when the object becomes unreachable) in the general case, then every language would do it. They'd immediately run the destructor and immediate collect the memory, because why not? But it's not possible to efficiently detect when an object becomes unreachable in a general way, or at the very least we don't know how. (I'm not certain if it's actually been proven impossible.)


There's automatic reference counting, which kind of works in some cases. But fails spectacularly in others - a simple doubly-linked list will defeat ARC. In essence, any reasonably complex program that relies on it will probably leak memory, unless the programmer has taken the time to find the spots where the ARC fails and takes care of them manually.

If a reference cycle happens then the only good way to detect that the objects involved have been orphaned is to walk the object graph. Doing that every time an object is deallocated would be horribly inefficient. Also, if you're doing it then there's really no need for the reference counting in the first place, since the object graph walk is perfectly capable of finding all the orphan objects on its own. Much better to just do it periodically, so you can get a whole swath of objects each time you walk the graph.

At which point we've come to to garbage collection, and discovered why that's what all the kids are doing these days.


Reference counting is also not free by itself (ignoring its cycle problems). There's overhead to all the incrementing and decrementing. It can add up in modern systems that toss around objects willy-nilly. Reference counting also adds a ton of additional synchronization points in a multi-threaded scenario.


Yup. I don't think it's any coincidence that the only really prominent place where ARC is being used is iOS.

It's a platform where reference counting was already in place, where you can have a reasonable hope that most the programs are too small and simple to take much of a hit on the overhead, and where memory leaks can't get too far out of hand because the OS reserves the right to kill processes whenever it wants.


> where you can have a reasonable hope that most the programs are too small and simple to take much of a hit on the overhead

What overhead? There's no runtime overhead to ARC compared to manual reference counting.


If I understand the distinction you draw between deallocation and destruction correctly, then in .NET a well-designed object that does not maintain unmanaged resources is effectively destroyed the moment the last reference to it goes out of scope. After all, it's disappeared from the world, never to appear again.

All the garbage collector does is comes through later and deallocates the memory it consumed.


Normally, the deallocation of the destroyed object has no visible effect, and you're not expected to do anything with object after it's destroyed.

So, in case of C++, object is destroyed when goes out of scope, but the physical act of deallocation may be deferred by the standard library as whatever 'smart memory manager' wishes.


To support this you'd need refcounting of each object on top of the gc, which would be a substantial overhead.


Technically you only need refcounting of objects with a destructor. For everything else, destruction is a no-op and you can just let the GC re-use the object's memory. And if the destructor is used pretty much just for side effects (closing files & sockets etc.), these will typically be rare & pose little overhead.

Would suck in a dynamically-typed language, though, because you have no idea whether a reference points to an object with or without a destructor, and so always have to pay the overhead of checking and possibly decrementing the refcount. I don't really see this working without a static type system.

BTW, you're often better off with explicitly-scoped destruction (eg. the Ruby & Python way) than refcounting, for the same reason that boost::shared_ptr can be tricky and Java programs can leak memory. If you don't have clear object ownership & lifetimes, it's very easy to stash a reference to an object in some longer-lived object (a cache, for example) and accidentally hold onto it way longer than anticipated.


> Would suck in a dynamically-typed language, though, because you have no idea whether a reference points to an object with or without a destructor, and so always have to pay the overhead of checking and possibly decrementing the refcount. I don't really see this working without a static type system.

It's worse than that. Any object that contains references to other destructible objects (including transitive references), must have an implicit destructor to decrement the refcounts of the destructible objects it points to. In fact, any object that holds a reference, directly or indirectly, to a generic "Object" class or equivalent must have an implicit destructor. This makes a huge part of the object graph implicitly destructible.


This is exactly what Python does, and while Python isn't the fastest language around, it's not that slow either, when compared to other dynamically typed, interpreted languages.


OTOH, refcounting is one of the main reasons why it's virtually impossible to eliminate the GIL, which is a showstopper for many multi-core usages of Python.


My understanding is that circular references wreak havoc with the concept of 'scope of a single object', and make implicit deterministic destruction unsafe if not impossible:

a = new A(); b = new B(); a.b = b; b.a = a;

Whichever you destroy first, the other destructor will have a reference to an unconstructed object.


To get around this you can distinguish between references that imply ownership and those that don't. In C++ pointers and references do not imply ownership, but values do.

It isn't ideal, but I think it's an impossible situation - I think predictable destruction is fundamentally at odds with automatic, transparent memory management when you push the ideas to their limits. The idea of automatic resource cleanup is tied closely to the idea of tight (recursive) resource ownership. Automatic memory management exists mostly to take care of the cases when object ownership and lifetime isn't clear.

Reference-counting smart pointers (used properly) can technically give you deterministic destruction, but it's kinda beside the point. We want our file handles and our connections closed as soon as we're done with them, not some time in the future when someone else's code cleans up its references to our old objects.


Which is where .NET hits a pretty good compromise, IMO. For things where deterministic cleanup really is important (file handles, connections, COM objects, etc.) there's IDisposable to provide for deterministic destruction of that stuff.

Folks do still complain that they can't deterministically destroy managed resources such as instances of String. I suspect those folks have missed the plot. That desire is indeed at odds with the memory manager; what is being requested could not (for the most part - the CLR's large object heap does complicate the story a bit) be provided without wholesale downgrading .NET to a less efficient collection algorithm.

It'd be much wiser to instead learn how to implement performance-critical routines that really would benefit from manual memory management in a language that's better suited to that kind of task, such as C. Whining that a garbage collected run-time uses garbage collection is just being silly, not entirely unlike complaining that your horse isn't very good at swimming.


Objects aren't deallocated in garbage collected languages. The memory they had allocated just doesn't get copied to the new heap -- which is a big difference.


You could argue that objects aren't deallocated in non-GC languages either, the memory they had allocated is just available for other objects to use. Which sounds kinda like "deallocated" to me, and yet it's the exact same thing as the allocation pointer in a stop & copy GC being free to overwrite them.


Generally in other languages the memory is immediately freed and available to be reallocated. This is not true in most garbage collection strategies.


Nor is it particularly desirable. One of the cornerstones of how a modern generational GC performs as well as it does is that blocks of objects are freed up all at once.

If they were deallocated in a piecemeal fashion then:

- Freeing up memory would take longer, because you can't just change a pointer back to the root of the generation.

- Allocating memory would take longer, because it involves searching for a big enough empty slot rather than just throwing the new object on top of the heap.

- More memory is wasted overall, because of heap fragmentation.


That depends on the GC implementation. Maybe you're saying that because most modern collectors tend to generational collection, and copy between generation heaps.

But copying is not required for GC. The trusty mark-and-sweep algorithm doesn't copy, for example.

See: http://en.wikipedia.org/wiki/Garbage_collection_%28computer_...


In fact, one can claim that mark-and-sweep is one of the few garbage collection algorithms that actually collect garbage.

Many others collect non-garbage. Because there typically is way less non-garbage than garbage, that is what makes them competitive with alloc/free.


That's an interesting point, thanks.


I totally wonder why do they not have it this way. I have posted the same question on various forums but have never heard any real answer.


Well, deterministic deallocation is difficult to combine with non-deterministic garbage collection.

Going out of scope is deterministic, but the object that has gone out of scope might still be referenced from somewhere if the reference has "escaped". Sometimes automatic escape analysis can work when references escape, but not always.

Would you like the object to have its destructor called even though it might be still be accessible? What would happen if the program subsequently tries to access the object?

Most languages are naturally uncomfortable with tidying up some resoure while it still might referenced. For this reason languages prefer to leave it up to the programmer to explicitly manage destruction themselves.

A perfect example is C#'s IDisposable interface. The language obviously "knows" that a disposable object needs cleaning up after it is used - that's what IDisposable means - but it cannot work out when this should happen. So it lets the programmer decide when that actually happens. If a programmer wants destruction based on scope then she should use the using statement. If she wants destruction based on some other condition then that is possible too. The language cannot make this decision.

I hope that answers your question.


The CLR also has different concepts for destruction and deallocation. Not sure what the using block and IDisposable have to do with that.


Yeah!

Objects in the stack are easy to use, logical, and orders of magnitude faster to instantiate than objects in the heap!

(when the underlying runtime uses malloc)


I believe c# has something similar too, "using"

  using (FileStream fs = File.Open(path, FileMode.Open, FileAccess.Write, FileShare.None))
  {
      Byte[] info = new UTF8Encoding(true).GetBytes("This is some text in the file.");
      // Add some information to the file.
      fs.Write(info, 0, info.Length);
  }


I was about to put this down. Although, the C# world requires an IDisposable interface and implies that you're operating on a resource.

Ruby has similar concept when you open a file and pass in a block to operate on the File object.

I'm pretty sure the author is just excited about the language and failed to recognize that this pattern is everywhere in different programming languages.


The author is excited that functions have their inverses by default, I think. Analogous construct in Python would be somethoug like:

  with_under do_something() as f:
    ...
translated to

  f = do_something()
  ...
  f.undo_something()
With do_something automatically matched to undo_something.


You can do the same thing in C# that you can in Ruby

    new FileEx (SOME_PATH, () => ConsoleEx.ReadLine (), () => ConsoleEx.CurrentLine != "QUIT").WriteAll ();
is equivalent to

    using (var file = new File(SOME_PATH))
    {
        string line = null;

        try
        {
             while((line = Console.ReadLine) != "QUIT")
             {
                file.WriteLine(line);
             }
        }
        catch (Exception ex)
        {
            throw;
        }
        finally
        {
            sw.Close();
        }
    }


For a second I was intrigued and tried to find that FileEx class that I obviously missed to notice before.

It doesn't seem to be part of the framework though? Homegrown and _you_ expand it to something that is equivalent to the second code snippet?


Yes, it is homegrown[1]. I have quite a few little utilities that use (or abuse) iterators to get something like RAII in C# (I don't like relying on using/IDisposable and I especially don't like requiring it in my public types).

Another one that I use a lot [2]:

    SqlCommandEx sql = "select * from person";

    foreach (dynamic result in sql)
    {
        var full_name = result.first_name + " " + result.last_name;

        DateTime dob = result.dob;
         //do other stuff

    } //connection automatically disposed.
[1] Relevant source: http://pastebin.com/LwHD8MHc

[2] https://github.com/noblethrasher/Prelude/blob/master/SQL.cs


I'm going to assume I missed something, because from what I'm reading:

1. You're avoiding IDisposable for reasons you don't explain and I fail to see

2. You replace it with an idiom which is supposed to do the same thing (except abusing iterators instead of using tools dedicated to that job) and which I don't believe works, as your connection will not (as far as I can see) be disposed of if there's an error in the foreach block. And the code you linked does not seem to use this "pattern" anywhere


Avoiding using/IDisposable is taste thing; I'm not saying that anyone else should do it.

If called from the `foreach` statement, the code in a finally block of an iterator is guaranteed to execute even if the loop exits early due to a `break` or exception. The compiler will also generate a call to Dispose() as well.

Also, it's not really an abuse. When dealing with an unmanaged resource, you're almost always either pulling a stream of data out or pushing a stream of data in. That's precisely the use case of IEnumerable and its dual, IObservable (SqlCommandEx is an example of IEnumerable and FileEx is a loose example of IObservable).

P.S. Now there might be a thus far undiscovered bug in my SqlCommandEx code to which I linked (e.g. the compiler only generates calls to Dispose() on IEnumerator<T> which might not work for dynamic types, I haven't checked) but it's possible to do what I described in principle.


Yes, you can do this but I usually just end up using the helper functions i.e

      System.IO.File.AppendAllText(path, "This is some text in the file.");
Also, why are you writing text to binary file instead of just writing the text as a text file?


Java 7 has a implementation pattern built in for this now:

String readFirstLineFromFile(String path) throws IOException { try (BufferedReader br = new BufferedReader(new FileReader(path))) { return br.readLine(); } }

Anything that implements the AutoCloseable interface gets this behavior for free.


In Perl you can just use scope:

  {
    open my $fh, '<', "text.txt";
    print scalar <$fh>;
  }

  # $fh goes out of scope so file handle is closed


That's a bit different, Python without `with` does the same, but that's just property of garbage collector in this case.


Correct. To mimic it exactly is straightforward though:

  sub with_open_file {
      my ($filename, $code) = @_;
      open my $fh, '<', $filename or die $!;
      $code->($_) for <$fh>;
      close $fh;
  }

  with_open_file 'text.txt' => sub {
      my $f = shift;
      print $f;
  };

And to add to my original answer it's not just the effects of GC that are handy but also arbitrary scoping. Here is an example of what you can do with nested dynamic scoping:

    our $rake = "in shed";
    sub where_is_rake { say $rake }

    where_is_rake;  # => in shed

    {
        # take rake out of shed
        local $rake = "rake in hand";
        where_is_rake;  # => rake in hand
    
        {
            # pile up leaves
            local $rake = "using rake";
            where_is_rake;  # = using rake
        }
    
        where_is_rake;  # => rake in hand
    }

    where_is_rake;  # => in shed


C# and VB have the `using` statement;

    using(var db = OpenDbConnection())
    {
       // read from db, connection closed on next curly.
    }


In D, this is done with scope guard (or you can use destructors):

    f = open("test.txt");
    scope(exit) f.close();
    print(f.read());
The scope(exit) causes the following statement to be executed at the exit of the scope, regardless of how it leaves (return, goto, fall through, throw, etc.).

The advantage over try-finally is the 'finally' statement is associated with the open, rather than some arbitrary distance away.


> In D, this is done with scope guard

Which is nowhere near as good, as the closing operation has to be setup explicitly. Go has the same issue with `defer`.


The various tradeoffs between try-finally, RAII, and scope guard is discussed here:

http://www.d-programming-language.org/exception-safe.html

Scope guard has a number of distinct advantages.


Common Lisp has had with-open-file for a long time now. The great thing about with-open-file is that it ensures the file gets closed regardless of what the code in the scope does, using unwind-protect. This implies you can use the same method to create your own with-held-resource macros that work the same way.


Clojure has with-open, to the same effect. I don't think Scala has one natively but the Odersky book covers how to make one using a by-name parameter and an anonymous class.


Interestingly, Python has the "with" statement for this basic genre of operation (executing a block of code while some particular resource is held), although it doesn't require that the third, "undo" step be automatically derived from the first step - it's just a protocol where a 'start' and 'end' method are called on the object representing the resource.

It sure makes it easy to be sure you're closing your file handles/database connections/mutexes, though.


I find it very handy with GUI programming under python too.

with wait_cursor(wxwindow):

   ... change cursor to hourglass whilst process runs...
I have similar context processors for locking controls or other temporary GUI changes


This idiom is common in mathematics, where it's called a conjugate.

Any time you have an element, y, of a group, the action of pre-composing with any other element, x, and post-composing with x's inverse to get xyx^{-1} is called conjugating, and the result is a called a conjugate of y.

http://en.wikipedia.org/wiki/Conjugation_(group_theory)


Constructor / Destructor pairs are how you do this in C++. This is so fundamental to the object system that it's a bit steep to claim that "you've never heard of it". You have. The author's example of "get the rake - use rake - put away rake" maps exactly to [construct - call methods - destruct].

Given the jmp-iness of exception handling in C++, this is the recommended (i.e. the only safe) way to clean up after yourself.


There's even a commonly used name for it: Resource Acquisition is Initialization (RAII). It's been around for many years, and is fundamental to modern C++ style. It's beyond steep to claim we've never heard of it, just because some language has a feature that helps make it easy. Python has "with," Ruby has blocks, many languages have higher-order functions. Not an uncommon concept.


This is just for resource allocation.

The pattern the article talks about is FAR more general -- it's about all reversible processes, even things that have nothing to do with allocation.


> This is just for resource allocation.

It's the same thing, you can represent the pair (operation, inverse) as a "resource". Python calls it "context management" (and context managers), the means are slightly different, the result's the same.

I agree that OP talks about something of a wider scope though, the language features described in these sub-threads are specialized solutions to precise issues, they're not generalized reversal of arbitrary operations.


Sentry classes, which are a generalization of RAII, are used idiomatically in C++ for non-allocation tasks, such as whitespace skipping, logging, and profiling.


Also mapping state to scope e.g. mutex objects that set/clear on ctor/dtor


We can add code to destructors to do general stuff, not only deallocation. The same applies for constructors.

In C++ this works amazingly well for objects in the stack (not in the heap, as you need call the destructor yourself).


Is there a programming languages where functions have inverses?

How about a computer architecture where every instruction has an inverse, and therefore all functions and even programs are invertible? (Quantum computing, for example.)

http://en.wikipedia.org/wiki/Reversible_computing


Interesting. I'm sure you could do a VM where all the opcodes are reversible. Definining inverses of functions as inverses of opcodes and other functions would then be feasible. Might make a cool project.


Go handles this type of situation nicely using deferred function calls:

"Go's defer statement schedules a function call (the deferred function) to be run immediately before the function executing the defer returns. It's an unusual but effective way to deal with situations such as resources that must be released regardless of which path a function takes to return. The canonical examples are unlocking a mutex or closing a file." http://golang.org/doc/effective_go.html#defer


Seems that everybody is talking about files. :-) Open, operate, close. I think the author tries to bring us more, mathematically. Like a matrix operation: Move to origin, rotate, move back. But this is a lot harder than just "resource guard", imo. Any ideas?


Exactly. So far I'm kind of disappointed by the HN debate, it seems like many people only read half the article and are then quick to point out that their own favorite language has a way to release resources when done.

I liked the sum under square better than the read under open example (although the latter may be more relatable). Sum under square squares items, sums them up, and then performs the "inverse-square" operation on the sum. I don't think this is similar to using X or RAII in C++.


Bidirectional programming is somewhat related.

http://www.cis.upenn.edu/~bcpierce/papers/lenses-etapsslides...

e.g. read a file, parse it into an AST, modify AST (i.e. rename a variable), unparse the new AST, close the file.


Awesome slides. Would you happen to have a transcript for it as well. The slides themselves are pretty clear, but there's probably stuff missing.


That's in many cases what a LET with a dynamic binding does in Lisp.

Set origin draw Reset Origin

is

    (let ((origin (move 50 50)))
      (draw something))
Similar in the Common Lisp object system we use :before, :after and :around methods.

    (defmethod draw :before () (move 50 50))

    (defmethod draw () (draw-something))

    (defmethod draw :after () (move -50 -50))


> Similar in the Common Lisp object system we use :before, :after and :around methods

Ditto for Moose (in Perl)...

  before draw => sub { shift->move( 50, 50 )};

  sub draw { shift->draw_something }

  after draw => sub { shift->move( -50, -50 )};
Also you can package up these method modifiers into a role which not only can be composed into a class but can also be applied just to an object...

  {
    package DrawRole;
    use Moose::Role;
    
    before draw => sub { shift->move( 50, 50   )};
    after  draw => sub { shift->move( -50, -50 )};
  }

  my $d = Draw->new;
  DrawRole->meta->apply( $d );


I think you mean Special Operator UNWIND-PROTECT.


People have posted a lot of examples in a lot of different programming languages. What they show is that all you really need to build this idiom is first class functions (or some approximation thereof). (It also shows that the title is not correct; many, many people are familiar with this idiom.)

When you need to wrap up a common block of code, you use a function. This lets you call the same block in a number of different contexts. When you need to fix the context in place and let the block in the middle vary, you write a function that takes another function as a parameter. An example in JavaScript (might be a little off):

    var withFile = function (path, action) {
        var f;
        try {
            f = File.open(path);
            action(f);
        } finally {
            File.close(f);
        }
    };
    
    withFile('foo.txt', function (f) { ... });
This can be generalized (to some extent and depending on language) so you don't need to write the same structures every time:

    var context = function (prolog, epilog) {
        return function (action, ...args) {
            var o;
            try {
                o = prolog.apply(args);
                action(o);
            } finally {
                epilog(o);
            }
        };
    };
    
    var withFile = context(File.open, File.close);
This example uses JavaScript' hypothetical future "rest" parameters, but there are other ways to manage the same goal.

(Other comments have mentioned the author conflates context management with applying reversible functions. I've only dealt with one of those things here, but similar techniques apply in solving both problems.)


I think that few people are actually aware of this as an idiom. Rather, many people are aware of just a few special cases, but have not considered this in the more general form.

All the examples of various languages providing this for open files also kind of misses it. The point is not whether a particular language supports this for open files, but whether people recognize the idiom and the language supports doing this yourself wherever it fits. For a pattern this incredibly common it's a point well worth making.


Yeah, you may be right that the full generality of the pattern is not appreciated.

For what it's worth, I think most commonly used languages can implement some form of the context function in my example.


> People have posted a lot of examples in a lot of different programming languages. What they show is that all you really need to build this idiom is first class functions (or some approximation thereof).

Or macros. Your `context` is called `unwind-protect` in common-lisp.

There's a difference in that `unwind-protect` does not take an "opening form" (which you call `prolog`), and this makes your `context` implementation broken: you're executing the epilog if the prolog blows up, which is not correct as you have an `o` of `undefined` at this point (and your epilog likely awaits some sort of valid resource).

Side-note, your javascript code also looks incorrect, `apply` takes a context (a `this`) as its first parameter, so you're not applying arguments you're taking the `args` array as the `this` of your prolog. You probably wanted to write `prolog.apply(null, args)`.



What RAII and Ruby's block statement lack here are automatically doing the reversing of functionality.

Few languages have any procedural way to do that.

If Ruby had some way to state "the opposite of opening a file is closing a file" and then use it everywhere, we could do that. The block-syntax open is nearly that, but not quite. You'd need to have no non-block-scope way to use the resource for it to work.

RAII would also give the same benefit, if there were no non-RAII way to declare the resource.


Hmm. I don't agree with this conclusion. Take for instance a file class in C++ which opens the file descriptor in the constructor and closes it in the destructor. Now, for any instance of this class, it is impossible for the open fd to outlive the instance!. That is an as strong guarantee as I need! Now, if I leak the instance the fd will leak too, but that is more an issue of manual lifecycle management if you ask me.

This very significant property is what makes me love RAII. I don't count C#'s using-statement and/or rubys block statement as RAII since it relies on useage patterns. C++'s automatic destruction of graphs of objects also makes it outshine all other methodologies, and the guaranteed destruction is the key here.


RAII is a usage pattern: You must stack-allocate an object or wrap it in a suitable smart pointer. It's just as easy to do things differently as it is with Ruby's blocks or C#'s using statement. The reason it has a name is because it is a pattern that's talked about so much exactly because following the pattern is so intrinsic to modern C++ programming.

Once you're passing a block to a method like this in Ruby, the execution of the cleanup code is just as guaranteed as it is in C++.


I don't agree completely. As I said, you cannot leak the fd without leaking the memory of the object. This means that in C++ you get the correct behavior as long as you take care of the lifecycle of your objects correctly. You don't need to do stack allocation or smart pointers as long as you de-allocate the object when your are done with it. So the lifetime of the fd is tightly bound to the lifetime of the wrapper instance. If you stack allocate without deallocation you have a bug _anyway_ in your C++ program, thus I would like to argue that it is a problem of manual lifecycle management, not a flaw in the RAII implementation.

You do however have a point when you say that you can miss deallocation and that is just as easy as forgetting to put a variable in a block. In an ideal world I would like to be able to decouple de-allocation time from destruction time, as someone mentioned above they are completely orthagonal issues. If this would be possible, one could guarantee the destructor callback, but defer deallocation until later. This would probably give leeway to more effective memory allocation patterns while keeping RAII alive.

And another thing about blocks: how do you deal with trees and graphs of objects? That is not trivial when using blocks. You can get around it, but continuing that way would most likely lead you to something like Haskells monads (the "programmable semicolon" aspect of them).


You'd need to have no non-block-scope way to use the resource for it to work.

You can already do that, just `return unless block_given?`. In principle I don't see why you'd want to non block version anyway but maybe there's some use case where that is valid.


IIRC, Common Lisp has an UNWIND-PROTECT operator for that. It works even if exceptions are thrown, and it is the reason Common Lisp (as opposed to Scheme) has no continuations.


Python's context managers have that; define the two sides of the functionality in __enter__()/__exit__(), and voila.

C#'s IDIsposable has this, too.

So do other languages. It's really not that unusual a pattern.


It is a common idiom in C++ RAII https://en.wikipedia.org/wiki/Resource_acquisition_is_initia...

Context managers in Python:

  with fly('Seattle'):
      see_sights()
Real-world example from Fabric http://fabfile.org:

  with cd('/to/path'):
      do_stuff()


Python has this with the "with" statement. It's often used for files, but you can write your own with classes and an __enter__ and __exit__ method.

However you can use the ``contextlib`` module to make really simple ``contextmanger``'s with the yield statement

e.g.:

    from contextlib import contextmanager
    
    @contextmanager
    def glcontext():
      glBegin()
      yield
      glEnd()
  
(This is an example where someone wanted to do OpenGL in python)


The OP seems to be conflating two different things: undoing an action (closing a file, etc), and performing the inverse of a reversible function.

The first is amply covered by the RAII/unwind-protect etc idioms than most languages do have.

The second is not, as most programming languages don't have the concept. The closest I can think of is the ability to use a predicate in either direction in Prolog (or logical programming in general)


This reminds me of my friend's project for a recent hackathon where he implemented a language that guaranteed every function in it was reversible. It would let you do something like this for any function.

Unfortunately, I don't think he found it particularly useful and gave up on it after the hackathon. Still, it was a cool project.


This sounds fun. Is there a link anywhere?


There are a pair of research languages called Harmony and Boomerang [1] that does this with what they call "lenses". There are also some easy-reading introductory slides [2,3].

[1] http://www.seas.upenn.edu/~harmony/

[2] http://www.cis.upenn.edu/~bcpierce/papers/lenses-etapsslides...

[3] http://www.cis.upenn.edu/~bcpierce/papers/harmful-mfps.pdf


Here's the link: http://stanford.edu/~jacobt/reversible.html

I don't think anything happened to it since the hackathon. Judging by own experience with hackathons, I wouldn't expect too much in the way of code quality. (Reading some of my old hackathon code still makes me shudder :)).


C#'s 'using' statements allow this kind of pattern for types implementing IDisposable. It looks something like this (might be a little off):

    using (var file = File.Open("foo.txt"))
    {
        ...
    }


This is what I thought of immediately except to implement the end with behavior you have to override the IDisposable shut down sequence? (does that make sense?)


LISP has had this for years:

(with-open-file (output "foo.txt" :direction :output :if-exists :supersede) (print-foo output))


That's not a general op/inverse thing like J apparently has, though. Also, which lisp?


That looks like Common Lisp, which also has a general version named unwind-protect: http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec...


Not really. unwind-protect is just Lisp's version of Java's "finally" statement.


Oh you're right, it'd need to take a param with the setup form(s) and somehow look up the corresponding cleanup form(s), sort of like getter and setter code registered with setf.


Terminologically speaking, Scheme is a Lisp. And ELisp is a Lisp. And Autodesk VisualLisp is a Lisp. And Clojure is a Lisp. And Arc is a Lisp.

But Common Lisp is Lisp.


? My lisp has more lispery than your lisp ?


It's the difference between being Lisp and being a member of the Lisp Family (aka "being a Lisp"). It's a fun way to weed out 1-bit people from the population: http://www.xach.com/naggum/articles/3225130472400367@naggum.... Furthermore Common Lisp is a Lisp-2 whereas Scheme is a Lisp-1. (Aesthetically I prefer Scheme.) So really there are several working meanings at play when people say "Lisp", this is frustrating for 1-bit people.

My usual heuristic in using context clues to determine which meaning is meant: if there is no qualifier for a particular family member of Lisp, I assume something is talking about the Lisp family, unless accompanied by a code example, in which case it's probably Common Lisp. When I see someone type "LISP" I usually think "This person doesn't know Lisp", just like how you'll sometimes see people type FORTH or PERL when talking about those languages. But that's not always the case so it's useful to double-check before calling them out, especially if they give code examples. (They may just be old after all. ;) )


Oh for sure, I was just poking a little bit of fun, thanks for the Naggum article though I had not come across that one.

FWIW I tend to think of common-lisp when people say lisp in general, the schemers and clojurites tend to self identify better.


Clojure has it too.

If you want to abstract the pattern out even further, then all you are doing is running a function within a certain context and then getting everything back to normal. Using under in J, scoping it out with ruby, or using with-open in Lisp are all examples of the pattern, just different ways of thinking about it.


I think what you're describing is somewhat different: temporarily changing context is like a let block. So for instance in CL you can rebind standard-output to a different stream using let, but then you have to do the legwork to ensure that everything is cleaned up if something goes wrong, while with-open-file or with-output-to-stream will do part of that work.


SBCL


The language Nickle (http://nickle.org) has a "twixt" control structure:

    twixt (expression1, expression2)
           statement1
If expression1 is true, then statement1 will be executed followed by expression2 in that order. The twixt statement guarantees that expression2 is always executed if expression1 is true, similarly to how "finally" works in java.


> If expression1 is true, then statement1 will be executed followed by expression2 in that order. The twixt statement guarantees that expression2 is always executed if expression1 is true, similarly to how "finally" works in java.

Am I missing something here? How is that different from, say:

    if (expression1) {
        statement1
        expression2
    }

?


It's different in that expression1 is always evaluated whenever control enters the twixt and expression2 is always evaluated whenever it leaves. A simple example is that if statement1 throws an exception, then expression2 is evaluated before the exception is handled. This is how finally works in Java.

More interestingly, Nickle has support for continuations, so if statement1 generates a continuation and makes a non-local exit, then expression2 is evaluated before that exit, and expression1 is evaluated before the continuation is resumed.

In other words, statement1 is always bracketed by expression1/expression2, even in the presence of non-local control flow.


Aha! Thanks for clearing that up.


If there was a way to mark functions and their inverses consistently in language, then functions with inverses could be used in (semanticaly) pure functional code, even if they were destructive.

For (stupid) example:

    def reverse(x) as inversible with(reverse): #special case - reverse is itself inverse function
        x = x.reversed() #I'm too lazy to code this by hand

    def lastFive(x) as pure: #let's say language checks if pure function only uses pure and inversable functions
        return x[:5] under reverse
Here lastFive is pure function even thought reverse is destructive.

This assumes everybody sticks to one definition of reversing something, in this case functions f and g are inverse of each other, when after invoking f(x), doing only pure or reversable things to x, and invoking g(x), x is unchanged.

And I don't know if this will be useful for anything - the only point seems to be avoiding making copies and still making pure functions. But resulting code, while semanticaly pure, is not threadsafe. So I don't know if it's worth anything.


You could dispense with the 'you've never heard of'.


IIRC, we use this Idiom in SICP while doing symbolic algebra.

For instance, when we have x+y=z and know only x and z, we need to reverse the equation to have y=z-x. The system was built on nodes which would propagate change (a-la events).

IMO, python's with statement with __enter__ and __exit__ looks mostly like this.


I refuse out of principle to click on links with headlines that are so obviously trying to trick me into going to a site.

Make the headline informative instead, if you have great content people will read and share it anyway.

Better version of this headline:

XXXX: A programming idiom that helps you do YYYY


Other posts have already mentioned that specific macros of this form exist in several Lisps. I've seen more than one macro-writing guide that talks about the with-something macro pattern for managing resources, and I've seen it applied in libraries that provide macros like with-database and with-query-results.

CL and Clojure could easily support a generic with macro requiring that there be an open and close method for the resource in question.


In C# this is "using". using(new something as something) { Do something with it. } //it goes away now.

Typically this idiom is the function call itself. The act of sipping coffee is the point of the function, and picking up the cup and putting the cup back down are the whole point of the function call.


This reminds me of Suzuki's idiom, "Leave No Trace," found in the book Zen Mind, Beginner's Mind.


Python has this idiom for a while now: it's known as a "context manager", also known as the "with" statement. Here's the PEP: http://www.python.org/dev/peps/pep-0343/


The BETA language does this with methods (you can't override a method, but you can add pre-and-post operations that wrap it).

I was never convinced this was a great idiom. BETA didn't become very popular, in any event.


Beta lives on in the Racket class system as (inner).

http://docs.racket-lang.org/reference/createclass.html?q=inn...)


An interesting approach to this is the one used in Go, where the "defer" keyword schedules code to run on function exit. It's simple and doesn't require adding more nesting to your code.


Ruby has "ensure" if you want something to be guaranteed to run on exit - it can be used at the end of methods, or by explicitly wrapping code in begin .. end.


(Assuming it works the same way as Smalltalk's ensure...) a really nice aspect of this is that the resource is held open while you're in the debugger, but cleaned up when you finish debugging.


"Never heard of" is quite a stretch.

During my dark past as a VB programmer (early-mid 90's) I even indented the code for pairs like this.


Oh, c'mon! All people with with and RAII and files, please reread the article. At least the last example.


I was going to argue that point, but glance at the J docs here:

http://www.jsoftware.com/help/dictionary/d202n.htm

Obverse is defined in a table and some special cases, and there's a built in "this is the obverse of that" instruction.

You could easily tie square/sort together as a pair in Python and it wouldn't be significantly different.

Under(square, sum, mylist) or whatever.


Pretty sure Ruby has this.

File.open(name, mode) receives a block, executes it, and then closes the file.


Ruby has a specific instance of that pattern, but to my knowledge it lacks a way to say, "do some arbitrary operation that has a side-effect, do this, then undo the operation". The function taking the block still has special-case code to open and close files.


You still have to implement the inverse of the arbitrary operation, at some point, somehow.

A programming language can't figure out what the inverse of open() is without someone somewhere telling it to close(). The point of the idiom is to have a library figure that out so that users don't have to. Therefore, Ruby supports it fully (there where API developers are sane, that is).


Having that built in for every side-effect in the language is pretty amazing, though.

Case by case, the block scoping idiom is a good way to handle it.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: