Safely using destructors in Python (2009)

wwright · on Jan 5, 2019

I dislike the argument about context managers being unusable if you have a class that needs to re-use some context across many methods.

There are a lot of ways to do so comfortably. A trick I really like is to use `contextlib.contextmanager`, and have a pure constructor and a separate static factory method that injects appropriate contexts. This also makes the code more loosely coupled in general :-)

    from contextlib import contextmanager

    class Session:
        @classmethod
        @contextmanager
        def create(cls):
            with db.connect(...) as connection:
                yield cls(connection)
        
        def __init__(self, connection: db.Connection):
            self._connection = connection

        def login(self, name, password):
            self._connection.query(...)

    with Session.create() as session:
        ...

chronial · on Jan 5, 2019

I think I prefer this version:

    class Session:
        def __init__(self):
            self._connection = db.connect(...)

        def __enter__(self):
            self._connection.__enter__(self)
            return self

        def __exit__(self, *args):
            return self._connection.__exit__(self)

        def login(self, name, password):
            self._connection.query(...)

    with Session() as session:
        ...

It might need a bit more knowledge about context managers [1], but it feels less magic and more explicit to me – i.e. more pythonic :). But it also contains the assumption that db.connection already returns the connection. If that is not true, a bit more housekeeping is needed:

    class Session:
        def __init__(self):
            self._connection_context = db.connect(...)

        def __enter__(self):
            self._connection = self._connection_context.__enter__(self)
            return self

        def __exit__(self, *args):
            return self._connection_context.__exit__(self)

        def login(self, name, password):
            self._connection.query(...)

[1] https://docs.python.org/3/library/stdtypes.html#typecontextm...

rnnr · on Jan 5, 2019

it's subjective of course, but I find context manager decorators very pythonic, I still remember the 'whoa! that's beautiful' moment when I 1st learned about them.

But most importantly, generators as enclosures are used everywhere today with the whole async-await paradigm.

chronial · on Jan 5, 2019

I absolutely agree – it's the specific usage in the class that I don't like. Having a factory instead of using the constructor, with that IMHO not that obvious interaction with the context manage. A factory that is actually a context manager just seems non-trivial, and given that there is more straightforward solution available, I would avoid it.

hiccuphippo · on Jan 5, 2019

This is just what I needed. Thank you!

gvx · on Jan 5, 2019

To me, insisting on using `__del__` is just a sign of someone trying to write C++ in Python. If cleanup needs to happen, the Pythonic way is to make the lifetime of a resource explicit with a context manager. In the cases where that doesn't make sense, there is usually nothing to explicitly clean up anyway.

I honestly don't remember ever having to write a __del__ method in the decade I've been using Python.

uryga · on Jan 5, 2019

> If cleanup needs to happen, the Pythonic way is to make the lifetime of a resource explicit with a context manager.

in some cases resources are long-lived and not possible to enclose in a context manager. as an example, a while ago i wrote some PyOpenGL cide and wanted to automatically release some temp GPU buffers when no longer needed. however, they had to live for multiple iterations of the app's main loop, so a context manager wouldn't work – __del__ was the best place to release them.

uryga · on Jan 6, 2019

to be fair though, __del__ did cause problems. when the app was exiting, PyOpenGL would release the GL context before my buffers were __del__'ed, so the __del__s tried to free a buffer that stopped existing and crashed. it'd be nice if those methods were called at a predictable time and not "sometime after the object's RC drops to zero, maybe".

nerdponx · on Jan 6, 2019

Can an object safely call del on itself from inside __exit__?

syllogism · on Jan 5, 2019

Context managers change the API a lot though. If the resource cleanup is an implementation detail like freeing a handle to some C data, it's much better to do that in a __del__ method.

wwright · on Jan 5, 2019

If context managers have changed your API, then the context of the resources is no longer an implementation detail; the lifetime of your object and the lifetime of your resources are linked.

ChrisSD · on Jan 5, 2019

> While recommended, `with` isn't always applicable. For example, assume you have an object that encapsulates some sort of a database that has to be committed and closed when the object ends its existence. Now suppose the object should be a member variable of some large and complex class (say, a GUI dialog, or a MVC model class). The parent interacts with the DB object from time to time in different methods, so using `with` isn't practical. What's needed is a functioning destructor.

Isn't the issue with __del__ that you have no idea when or even if it will ever run? This is always a problem with destructors, no? I know I'm slightly too paranoid about resource management but I get nervous with that lack of control.

MereInterest · on Jan 5, 2019

Depends on the language, but you are correct that in GC languages it off nondeterministic when the destructor is called. I've read some articles that make a distinction between C++ destructors, which do occur deterministically, and finalizers in GC languages.

masklinn · on Jan 5, 2019

This is not just a language thing: CPython is reference-counted, so `__del__` is deterministic (in the absence of cycles at which point all bets are off, especially since the cycle breaker bails out then) however pypy or jython are not refcounted, and `__del__` is thus very much non-deterministic.

In fact, this implementation divergence is most of the reason why `with` got added to the language, the reliance on reference counting for deterministic release of resource was ultimately seen as an issue for alternative implementations as they'd face non-memory resource exhaustion issues when trying to run existing software.

ChrisSD · on Jan 5, 2019

Even when destructors are deterministic, they still may run later than you realise or not at all.

For example, the Rust docs list some platform specific examples of when the 'drop trait' (aka destructor) may not run for thread local values[0]. There are also circular references or simply another object holding a reference (directly or indirectly) that isn't clear from the context.

The `with` statement at least seems to clear up a lot of potential ambiguity if nothing else.

[0] https://doc.rust-lang.org/std/thread/struct.LocalKey.html#pl...

chronial · on Jan 5, 2019

This is stated even more clearly here: https://doc.rust-lang.org/std/mem/fn.forget.html#safety

> forget is not marked as unsafe, because Rust's safety guarantees do not include a guarantee that destructors will always run. For example, a program can create a reference cycle using Rc, or call process::exit to exit without running destructors. Thus, allowing mem::forget from safe code does not fundamentally change Rust's safety guarantees.

stubish · on Jan 6, 2019

I consider Python destructors either bugs or unnecessary. The reason, which the article neglects to consider, is that there are always cases where the destructors are not called. If you are relying on destructors running to do required cleanup (eg. removing temporary files), you have a bug. If you are not relying on the destructors running (eg. trace logging, closing database connections), then they are unnecessary.

They are also a source of latent bugs, which can eventually bite you as your code ages. For example, did you know it is unsafe to reference module level attributes from a destructor? Lets say your destructor does some simple cleanup using shutil.rmtree. This only works if the object's destructor is invoked before the shutil module is garbage collected at program termination. It might work now, but change the imports around or delay when the object is cleaned up (say, sticking it in a cache), and your program now spits out NameError tracebacks on termination. Which thankfully are just noise, since exceptions in destructors are ignored. Which itself is another gotcha, since you might actually want your program to exit with a failure code if it is failing.

So yes, avoid temptation and never use Python destructors. Even if you use them for the trivial things they allow or the baroque constructs required to do more complex things safely, the next person working on your code won't.

albertzeyer · on Jan 5, 2019

Be careful with __del__ also in the context of multithreading. __del__ can be called in any thread, where ever the GC currently executes. Some resources however can only be accessed (and also be cleaned up) from a specific thread (some GUI APIs e.g. only from the main thread, or some other things only from the same thread where they were created).

See e.g. also this bug: https://github.com/tensorflow/tensorflow/issues/22770

Which is still not really resolved. Maybe this is actually some of the problems with circular refs, and then Swig is also involved, and also multi-threading.

mlthoughts2018 · on Jan 5, 2019

> “While recommended, with isn't always applicable. For example, assume you have an object that encapsulates some sort of a database that has to be committed and closed when the object ends its existence. Now suppose the object should be a member variable of some large and complex class (say, a GUI dialog, or a MVC model class). The parent interacts with the DB object from time to time in different methods, so using with isn't practical.”

This isn’t exactly true. One way to still use context managers is to mix them with coroutines, even simple generator functions.

For example, for something repeatedly interacting with s database, you may have a generator function with a non-terminating loop, that relies on generator `send` commands to resume the coroutine with a newly injected value. This can be used to repeatedly insert new data or repeatedly query, and then when a “close” sentinel is sent, it breaks the loop and the context manager automatically cleans up the connection. I’ve used this idea with manipulation of TensorFlow graphs too.

This only works well if it’s hidden behind a library function or other type of user friendly calling interface though. If you’re making the user manually deal with the generator function, they are as liable to make mistakes as of they are tasked with manually closing resources or designing a destructor.

icegreentea2 · on Jan 5, 2019

I don't think using the generator solve's the author's specific problem which is to figure out when to send that "close" sentinel event. The author is looking for a way to "know" when an instance of a large and complex class is "done".

mlthoughts2018 · on Jan 5, 2019

This can be put in the object’s del method though, so that when the object holding the connection goes out of scope, it still uses a context manager as the means of cleaning up the resource.

I agree this isn’t easy all the time, but it is a good trick for library writers, since you control del in that case and can design it this way so that del has no complex logic and the “real” mechanism is still just a behind the scenes context manager.

ali_m · on Jan 5, 2019

What would be the advantage over just having your __del__ method do the cleanup directly (rather than using a context manager)? You still suffer from most of the disadvantages of using __del__ - you have very few guarantees about when it will be called (if at all), and it may render your object un-garbage-collectable if there are cyclic references.

mlthoughts2018 · on Jan 5, 2019

In cases of resource cleanup you often (not always) don’t care so much about deterministic destruction, because holding onto the resource is not acting as a bottleneck and doesn’t lead to any performance or resource problem.

In such a case, there’s a big win to avoiding writing custom destructor logic that adds complexity and requires more testing and so on. If you can use something super simple, like a context manager decorator or a simple generator function with a context manager inside, and the destructor logic is basically just “call close()” to trigger the implicit cleanup of the context manager, it’s often worth it.

You aren’t writing much logic to specially process a “close” operation.. you’re just electing to let the thing containing the context manager go out of scope.

But I do agree in a case where, for example, leaving many open connections to a resource would create a resource limit or bottleneck, then ensuring it happens deterministically is probably important.

notacoward · on Jan 5, 2019

I don't think cyclic references are as rare as the author seems to think. Many problems involve some sort of graph, which might or might not have cycles, and the natural representation of such graphs can't just be dismissed as bad design. Weakrefs don't help in that case either. I don't think that invalidates his main argument, but the discussion of cycles seems a bit ill considered.

icegreentea2 · on Jan 5, 2019

Reading the examples given and trying to generalize, I wonder if anyone else has some thoughts on the ideas of more explicitly tracking state and objects in GUI applications. It's common in GUI frameworks to have a backing class for each complex GUI element (it could be a screen, or a panel/frame, or a complex widget). In some frameworks, we control invoking the backing classes (as in, we call the initializer/constructor), and in other frameworks, we tell the framework which class to build an instance of for us.

In nearly all cases, when we're "done" with a GUI object, we just let go of it, and let the framework or language runtime handle it. If the framework is good, it'll give you some sort of "on_gui_element_undraw" event or something you can handle to trigger your "destructor".

This results in some weirdness. For example, if you have a long living socket that is only suppose to be active for say... 3 of 5 screens of an application, now you're counting "on_gui_element_undraw" events to track GUI state to control backend state.

I've sometimes played around with the idea of wrapping these frameworks with -something- (I don't know what...) that somehow gives you like... the "history vector" of the GUI...

I don't really have fully developed thoughts on this (notably, I really only write native GUIs, and only for smallish internal tooling), but has anyone played with frameworks with more powerful state management?

wwright · on Jan 5, 2019

I imagine that this would be related to separating your GUI code from your “business logic”/“effects”.

You can make a rule that code should only ever invoke widget code if it “owns” that widget. Then, have the code “owning” your widget be responsible for both showing and hiding the widget. That code then always knows when to invoke cleanup for related resources.

krylon · on Jan 5, 2019

I spent the past couple of months working with C#, and I like its approach to deterministic resource handling in a GC language. I think it is very similar to what Python does. Basically, one implements an Interface (IDisposable) and there is little syntactic sugar in the form of the using-statement.

This quickly becomes second nature and is a very intuitive way to handle, for example, database transactions.

makecheck · on Jan 5, 2019

The "weakref.ref()" advice is helpful but I would add that "weakref.proxy()" is often what you want instead of "weakref.ref()" because it does not require the real object type to be known.

Also, I wish Python would make exceptions and "with" easier to combine. It feels like there is always this unwelcome extra level of indentation when code may raise exceptions, and I would rather have a "try with" or something to remove that. This keeps me from using context managers for relatively simple steps that might otherwise be convenient, since it looks better to just do the simple steps as part of a try/except/finally that has to be there anyway.

blattimwind · on Jan 5, 2019

Notably Python ignores exceptions thrown in __del__.

sytelus · on Jan 5, 2019

Also Python won’t call __del__ if there are circular references! And also there are no warnings if there re circular references.

chrisseaton · on Jan 5, 2019

> Also Python won’t call __del__ if there are circular references!

Why not?

blattimwind · on Jan 5, 2019

Python uses cycle breaking i.e. it chooses an object on a circular path and simply deletes it, which essentially bricks the entire connected set for reference counting purposes. This means that if you ran __del__ code in this circumstance it could access references to deleted objects, i.e. use after free.

__del__ is simply an unsound part of Python.

chrisseaton · on Jan 5, 2019

Ah I assumed it marked, ran destructors, then swept.

monochromatic · on Jan 5, 2019

> Python doesn't know the order in which it's safe to destroy objects that hold circular references to each other, so as a design decision, it just doesn't call the destructors for such methods!

sytelus · on Jan 5, 2019

To people who don’t like __del__: what do you do to protect resources when people forget to use with statement? Are there any good pattern for using __exit__ with __del__?

wwright · on Jan 5, 2019

There's a really simple and easy solution in the standard library, but like many things in Python, it's not the first solution you're usually shown: `contextlib.contextmanager` [1].

This API cleanly separates the "context manager" part of your code from the "context" part. The result of calling your function is only a context manager, with no side effects. Using `with` on that context manager yields your actual object :-)

Very simple usage:

    from contextlib import contextmanager

    @contextmanager
    def my_context(...):
        print("Started")

        try:
            yield 5
        finally:
            print("Finished")

    # Calling the function alone does nothing
    >>> my_context()
    <contextlib._GeneratorContextManager object at 0x10de07588>
    # Using `with` runs our side effects
    >>> with my_context() as context:
    ...     print(context)
    ...
    Started
    5
    Finished

[1]: https://docs.python.org/3/library/contextlib.html#contextlib...

blattimwind · on Jan 5, 2019

> what do you do to protect resources when people forget to use with statement?

In many cases it is perfectly reasonable to only allow context manager use. And if it isn't possible, offer explicit close() or dispose() interfaces, don't rely on __del__.

emidln · on Jan 5, 2019

It's not my responsibility as a library author to protect users from themselves. If they have leaks because they refuse to use my API then so be it. If you don't want to use with, you're free to call __enter__() and __exit__() yourself in a way you see fit.