Python's Mutable Default Problem

sophacles · on Feb 20, 2011

I'm not sure this constitutes any problem other than a lack of understanding of the python runtime. What the author describes as:

"the mutable default parameter quirk is an ugly corner worth avoiding"

could also be described as:

"a natural outcropping of python's late binding, "names are references" variable model, and closure mechanisms, which provide a consistency to the language that is often crufted up in others"

I do somewhat agree with the author that this particular functionality should be a "use only when needed" feature. I don't think it should be avoided at all costs tho, because there are times where the mutable default allows for a lot of saved code. In fact in a few cases the code to work around using mutable defaults can get into some serious voodoo because frequently the writer is actually trying to work around the bigger mutable/immutable objects and names are references "issues" in python.

This also reminds me of something I was reading on the front page today about the old 'use the whole language' vs 'simplicity is king' holy war.

statictype · on Feb 20, 2011

"a natural outcropping of python's late binding, "names are references" variable model, and closure mechanisms, which provide a consistency to the language that is often crufted up in others"

hm... that's debatable. The implementation could have just as easily chosen to evaluate the default arguments each time the function is invoked and that decision wouldn't have broken any of the existing mental models of variable binding/closures.

sjs · on Feb 20, 2011

Precisely because Python has late binding you would expect the parameters to be evaluated on each call to the function.

One thing Python lacks is the ability to use preceding arguments in defaults, e.g. you cannot do this:

    def f(a=3, b=a+1):
        return (a + b) / 2
    
    NameError: name 'a' is not defined

Oops.

beza1e1 · on Feb 20, 2011

There is no order for named arguments. You could call the function like this after all:

  f(b=3)

wingo · on Feb 20, 2011

Of course; in that case the default value expression for `b' would not be evaluated.

Common Lisp does this right.

crystalis · on Feb 20, 2011

That hides default logic in the signature. Why would you favor that over

  def f(a=3, b=None):
    b = b or a+1 # or use a more explicit version
    return (a + b) / 2

chc · on Feb 20, 2011

That is code is logically convoluted — that's one reason not to love it. Why do you think is the more straightforward statement, just in general?

- "Let B be one greater than A unless otherwise specified."

- "We have no default value for B. If B has a value, then let B be equal to that value. If B does not have a value, then let B be one greater than A."

sjs · on Feb 21, 2011

Because that can be said more succinctly. It is even easier to read as there is less to read, which I realize is mostly subjective.

crystalis · on Feb 21, 2011

I'm pretty sure this is the same argument used to make Perl a bad guy.

sjs · on Feb 21, 2011

I disagree, but we've gotten a bit off topic.

I think Python's behaviour is confusing and basically never what anyone actually wants. Regardless of whether or not you can use other params in defaults, the defaults should be evaluated on each call.

neutronicus · on Feb 20, 2011

"a natural outcropping of python's late binding, "names are references" variable model, and closure mechanisms, which provide a consistency to the language that is often crufted up in others"

My mileage varies.

I'd prefer default parameters to honor referential transparency, whatever hoops the runtime has to jump through to make this happen.

jerf · on Feb 20, 2011

That would make that the only place in which Python has referential transparency, though. It may be a quirky side-effect of consistency, but it is consistent.

eru · on Feb 20, 2011

Strings are also non-mutable, and thus have referential transparency. And so do numbers in Python.

jerf · on Feb 21, 2011

Referential transparency is a property of functions, not data (unless you're in a lambda mood and treat them as functions of zero arguments, but in that case you're not in Python so it's not relevant here). Even a function to "concatenate two strings" could be passed an object that overloads the addition operator to cause arbitrary modifications:

    >>> class Evil(object):
            def __init__(self):
                self.evil = 1
            def __add__(self, other):
                result = ("%s" % self.evil) + other
                self.evil += 1
                return result


    >>> def referentially_transparent_concat(a, b):
            return a + b

    >>> e = Evil()
    >>> print referentially_transparent_concat(e, "hi")
    1hi
    >>> print referentially_transparent_concat(e, "hi")
    2hi

You can program in a referentially-transparent style with Python, but you'll have to do it by adding your own restrictions to the code you write. Python will not help you with that.

eru · on Feb 25, 2011

In your code the devil is in the data-type.

aston · on Feb 20, 2011

As noted in the comments, DON'T use

  stuff = stuff or []

because if you pass an empty list, you'll get a new one rather than mutating the one you passed.

  stuff = stuff if stuff is not None else []

is wordy, but at least it's correct.

btilly · on Feb 20, 2011

I prefer the usual alternative

  if stuff is None:
      stuff = []

dilap · on Feb 20, 2011

It's sad that such a common and concise idiom as "x or y" is so perniciously, subtly broken in Python, and that there is no satisfyingly concise equivalent.

If I were being cavalier and had an extra wish to burn, I'd request

  x else y

to mean x unless x is None.

eru · on Feb 20, 2011

Make yourself a function.

dilap · on Feb 20, 2011

Not so helpful in a strict language (since you want y to be evaluated only when x is None)

eru · on Feb 21, 2011

Yes, that's true. You would need to wrap it in a lambda, but then that's looks horrible and you might as well use an if.

itsnotvalid · on Feb 20, 2011

So indeed the auth of the article is really too clever on this.

xiongchiamiov · on Feb 20, 2011

And as usual, being clever in Python equates to getting yourself in trouble.

Dammit, people - in Python, "That's clever" is an insult. Learn the philosophy of a language and you'll be much happier using it.

code_duck · on Feb 20, 2011

That's exactly why I prefer Python over Ruby: not the language, but the philosophy of the community. The Ruby community seems to adore 'clever', while the Python community explicitly shuns it. I view the latter as being more mature and borne of experience.

chc · on Feb 20, 2011

Python itself is pretty "clever" compared to many other languages (GC, lack of obvious 1:1 correspondance between code written and code executed, dynamic typing, significant indentation, decorators, generators, etc.). If Python programmers were really opposed to cleverness, they'd be writing in straightforward assembly or a very thin veneer over it.

billmcneale · on Feb 20, 2011

A language that forces me to say "self" every other word doesn't strike me as particularly clever.

Especially when that keyword is not necessary and when it breaks standard programmer expectations (if I declare a method with 3 parameters, I should call it with 3, not 2).

xiongchiamiov · on Feb 21, 2011

Ah, but you are calling it with 3 parameters:

  foo.bar(baz, sputz)
  ^1      ^2   ^3

When you write a class method, it has more information available to it than a similar subroutine; thus, another parameter.

GvR brings up the deeper reasons for explicit self at http://neopythonic.blogspot.com/2008/10/why-explicit-self-ha... .

billmcneale · on Feb 21, 2011

Yes, this is exactly what I mean by "breaking programmer's expectations". Except maybe Modula, no language works like that at all. The parameters are in the parentheses, period. If you need to pass this, you do so in the declaration and in the invocation. If this is passed implicitly, you don't declare it in the parameters and you don't pass it at the call site.

Python is doing this totally weird stuff that sits in the middle and that makes no logical sense at all.

The fact that Python forces you to declare the "self" parameter is simply due to the fact that it's old, old, old. Nothing wrong with that, but post rationalizing it by saying it's okay to declare a method with 3 parameters but calling with just 2 is just silly.

xiongchiamiov · on Feb 21, 2011

Simply because it does not fit your expectations does not mean that it makes "no logical sense at all". As faulty beings, we often have expectations that really are quite far from logical.

code_duck · on Feb 21, 2011

I like 'self'. I don't mind it at all.

The error messages related to `self` in method definitions caused me some confusion when starting out with python. For instance if you define a method with no arguments, calling it results in "TypeError: your_method() takes no arguments (1 given)".

_3u10 · on Feb 21, 2011

This is why I love F#.

It lets you define the name of the this parameter in exactly the same way you pass it.

  type Bar(p) =
    member x.Foo(bar) = printf "%s %d" bar p 

  let x = new Bar(10)
  x.Foo("Foo bar")

Output:

  Foo bar 10

And yes, the variables bar and p are type inferred from the %s and %d respectively

code_duck · on Feb 21, 2011

You can rename self to whatever you want in Python by changing it in the function definition.

    class Flip(object):
        foo=4
        def flop(x,num):
            x.bar=x.foo+num

xiongchiamiov · on Feb 21, 2011

Right, but his point was that the self variable is declared in front of the method name, similarly to how it looks when you call said method.

jbri · on Feb 20, 2011

Python is "clever" so that the Python programmer doesn't have to be.

eru · on Feb 20, 2011

Actually Python has pretty close correspondence between source code and compiled byte code.

tswicegood · on Feb 20, 2011

You can also do:

    stuff = stuff is not None and stuff or []

That said, the `if stuff is not None:` is the most Pythonic way of doing it.

wiredfool · on Feb 21, 2011

I didn't realize you could write perl in python.

perlgeek · on Feb 20, 2011

FWIW, Perl 6 implicitly treats default values as closures, and calls them when no argument is passed that could bind to the optional argument.

That way you get a fresh array each time, and you can even use defaults that depend on previous arguments:

    sub integrate(&integrand, $from, $to, $step = ($to - $from) / 100) { ... }

emehrkay · on Feb 20, 2011

Pylint (or maybe it was pep8) has told me not to make dicts default arguments when running it against my code, but didn't explain why. Thanks for the post.

riobard · on Feb 20, 2011

You just need to fully understand when Python does evaluation.

    import types
    def function(item, stuff = lambda: []):
        if type(stuff) == types.FunctionType:
            stuff = stuff()
        stuff.append(item)
        print stuff

    function(1)
    # prints '[1]'

    function(2)
    # prints '[2]'

In Scala it has a better syntax because of typed function object:

    trait Map[A, B] {
        …
        def getOrElse (key: A, default: ⇒ B): B
        …
    }

the `default` parameter is a function, so when you do

    getOrElse(someKey, defaultValue)

The `defaultValue` becomes a function that generates the value you put there when called.

spenrose · on Feb 20, 2011

Yes, it's a wart. You learn the patterns he mentions pretty quickly.

code_duck · on Feb 20, 2011

I knew the pattern well from other languages, but would have assumed that it wasn't necessary in Python. So, this is good to note.

neutronicus · on Feb 20, 2011

"stuff = stuff or []"

This idiom is baked into the perl community. It's kind of funny to see a Pythonista deciding it's a good idea. (You can tell it hurts him too).

jacobolus · on Feb 20, 2011

It’s not a good idea in Python. There are legitimate reasons to pass falsy objects as function arguments.

robinhouston · on Feb 20, 2011

Funnily enough, Perl (since version 5.10) actually has a solution to this. The expression

  $foo // "default"

evaluates to "default" if $foo is undefined, and to $foo otherwise (even if it is defined but false).

dagw · on Feb 20, 2011

I've been programming python for on and off 12 years, and full time for the past 3, and this is the first time I've seen anyone recommending that idiom. So on the whole the python community does not think it's a good idea.

code_duck · on Feb 20, 2011

JavaScript, too:

    stuff = stuff||[]

since JavaScript doesn't even support default argument values.

mark-r · on Feb 21, 2011

I keep seeing this "problem" come up, but I don't understand how it's realistic. If you have a function that modifies a parameter as a side effect, why would you have a default value for the parameter?

And since the site's comments seem to be taken over by link spam, is this mention on Hacker News just a clever way to juice the Google rank of said spam?

ylem · on Feb 21, 2011

I saw it bite some people where I work. It's now used for discussion with potential hires.

dustingram · on Feb 20, 2011

This 'problem' can actually come in handy when used with a regex callback function.

See if you can determine what this does:

    def cbk(match, nb = [0] ):
        if len(match.group())==len(nb):
            nb[-1] += 1
        elif  len(match.group())>len(nb):
            nb.append(1)
        else:
            nb[:] = nb[0:len(match.group())]
            nb[-1] += 1
        return match.group()+' '+('.'.join(map(str,nb)))
    
    str = re.compile('^(#+)',re.MULTILINE).sub(cbk,str)

d0mine · on Feb 21, 2011

The code converts:

  ##
  #

To:

  ## 0.1
  # 1

str is builtin, don't use it as a variable name especially if you use it in its original role.

dustingram · on Feb 21, 2011

Correct, & thanks for the tip! But don't worry, I changed the variable name from my copy & paste and just didn't give it a second thought.

billmcneale · on Feb 20, 2011

The problem is not so much about mutable, it's that default parameters escape the scope of their method.

Which is very, very messed up (but not the first thing Python messed up).

wyuenho · on Feb 20, 2011

It's on SO's Python FAQ too. This is the only thing in Python that's really bitten me. I remember it took me like a week to figure this out when I was tearing down my algorithm bit by bit to find out whether my prove was wrong or the code.

http://stackoverflow.com/questions/1132941/least-astonishmen...

pedro3005 · on Feb 20, 2011

You can use this trick for memoization. Example: http://paste.pocoo.org/show/341849/

code_duck · on Feb 20, 2011

The article is right, though, in stating that later viewers of your code may not be expecting this behavior, and it culd lead to problems.

underwater · on Feb 21, 2011

Memorization is an implementation detail and shouldn't be part of the arguments list.

clay · on Feb 20, 2011

"stuff = stuff or []"

This would fail to have expected behaviour here:

fill_list = []

stuff = function(info, full_list)

print fill_list

use the if list is None: paradigm

viraptor · on Feb 20, 2011

Not sure if that's a "problem" if the only other way to implement function-static variables would be to add a variable visible for the whole module, or tricks with decorators...

    @statics(blah=[])
    def foo(normal_args, **kwargs):
    # or
    def foo(normal_args, blah):

It messes up the idea of looking at the definition to find the function signature.

baq · on Feb 20, 2011

it's a gotcha but it makes perfect sense once you understand why it works that way - i.e. the difference between evaluating a function definition and calling it.

jfm3 · on Feb 20, 2011

The Pythonist doth protest too much, methinks.

drstrangevibes · on Feb 20, 2011

i think its clearer and more pythonic in this case to do

def function(item, stuff): .... blah blah def function(item): function(item, [])