Python idioms

Estragon · on June 12, 2011

  > [Despite the existence of "enumerate", "xrange(maxint)"] can still
  > be useful when you want to include an index along with several
  > other lists, however, e.g. zip(list_1, list_2, indices)

I would just use

  for idx, (elt1, elt2) in enumerate(zip(list_1, list_2))

The following is not necessarily good advice, and he should take his own earlier advice and validate his claims with some profiling.

  > [map is ] much faster, since the loop takes place entirely in
  > the C API and never has to bind loop variables to Python
  > objects.

  > If you find yourself making the same list comprehension
  > repeatedly, make utility functions and use map and/or filter...

Note that the following "map" snippet is considerably slower than the corresponding list-comprehension snippet. Function calls are expensive in python. Also, his claim that you save time with a "map" by avoiding the loop variable is obviously bogus: you still have to bind the variables in the signature of the function you're mapping, unless the signature is empty.

  met% python -m timeit "[x**3 for x in xrange(10000)]"
  1000 loops, best of 3: 1.27 msec per loop
  met% python -m timeit "map(lambda x: x**3, xrange(10000))"
  1000 loops, best of 3: 1.88 msec per loop

VBprogrammer · on June 12, 2011

The map and filter functions were almost depreciated in Python 3.

http://www.artima.com/weblogs/viewpost.jsp?thread=98196

louisdefunes · on June 12, 2011

FTFY: depreciated --> deprecated

masklinn · on June 12, 2011

> I would just use

> for idx, (elt1, elt2) in enumerate(zip(list_1, list_2))

or use itertools.count:

    for idx, elt1, elt2 in zip(count(), list_1, list_2)

orlandu63 · on June 12, 2011

I'm getting entirely different results with Python 3.2:

    % python -m timeit "[ x**3 for x in range(10000) ]"
    100 loops, best of 3: 7.85 msec per loop
    % python -m timeit "map(lambda x: x**3, range(10000))"
    1000000 loops, best of 3: 1.26 usec per loop

By these benchmarks map is 700 times faster than the corresponding list comprehension.

imurray · on June 12, 2011

When times are that short you should question if Python is actually doing anything, or just promising to do it later (remember that range is a generator in python3).

  $ python3.1 -m timeit "map(lambda x: x**3, range(10000))"
  1000000 loops, best of 3: 1.01 usec per loop

Seems fishy. Adding up the elements forces Python to actually do the cubing on all of them:

  $ python3.1 -m timeit "sum(map(lambda x: x**3, range(10000)))"
  100 loops, best of 3: 13 msec per loop
  $ python3.1 -m timeit "sum([ x**3 for x in range(10000) ])"
  100 loops, best of 3: 10.6 msec per loop

map really is slower, at least for me in python3.1.

ldng · on June 12, 2011

$ python3.2 -m timeit "sum(map(lambda x: x3, range(10000)))"

100 loops, best of 3: 7.61 msec per loop

$ python3.2 -m timeit "sum([ x3 for x in range(10000) ])"

100 loops, best of 3: 6.27 msec per loop

$ python2.7 -m timeit "sum(map(lambda x: x3, range(10000)))"

100 loops, best of 3: 2.81 msec per loop

$ python2.7 -m timeit "sum([ x3 for x in range(10000) ])"

100 loops, best of 3: 2.28 msec per loop

I wasn't expecting python3 to be that slower.

imurray · on June 13, 2011

[ldng: put some whitespace before code on hacker news: it indents it and the asterisks don't disappear.]

I infer that ldng has a 64 bit machine. On my 32 bit machine, Python 3.1 is faster than 2.6 for these examples. On a 64 bit machine I get similar results to ldng's, with Python3 being slower. If I wrap long() around the x in the example, Python2 becomes as slow as Python3.

Note that taking the cube of lots of big integers is not typical for many people: it generates very large integers that have to be in Python2's special long type on a 32 bit machine. On a 64 bit machine they stay as normal ints in Python2, which are much faster. Python3 has a single automagic int type, which seems to internally convert to the arbitrary precision type sooner than it has to on 64 bit machines(?).

Examples more typical of my use would wrap float() around the x, or change the example to add up 3x instead of x^3. These examples are all faster in Python3 for me. Faster still is to use numpy (which is now supported in Python3).

Summary: the people who would be affected by this regression have both a 64 bit machine, and do a lot of exact integer arithmetic on integers that can be represented in 64 bits, but not 32.

deno · on June 12, 2011

map() returns an iterator in Py3k.

    > python3.2 -m timeit "( x**3 for x in range(10000))"         
    100000 loops, best of 3: 2.67 usec per loop

    > python3.2 -m timeit "[ x**3 for x in range(10000)]"      
    100 loops, best of 3: 9.53 msec per loop

    > python3.2 -m timeit "list(map(lambda x: x**3, range(10000)))"
    100 loops, best of 3: 11.8 msec per loop

    > python3.2 -m timeit "list( x**3 for x in range(10000))"      
    100 loops, best of 3: 10.7 msec per loop

And here's the Python 2.5 naïve/in-memory version recreated in Python 3.2:

    > python3.2 -m timeit "list(map(lambda x: x**3, list(range(10000))))" 
    100 loops, best of 3: 12.1 msec per loop

And on Python 2.7 is still the fastest:

    > python -m timeit "map(lambda x: x**3, range(10000))"
    100 loops, best of 3: 4.33 msec per loop

    > python -m timeit "[x**3 for x in xrange(10000)]"                                             
    100 loops, best of 3: 2.32 msec per loop
    
    > python -m timeit "[x**3 for x in range(10000)]" 
    100 loops, best of 3: 2.66 msec per loop

On PyPy (1.5) it's pretty much the same, which goes to show why this is not “Python idiom”, but rather CPython's implementation detail:

    > pypy -m timeit "[x**3 for x in range(10000)]"
    100 loops, best of 3: 2.73 msec per loop

    > pypy -m timeit "[x**3 for x in xrange(10000)]"
    100 loops, best of 3: 2.62 msec per loop

    > pypy -m timeit "map(lambda x: x**3, xrange(10000))"
    100 loops, best of 3: 2.51 msec per loop
    
    > pypy -m timeit "map(lambda x: x**3, range(10000))" 
    100 loops, best of 3: 2.87 msec per loop

beaumartinez · on June 12, 2011

Did you try itertools.imap instead? Vanilla map gives us a list; imap gives us a generator. (Also, I'd try a generator expression as well to see how it compares to a list comprehension).

Also, note that the article is from January 2007; I'd wager that Python has evolved and had a wealth of optimisations since then, especially regarding "common idioms" such as vanilla string concatenation and list comprehensions.

Estragon · on June 12, 2011

  > ...the article is from January 2007; I'd wager that Python has
  > evolved and had a wealth of optimisations since then...

The observations I made regarding list-comprehensions vs map have been true since list comprehensions were first introduced.

  > Did you try itertools.imap instead? Vanilla map gives us a list;
  > imap gives us a generator.

Don't see why that would make any difference (except to confuse the benchmark, as the python3 examples in this thread show.)

steve-howard · on June 12, 2011

The rule of thumb I've read is that map is appropriate and (generally) faster when you do not need to create a lambda.

Estragon · on June 12, 2011

That's not strictly correct. It really comes down to whether you're calling a native python function or a C extension. If it's a native python function call, it's going to be relatively slow.

MostAwesomeDude · on June 12, 2011

In CPython, LCs are simpler in bytecode than map/filter, which makes them faster. That's all, really.

Ysx · on June 12, 2011

Good stuff, though there's a few outdated idioms:

Lists, or any iterable, can be reverse-sorted with reversed(sorted(my_list)). That'll give you an iterator, though you can call list() on the result if you need it.

"while True" should be used in-place of "while 1". Reads better.

In Python 2, xrange() is preferred over range() when looping - it won't create an in-memory list of integers, and behaves mostly the same as range but for a few edge cases. Python 3 renamed xrange() to range(), and removed the original range() function.

shazow · on June 12, 2011

> reversed(sorted(my_list))

I thought sorted(my_list, reverse=True) would be slightly faster, but it seems not (by a tiny amount). Weird.

> "while True" should be used in-place of "while 1". Reads better.

If you disassemble these statements, you'll see that "while 1" creates fewer instructions:

    def foo():
        while True:
            pass

    def bar():
        while 1:
            pass

    import dis

    dis.dis(foo)
    #       0 SETUP_LOOP              10 (to 13)
    # >>    3 LOAD_GLOBAL              0 (True)
    #       6 POP_JUMP_IF_FALSE       12
    #
    #       9 JUMP_ABSOLUTE            3
    # >>   12 POP_BLOCK
    # >>   13 LOAD_CONST               0 (None)
    #      16 RETURN_VALUE

    dis.dis(bar)
    #       0 SETUP_LOOP               3 (to 6)
    #
    # >>    3 JUMP_ABSOLUTE            3
    # >>    6 LOAD_CONST               0 (None)
    #       9 RETURN_VALUE

Edit: This is in Python 2.7.1. d0mine pointed out that the discrepancy is no longer the case in Python 3. Good to know. :)

d0mine · on June 12, 2011

There is no difference on Python 3:

  >>> dis.dis(foo)
  2           0 SETUP_LOOP               3 (to 6) 

  3     >>    3 JUMP_ABSOLUTE            3 
        >>    6 LOAD_CONST               0 (None) 
              9 RETURN_VALUE         
  >>> dis.dis(bar)
  2           0 SETUP_LOOP               3 (to 6) 

  3     >>    3 JUMP_ABSOLUTE            3 
        >>    6 LOAD_CONST               0 (None) 
              9 RETURN_VALUE

kqueue · on June 12, 2011

~$python -mtimeit "'a' + 'b' + 'c' + 'd'"

10000000 loops, best of 3: 0.026 usec per loop

~$python -mtimeit "''.join(('a','b','c','d'))"

10000000 loops, best of 3: 0.197 usec per loop

Ysx · on June 12, 2011

Interesting! ''.join() has the advantage on longer strings though:

$ python -mtimeit "'aaaaaaaaaaaaaaa' + 'bbbbbbbbbbbbbbb' + 'ccccccccccccccc' + 'ddddddddddddddd'"

1000000 loops, best of 3: 0.224 usec per loop

$ python -mtimeit "''.join(('aaaaaaaaaaaaaaa','bbbbbbbbbbbbbbb','ccccccccccccccc','ddddddddddddddd'))"

10000000 loops, best of 3: 0.201 usec per loop

kqueue · on June 12, 2011

That's definitely interesting.

imurray · on June 12, 2011

Always worth checking. Although the relative merits change when doing many joins to build up a single long string.

This recommendation is to go with .join() by default:

In performance sensitive parts of the library, the ''.join() form should be used instead. This will ensure that concatenation occurs in linear time across various implementations. — http://www.python.org/dev/peps/pep-0008/

despite the fact it might not be better in CPython.

steve-howard · on June 12, 2011

It's not terribly surprising. Things that scale faster tend to have higher setup costs (cf sorting algorithms; insertion sort is the fastest for relatively small n).

thristian · on June 13, 2011

I believe the Python runtime has special-case handling for concatenating string-literals, and for concatenating string whose reference count is exactly 1. String concatenation isn't wholly defanged, though.

podperson · on June 13, 2011

You're also constructing the array in the second test.

For a fair test, add elements of an array vs. joining them.

1amzave · on June 12, 2011

> Calling it from the empty string concatenates the pieces with no separator, which is a Python quirk and rather surprising at first.

This is stated twice, but I can't make any sense of it. How is this even remotely surprising? What else would anyone possibly expect joining with the empty string as the separator to do?

kragen · on June 13, 2011

The surprising thing is that .join is a method of the separator, not of the list of pieces.

dzderic · on June 13, 2011

I'm pretty sure it's a string method so it can take any iterator as an argument.

kragen · on June 13, 2011

I think that was the deciding argument, yes.

underwater · on June 13, 2011

I think he's referring to backwards nature of string joining in general.

skimbrel · on June 13, 2011

Use function factories to create utility functions. Often, especially if you're using map and filter a lot, you need utility functions that convert other functions or methods to taking a single parameter. In particular, you often want to bind some data to the function once, and then apply it repeatedly to different objects. In the above example, we needed a function that multiplied a particular field of an object by 3, but what we really want is a factory that's able to return for any field name and amount a multiplier function in that family:

  def multiply_by_field(fieldname, multiplier):
    """Returns function that multiplies field "fieldname" by multiplier."""
    def multiplier(x):
        return getattr(x, fieldname) * multiplier
    return multiplier

  triple = multiply_by_field('Count', 3)
  quadruple = multiply_by_field('Count', 4)
  halve_sum = multiply_by_field('Sum', 0.5)

Other languages (most prominently Haskell, though you can convince most Lisps to do it through the clever use of macros) have built-in support for doing this with whatever function you want, and it's called partial function application. It's a rather useful technique and it saddens me that it's not supported as such in more languages claiming to support functional programming.

Lycanthrope · on June 13, 2011

There is functools.partial:

  from functools import partial
  def add(x,y): return x+y
  add3 = partial(add, 3)
  add3(2) # returns 5

skimbrel · on June 13, 2011

Ah! I didn't know about that. It'd be nice to have the syntactic sugar à la Haskell, but oh well. Close enough.

herdrick · on June 13, 2011

You don't need a macro to do that in a lisp. Typically you just use a built in 'partial' function, which just makes a closure that will call the original function. (It's simple, so you can just make your own if you're using a lisp without it built in.)

j_baker · on June 13, 2011

Use if not x instead of if x == 0 or if x == "" or if x == None or if x == False; likewise, if x instead of if x != 0, if x != None, etc.

Be careful with this one. "if not x" isn't necessarily the same as "if x == None". It's easy to forget that "if not x" will be true for values other than None.

Also, use "if x is None" rather than "if x == None". :-)

mixmastamyk · on June 12, 2011

I read many years ago that '%s%s' % (a,b) was faster than a + b, the reason given that it was done in C. But after reading this thread and trying myself, it seems to be false also. On Py 2.6:

  python -m timeit " '%s%s%s' % ('aaaaaaaaaaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbb', 'ccccccccccccccccccccccccccccccc') "
  1000000 loops, best of 3: 0.201 usec per loop

  python -m timeit " 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaa' + 'bbbbbbbbbbbbbbbbbbbbbbbb' + 'ccccccccccccccccccccccccccccccc' "
  10000000 loops, best of 3: 0.136 usec per loop

Perhaps it changed in new versions of Python, but either way, I guess I'll be using the % form less than I used to.

wahnfrieden · on June 13, 2011

Use whatever is clearest to read, since the benchmarks vary upon circumstance, VM, and VM version.

fijal · on June 13, 2011

For what is worth, most "performance" idioms are anti-idioms on PyPy. Especially those:

  sum = 0
  for d in data:
      sum += d
  product = 1
  for d in data:
      product *= d

Are much faster than reduce/map equivalents. The zip/dict example at the end is even more confusing. I'm convinced PyPy would be the fastest on the simplest-possible code (the first one, marked as "bad")

astrofinch · on June 13, 2011

What's the rationale for preferring map/filter over list comprehensions?

tocomment · on June 13, 2011

I wish every language listed it's idioms somewhere.