Hacker News new | past | comments | ask | show | jobs | submit login
Python List Comprehensions Explained Visually (treyhunner.com)
138 points by ingve on Dec 2, 2015 | hide | past | favorite | 78 comments



Maybe its me, but "Explained Visually" sounds more like it would have some 'visual' aspect to it besides a GIF of code (don't get me wrong, I think most code should be made an image to limit copy/pasting).

A website like VisuAlgo (http://visualgo.net/) is a great example of exampling something visually should mean


I agree. I was using "List Comprehensions: Now In Color" as a working title but I thought that was too nonsensical. It was actually the colorized code and not the gifs in particular that made me think "visual".

Sorry for the misleading title!


It's a good document. I know list comprehensions already but still I found it entertaining to read - I think it'll enlighten a lot of people on this topic.


This article needs some more complex examples, e.g. comprehensions with multiple "for" clauses and more complex filtering. Perhaps an example using zip or enumerate.

E.g.

    cartesian_product = [(a,b) for a in [1,2,3] for b in ['a','b','c']]
Could really be nice when explained in color.


What's the font you're using in that gif? It looks good.



Ah, cheers :)


> I think most code should be made an image to limit copy/pasting

You think what now?


As others have discussed, making coding example images for the sole purpose of forcing them to type it out. Don't get me wrong, I copy/paste like the rest of us; however, from a learning perspective, I think forcing students to type out the code gives them practice, akin to how you'd practice your shots for basketball.

I've actually implemented typing exercises that are straight "copy this image" as one of the assignments I give students each week. Now, students have my lecture, the demo done in class, AND a typing exercise before trying to implement something like, say, a collision detection algorithm.


Not OP but I prefer to re-type rather than copy-paste because too often when I copy paste I discover that I forgot to change something.


It seems to me that the cost of retyping code each time is much higher than the cost of forgetting something.

I get the underlying idea, that you "feel" and understand the code more if you type it, but holy hell would that be annoying !


To qualify my statement:

- If found a boilerplate HTML layout and wanted to insert some stuff into it, I would use copy-paste.

- If I were editing a function to enqueue data, and I wanted to send it to two queues, I would re-type the 2nd enqueue command to avoid blindly copying over an inappropriate parameter, which is the kind of thing a tired mind might gloss over.


Yup! Its suppose to be annoying (well, not 'really').

It's primarily a method to ensure the reader 'learns' by mimicking versus blindly copying. Annoying to a seasoned programmer, but its a great way for students to get extra practice.


Setosa (http://setosa.io/) hosts some impressive interactive visualizations as well.


Why do you want to limit copy pasting?


He's probably referring to stuff like this:

> Do Not Copy-Paste

> You must type each of these exercises in, manually. If you copy and paste, you might as well not even do them. The point of these exercises is to train your hands, your brain, and your mind in how to read, write, and see code. If you copy-paste, you are cheating yourself out of the effectiveness of the lessons.

(from the intro of "Learn Python The Hard Way" ( http://learnpythonthehardway.org/book/intro.html#do-not-copy... ))


ciupicri has it right. I mention above, I think of it like practice shots in basketball, or going to the batting cage for baseball. While coding isn't as physically demanding, it definitely takes time and practice to get better at it. Forcing someone to write the code versus copy/paste gives them just a little more practice at the discipline.


I don't know why you would want to force people to do this. There's nothing stopping someone from hand-copying code if you use text, but you're using the wrong format if you encode it as an image. So, as a result, in addition to losing copy+paste (which may be useful for someone who doesn't want to "Learn Python the Hard Way"), you also lose Ctrl + F, and if you're using an animated GIF, you also lose the ability to stop at a specific point or move around.

Overall, I think the right answer is to use the appropriate format and let people sort themselves out.


I agree.


Coming from Ruby to Python...I could not believe how damn hard it was to comprehend the beautifully explained and simple code (just 21 lines!) in Peter Norvig's writeup of how to create a spelling corrector [1], purely because of the list comprehensions (and a few other visual differences, such as partial list notation)

For example:

     inserts = [a + c + b for a, b in splits for c in alphabet]
It's no different (conceptually, at least) than doing:

     inserts = []
     for a, b in splits:
        for c in alphabet:
            inserts.append(a + c + b)
Or in Ruby (OK, maybe this isn't perfect Ruby):

     inserts = splits.reduce([]) do |x, (a, b)|
         alphabet.each{|c| x << a + c + b }
         x
     end
  
But the list comprehension was so difficult for me to read that it might as well have been a brand new concept. But I found I got quickly over it after rewriting all of his comprehensions as for-loops, and then trying to convert them back into comprehensions. It was a good reminder that sometimes the barrier is just plain muscle/memory reflex (and probably much more so for beginning programmers...)

[1] http://norvig.com/spell-correct.html


I most frequently use List Comprehensions in a nested manor, whenever I do that I /ignore/ the normal python syntax and do this for readability.

    inserts = [ a + c + b
                for a, b in splits
                for c in alphabet ]
It has the effect of showing the nested loops stacked like they would be in code, but without intending.


For the projects I manage, I would have the dev rewrite that as a proper for loop.

That's way too confusing and hard to parse. List comprehensions are great for simple operations, but once you start nesting, you're just being a jerk to the next dev that has to read your code.

Code should be clear first, elegant second.


> Code should be clear first, elegant second.

I disagree: code should be correct before it is clear. And because it's so easy to mess up a for loop (for me at least), I choose list comprehensions where reasonable.

> you're just being a jerk to the next dev that has to read your code

That's not very charitable to either party. You're assuming that the motive of the author is to be a jerk, and you're assuming that the reader won't understand. If I had a dev on my team who couldn't read a list comprehension, I'd (a) wonder how they were hired, and then (b) teach them.


I've written a fair share of Haskell, and I still get tripped up by list comprehensions in python - even my own. It's not that it's impossible to understand, and when there's only one predicate, I think it's fine:

  for row in [[i*j for i in range(1, 8)] for j in some_list if j % 2 == 0]:
    some_op(row)
vs

  for j in some_list:
    if j % 2 == 0:
      row = [ i * j for i in range(1,8) ]
      some_op(row)
I would compare it to sentences and paragraphs. The former feels like a run-on sentence, while the latter is more obvious, cause it has one predicate per line. Also, list comprehensions are a bit like yoda-speak - it introduces the verb before the subject. You have to untangle the order of operations, rather than having the order read top-down and left-right.

Of course, I'm a rubyist, so I'd prefer:

  some_list.select { |j|   j % 2 == 0 }
           .map    { |j|   (1..8).map { |i| i * j } }
           .map    { |row| some_op(row) }
Though definitely need to do something about that second line.

Haskell list comprehensions are a bit easier to parse because they have symbolic delimiters, the fact that Haskell is naturally more terse, and because you can always check the type of the list

  [ [i*j | i <-[1..8]] | j <- [1..4], j % 2 == 0 ]


Any list comprehension that is

1. Longer than ~80-120 characters

2. Can't be explained in 1 comment line

Should be written as a nested for loop IMO.


Yep. But even more: as any piece of code may have to grow and become more complex, any list comprehension except in throwable scripts should be rewritten as a for loop.

Of I think 95% of the list comprehensions I wrote have been refactored to for loop by myself while debugging or extension the code later. So now I just never use them anymore, except for throwaway code.


I don't get this. The list comprehension is much clearer for me. Do you have a mathematical background ? If you can easily think about sets I would expect the list comprehension to be far clearer.

List comprehensions are much easier to read just because they're so much shorter : there's less to remember while reading it.

This is an issue one always hits when you try to implement stuff that's more complex. List comprehensions help because they let you talk directly about more complex sets. They raise the level of abstraction. It's hard to get comfortable with but it really helps.


Yes, I happend to have a strong mathematical background.

the difference in our points of view might be that I work on a code base with some ten million lines of code. Anything in there that is not purely flat stupid python, anything clever in fact, will be an annoyance some day. For example, someone started to use a "clever" `template.render(locals())` 5 years ago, and now we have no way to tell where the variables come from.

Inside a list comprehension, how do you print? branch? comment? emit logs? (I know you "can" do that, but then it becomes a mess not worth it). Moreover, if in you 5 line long-list comprehension, if there is an error, will the traceback tell you where it is?


I understand that list comprehensions are only for simple list construction. So you don't print or branch. Comment is pretty simple though.

My perspective is that in large codebases I have far more trouble remembering function names and what they do than I ever have trouble decoding statements. So the increased code size, long functions and ... that comes from those long list construction statements and loops always seems to take more out of me than difficult list or dictionary comprehensions. Also there are very much fewer places for bugs to hide.

There's also the additional argument that list comprehensions are more efficient because they don't actually construct the list. They generate iterators. It is not a huge difference like in Haskell but it definitely helps.

And there are special cases where you have to use list comprehensions : infinite lists can work as a comprehension, yet the equivalent code to generate them is pages long.


Huh. You seem to think nobody could possibly find comprehensions as easy to read as loops. That surprises me because I think the reason they got implemented in many programming languages is precisely that many people find them easier to read!


I think he's talking about more complex list comprehensions then just the traditional for-loop + map setup. I haven't been in Python land for awhile but I remember some atrocious contraptions of 80+ line comprehensions that could of been infinitely more explicit as a literal for loop.

This is Python after all.


Well, rewriting a 80+ line comprehension as an 80+ line series of nested loops and conditionals wouldn't increase readability in my opinion. What would make it easier to read is to divide the computation into several smaller steps, which could be done either with several comprehensions or several loops.


> I /ignore/ the normal python syntax

Is it not normal to do that? I always lay out the nested ones, and even single comprehensions if the conditional is complex.


Yes, that's very common formatting (except for the whitespace immediately inside the square brackets, which violates PEP 8, but that's incidental to the question of line breaks).


I'd do the ruby version as:

    splits.flat_map do |(a, b)|
      alphabet.map{|c| a + b + c}
    end
Which feels pretty close to the list comprehension to my eyes.


Depends on a programmer I guess. I'm pretty confused by the Ruby version and I can do Ruby in anger. I'd also use maps rather than each if I was writing it.


Cool. As a math guy, I enjoy the similarity between comprehensions and set-builder notation:

https://en.wikipedia.org/wiki/Set-builder_notation


In the opposite direction, having knowledge of Python's list comprehensions (and to some degree, an understanding of lazy evaluation) helped me a ton when set-builder notation was introduced in my first discrete math class.


Agreed! Haskell's list comprehensions look even more like set-builder notation: https://wiki.haskell.org/List_comprehension


Are Python List Comprehensions really hard for people? I feel like my memory of learning them was I saw one in production code I was modifying, intuited what it did, pulled open a REPL and played with it for a bit, and then moved on.


I agree. It really surprised me to see this getting so much traffic. I sometimes get hung up with nested comprehensions when I haven't been using them in a while, but other than that, they are quite mundane.


Is it just me or were there zero visualisations in this post?


It might not be what you were expecting, but there's an animation if you ctrl-f "Here’s the transformation animated:"


:) I did see that one and indeed it wasn't what I was expecting.


That translation is kind of why i don't like python list comprehensions though. It's often clearer as a for-loop or map/filter, cause the different predicates are broken out into separate lines, rather than lumped together in one long line.


It is not much easier to understand the explicit loop:

  doubled_odds = []
  for n in numbers:
       if n % 2 == 1:
           doubled_odds.append(n * 2)
compared to the list comprehension:

  doubled_odds = [n * 2 for n in numbers if n % 2 == 1]
Here's the explicit loop without newlines for comparison:

  for n in numbers: if n % 2 == 1: doubled_odds.append(n * 2)
Notice that the for-loop and the if-statement are in the same order in the loop and in the list comprehension. Only the value expression (n * 2) is out-of-order. OPs jumping around is often unnecessary in the gif.

The one line list comprehension is as much readable for people familiar with Python as the variant with the explicit loop.

It is not uncommon to use several list comprehensions in a function. If you rewrite them as explicit loops then it may increase the line count 4 times e.g., it can convert a simple 3-lines function that could be understood at a glance into 12 lines function that you have to read line by line to understand -- it is easy to keep track of 3 values on 3 lines but it is not clear whether it is true for 12 lines.


I think the problem here is that nested comprehensions and filter comprehensions are almost always too complex. If you are doing something that requires that, you should break it up and give the pieces names:

    odd_numbers = range(1, limit, 2)
    doubled_odd_numbers = [n * 2 for n in odd_numbers]


A good reason to use list comprehension for simple things though is that you can sometimes quickly change [...] to (...) and get an equivalent generator. With a loop, you'd have to extract it into a separate yielding function.


I'm a big fan of just using map/filter, but I think it makes sense to wonder whether such a common and important operation is worth its own syntax. I do think the way nesting works is a bit surprising. It didn't really occur to me until reading an earlier comment that in essence the most nested loop is a `map` and all the others are `flatMap`s. (I think?)


The way to understand how the nesting works in the list comprehension:

   L = []
   for x in X:
       for y in Y:
          L.append(x*y)
[loops] translate to

  L = [x*y for x in X for y in Y]
If you understand the order of the loops in the first code fragment; you should understand the order in the second one -- it is the same order.


The problem is that the relation of the body to the loop specifications is reversed, so the inner loop is most proximate to the body in the first, but most distant from the body in the second.

The specification of the inner loop should always be most proximate to the body, not last independent of the location of the body.


As a person who thinks about this stuff in more functional terms, as in mapping over collections, rather than imperatively iterating and appending, it helps me to realize how it translates to map and flatMap rather than how it translates to for loops (which I already realized).


Perhaps a for-loop is more similar to your memory of other code, but a list comprehension is more similar to an English sentence.

    "All names longer than 4 characters in the list of cities?"
    all(len(name) > 4 for name in cities)

    "Sum of x-squared for x from 10 to 20."
    sum(x**2 for x in range(10, 20))


Frankly, comprehensions are one of Python's worst features. Some properly implemented functional syntax would be both more readable and concise. Plus double-nested (I'm not sure what you'd call it) comrpehensions seem to be backwards. Rather than [ingredient for ingredient in recipe for recipe in cookbook] it's [ingredient for recipe in cookbook for ingredient in recipe] which makes it sound as if there's only going len(recipe) number of ingredients in your list. Plus it's so verbose and horizontal, half the time if you want descriptive names you're gonna write something so verbose it doesn't even make it that much more readable.


When reading code I find comprehensions easier to comprehend than combinations of map, filter and reduce. That is considering that I played around with some Common-Lisp before getting into Python.

The parts in the comprehension

    [*wanted* *iteration(s?)* *condition?*]
compares well to

    *iterations(s?)*
        *condition?*
            *append wanted*
for normal looping behaviour without changing order of the iterations.

One of my favourite comprehensions is of the sort

    [item for item in items in for i in (0, 1)]
of course this can be the same as

    list(items) * 2
but the comprehension is more versatile

    [item for item in items in for i in (0, 1) if items[0] != 'a']
For most once the mental fog clears comprehensions become natural very quickly as they are not that far removed from normal loops.


Haskell, Erlang, and Clojure all have sane lambdas (assuming that's what you mean by "properly implemented functional syntax"), and list comprehensions are useful and idiomatic in those languages. Agreed about sequence names feeling out-of-order in nested comprehensions in Python.


From the Zen of Python: > There should be one-- and preferably only one --obvious way to do it.

If you have something written as a simple for loop, there's a real danger that you're moving in the wrong direction by translating it into a list comprehension.


> Plus double-nested (I'm not sure what you'd call it) comrpehensions seem to be backwards.

Why are you even doing this? This isn't an acceptable way to write code.

> Plus it's so verbose and horizontal, half the time if you want descriptive names you're gonna write something so verbose it doesn't even make it that much more readable.

If your list comprehension is more than 80 characters wide, it's not because list comprehensions are bad, it's because you did something too complicated. Pull it into a function and name it so people know what the hell you're doing.


> Sometimes a programming design pattern becomes common enough to warrant its own special syntax. Python’s list comprehensions are a prime example of such a syntactic sugar.

I disagree with this thesis, although the post is otherwise a very nice guide to list comprehensions. Special syntax for specific types indicates a failure of the language to be generic enough.

List comprehensions anoint lists as a special thing. You don't get to play with list comprehensions, unless you're using the type that the language has decided to let you play with. If you decide to use a different type for some reason, you... can't.

map and filter can be part of a typeclass/interface/protocol (in Python these would just be informal), so you can use them on arbitrary types. If you want to switch your list type, it should just work.

I write a lot of Swift, and I'm constantly frustrated that optional chaining (?.) and exceptions are special things. I can't implement ?. for my type (say, a Result). Only the language creators can use it. Somewhat confusingly, ?? is a thing I can implement.

In Python's case, I think that list comprehensions are necessary because the language's support for first-class closures is poor.


> You don't get to play with list comprehensions, unless you're using the type that the language has decided to let you play with. If you decide to use a different type for some reason, you... can't.

You can use generator expressions and pass it to your custom type's constructor.

https://www.python.org/dev/peps/pep-0289/


> List comprehensions anoint lists as a special thing. You don't get to play with list comprehensions, unless you're using the type that the language has decided to let you play with.

That's one reason why I like Scala's comprehensions; they have the conciseness of list comprehensions, but are more generic.


aka monads ;)


Yeah, it would be neater if they were just "iterable comprehensions" and worked with anything that is iterable.


That's kind of what generator expressions are actually :)

https://www.python.org/dev/peps/pep-0289/

You can pass them into anything that expects an iterable. That's pretty much the same since iterable types will frequently consume an iterable in their constructor.

For example, here's a "tuple comprehension" (really just a generator expression passed to a tuple):

>>> tuple(x for x in [1, 2, 3])

Same thing passed to list and set constructors (which you'd never do because we have the special comprehension syntax for those:

>>> list(x for x in [1, 2, 3])

>>> set(x for x in [1, 2, 3])

I explained these in a very slightly different way in the linked webinar I linked in the post: https://youtu.be/u-mhKtC1Xh4?t=35m05s


To make it even more clear, you can store the generator in a variable and pass it around:

    >>> x = (a*a for a in range(3))
    >>> type(x)
    <type 'generator'>


>Same thing passed to list and set constructors (which you'd never do because we have the special comprehension syntax for those

I do this all the time (for set) because it works even in python 2.6. Also dict((key, value) for foo in bar). In fact, I think it was a bad idea to add special comprehension syntax for sets and dicts.


Excellent. Mental model of what python can do: changed. I sort of thought that was a fancy python 3 feature, but looks like python 2 has it as well.


One basic misstep here for me is a comparison of why list comprehensions are better than loops -- speed, ease of reading, etc. There are a lot of comparisons, but no discussion of WHY list comprehensions.

I work on a project that's written mostly in Python; on occasion, other coworkers have to edit/work with my code and they're not used to Python at all, so I find myself writing for loops instead of list comprehensions for clarity.


I find the utility of list comprehensions is maximized when I'm in a Python shell and dealing with indentation when writing a loop can become obnoxious to edit. I tend to have a Python shell tab open in my terminal throughout the day, and it's nice to bang out a quick loop in a one-liner.

When writing actual code, I tend to stick with regular for loops.


For those confused - the visual part is done in a GIF (i.e. http://treyhunner.com/images/list-comprehension-condition.gi...).

I think animating the transformation from a normal loop to a list comprehension is a great way to show how the syntax translates between the two forms. Very awesome and comprehensive post.


Maybe I'm an FP snob but it would have been nice to mention the analogy to set builder notation (though lists aren't sets - maybe multisets or bags) to indicate the existence of higher abstraction; not just syntactic sugar on the for-loop.

However, from a practical point of view I think it is great introduction for procedural programmers to begin using list comprehensions.


> though lists aren't sets - maybe multisets or bags

arrays/lists (like maps/hashes, though with a more narrow restriction on the indices) are (or at least are isomorphic to) sets of pairs of (index, element).

They carry information about order which does not exist in multisets/bags.


collections.Counter was designed with multiset in mind.


I've been reviewing Python this week and this was a great brush up. Thanks!


nested comprehensions are kind of a nightmare to read, but basic list comprehensions seem incredibly straightforward to me. So much so that the explanation is more complex.


Not sure how this is visual, neat post either way!


When I started to learn Python, I was quite lost with this feature, but now I love using it.


Is it really that hard to grasp?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: