Personally, I think this is a bit on the "clever" side. Plus, the error message you get isn't as easy to understand as if you used an assert statement. I'd probably just do something like this:
def get_single(l):
assert l and len(l) == 1
return l[0]
Then you get the best of both worlds: readability and a concise one-liner.
That fails for sets and other non-list iterables. It would have to be something like:
def get_single(l):
i = iter(l)
val = i.next()
try:
i.next() # expected to throw exception for one-element iterable
except StopIteration:
return val
raise AssertionError('More than one object')
A property which is exposed by the article's solution too:
>>> x = (lambda: (yield 1))() # generator with one step
>>> y = tuple(x)[0]
>>> y
1
>>> list(x) # exhausted
[]
>>> x = (lambda: (yield 1))()
>>> y, = x
>>> y
1
>>> list(x) # exhausted, too!
[]
Because that's just what you inevitably need to do to fetch a value from a generator. There is no peeking action or some such.
Personally, I'm willing to restrict this to just lists to keep things simple. I rarely ever need a way to extract one value from a list, set, and iterable.
Yes, but with most functions there's no choice, because most of the functions in your program do something (a) complex and/or (b) specific to your program. This `get_single` function falls into neither of these categories.
It is certainly okay for some functions to be complex. Some may rely on simpler functions, but the one you'd use can still do a complex task in general.
A function that squares a number is simple, one that computes the standard deviation is definitely more complex.
> A function that squares a number is simple, one that computes the standard deviation is definitely more complex.
I'm talking about the complexity difference between this:
def stddev(pop):
total = 0
count = 0
for x in pop:
total += x
count += 1
mean = total / float(count)
variance = 0
for x in pop:
variance += (x - mean)**2
return math.sqrt(variance)
and this:
def stddev(pop):
return math.sqrt(variance(pop))
def variance(pop):
m = mean(pop)
return sum(square(x - m) for x in pop)
def mean(pop):
return sum(pop) / float(len(pop))
def square(x):
return x**2
The first is a (mildly) complex function. The latter are all simple functions, and the complex result is constructed by composing simple operations.
Good programmers write functions in the latter style, not the former.
Good programmers write functions in the latter style, not the former.
Well, that stddev function could be much less verbose:
def stddev(pop):
mean = sum(pop) / float(len(pop))
variance = sum( (x-mean)**2 for x in pop)
return math.sqrt(variance)
To me that's easier to read than jumping back and forth between multiple function definitions. Of course, if you need the mean or variance independently then your way is better.
> Well, that stddev function could be much less verbose:
Sure, it could, but I was demonstrating what it looked like without the use of functions. Your example proves my point just as mine does: sum(), like mean() or variance() in my example, is just a simple function, the kind that I'm arguing for. The fact that it's built into Python (rather recently, I note) doesn't change that fact or reduce the impact of the argument. Your example simply goes one step down the path, and mine goes further.
> To me that's easier to read than jumping back and forth between multiple function definitions.
You don't have to jump back and forth between function definitions. Let's say you don't know what the standard deviation is, but you know what the mean is. You can look at the definition of stddev() and see, "Ah, it's clearly the sqrt() of the variance. What's the variance? Ah, it's the sum of the squares of difference between each element and the mean." You know what the mean() does (its name is pretty clear) and you know what sum() and square() do, so you never have to look at those functions. Someone else who knows what the variance is would never have to look that deep. When someone is reading the stddev() in my example, he doesn't have concern himself with implementation details of functions he already understands. When someone is reading yours, he has to at least read how the mean is calculated. He can't avoid it--it's right there.
> Of course, if you need the mean or variance independently then your way is better.
You almost certainly will in any case where you're using the standard deviation, but that's just an artifact of the example. Other advantages of using small, simple functions like in my example:
* More reusable (as you noted)
* More easily testable.
* More easily comprehensible (as I showed above)
* More easily documented (especially in a language like Python with its docstring support)
* More conceptual abstraction
This is redundant: if a list's length is 1, then it's true in a boolean context.
In a boolean context, the list is true if the length is non-zero. This example and the one the article is about is for the case where you know the list to have exactly one element. Not zero and not more than one.
And the point that was made in the other branch of this thread is that if the author intends to guard against None (who knows why?) then he should say, explicitly, "if L is not None". That's what PEP8 recommends precisely to avoid ambiguities such as these.
There's a better variable naming scheme for generic lists:
Use xs. If you have multiple lists use ys etc.
This has multiple benefits over L in terms of readability anad understandability as a single-item variable names can be made to match the list naming scheme:
for x in xs:
for y in ys:
do_some_fancy_calculation(x,y)
I wouldn't name a list l in real code. It was just the first thing that popped to mind. :-)
And as was mentioned the "assert l" part is defending against l being None. I suppose I could be more explicit by saying "assert l is not None and len(l) == 1".
There's nothing wrong with `is l`. Sounds like you're just splitting hairs.
He's defending against None because calling __len__ on None results in an exception.
3.14159 also results in an exception but it's far more likely that the object passed was None than that it was a completely different type than the one expected.
Excellent. That one belongs in any Python style guide. Though technically it's not a style, it does lead to better readability, and reduces the propensity for unforseen consequences.
The reason it's not in the Python style guide is because it's a symptom of other problems in code. Lists are for holding multiple values of the same type. If you know that a list will always have one and only one value, it's not conceptually a list, it's some other type that's been encoded into a list for some reason, and you should fix that conceptual mismatch rather than papering over the issue with a style idiom.
That sounds logical enough, except for two things:
First: You are often working with someone-else's library which for various reasons you cannot change.
Second: It is not uncommon to use a standard method that may well be able to return multiple items, but in your use case it should only return one. Case in point: a database call that returns the result of a query.
What you say sounds right, but it's not. Using some DB APIs you'll get back a list of results, and you'll know from the SQL that it will only have one result.
Another example is if you know there is a single item in a data structure, and use list comprehension to extract it. You'll end up with a list of one item.
But in that case, either your code is evidently in a branch where the list will have one item -- as in your example -- in which case the context makes it clear. And if there's no context, such as in a nested function call, then at some point you should be passing that lone list item into a function which takes the item as an argument.
The real solution is not to arbitrarily encode your types as lists of exactly one item. If you find yourself passing around such lists with enough regularity that you feel the need to develop an idiom for deconstructing them, you're doing something wrong.
More often than not you are receiving a list from some function, not constructing it manually.
As an example: Say you have a GUI widget that can contain many entries, and you call `widget.get_entries()` which returns a list with all the entries. But if you know there must be only one entry, you can do `(entry,) = widget.get_entries()`.
Then it should make a method which returns its known, sole entry. Returning a list of entries from a widget which will always contain only one entry is the logical equivalent to converting a function return value to a string and then expecting the client to convert it back from a string. That is to say, it's necessary in a general case (e.g., a Widget super class) but should not be exposed that way in some specific cases (e.g., your subclass wherein you know there will always be one entry).
In reality, you shouldn't be mucking with the entries of a widget at all; you should tell the widget what to do and it should adjust its entries as necessary). Demeter's law and all.
No doubt! But since we're on the subject of "how to make code like this better" it seems appropriate to discuss what's really wrong with the code, rather than just what sort of duct tape style guidelines we can use to patch over its flaws :)
This is Python. Subclass it and fix the problem, unless the guts are so opaque that your code would be littered with subclasses and you can't figure out an elegant, general way of fixing it (unlikely).
If someone's knowledge of Python is so shallow that they can't handle tuple unpacking, they should learn more Python. It's one of the basic foundations of the language, and it's hardly a difficult concept.
It is a good thing if code is readable by people who only know the language a little, only know similar languages, or have not used the language for years.
It is a good thing, but it isn't the highest good. It doesn't make sense to avoid useful basic features simply because they aren't immediately obvious to a novice in the language.
> If someone's knowledge of Python is so shallow that they can't handle tuple unpacking
Did you know that "tuple unpacking" actually works on arbitrary iterables (in fact, the relevant function in Python's C code is called ``unpack_iterable``)?
Yes. Why is that surprising? f is a generator, generators are iterables, iterables can be unpacked. I've always thought that the tuple referred to by 'tuple unpacking' was the target containing the variables being assigned, rather than the thing being unpacked, because you frequently unpack things other than tuples.
Your comment is interesting at a number of levels. And by interesting, I mean that I can hardly fathom you wrote it:
> Yes. Why is that surprising?
because you called it "tuple unpacking". Furthermore, because in functional languages where this feature comes from you actually unpack tuples, and pattern-matching of other structures is not called "tuple unpacking".
> I've always thought that the tuple referred to by 'tuple unpacking' was the target containing the variables being assigned
That's what the "tuple" is unpacked into, it makes no sense that the name of the pattern would come from there. At best and stretching it, you'd have "into-a-tuple unpacking". Furthermore, you're not actually unpacking into a tuple (even less so in Python 3, `(1, * a)` isn't a valid expression... anywhere that I know of), you're unpacking into free variables. That's the point of unpacking.
> because you frequently unpack things other than tuples.
The only other thing frequently unpacked is a list, and Python's tuples are immutable lists, it does not stretch the imagination that a read-only feature of tuples would work on their mutable cousin as well. As for other structures, in 6 years of Python I'd say I've seen unpacking of arbitrary iterators thrice at best. You do not frequently unpack dicts or iterators.
edit: fucking hell, yc's comment format sucks donkey balls.
> Your comment is interesting at a number of levels. And by interesting, I mean that I can hardly fathom you wrote it
I imagine a great many people will be "interested" in your response, then.
And really, you choose to be uncivil over a name? Arguing about names is one thing; insulting your opponent because he uses a different name than you do is just childish.
Anyway, since I can't seem to resist trollbait:
> because you called it "tuple unpacking".
It surprises you that historical terminology persists even a decade after limitations have been removed?
> Furthermore, because in functional languages where this feature comes from
Algol 60 had this feature under the name of "multiple assignment" I'll bet long before LISP had "destructuring-bind". This feature did not originate in functional languages.
> That's what the "tuple" is unpacked into, it makes no sense that the name of the pattern would come from there.
Prior to Python 1.5, the expression being unpacked had to be a tuple; now it does not. The name is derived from the original functionality. If you'd spent your effort referring to the Python Language Reference instead of telling the world how you can't fathom someone would use a different name than you would for this sort of assignment, you'd know this :)
> The only other thing frequently unpacked is a list, and Python's tuples are immutable lists
Unpacking applies to iterables. That seems obvious and easy, but the feature is commonly called "tuple unpacking" in the Python world despite the fact that more than tuples can be unpacked. I resolve the apparent conflict by thinking of the term as referring to the most common form of the literal on the left side of the statement, the same syntax as a tuple. If that's not the actual origin, fine, you win the pissing match. But where the name came from has zero bearing on the actual behavior of the language feature, and the point remains that the language feature is simple, consistent, and basic enough that if you don't understand it when you see it, you need to learn the language better.
The only other thing frequently unpacked is a list
Which isn't a tuple. You're the one trying to be pedantic, but you want to ding me for making this distinction?
it does not stretch the imagination that a read-only feature of tuples would work on their mutable cousin as well.
I agree, it doesn't. I don't see how it stretches the imagination that it works on general iterables, either.
Because packing multiple values on the right hand side always creates a tuple, and because the most common use of the feature – multiple assignment – deals with tuples, the “sequence unpacking” feature of Python is often called “tuple unpacking” instead, even by people who understand that any sequence can be unpacked.
"There should be one-- and preferably only one --obvious way to do it."
A couple of things:
1) The key word is obvious. There are oftentimes less obvious ways to do things that may be better for whatever reason.
2) It's not really reasonable to expect that there can only be one way to do everything.
What it really means is that (for instance) Python only allows one way to denote where a code block begins and ends (via indentation) while Ruby allows you to use curly brackets and begin/end. Nor does it have an unless statement that is equivalent to "if not"
This is the one way to do it. masklinn has cousin comment that does a good job of explaining why: http://news.ycombinator.com/item?id=1732575, and others have made some good points questioning whether you'd ever actually want to do this particular "it" in the first place.
Very good idea, I support this and will use it from now on (even though I don't like the (thing,) Python syntax very much), as it's the only way to say that the iterable should only have one element.
Oh, you're very much correct. Let me benchmark that...
There doesn't appear to be any difference in speed between the two. Assigning the result to one variable (totally different to this) is about 6% faster, so this syntax is what I'll use, thank you!
Assigning the result to one variable (totally different to this) is about 6% faster, so this syntax is what I'll use, thank you!
On your exact machine with your exact version of Python, today.
Please don't let microbenchmarks dictate what code you write. If you need a 6% performance gain in a microbenchmark, you've chosen the wrong language to use. Python is about readability and maintainability, not syntax hacks to make some benchmark slightly faster.
The point of the above post was to say "this is nice syntax, and it isn't ten times slower, so it's good on that front too", not to say "use that because it's 6% faster". I'm never going to write this in a tight loop anyway, and even if I did, I'd benchmark the entire piece of code, not just this line.
I know about the optional parentheses, but omitting them makes the comma very easy to overlook :/ I do like your fake operator idea, but it seems that it would go against the spirit of python and make it easier at the expense of readability...
so if I was wrong in my original assumption that stuff has exactly one element, Python will shout at me before this will manifest itself as a hard-to-find bug someplace else in the program.
and then later on,
This method works even when stuff is a set or any other kind of collection. stuff[0] wouldn’t work on a set because set doesn’t support access by index number.
The second argument is basically in favor of duck-typing which is the pythonic way of writing code, ie, not caring about the actual object but only if it responds to the given message.
But what about the first argument? Couldn't you make the case that the pythonic way of handling it is to only care about if the object responds to __getitem__(0)?
Which idiom you use would depend entirely on context, wouldn't it?
> But what about the first argument? Couldn't you make the case that the pythonic way of handling it is to only care about if the object responds to __getitem__(0)?
Mmm no? Here, what he cares about is that it's a single-element collection, he doesn't just want the collection's first item. If he did, foo[0] would do a better job.
So the object responding to __getitem__(0) is not a sufficient condition.
In fact, it's entirely wrong as sets do _not_ implement __getitem__. Worse, __getitem__(0) does a very different thing than unpacking on a dict. Unpacking is more ducky than __getitem__ for this case, because it expresses the following: the right-hand is a single-element iterable. Any iterable will work as long as it only has a single item, it doesn't have to be a sequence, a mapping, or even a collection (generators or callable_iterators will work just as well)
Duck-typing means you don't assume that the object is from a certain type; Instead you assume that it's got the interfaces you need. (Either `.__iter__()` or `.__getitem__(0)` in this case.)
`.__iter__()` is a more general and common interface than `.__getitem__(0)`, so it's preferable to assume that the object implements the former rather than the latter.
But sometimes you do want to assume stuff about your object. In this case we want to assume that the list has exactly one item, and we want our code to break immediately if it isn't true.
Depends on which is more important. If it's an egregious error for there to be more than one element then I'd prefer the unpacking solution. However if ordering is/was involved at some point a set would indicate a possible bug (since they're unordered) so I might lean towards indexing.
They are both decent ideas, not hard fast rules. Let your context be your guide.
My problem is that exact situation almost never comes up. Usually, I want to use the item in the list directly in an expression, not assign it to another variable first. For example:
if 1==len(lst):
return lst[0]
else:
return reduce(fn, lst)
I don't really want to have to do:
if 1==len(lst):
(single,)=lst
return single
else:
return reduce(fn, lst)
I see what you mean. In this case it's indeed questionable whether this idiom would improve the code. The only improvement I can offer for this code is to fix the Yoda conditions :)
That's very different than OP's piece of code: you're not trying to assert that the iterable yields a single element, you already know it.
Furthermore, OP's assertion-unpacking will work not just on lists, but also on dicts (will return the only key, not the only value, and the key doesn't have to be ``0``), on sets (which don't implement __getitem__ at all), on arbitrary collections and even on arbitrary iterables (including callable_iterator and generators)
Also, please don't put the constant on the left-hand of a comparison in Python, it's useless and ugly.
I agree it's cool that it works on lists and sets and dictionary keys. Upvoted, etc. Just noting that there are many situations where it won't replace the [0].
Also, I don't worry about order on simple equality expressions unless someone asks me to do it a certain way. That's a religious argument and a complete waste of time for me.
Python 2.6.5...
Type "help", "copyright", "credits" or "license" for more information.
>>> (a, (b, c, (d, e))) = (1, (2, 3, (4, 5)))
>>> a, b, c, d, e
(1, 2, 3, 4, 5)
>>> (a, (b, c, (d, e))) = (1, (2, 3, (4, 5, 6)))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack
It isn't quite as flexible as functional languages and it's not as idiomatic as it is in functional languages, but it's not a hack or quirky edge-case either.
Not really. The first two are consistent, but the last one is "get the only item and assert that there is only one item to get. If you don't care about the assertion, use [0], as always. I like the latter, because there's added info for the reader ("this should return a one-item iterable").
It's not really inconsistent, because you aren't performing the same operation in the two cases.
I'm annoyed by the belief that consistency is ipso facto good. Consistency is a tool for attacking a particular type of problem. Consistency for its own sake will often make things worse.
I see this fallacious reasoning all the time in design critiques: this is inconsistent with that. Well, yes, but so what? Why is consistency desirable in this instance?