Honestly, it was mostly the concision that struck me at the time. Looking at the zipmap source now, I think it's more or less where my code tends to end up in terms of comments--some motivation / higher-level comments at the top, and most functions have a comment before, hopefully with some explanation of any edge/NULL cases as well. Any open-source projects you'd point to as good examples of what looks readable to you? Always trying to improve my own coding habits as well.
That's because rehash returns 0 upon success, and -1 on failure.
To me, that's very confusing, because it reads to be erroring upon success. When I read rehash's body, I learn that 0 means success and -1 means failure.
I'd prefer one of two methods. First, you could return 1 on success, and keep returning something falsy on failure. Then the code would read:
if (!rehash(H))
goto err0;
Which I think is more readable. Perhaps even better would be to use a constant, i.e.,
if (rehash(H) != REHASH_SUCCESS)
goto err0;
although that can easily lead to a mess of redundant constants, or a headache managing them.
Just my personal preference. Really nice code, thank you for sharing it.
you could return 1 on success, and keep returning something falsy on failure
I come from an OS background, so to me 0 is success and non-zero is failure. It doesn't really matter which convention a project uses as long as it's consistent, so I documented this in my /STYLE file: "In general, functions should return (int)(-1) or NULL to indicate error."
Both you and Salvatore have very readable code. I'm uncomfortable with having both 'H' and 'h' in the same function, and with using 'l' as a variable, but both of these files are exemplary C. Still, although it may just be personal preference, I give the slight edge to Salvatore on clarity.
I have no real problem with 0 for success, but I'm with 'qeorge' on this. How is someone looking at only this function call supposed to determine if rehash() returns an integer or a pointer? My first guess would have been that you were checking for a non-NULL pointer. The 'goto err' makes it relatively clear after the fact, but not at a glance.
If you want to keep returning -1, I think a better convention would be 'if (rehash(H) < 0)', which makes it more clear that you expect an int and better implies an error condition. But I think even better would be to return KV_SUCCESS or KV_FAILURE (defined however you choose) and check explicitly. This would also let you remove a bunch of lines by getting rid of every comment next to every return statement! :)
I'm uncomfortable with having both 'H' and 'h' in the same function
To be honest, I don't like that either, and normally wouldn't do it. In this case I decided that the benefits of consistency with other code ([H]ash table structure; [h]ash of some data) outweighed the ugly near-collision.
How is someone looking at only this function call supposed to determine if rehash() returns an integer or a pointer?
If it returned a pointer, that line would have been "if ((foo = rehash(H)) == NULL)" -- I don't return pointers which aren't going to be used, and I don't write "if (pointer)". Again, it's a matter of knowing the house style.
I think even better would be to return KV_SUCCESS or KV_FAILURE (defined however you choose) and check explicitly
If it was just a matter of one function, that might be reasonable. But applying that to the entire project I'd have LBS_STORAGE_SUCCESS, LBS_WORKER_SUCCESS, LBS_DISK_SUCCESS, LBS_DISPATCH_SUCCESS, KVLDS_DISPATCH_SUCCESS, KVLDS_SERIALIZE_SUCCESS, BTREE_FIND_SUCCESS, BTREE_MUTATE_SUCCESS, BTREE_SUCCESS, BTREE_CLEANING_SUCCESS, BTREE_NODE_SUCCESS, MUX_DISPATCH_SUCCESS, NETBUF_SUCCESS, PROTO_LBS_SUCCESS, PROTO_KVLDS_SUCCESS, EVENTS_SUCCESS, WIRE_READPACKET_SUCCESS, WIRE_WRITEPACKET_SUCCESS, WIRE_REQUESTQUEUE_SUCCESS, UTIL_SUCCESS, ELASTICQUEUE_SUCCESS, PTRHEAP_SUCCESS, SEQPTRMAP_SUCCESS, and ELASTICARAY_SUCCESS. Plus all the _FAILUREs.
Much simpler to just say "0 is success, -1 is failure" once.
That's fair, although it seems like in that case TARSNAP_FAILURE and TARSNAP_SUCCESS would then be a fine alternative.
I think the whole question is how much it's OK to have an house style that isn't immediately accessible, and how much that benefits the project. Is it really a benefit if outsiders can grasp the code immediately, or is it actually good to have a hurdle?
Do you have an argument against "rehash(H) < 0" other than the visual distraction? It seems like it would parallel better with a literal comparison to NULL.
TARSNAP_FAILURE and TARSNAP_SUCCESS would then be a fine alternative
Sure, until someone looking at kivaloo or spiped or scrypt asks "what the heck is tarsnap?" ;-)
house style that isn't immediately accessible
I think it's very likely that people working on the kivaloo code will have at least a passing familiarity with UNIX system call conventions, and would find my house style entirely accessible.
Do you have an argument against "rehash(H) < 0" other than the visual distraction?
I would interpret that to indicate that the function has several potential return codes, not just 0 or -1.
There's no such thing as a dumb question, only people too dumb to take every available opportunity to learn. ;-)
In C, "if (foo)" means "if (foo != 0)" if foo has integer type, so that line means "if (rehash(H) != 0)" or (since rehash only returns 0 or -1) equivalently "if (rehash(H) == -1)".