Hacker News new | past | comments | ask | show | jobs | submit login

It is a shame that default arguments isn’t a bit longer. Perhaps it’s out of scope to talk about anti-patterns but in my experience default arguments cause a lot of distress to a good code base.

Defaults are useful when you are providing a library function for other teams to use. If you’re inside a more private code base and doing work on the implementation of your team’s service then it is wise to avoid default arguments.

The problem is they provide a point after which it seems acceptable to add a flood of more default arguments. This is particularly the case for junior developers who lack confidence to refactor instead of patch. Default arguments go hand in hand with conditional logic and cause functions to bloat into do-everything multi-page monsters without any focus and no tractable flow of logic.

Forgive the contrived example, but what was once this:

  def greet(name):
    print(f”Hello {name}”)
ends up becoming this, all because no one would bite the bullet and pick this apart into individual functions:

  def greet(
    name,
    language=None,
    io=None,
    is_ci=False,
    and_return=False,
  ):
    greeting = “Hello”
    if language:
      greeting = translate(greeting)
    message = f”{greeting} {name}”
    fn = print
    flush = False
    if is_ci:
      fn = log
      flush = True
    fn(
      greeting,
      flush=flush,
      io=io if io else stdout,
    )
  if and_return:
    return greeting

The slow rot of more and more defaults makes the function longer and longer. Moreover, each time someone adds a new option it gets harder to justify why they shouldn’t do it when the previous person was allowed.



I think the example is indeed contrived - you said it yourself!

To me it doesn't illustrate the problem with default parameters specifically.

For example it shows that the programmer doesn't know dependency injection and first-class functions. Printing could be passed in as a function param, but then it might actually be sensible to provide a default callable (eg, print), depending on how the greet function is going to be called.

Language seems to be a perfectly sensible thing to have a default on.

Then, and_return... That is, like, super contrived, man... I mean if a programmer doesn't know that a function caller can simply call for side effects and ignore the return value, they likely have much bigger problems than their judgement to use defaults or not.

I empathize with your plight though - you're probably a great programmer, and I think it's very difficult for someone who is good at their craft to come up with genuinely but subtly shitty examples.


You are right — the return thing is very silly. What about this:

  def greet(…, bow=False):
    …
    if bow:
      take_a_bow()
Except imagine take_a_bow() instead as 10 lines of code to perform the post-greeting bow-ceremony. That code takes additional arguments regarding what kind of hand flourish to perform while bowing. The flourish_type has to be an optional argument to greet (because bow is) but inside the bow you have to assert flourish_type is not None because you can’t bow without knowing what flourish to give.

I’ve seen some dark stuff over the years.

Also: Welcome to HN!


  def greet(bow=None):
    ...
    if bow:
      bow()

  def elsewhere(gesture):
    def bow():
      print(gesture)
    greet(bow)
Wrap that.


OMG please leave your dependency injection in your Java project and keep Python clean with its easy to use list of default parameters.


DI is not a pattern to provide the defaults but a pattern that allows you to separate dependency initialisation from your callable's responsibilities, which often is a complex graph of inner dependencies that you would need to provide/initialise in some way, it also can handle whether you want to have single instance of such dependency for your single call, request scope, thread scope or process scope. Also injecting some defaults from configuration object or some properties/setting file is very nice feature.

Also stop being so religious and defensive when somebody mentions things that are not standard in your language of choice as nobody forces you to use this.

I was sucessfully building very testable and maintanable codebases (~50 kloc) in Python while also using very small in-house built DI framework and it was subjectively (by me and my collegues) much better than what we had before while we followed standard Python patterns and ways Python frameworks teach you to follow.


What's the point of a DI framework? I never got the point of even thinking about DI explicitly like it's something special. It's so obvious that it hardly deserves to have a name, let alone a framework.


> (by me and my collegues)

Can I guess you were all Java developers?


No, around 6-7 Python developers including me, we've used it in two different projects. Also it's probably worth mentioning that I'm not author of that lib, my die-hard anti Java collegue created it by just reading about DI, consulting with other collegue that was using Spring more-or-less since high school and researching existing DI libraries. I myself did some programming in Java before that, but it was mainly hobbist gamedev, later commercial Android and light Java backend work using Spring (that's where I've seen it used for first time) intermixed with around 7 years of professional Python backend programming in two different companies. Now after 10 years I'm an Java engineer, I've had enough of using dynamic languages to write moderately complex web applications.

I still love using Python for REPL, small scripts or prototyping, and I think having things like mypy is great as it takes away much of the burden without being a huge obstacle in some situations where you really need to use duck typing. Also I'm thankful that it teached me early that the debugger is one of the developer's best friends and the best documentation is just reading the code.


Thanks yeah I'm just bitter and in a bad mood today. :)


You can use dependency injection in Python while remaining completely pythonic - nothing in the Python I write resembles Java. Dependency injection is not mutually exclusive with using defaults either, so I'm not sure what we're talking about here.


For me, DI is where this happens at the public interface to things. I’m quite happy with subprocess.Popen’s keyword argument salad because they are relatively ergonomic and make good sense.

When that kind of “handy defaults for ya!” programming happens to a function inside a package… that has four call sites… all of which were added by the same team of three people… just refactor your stuff, get functional, and say explicitly what you actually need.



Yeah pandas seems to flaunt a lot of convention about not grouping lots of different control flow into a single function.

But at the same time I wonder how it would look refacotred. How many read from csv functions would we be left with?


> How many read from csv functions would we be left with?

It probably couldn't be that, because many build on one another. Some are deprecated and others are clearly incompatible, but out of 50 parameters you likely could imagine calling this with 20 parameters if the environment and the CSV you're ingesting are wonky enough.

I think feasible refactorings would be:

- rationalise currently separate parameters into meatier objects e.g. there's at least half a dozen parameters which deal with dates parsing, a dozen which configure the low-level CSV parsing, etc... that could probably be coalesced into configuration objects

- a builder-type API, but you'd end up at the same result using intermediate steps instead of a function, not really useful unless you leverage (1) and each builder step configures a non-trivial amount of the system, so rather than 50 parameters you'd have maybe 10 builder, each with 0~10 knobs

- or you'd build the thing as a bunch of composable transformers on top of a base parser

Of note: the latter at least might be undesirable from the Pandas POV, as it would imply layers of recursive Python calls, which might be much slower than whatever Pandas currently does (I've no idea).


I think that this style (such as it is) comes from R, and scientific computing more generally. I grew up with R and never realised how terrible long argument functions are until relatively recently.


`pyarrow`'s `read_csv` function[0] has just four default arguments (defaulted to None): 3 option objets and one Memory Pool option.

``` pyarrow.csv.read_csv(input_file, read_options=None, parse_options=None, convert_options=None, MemoryPool memory_pool=None) ```

You can then pass a `ReadOptions`[1] object if needed.

For example:

``` read_options = csv.ReadOptions( column_names=["animals", "n_legs", "entry"], skip_rows=1) csv.read_csv(io.BytesIO(s.encode()), read_options=read_options) ```

You can see how ReadOptions is written on this link [2]. It's interesting they use a `cdef class` from `Cython` for this.

This doesn't solve all issues (the ReadOptions object and the others will inevitably have a bunch of default arguments) but I do think it's safer and it's easier to have a mental map of the things you need to decide and what's decided for you.

[0] https://arrow.apache.org/docs/python/generated/pyarrow.csv.r... [1] https://arrow.apache.org/docs/python/generated/pyarrow.csv.R... [2] https://github.com/apache/arrow/blob/master/python/pyarrow/_...


This is where you could use a builder pattern where you specify everything that diverges from the default using chained method calls.


So you end up at the same point, but now you need additional intermediate structures and infrastructure which do nothing to help. And for Python specifically it's also a pain in the ass to format due to the whitespace sensitivity.


proving the comment's point - this is a library function! exactly the right case for default args


Yet, I've never had an issue using that function!


matplotlib has entered the chat.


You don't suggest any solution. Do you want more function overloading or maybe config objects?

Adding default parameters works well with existing code. It is not bad and lazy because it is easy.


Imagine two different call sites want to do two different things with the message. One wants to log() it, another wants to print() it. In my example this has been implemented by passing a flag to greet() to tell it what to do.

If greet() gave up responsibility for outputting the message and instead just constructed it, then your code would look like this:

  def site1():
    print(greet(“x”))
  
  def site2():
    log(greet(“y”))
And greet would be half as long.


Except wouldn't the ideal in this case be to use default parameters?

def site(message, port=print): port(greet(message))


This bit in particular:

  if and_return:
    return greeting
Is a nice touch. I've definitely seen that pattern in the wild.


I can't conceive where this would be useful, could you expand on this?


I usually see it for things like “return_generator”. Then you need to write an overloaded signature to show that it could return a list or generator depending on that param.

Then it’s even worse when there’s also an “allow_raise” param, where the return type will not be None if allow_raise=True. Now you need to write 4 overloaded signatures to account for the 2 polymorphic params


It's more likely to be listDir(withSizes=False)

I work on a very large codebase like this and most functions return lists of strings/tuples and most DTOs will be dictionaries with string keys. Instead of classes which have methods to retrieve information in different ways. Therefore parameters have been added to return information in more and more ways.


> Moreover, each time someone adds a new option it gets harder to justify why they shouldn’t do it when the previous person was allowed.

I’m not sure languages should be limited in order to avoid problems with the lack of project leadership.


I think the example is a bit grey.

In my opinion, function should list its dependencies and allow changing them. Having said that I dont believe the `is_ci` decision should happen in the function. The decision should happen at the entrypoint and it should drive which implementations the code will use for the dependencies.

I would look for the reason of rot in making the function become the merge point of multiple context, not the default values per-se. Whether default arguments make merging multiple contexts in a single function easier - code reviews might help here.

In any case, very good example


mmmm yes the entire R ecosystem…

Others also mention pandas

Data science workflows in general love to do this


SRP violation?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: