Hacker News new | past | comments | ask | show | jobs | submit login
Automate your Python project with Makefile (2021) (antonz.org)
50 points by asicsp on March 27, 2023 | hide | past | favorite | 66 comments



I have been using Makefile for over 10 years in all of my projects, and here are some features I've always found lacking in Makefile:

1. There is no way to display documentation for commands and accepted parameters. Yes, you can write a special task that will display comments, but you have to copy it from project to project.

2. The need to pass named parameters when calling tasks. I want to write `make serve localhost 3000` instead of `make serve bind=localhost port=3000`

3. I've always had the need in different projects to use the same commands, so I had to copy tasks from project to project. I need a central place with commands that I can apply to any project.

4. The ability to write tasks in different languages. In some cases, it's easier to write in Python or TypeScript/Deno.

5. And most importantly, it is difficult to write commands in Makefile that can be used in different environments. For example, I need to run commands on different groups of servers: production and staging. This could look like: `make production TASK1 TASK2` or `make stage TASK1 TASK2`. In other words, the production/stage task sets up the parameters for executing tasks in a specific environment. It might be possible to call commands in this way with Makefile, but it seems too complicated.

As a result, I decided to write my own utility for task automation: https://github.com/devrc-hub/devrc

It solves all of the above problems and has other interesting features and also written in Rust .


I think Just (as in Justfile) also solves most of your points, is reasonably widely used, also written in Rust and has integration plugins for it in most editors.


These “run something” targets should be marked as .PHONY so make realizes they will not produce a file. Otherwise a file named like the target will confuse make and turn your day into a sad one.


I ran into this back when I used make as a command runner.

After a short detour via just Ive been using a shell script with a big case statement since.


Yup. I've found that in most cases a directory of script files ends up better. Or even better a directory of python files since shell kinda sucks.


The main reason I still use Make over other alternatives is that everyone already has it. No need to install something else to bootstrap a project. I put in a `make initialize` that takes care of everything: setting up the virtualenv, using our internal pip mirror, downloading dependencies, etc. It's just such low friction!

Side note: Make is also really popular among self-hosters for ansible infrastructure setup.


Make wasn’t designed to be a task runner. It can be used as one, but it doesn’t make (pun not intended) a very good one, and it doesn’t have the best syntax, either.

Make is an artifact updater. Although it’s not its main focus, Peter Miller’s classical paper ‘Recursive Make Considered Harmful’ does a good job of explaining what it is.

One great benefit of make is that it’s present everywhere, so there’s no additional hassle of installing an extra tool. Depending on the project, this hassle-freeness may or may not outweigh make’s relative incomfort as a task runner.


What things were designed for is often a side show. We want to think that intent of creation matters, for some reason, when reality is full of that never really being the case.

So, on the merits, what makes it a bad task runner between outputs? I agree it is somewhat obtuse, but I'm still mostly crying from trying to get nox and a few other things working. Githubs workflow syntax is as painful, all told. (Though, it has the very real constraint that it is "per repository" and you need something above it.)


> ...there is no more important proposition for every sort of history than that which we arrive at only with great effort but which we really should reach, -- namely that the origin of the emergence of a thing and its ultimate usefulness, its practical application and incorporation into a system of ends, are toto coelo separate; that anything in existence, having somehow come about, is continually interpreted anew, requisitioned anew, transformed and redirected to a new purpose...


Well first of all if you’re not using any dependency tracking logic anyway it provides 0 advantages over some scripts like make.sh test.sh push.sh lint.sh (where test.sh can simply call make.sh if you want).


It's not true that Python doesn't have anything akin to package.json scripts. Bonus points: all is executed within project's virtualenv.

https://python-poetry.org/docs/cli/#run


While poetry is the way to go today, I've got my eyes on pdm. https://github.com/pdm-project/pdm Pyproject.toml is a pep standard so it will be easy to move around tools.


i just wish poetry's dependency generation was faster. It's painful* today.


pdm is already the better choice than poetry IMHO.


How does this compare to taskipy?


is this separate to pip's "requirements.txt" usually?


Very different. Requirements.txt only handles dependency packages, and only handles on kind of dependency (no dev/release distinction). pyproject.toml does dev/release deps, it has package metadata, it can configure all your lint/test tooling, and it is extensible with plugins.


and they can't be combined, right? like package.json for example


Yes, it's assuming poetry is used, whereby normally the packages needed by the project are defined in pyproject.toml


I've been seeing a growing use of `just` for cases like this. It works similar in function to make, but it's a great deal cleaner and easier to use.



It also backtracks up the directory tree so you don’t have to cd to run «just test».


I'm considering using it because the justfile can be anywhere in the directory tree above the point where you're running it. (This happens to be useful in certain situations like terraform env/region/stack trees. "Just" is also a command runner. While (as another person has pointed out) "make" is everywhere, "just" is easy to install.


as someone who maintains projects both using just and make, I have to say just is awesome, and allows things like writing a given task directly in a scripting language. sadly its not universally available across operating systems and linux distros (ubuntu/debian was the last significant dev ecosystem where it wasn't native, re apt install). so for projects with wider contributor goals, I still default to make, despite its warts (which are many imo), cause it tends to work ootb.


Invoke is like a Makefile but in Python: https://www.pyinvoke.org/index.html


Makefiles are great for local task automation. In college I had a friend who started every homework, even the social science and English homework, with a makefile to compile his Latex files.


Unrelated to Python, but is anyone aware of a tool like Make that can handle file path wildcards in prerequisites?

For example, let's say I have a rule that consumes *.json to produce some target. I want it to rebuild that target if any *.json files are added or removed (or modified).

As far as I'm aware, Make relies on file system timestamps, so even if it supported wildcards (I'm not sure if it does or not), it wouldn't be able to notice that foo.json has been deleted and a rebuild of the target is needed.

I thought I'd ask here before I go build such a tool (to support incremental rebuilds of a static site).

Edit to add that my particular use case has a non-prescriptive set of directories, nested a few levels deep. I'm actually realizing that that is probably a bigger hurdle for using something like Make (e.g. transform all *.md files, no matter how deep; but copy everything else, like images, verbatim; oh, and also then aggregate all *.md files into an Atom feed). Yes, I know this is asking a lot of something like Make!


Did you try searching stack overflow? https://stackoverflow.com/a/43402649

If you dedicate a specific directory for all your JSON, then you can depend on that directory and it will do what you're asking for.


I did spend quite a bit of time researching approaches, but I didn't come across this particular idea. I'll give it a try. Thanks!

Note: in my case, I have a directory structure that's a few levels deep, with an non-prescriptive set of directories (one subdirectory per category, with no limit on the set of categories). Maybe Make handles directories better than I realized (I'd always seen it recommended to use a Makefile in each directory--something I want to avoid).


Here's one way to do this.

I have used this method (directory mod-time triggering, let's say) for a simulation-summarizer which analyzes whatever pile of simulation-output-files happen to be in a given directory. If you run a new simulation, the directory changes and the analysis is re-done by running "make".

I used the Gnu make $(wildcard ...) for the template expansion, instead of using shell expansion. This is to take care of the no-file case, so that jsons/*.json will expand to nothing rather than to the literal jsons/*.json (which does not exist).

  $ cat Makefile 
  file.out: jsons
         cat $(wildcard jsons/*.json) /dev/null > file.out
  
  
  $ ls -R
  Makefile  jsons/
  
  ./jsons:
  foo.json
  
  $ make
  cat jsons/foo.json /dev/null > file.out
  
  $ make   # no file mods => no-op
  make: `file.out' is up to date.  
  
  $ touch jsons/bar.json
  
  $ make   # new file => re-make
  cat jsons/bar.json jsons/foo.json /dev/null > file.out
  
  $ make     
  make: `file.out' is up to date.
  
  $ rm jsons/foo.json 
  
  $ make  # deletion => re-make
  cat jsons/bar.json /dev/null > file.out
  
  $ rm jsons/bar.json 
  
  $ make  # nothing there
  cat  /dev/null > file.out
  
  $ make
  make: `file.out' is up to date.


Definitely avoid makefiles per directory, see Recursive Make Considered Harmful https://accu.org/journals/overload/14/71/miller_2004/

The approach being recommended in the sibling comment to this is quite nice!


Thanks to the folks who replied! Looks like I didn't dig far enough into GNU Make. It's got a lot of the functionality I'm looking for (and I just didn't realize it). Hopefully I don't need to use it, but it even has an "eval" function for dynamically generating its own input.


GNU make at least can do that.


Interesting! I had originally tried to do this with GNU Make (with the most naive approach you can imagine) and it wasn't able to notice that a prereq that matched the wildcard had been deleted (i.e. it thought the target was up to date, when I wanted it to rebuild).

I'll read up on the wildcard function and see if that is what I was looking for:

https://www.gnu.org/software/make/manual/html_node/Wildcard-...

Edit: A sibling comment also pointed out putting the wildcard'd files into a directory to help Make notice deletions. I'll give that a try first.


I am surprised nobody is mentioning go-task https://github.com/go-task/task.

It is a great project. I love the management of dotfiles, the inclusion of other Taskfiles, the namespacing, the fact it is an easily downloadable binary that runs the same on Linux, macOS AND Windows.

It is the superior option.


Another possibility is using the `scripts` in the `pyproject.toml`, as described here: https://python-poetry.org/docs/pyproject/#scripts


that’s generally not correct. entry points are how your package, once installed, is called from the command line; i wouldn’t wire up a docker image build step or twine publish that way.


Those two examples are the exact use-case that defining scripts in pyproject.toml are meant for. Users of my installed package would never need to run `twine publish` or build the project's docker image. That's only really needed by developers who would be working from the full project source including pyproject.toml.


entry points are installed to the interpreter site when you `pip install`: everyone gets them including your users.


That's specifically for the Poetry package manager, not Python in general.


Neither is make. You’re using an external tool either way. The one purpose built for the task, and doesn’t have a bunch of archaic footguns [1], will probably give a better experience.

1. https://stackoverflow.com/questions/17965806/how-do-i-handle....


pyproject.toml is for python in general. https://peps.python.org/pep-0621/ setup.py is legacy.


The GP comment was referring to the `tools.poetry.scripts` namespace in pyproject.toml.


I don't do much python anymore but recently, I decided to try moving from Makefile files to simple command-line python scripts for this type of thing. The python standard library has pretty much everything you need to replace `make` so there's no need for virtual environments and... 1. They're way more readable and intuitive 2. Cross-platform across Linux, Windows, MacOS, even if you have to branch off to platform-specific commands inside the script 3. I'd much rather maintain commands using the python standard library over complex command-line things with `sed`, `awk`, etc


Makefile quickly gets complicated when you have to pass in flags (--an-example-flag).

e.g.

dev: ./scripts/run-dev.sh $(if $(filter build,$(MAKECMDGOALS)),--build,) $(if $(filter restart,$(MAKECMDGOALS)),--restart,) ...etc

Since the project is in python it would be better to write a python script and start the tasks as subprocesses. This means you can use the python argparser you're already familiar with.

https://docs.python.org/3/library/subprocess.html


Or in short: using 3 languages (Make, shell interspersed between the Make, Python) is more complicated than using one language.


The `.sh` was not implied by the article. That was my addition but, yes, the less languages the better!


I’ve been doing this for decades now. Anything I do has “make deps”, “make serve”, etc. I don’t care that Make “wasn’t made for this”, it is ubiquitous and reliable at it.


Use Makefile if you want to make your cross-platform build environment Linux only. You will have all kind of issues on Windows, Mac, *BSD, you name it.


You can resolve the differences between BSD and GNU make.

Not sure about Windows. I'd expect that WSL bundles GNU make, no?


Make is in POSIX, so stick to POSIX features and your Makefile is portable across all reasonable systems. For Windows, I guess you use a .bat file.


Is that more of an issue if you shell-out a lot in your Makefile? (I don't use mac or win)


If your make targets don't have dependencies you are probably using the wrong tool for the job.


Pipenv has script hooks as well.

https://pipenv.pypa.io/en/latest/scripts/

It also includes many other package.json features.


I would not recommend make for python projects as make does not play nicely with virtualenvs and the environment variables that power them. You can make it work, but you'll be fighting it constantly.


An easy way to use virtual environments with Make is to create a wrapper script to activate the virtual environment and run a command in its context:

    # venv.sh
    venv="$1"
    shift
    . "$venv/bin/activate"
    "$@"
Then in the Makefile I do:

    VENV := venv.sh ./venv

    install-reqs:
        ${VENV} pip -r requirements.txt


That works but the whole point of the article was to avoid creating extra scripts, and now you have the problem that you need to document the extra required environment setup before running make or you get opaque errors.


What exactly are you doing that hits friction? You shouldn't be using "make" to "activate" your env - that should occur "outside" your makefile. It plays totally nice if you aren't trying to use make to set environmental state.


Are you specifically talking about virtualenv or the concept of venvs in general?

I've used make with conda, for example, for the last five years with no env issues at all, so some clarity or specific problem cases would help.


“never activate venvs in a headless context” is sage advice and not just for make.


How so? I have a Makefile in all of my Python projects. Never had issues.


Personally it's the interactions between the life cycle management of the virtual environments and bulletproofing for use by all team members. For any non-trivial Python project, I end up with at least 2 virtual environments, one "normal" and one for installing unit tests/linting/other static checkers. If you're environments are already pull this project and goes to `make lint`, is the lint virtual environment extant, does it have all the required dependencies, and are those steps idempotent so as not to trigger a setup on every run. And then add on edge cases like, what if someone already had a virtual env active when you went to do the setup, what edge conditions does that introduce? Do you add something to your logic to detect extant virtual environments and deactivate them before trying other actions.

All of these problems are solvable of course, but at what level of effort and for what benefit? And you pay all these costs every time you want to change something, especially if you've got to explain all the hacks to someone new, and you always run the risk that it works for you but not in a clean environment.


For my side projects I use a bash script in the style of https://github.com/adriancooney/Taskfile


Don't. Use pydoit.org.


As someone not up to date on the Python world, how does your suggestion compare with something like Poetry?


poetry is specialized in managing venv and packages, with the options to run some commands.

doit is a task runner: you declare a name that groups several commands, and when you call the task with the name, it runs all the commands in order. It can accept bash commands, python functions, declare dependencies on other commands or on files, and cascade the tasks to follow those dependencies automatically.

It's make, but in Python: nice syntax, works on windows/mac/linux, easy things are easy, hard things are possible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: