How We Deploy Python Code

svieira · on July 10, 2015

Back when I was doing Python deployments (~2009-2013) I was:

* Downloading any new dependencies to a cached folder on the server (this was before wheels had really taken off) * Running pip install -r requirements.txt from that cached folder into a new virtual environment for that deployment (`/opt/company/app-name/YYYY-MM-DD-HH-MM-SS`) * Switching a symlink (`/some/path/app-name`) to point at the latest virtual env. * Running a graceful restart of Apache.

Fast, zero downtime deployments, multiple times a day, and if anything failed, the build simply didn't go out and I'd try again after fixing the issue. Rollbacks were also very easy (just switch the symlink back and restart Apache again).

These days the things I'd definitely change would be:

* Use a local PyPi rather than a per-server cache * Use wheels wherever possible to avoid re-compilation on the servers.

Things I would consider:

* Packaging (deb / fat-package / docker) to avoid having any extra work done over per-machine + easy promotions from one environment to the next.

s_kilk · on July 10, 2015

I built a system that did something very like this at a previous employer. We got really quick (mostly) atomic deployments which could be rolled-back instantly with one command.

Even at the time I thought Docker would be a great solution to the problem, but the organization was vehemently against using modern tech to manage servers and deployments, so I ended up writing that tool in bash instead. Good times.

thraxil · on July 10, 2015

This is basically how we still do it (with nginx + gunicorn rather than apache). Wheels (and recent setuptools/pip versions that build and cache them automatically) just made the build step significantly faster (lxml, argh).

We're moving to the Docker approach, which is really nice, but it does change the shape of the whole deploy pipeline, so it's going to take some time.

crdoconnor · on July 10, 2015

This is exactly what I do.

>Use a local PyPi rather than a per-server cache

I stil prefer a per-server cache. A local pypi is another piece of infrastructure you need to keep alive. You don't have to worry about the uptime of an rsync playbook.

limelight · on July 10, 2015

Their reason for dismissing Docker are rather shallow, considering that it's pretty much the perfect solution to this problem.

Their first reason (not wanting to upgrade a kernel) is terrible considering that they'll eventually be upgrading it anyways.

Their second is slightly better, but it's really not that hard. There are plenty of hosted services for storing Docker images, not to mention that "there's a Dockerfile for that."

Their final reason (not wanting to learn and convert to a new infrastructure paradigm) is the most legitimate, but ultimately misguided. Moving to Docker doesn't have to be an all-or-nothing affair. You don't have to do random shuffling of containers and automated shipping of new images—there are certainly benefits of going wholesale Docker, but it's by no means required. At the simplest level, you can just treat the Docker contain as an app and run it as you normally would, with all your normal systems. (ie. replace "python example.py" with "docker run example")

viraptor · on July 10, 2015

> (not wanting to upgrade a kernel) is terrible considering that they'll eventually be upgrading it anyways.

If they're running ubuntu 12.04 LTS they can keep the 3.2 kernel until late 2017. That's 2 more years. And they wrote "did not", so it was likely the situation months ago, not yesterday.

> (not wanting to learn and convert to a new infrastructure paradigm) is the most legitimate, but ultimately misguided

It depends on the amount of stuff they deploy. If they handle everything using Ansible (and from the list it looks like they do), then it's months of work to migrate to something else. They may need the right users / logging / secret management in the app itself, not outside of it.

limelight · on July 10, 2015

> If they handle everything using Ansible (and from the list it looks like they do), then it's months of work to migrate to something else.

It's not. It would be months of work if they wanted to convert all their Ansible code to Docker, but that's by no means required.

Docker and Ansible can easily coexist peacefully.

viraptor · on July 10, 2015

They can. But depending on how you used Ansible before, it may mean a heavy rewrite of your deployment strategy. I'm not saying it will always take that long. But depending on your app, the requirements may be very complex and not fit into the docker idea.

(it always means some extra work for security updates though - now you're updating both the host and images)

mixmastamyk · on July 10, 2015

    sudo apt-get install linux-generic-lts-quantal

Easy to upgrade.

viraptor · on July 10, 2015

If you can, it's easy. I wish everyone to work in environments where it's that simple ;)

mixmastamyk · on July 14, 2015

Kernel interface change is low on purpose. Test it first with additional vms before throwing up your hands and exclaiming it can't be done, you might be surprised.

Cieplak · on July 9, 2015

Highly recommend FPM for creating packages (deb, rpm, osx .pkg, tar) from gems, python modules, and pears.

https://github.com/jordansissel/fpm

daurnimator · on July 10, 2015

Rather than such a blunt tool like fpm, if you're deploying only python, look to something like py2deb: https://github.com/paylogic/py2deb

mixmastamyk · on July 10, 2015

How does this compare with the "dh-virtualenv" tool recommended by the article?

Edit: found https://py2deb.readthedocs.org/en/latest/comparisons.html

daurnimator · on July 10, 2015

https://py2deb.readthedocs.org/en/latest/comparisons.html#no...

zobzu · on July 10, 2015

fpm is cool - very cool. that said it wont create proper spec files and friends, thus its harder to maintain the packages properly. also they usually ont get accepted upstream due to that.

that said, for python files and simple packages it works well enough!

jlees · on July 9, 2015

That seems like a neat tool. I wonder if you could combine it with the sandboxing that dh-virtualenv provides to get the best of both worlds?

rhelmer · on July 9, 2015

I've used fpm to make rpm and deb packages that simply include a virtualenv, it works ok.

One of the significant tradeoffs to this approach is you lose the carefully-crafted tree-of-dependencies that the distros favor, so it makes the package pretty much automatically unacceptable to package maintainers.

However, being able to have install instructions that amount to "yum/apt-get install <package>" is pretty great.

I am hoping for an app/container convergence at some point, but we might need to drop the fine-grained dependency dream and have them be more self-contained, like Mac OS X apps.

vacri · on July 10, 2015

FPM is written as an in-house solution only. It's not intended to be used for making packages for official distro repositories for third-party users to pick up, and they suggest you use the distro-specified methods for those.

amatix · on July 10, 2015

we use fpm & something like dh-virtualenv along exactly those lines. Helps us manage a complex mix of native/system-level dependencies (non-python) as well as python packages.

We also incorporate a set of meta packages which means we can have multiple codebase versions installed and switch the "active" one by installing the right version of the meta-package. There's also meta-packages for each service running off the same codebase, which deals with starting/stopping/etc.

doki_pen · on July 10, 2015

We do something similar at embedly, except instead of dh-virtualenv we have our own homegrown solution. I wish I new about dh-virtualenv before we created it.

Basically, what it comes down to a build script that builds a deb with the virtualenv of your project versioned properly(build number, git tag), along with any other files that need to be installed (think init scripts and some about file describing the build). It also should do things like create users for daemons. We also use it to enforce consistent package structure.

We use devpi to host our python libraries (as opposed to applications), reprepro to host our deb packages, standard python tools to build the virtualenv and fpm to package it all up into a deb.

All in all, the bash build script is 177 LoC and is driven by a standard build script we include in every applications repository defining variables, and optionally overriding build steps (if you've used portage...).

The most important thing is that you have a standard way to create python libraries and application to reduce friction on starting new projects and getting them into production quickly.

remh · on July 9, 2015

We fixed that issue at Datadog by using Chef Omnibus:

https://www.datadoghq.com/blog/new-datadog-agent-omnibus-tic...

It's more complicated than the proposed solution by nylas but ultimately it gives you full control of the whole environment and ensure that you won't hit ANY dependency issue when shipping your code to weird systems.

sytse · on July 10, 2015

At GitLab we use Chef Omnibus too and we love it. More than 100k organizations use GitLab with Omnibus and it has lowered our support effort enormously. https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/REA...

jlees · on July 10, 2015

Looks neat. I think there's definitely a difference in the requirements for a deploy method when you can control the underlying systems (internal servers) vs have to make it work with just about any weird config users throw at you (your agent).

kbar13 · on July 9, 2015

http://pythonwheels.com/ solves the problem of building c extensions on installation.

emidln · on July 9, 2015

Pair this with virtualenvs in separate directories (so that "rollback" is just a ssh mv and a reload for whatever supervisor process) and you get to skip the mess of building system packages.

Also, are there seriously places that don't run their own PyPI mirrors? Places that have people who understand how to integrate platform-specific packages but can't be bothered to deploy one of the several PyPI-in-a-box systems or pay for a hosted PyPI?

toomuchtodo · on July 9, 2015

> Also, are there seriously places that don't run their own PyPI mirrors? Places that have people who understand how to integrate platform-specific packages but can't be bothered to deploy one of the several PyPI-in-a-box systems or pay for a hosted PyPI?

Yes. I've seen them, and they've been huge shops.

viraptor · on July 10, 2015

> you get to skip the mess of building system packages.

Only in cases where you don't have wheels depending on external libraries. If you do, you should still package with the right dependency constraints. Otherwise you can install a wheel which does not work (because of missing .so)

emidln · on July 10, 2015

This simply isn't true. You can package .so or .dll (or anything else) with your wheel. An example of doing it with a dll or so is here: http://stackoverflow.com/questions/24071491/how-can-i-make-a...

viraptor · on July 10, 2015

What you said does not disagree with me. You can have wheels depending on external dynamic libs. Or you can package them together. Both ways have good and bad sides, but if you leave them external, it's very useful to have the right dependencies.

vamega · on July 10, 2015

Can you point me to your recommended PyPI-in-a-box system?

kbrownlees · on July 10, 2015

I have had good success with http://doc.devpi.net/latest/

vamega · on July 10, 2015

That looks good. Might work exceptionally well for my needs. I want to have a PyPI mirror in an environment that doesn't have public internet access.

Running devpi in another environment and syncing the resulting repository should allow me to achieve what I want.

ronjouch · on July 10, 2015

As a first step, what about trying pip's `-f` option, combined with dumping your wheels in a dumb folder served by apache/nginx:

    pip --help
    [...]
    Install Options:
      [...]
      -f, --find-links <url>      If a url or path to an html file, then parse for links to archives. If a local path or file:// url that's a directory,then look for archives in
                                  the directory listing.

EDIT: in the context of deploying some app at work, what's the interest of a full-blown hosted cheeseshop? Users of these solutions, what value does it add to a simple `pip install -f INTERNAL_PKG_URL pkg_a==1.2.3 pkg_b==0.1.2` ? Which features do you frequently use?

emidln · on July 10, 2015

I currently use this one: https://localshop.readthedocs.org/en/latest/installing.html

It works. It's django based and you can setup s3-backed storage. It also has a docker-compose script.

doki_pen · on July 10, 2015

We migrated off of localshop and onto devpi. Devpi is a much better product and much more actively maintained. localshop was nothing but headaches and constantly breaking.

mvantellingen · on July 10, 2015

Author here: I created it to solve an issue I was running into a couple of years ago. I've only recently started using it again myself. I think the development version (not on pypi) is in much better shape with things like multiple repositories and better user management (teams).

emidln · on July 10, 2015

Maybe you use more esoteric features. The only thing I've done in the last 18 months is patch a bug that prevented uploads from Python 2.7.4-2.7.10. We just run it under circus with chaussette and front it with an elb.

kbar13 · on July 10, 2015

localshop has significantly improved in the past few months. Before I could barely consider it production-ready.

vamega · on July 10, 2015

Do I understand this correctly? It only mirrors the packages that are requested from it? So I won't need to download 100GB+ of packages that I am not interested in?

emidln · on July 10, 2015

Correct. My team has two main use cases: private packages and guaranteed access to packages we've built with. It's extremely frustrating to come into a codebase after several months or years to find it using a library that no longer seems to exist on the public internet.

tschellenbach · on July 9, 2015

Yes, someone should build the one way to ship your app. No reason for everybody to be inventing this stuff over and over again.

Deploys are harder if you have a large codebase to ship. rSync works really well in those cases. It requires a bit of extra infrastructure, but is super fast.

erikb · on July 10, 2015

Deployment is hard, system administration is hard. As a software developer you think there should be one way but there can't be one way. There never will be. If you want to make good software really spend some time learning the intricacies of packaging, deployment and sysadmin life. You make a lot of people happy by just knowing the problems, even if you can't apply the correct solution to each problem.

Come from the same island as you, trust me. But the more you learn about this the more you see how complex it is. You can't even say that one solution is better than the other (like apt vs yum). Each and every one of them has their pros and cons. And more often than not architectural decisions make it impossible to get both solutions into the same system working together.

rSync is not deploying. It's syncing files. But even if you have a 1:1 copy from your development computer on a server it still might not work because on that server package xyz is still in version 1.4.3b and not 1.4.3c. Deployment is getting it there AND getting it to work together nicely and maintainable with the other things that run on that computer/vm.

mattbillenstein · on July 10, 2015

+1 rsync is pretty darn good at any scale -- I'm not sure why the simplest solution possible doesn't beat out docker as a suggestion in this thread.

I've been bundling libs and software into a single virtual environment like package that I distribute with rsync for a long time - it solves loads of problems, is easy to bootstrap a new system with, and incremental updates are super fast. Combine that with rsync distribution of your source and a good tool for automating all of it (ansible, salt, chef, puppet, et al) and you have a pretty fool-proof deployment system.

And a rollback is just a git revert and another push away -- no need to keep build artifacts lying around if you believe your build is deterministic.

viraptor · on July 10, 2015

Rsync is good for simple things. But it will fail with more complicated apps:

- how do you know which version you're running right now?

- how do you deploy to two environments where different deps are needed?

- how do you tell when your included dependencies need security patches?

mattbillenstein · on July 10, 2015

rsync isn't the complete system - you're going to need git (or another vcs) and some other tools of course.

#1 is git (dump and log the git head on a deploy) #2 don't do that - keep a single consistent environment #3 use the system openssl - monitor other software components for security updates -- you need to do this anyway in any of these systems.

viraptor · on July 10, 2015

> #2 don't do that

I wish everyone to have easy deployments where environments, OS versions and everything else are always consistent. :)

> #3 monitor other software components for security updates -- you need to do this anyway in any of these systems.

Sure. But having multiple virtualenvs means you need to monitor all of them on all of deployed hosts. Having everything packaged separately means you can do audits much easier and without location-specific checks.

stephenr · on July 10, 2015

They have. Every mainstream OS/Distro has a packaging system and an installer for those packages.

For server-side apps like this, that usually means a Deb or an RPM. These systems handle upgrades, rollbacks, dependencies, etc.

Just because some people decide that writing an RPM specfile or running dh_make is too hard to work out, doesn't mean that the solution doesn't exist.

sandGorgon · on July 9, 2015

The fact that we had a weird combination of python and libraries took us towards Docker. And we have never looked back.

For someone trying out building python deployment packages using deb, rpm, etc. I really recommend Docker.

craigmccaskill · on July 9, 2015

They specifically called that out in the article with an entire section called "just use docker".

knicholes · on July 9, 2015

On the same vein as omni, the other reasons are "we don't want to learn a new technology, even though it was made exactly for this purpose" and "we don't want to learn how to install docker-registry."

x0x0 · on July 9, 2015

if you have to choose between a kernel and docker, just choose docker. Python can't get their shit together deployment-wise, and docker is the one true route (tm) to python deployment happiness.

forget virtualenv; forget package dependencies on conflicting versions of libxml; forget coworkers that have 3 different conflicting versions of requests scattered through various services, and goddamnit I just want to run a dev build; forget coworkers that scribble droppings all over the filesystem, and assume certain services will never coexist on the same box

just use docker. It's going to go like this:

step 1: docker

step 2: happy

pyre · on July 10, 2015

Ha. Wait until you need to run a build of shared Perl codebase against unit tests in all of the dependent codebases... but some of those codebases compile and run C (or C++) programs... and some of those codebases depend on conflicting versions of GCC!

"If we hit the bullseye, the rest of the dominos will fall like a house of cards... checkmate!" -- Zap Brannigan

> forget coworkers that scribble droppings all over the filesystem, and assume certain services will never coexist

I think this tends to be less of a problem than the desire to have a build artifact that can be reliably deployed to multiple servers, rather than having the "build" process and "deploy" process hopelessly intertwined with each other.

jlees · on July 9, 2015

To be fair, what didn't work for Nylas might well not be an issue for others. There's definitely more than one way to skin a cat, especially in the Python world.

omni · on July 9, 2015

Seriously, their #1 reason for not using Docker amounts to "we couldn't be bothered to update our kernel." This reads to me more of an interesting story about how they chose to solve the problem rather than a serious recommendation of how things should be done.

craigmccaskill · on July 9, 2015

I don't think that's a fair representation of why they don't want to update kernels and also completely downplays that there were three other points mentioned which all added non trivial overhead to getting stuff done now.

In an ideal world you build a system from the ground up, but rarely is that ever possible and the approach taken to iterate is far more valuable than your requested 'serious' recommendation.

vacri · on July 10, 2015

Seriously, your #1 complaint about their reasons not to use Docker amounts to "I couldn't be bothered to read past the first dot point".

dwb · on July 10, 2015

We love Docker where I work, but running it in production is a big challenge — there's no "just" using it. We build packages of Rails apps and dependencies quite happily. Sure, you have to make sure all dependencies are packaged too, but that's still easier than a full-on Docker roll-out.

Indeed, we actually use Docker to build packages. Blog post coming soon, maybe.

jacques_chester · on July 10, 2015

In a few months Cloud Foundry will natively support launching, placing, managing, wiring, routing, logging and servicing Dockerised application images.

In the meantime you can get a taste with Lattice[0].

[0] http://lattice.cf/

jedberg · on July 9, 2015

There was a whole paragraph in the article about why Docker didn't work for them.

x0x0 · on July 9, 2015

they had 3 reasons:

one of which was just silly (kernel version -- are you living on that point release forever?)

one of which was valid (necessity to maintain method for distributing docker images), but probably dumb: you only get so many innovation points per company, and innovating on a problem docker just solves means you are supporting your in-house solution ad infinitum

and one of which definitely sounds painful (docker vs extant ansible playbooks)

vacri · on July 10, 2015

The 3.2 kernel they mention is the standard kernel of Ubuntu 12.04. I have some docker machines of the same vintage, and it's as simple as installing a backported kernel from the official repos.

This being said, I'm using docker for packaging/deployment of a nodejs app on those machines, and I hate it. I'm about to strip it out and go for .debs. Docker brings a lot of baggage with it, and requires major restructuring of some infrastructure parts. As they say in the article, the changes required to bring docker in just to do packaging are way too heavy. And Docker also sucks for rollbacks, to be honest - their tagging system is downright terrible.

My advice is not to use Docker in a production environment unless you can articulate the specific pain points it will solve for you.

xorcist · on July 10, 2015

Agreed on the rollback part.

It is easy to pick the silliness of the kernel reason, but Docker is moving fast right now. They are still getting the basic building blocks in place, and the Docker in two years will look nothing like today's.

We use Docker quite a bit today, but it's immature and it shows. With Composer I feel the basic functionality is finally in place it needs time to mature.

So I think it's quite wise to wait. You don't need to chase every new technology. If you have a product to ship, focus on that instead and use whatever tools are proven to work.

mixmastamyk · on July 10, 2015

    sudo apt-get install linux-generic-lts-quantal

sophacles · on July 9, 2015

We use a devpi server, and just push the new package version, including wheels built for our server environment, for distribution.

On the app end we just build a new virtualenv, and launch. If something fails, we switch back to the old virtualenv. This is managed by a simple fabric script.

nZac · on July 10, 2015

We just commit our dependencies into our project repository in wheel format and install into a virtual env on prod from that directory eliminating PyPi. Though I don't know many other that do this. Do you?

Bitbucket and GitHub are reliable enough for how often we deploy that we aren't all that worried about downtime from those services. We could also pull from a dev's machine should the situation be that dire.

We have looked into Docker but that tool has a lot more growing before "I" would feel comfortable putting it into production. I would rather ship a packaged VM than Docker at this point, there are to many gotchas that we don't have time to figure out.

erikb · on July 10, 2015

You put the wheels into a git repo? That's the most sad thing I've heard today. You know that if you add a file in commit A and remove it in Commit B each and every clone still pulls in that file? It's okay for text files but it's very much not okay for binaries and packages.

kevinschumacher · on July 10, 2015

    git clone --depth=1 path/to/repo

when doing a clone for a deploy, since you don't need the history

edit: but yes, cloning as a developer will take a long time. But, if it really gets out of hand, I can hand new devs a HDD with the repo on it, and they can just pull recent changes. Not ideal, but pretty workable

so0k · on July 10, 2015

we download to a folder on the docker build server and build docker containers from this cache.

see here: http://stackoverflow.com/a/29936384/138469

viraptor · on July 10, 2015

> curl “https://artifacts.nylas.net/sync-engine-3k48dls.deb” -o $temp ; dpkg -i $temp

It's really not hard to deploy a package repository. Either a "proper" one with a tool like `reprepro`, or a stripped one which is basically just .deb files in one directory. There's really no need for curl+dpkg. And a proper repository gives you dependency handling for free.

sciurus · on July 10, 2015

Yes, that part really surprised me. Barebones repos are useful enough, and there's also some pretty fancy tools out there like http://www.aptly.info/

mixmastamyk · on July 10, 2015

Could you elaborate on the simple folder?

For example I find the --instdir option to dpkg but it still would have to be downloaded from the other host, unless of course the folder was mounted somehow.

viraptor · on July 10, 2015

Search for debian's "trivial archive". It replaces release/component elements with explicit path. It's deprecated now, but I believe still works.

perlgeek · on July 10, 2015

Note that the base path /usr/share/python (that dh-virtualenv ships with) is a bad choice; see https://github.com/spotify/dh-virtualenv/issues/82 for a discussion.

You can set a different base path in debian/rules with export DH_VIRTUALENV_INSTALL_ROOT=/your/path/here

serkanh · on July 10, 2015

"Distributing Docker images within a private network also requires a separate service which we would need to configure, test, and maintain." What does this mean? Setting up a private docker registry is trivial at best and having it deploy on remote servers via chef, puppet; hell even fabric should do the job.

juliangregorian · on July 10, 2015

It's not necessarily true either; it's not difficult to have your continuous build process build images from the Dockerfile, run tests, swap green and blue, etc...

erikb · on July 10, 2015

No No No No! Or maybe?

Do people really do that? Git pull their own projects into the production servers? I spent a lot of time to put all my code in versioned wheels when I deploy, even if I'm the only coder and the only user. Application and development are and should be two different worlds.

objectified · on July 10, 2015

I recently created vdist (https://vdist.readthedocs.org/en/latest/ - https://github.com/objectified/vdist) for doing similar things - the exception being is that it uses Docker to actually build the OS package on. vdist uses FPM under the hood, and (currently) lets you build both deb and rpm packages. It also packs up a complete virtualenv, and installs the build time OS dependencies on the Docker machine where it builds on when needed. The runtime dependencies are made into dependencies of the resulting package.

rfeather · on July 10, 2015

I've had decent results using a combination of bamboo, maven, conda, and pip. Granted, most of our ecosystem is Java. Tagging a python package along as a maven artifact probably isn't the most natural thing to do otherwise.

stavros · on July 10, 2015

Unfortunately, this method seems like it would only work for libraries, or things that can easily be packaged as libraries. It wouldn't work that well for a web application, for example, especially since the typical Django application usually involves multiple services, different settings per machine, etc.

vacri · on July 10, 2015

> different settings per machine

/etc/default/mycoolapp.conf

Debian packages have the concept of 'config' files. Files will be automatically overwritten when installing a new version of package FOO, unless they're marked as config files in the .deb manifest. This allows you to have a set of sane defaults, but not to lose customisations when upgrading.

acdha · on July 10, 2015

Just wanted to +1 this. There are literally decades of convention built around patterns where you ship a standard config file which merges in system/user/instance-specific settings from a known location, command-line argument, environment, etc. The Debian world in particular has lead the community for decades with the use of debconf to store values such as a hostname or server role which can automatically be re-applied when otherwise unmodified files are modified upstream.

When I used this approach with a Django site years ago using RPM[1] we used the pattern vacri mentioned or the reverse one where you have an Apache virtualhost file which contains system-specific settings (hostname, SSL certs, log file name, etc.) and simply included the generic settings shipped in the RPM.

In either case the system-specific information can be set by hand (this was a .gov server…), managed with your favorite deployment / config tool, etc. and allows you to use the same signed, bit-for-bit identical package on testing, staging, and production with complete assurance that the only differences were intentional. This was really nice when you wanted to hand things off to a different group rather than having the dev team include the sysadmins.

1. http://chris.improbable.org/2009/10/16/deploying-django-site...

doki_pen · on July 10, 2015

I use a similar method to ship app.embed.ly (emberapp on top of django). You can package whatever you want in a deb. The virtualenv is just a part of it. Config files are managed with configuration management system (chef in our case). Our django settings.py file just tries to import from /etc/blahblah

tspike · on July 10, 2015

Nah, the approach is sound. We did this for a Flask app that used celery, redis, etc and we were happy with it. For the software, use a .deb, for the configuration, use a configuration management tool like Ansible.

avilay · on July 10, 2015

Here is the process I use for smallish services -

1. Create a python package using setup.py 2. Upload the resulting .tar.gz file to a central location 3. Download to prod nodes and run pip3 install <packagename>.tar.gz

Rolling back is pretty simple - pip3 uninstall the current version and re-install the old version.

Any gotchas with this process?

mixmastamyk · on July 14, 2015

If you are using it for small services it's probably fine. But the original article did say that uninstall sometimes doesn't work correctly. Apt is more formal than pip.

So at some point, as you know you'll need to move on.

webo · on July 10, 2015

You have to do this every time there's a change in the codebase which is not easy. How do you stick this into a CI without the git & pip issue talked about in the post?

avilay · on July 10, 2015

I have to do this everytime I have to deploy, which is similar to having to create a deb package everytime Nylas has to deploy.

There are no git dependencies in the process I describe above.

The pip drawback that is discussed in the post is of PyPi going down. In the process described above there is no PyPi dependency. Storing the .tar.gz package in a central location is similar to Nylas storing their deb package on S3.

mixmastamyk · on July 10, 2015

Are you using a venv?

avilay · on July 10, 2015

Nope.

mixmastamyk · on July 14, 2015

If you did it would probably strengthen the isolation of your modules from conflicts, or say un-installation errors. Whether that's needed is up to you.

velocitypsycho · on July 10, 2015

For installing using .deb files, how are db migrations handled. Our deployment system handles running django migrations by deploying to a new folder/virtualenv, running the migrations, then switching over symlinks.

I vaguely remember .deb files having install scripts, is that what one would use?

viraptor · on July 10, 2015

Depends on your environment, number of hosts, etc. really. You probably don't want to stick it into the same install script because:

- your app user doesn't need rights to modify the schema

- you need to handle concurrency of schema upgrades (what if two hosts upgrade at the same time?)

- if your migration fails it may leave you in the weird installation state and not restart the service

Ideal solution: deploy code which can cope with both pre-migration and post-migration schema -> upgrade schema -> deploy code with new features.

jlees · on July 10, 2015

For e.g. changing the format of a column that's easy enough but it's tricky to create that intermediate state at the migration level for every migration. One option is to deploy the migration code without restarting the running services (or to a different box), rollback the code if the migration failed, restart the services to pick up the new code if it succeeded. This still means not writing migrations that actively break the running version though - if you're using database reflection, everything will go boom when the schema changes.

stephenr · on July 10, 2015

Depends on your infra. If it's a single server with the app + Db (or a single app server + single DB server) you could have a postinst script that calls your app/framework's migration system.

If your migration system is smart enough (or you can easily check the migration status from a shell script) you could also do this in a multi-app-server environment too.

lifeisstillgood · on July 9, 2015

Weirdly I am re-starting an old project doing this venv/ dpkg (http://pyholodeck.mikadosoftware.com). The fact that it's still a painful problem means Inam not wasting my time :-)

webo · on July 10, 2015

> Building with dh-virtualenv simply creates a debian package that includes a virtualenv, along with any dependencies listed in the requirements.txt file.

So how is this solving the first issue? If PyPI or the Git server is down, this is exactly like the git & pip option.

stephenr · on July 10, 2015

You need those things up to build the package not to install it

webo · on July 10, 2015

Ah I misunderstood the article. I just package my application source in a tar during deployment. I thought that's what most people do.

stephenr · on July 10, 2015

Using a native package gives so much more power - you define that the package relies on python, and maybe a mysql or postgres client, redis, whatever it needs, and then just install the one package, and let apt/dpkg handle the dependencies.

I'm a big fan of using the config-package-dev package from DebAthena to build config packages, which allow for about 99.9% of Debian server setup to be defined in Debian packages.

compostor42 · on July 10, 2015

Great article. I had never heard of dh-virtualenv but will be looking into it.

How has your experience with Ansible been so far? I have dabbled with it but haven't taken the plunge yet. Curious how it has been working out for you all.

emfree · on July 10, 2015

Ansible works well for us, although we use it in a somewhat different way than most folks. We previously wrote about our approach here, if you're curious: https://nylas.com/blog/graduating-past-playbooks

compostor42 · on July 10, 2015

Thanks for the article. Well written and a very interesting concept.

BuckRogers · on July 10, 2015

Seems this method wouldn't work as well if you have external clients you deploy for. I'd use Docker instead of doing this, just to be in a better position for an internal or external client deployment.

ytjohn · on July 10, 2015

If you took this a step further and set up a debian repo, then you could have your clients use that debian repo.

I'm looking to do something pretty similiar, but RPMs. I found rpmvenv that seems to work in the same fashion. https://pypi.python.org/pypi/rpmvenv/0.3.1

stephenr · on July 10, 2015

Exactly this.

If a company wants to use Docker that's their choice, but I don't think its at all reasonable to insist on or only support that environment as a software vendor. If it works on Debian, give me a .deb or even better an Apt Repo to use.

stephenr · on July 10, 2015

Why? Saying it runs on Debian X is much more easy to facilitate by your end users than "requires docker"

ah- · on July 9, 2015

conda works pretty well.

TDL · on July 9, 2015

Agreed (although biased since I used to work at Continuum.) I am wondering what others think of conda?

4lejandrito · on July 9, 2015

We have used Conda for our first python deployment and the process has been seamless. It provides the same sandboxing concept using virtualenvs and also uses prebuilt binaries for native dependencies so you don't have to build them every time. The only drawback I would say is that we have to install miniconda in our production servers rather than just deploying an standalone package.

tuckermi · on July 10, 2015

My team has been rolling our own conda packages for (frequent) internal software releases to local servers and have been pretty happy overall pulling down code from a locally managed conda package repo.

With that said, Conda is not a perfect solution. One thing that can be frustrating is that a package can include compiled code (shared objects/dylibs) that may be incompatible with your system. Unfortunately, while you can indicate dependencies on other conda packages, python versions, etc there isn't currently a convenient way to indicate things like GLIBC dependencies.

hcrisp · on July 9, 2015

I use conda as well and found it to be great. I love how it detects and manages dependencies for you when you install a new module.

pekk · on July 10, 2015

What was wrong with Python packaging that conda needed to replace it entirely (including virtualenv)? If you say the installation story for the scipy stack, that is less a matter of Python packaging and more a matter of the scipy stack.

nashequilibrium · on July 9, 2015

Love it, i don't hear much about it on HN but i personally really like it.

rlvesco7 · on July 10, 2015

I like it too, but I had issues when I had to compile packages and other stuff. In particular, its version of qmake can interfere in unpredictable ways.

jacques_chester · on July 10, 2015

Here's how I deploy python code:

    cf push some-python-app

So far it's worked pretty well.

Works for Ruby, Java, Node, PHP and Go as well.

nobullet · on July 11, 2015

That's interesting. I've tried to google cf CLI however wasn't able to find good documentation. Is it possible to install cf CLI on my server? Or is it Cloud Foundry tool only?

jacques_chester · on July 12, 2015

The cf CLI tool interacts with a Cloud Foundry installation.

You'd use it for one in your own data centre, or Pivotal Web Services[0], or BlueMix. You point it at an API and login, then off you go.

If you need something more cut-down to play with, Lattice[1] is nifty, but currently doesn't do buildpack magic.

[0] https://run.pivotal.io/ [1] http://lattice.cf/

daryltucker · on July 9, 2015

I see your issue of complexity. Glad I haven't ever reached the point where some good git hooks no longer work.

theseatoms · on July 10, 2015

Does anyone have experience with PEX?

stefantalpalaru · on July 10, 2015

> The state of the art seems to be ”run git pull and pray”

No, the state of the art where I'm handling deployment is "run 'git push' to a test repo where a post-update hook runs a series of tests and if those tests pass it pushes to the production repo where a similar hook does any required additional operation".

themartorana · on July 10, 2015

Git deployments work great if you're packing an image (AMI, Docker) using, say, Packer. But we only deploy "immutable" images, not code on to existing servers.

hobarrera · on July 9, 2015

> The state of the art seems to be ”run git pull and pray”

Looks like these guys never heard of things like CI.

jumpkick · on July 10, 2015

You had to read further.

This is the core of how we deploy code at Nylas. Our continuous integration server (Jenkins) runs dh-virtualenv to build the package, and uses Python’s wheel cache to avoid re-building dependencies.

andrewchambers · on July 10, 2015

Or release branches/tags.