We regularly update and improve our upgrades and sustainability service offerings at Caktus, and a recent upgrade for a client precipitated a solution that I felt might be worth sharing. At Caktus, the preferred approach for addressing upgrades and sustainability is to make incremental updates to a project over time, trying to keep both Django and the servers themselves on a long-term support version. These are select versions of Django and Ubuntu, for example, that generally have much longer support periods than other versions, i.e., they are a good fit for applications that you will need to continue maintaining well into the future.
We continue to host large projects for our customers that we began as early as 2010, and we've applied this methodology throughout the past decade to keep systems updated and mitigate risk. One such project currently includes upwards of 40 servers (across several environments), all provisioned automatically with a tool built for that purpose. This client also requires the systems to be run on physical hardware that we also help manage, rather than in the cloud.
Upgrading Python but Not the OS
Recently we wanted to upgrade the version of Python used for this application, but weren't yet ready to upgrade the operating system (there's plenty of time left in its support cycle), yet we didn't want to tie ourselves to using third party Ubuntu packages that include no guarantees of timely updates in the event of security issues.
So the question arose:
"How could we get a supported version of Python on a supported version of Ubuntu that is not the default Python version included in Ubuntu?"
Using Docker with supervisord
Enter Docker + supervisord. If you Google these two terms, you'll see many how-tos on running supervisord inside a docker container, which allows you to run more than one process inside a single container. This is generally not recommended, and is not what I want to discuss in this post.
There is another, seemingly less common method to pair these two: Using supervisord on the host OS to run multiple Docker containers, much like you would use supervisord to run Gunicorn, uWSGI, Celery, and other such processes directly from a Python virtual environment in years past. I should be clear: I don't recommend this approach, except on an existing project that already uses supervisord and you have a need to further isolate some processes it runs from the host operating system. If you're starting from scratch, Kubernetes or docker-compose are probably a better fit. But this is a handy method for gradually moving older, "pre-Docker" projects into the new world with minimal disruption and risk.
Updating the Deployment
After some experimentation, we found that running Docker containers (instead of Python processes directly) via supervisord actually works quite well, with a few changes to our deploy process.
Instead of building a virtual environment on the server and installing requirements directly, you can build the Docker container (locally on the server as well). You could opt to move it to a dedicated registry, but we decided to save that for a later step. Here's a command you may find helpful to build and tag a Docker image:
docker build --pull -t my_project:latest -t my_project:$(git rev-parse --short HEAD) .
This builds and tags a Docker image with the latest tag, and a tag equal to the short commit sha of your Git repo.
Next, you'll need to populate an env file with the variables necessary for your container to run. Our deployment stack uses Jinja2 templates already, so the template we came up with looks like this:
{# IMPORTANT: This is a Docker env file and cannot include 'export' nor quotes around variable values. #} DJANGO_SETTINGS_MODULE={{ settings }} {% for key, val in django_secrets.items() -%} {{ key }}={{ val }} {% endfor -%} {% for key, val in django_env.items() -%} {{ key }}={{ val }} {% endfor -%}
Finally, you'll want to create a shell script called docker_run.sh (or another name of your choosing) that you can use to call docker run with this env file, for the currently-tagged release, in a way that (mostly) mimics how you might have run a Python process directly in the past:
#!/bin/sh exec /usr/bin/docker run --init --rm -i --env-file={{ secrets_env_file_path }} --network=host --mount type=bind,source={{ public_dir_path }},target=/public/ ${DOCKER_RUN_ARGS} my_project:{{ current_git_sha }} $@
Let's break this down:
- exec ensures that the docker run process takes over the PID of this script.
- --init causes Docker to run the docker-init process, which can help with signal handling inside the container.
- --rm removes the container on exit (especially useful for short-lived commands).
- --env-file simply points docker to the file we created in the prior step.
- --network=host means the container will not have its own networking stack, so processes can listen on ports on the host just like they could previously, and the container also has access to any /etc/hosts customizations. (This is optional and we intend to remove it later, but may be helpful during the first step.)
- --mount mounts a directory (in our case, the one including static files and uploaded media) inside the container. (Again, this is optional, but may help simplify the migration if you haven't yet or cannot move static and uploaded media to an object store like S3.)
- ${DOCKER_RUN_ARGS} (if set) allows you to provide other arguments to docker_run.
- my_project:{{ current_git_sha }} ensures that we run the same version of the code that we built in the prior step. (Tagging and running containers like this is a Docker best practice, and easier to get into the habit of sooner rather than later.)
- $@ passes all the arguments passed to this script into the container to be run
In our case, once we had these three things in place, the upgrade to Docker (and a newer version of Python) simply involved finding any calls to Python/virtualenv processes in our deployment infrastructure, and prefixing them with our docker_run.sh script. For example, here's what our supervisord config for our gunicorn process looks like:
[program:gunicorn] process_name=%(program_name)s command=/path/to/my_project/docker_run.sh gunicorn my_project.wsgi:application --bind=0.0.0.0:8000 --workers=2 user=root # will be dropped by container autostart=true autorestart=true stdout_logfile=/path/to/log/%(program_name)s.log redirect_stderr=true startsecs=1 ; Need to wait for currently executing tasks to finish at shutdown. ; Increase this if you have very long running tasks. stopwaitsecs=60
Don’t Start from Scratch
If not an exact recipe for your project, I hope this has helped you think of new ways you may be able to upgrade and maintain older systems, bringing them gradually into the future, rather than making, as Joel Spolsky put it, "the worst strategic mistake that any company can make: ... rewrit[ing] the code from scratch."
If you liked this post and don't have a Dockerfile yet, you may wish to refer to my production-ready Django Dockerfile post.
Feel free to comment below or contact us with any upgrade or sustainability challenges you're encountering. I look forward to your feedback!