Make Butler server docker builds reproducible

Description

Before going to production with the Butler server, we need to ensure that we can reproduce a past build of its Docker image to allow for emergency fixes and backports to the production version of the service.

Currently the Docker build pulls in the latest version of its dependencies from PyPI at build time. That means external changes could cause breakage in the old code at any time if we attempt to rebuild.

The easiest and most obvious thing to do would be to pin exact versions of the dependencies in the requirements.txt. SQuaRE's safir FastAPI template has a bit of code for doing this in its Makefile (https://github.com/lsst/templates/blob/main/project_templates/fastapi_safir_app/example/Makefile).

However, that gets a bit weird because the integration testing on Jenkins installs its dependencies from conda-forge, ignoring requirements.txt. I'm not sure if it's possible or makes sense to install dependencies as subset of the rubin-env conda environment, but that would make sure that the code that is integration tested is the code that is deployed.

The current GitHub actions for running unit tests also use a different set of dependencies (some are from conda-forge and some are from PyPI). At a minimum these unit tests and the Docker build should probably use the same versions of dependencies.

Currently the Docker build is using a python/Debian base image just because that's what the safir FastAPI template is using. It may make sense to use micromamba instead to make it easier to get common package versions between Jenkins, Docker, and Github Actions (https://hub.docker.com/r/mambaorg/micromamba)

Issue Matrix

hide

Activity

Show:

Tim Jenness 
January 23, 2024 at 10:49 PM

Looks okay. If possible I'd like the test restored for "move".

Maybe also add a worked example outlining the process for dealing with a ticket that needs a change in resources or something.

David Irving 
January 23, 2024 at 6:44 PM
(edited)

Pinned the version of Python dependencies for server container builds, so that we are able to reproduce builds with the same versions if necessary. This will allow us to backport fixes to production builds with less risk of breakage. This uses the same pip-compile tooling using by SQRE repos. I decided not to mess with Conda for the Docker container – we are still using the same Debian-based python base image as SQRE services.

Because we are now running GitHub actions with a different set of dependencies than the Docker container, we now also run unit tests inside the Docker container to ensure that everything works with the pinned dependencies. In order to make this work, modified the unit tests so that they no longer create temporary files in the source tree.

This change also revealed that resources[s3] and resources[https] were pulling in moto and responses, so I went ahead and fixed that.

Tim Jenness 
November 21, 2023 at 9:15 PM
(edited)

The motivation to move to conda in the GitHub action was to give us a newer sqlite than we could get from the ubuntu images (). In theory we could try to go back to pip and apt-get.

Done
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Labels

Reviewers

Tim Jenness

Story Points

RubinTeam

Components

Checklist

Created November 2, 2023 at 9:47 PM
Updated January 24, 2024 at 4:39 PM
Resolved January 24, 2024 at 4:39 PM