-
Notifications
You must be signed in to change notification settings - Fork 435
feat: Introduce a new Docker environment for development #5355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 3 Skipped Deployments
|
Docker builds report
|
Uffizzi Ephemeral Environment
|
2947c16
to
51120a3
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5355 +/- ##
=======================================
Coverage 97.61% 97.61%
=======================================
Files 1237 1237
Lines 42957 42971 +14
=======================================
+ Hits 41934 41948 +14
Misses 1023 1023 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve the general idea of making life easier for contributors preferring Docker-based development environments.
That said, I'm not happy at all being faced with a prospect of maintaining:
- A new Dockerfile, completely independent from the current one, making the build toolchain for backend un-DRY again.
- A new task runner separate from the existing Makefile — confusing to contributors and probably redundant.
- A yet another Docker Compose file (though I appreciate it's probably intended to replace one or more of the existing ones).
Overall, I feel that, at this moment, this PR introduces more new boilerplate aimed to solve one specific use case than I can tolerate.
I suggest we take a step back and align ourselves on the initial intent. I'm curious to learn about every roadblock you've encountered when setting up your local dev env, and how they are solved by throwing Docker into the equation.
If we decide that the use case warrants adding new tooling, I will insist on keeping our build toolchain as DRY and as standardised as possible — i.e., introducing a new target stage into the original Dockerfile, integrating Compose with our Make targets, exploring established solutions like devcontainers, devenv.sh, etc. etc.
# NOTE: Why not Alpine? Because Alpine demands installation of OS software | ||
# prior to running app code, which leads to maintenance work and a larger image | ||
# size compared to Debian — considering Debian layers are shared across images. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how having to maintain a new Dockerfile is a viable trade-off here. I'll prefer a unified build toolchain to a neater dev container image; I'd rather this file be a new target stage in the existing root Dockerfile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! For now this Dockerfile is only experimental. It aligns nicely with the proposal but it would have a long way to go before becoming an actual thing.
# Install Python packages | ||
WORKDIR /tmp | ||
ARG POETRY_VERSION_CONSTRAINT | ||
ARG INSTALL_DEV_DEPENDENCIES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we want to not install dev dependencies in a development container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! The ability to not install dev dependencies allows us to reuse the image setup in other contexts, e.g. production. But then again, this Dockerfile is only experimental.
# NOTE: Forcefully delete optional private dependencies for now since | ||
# Poetry needs to fetch them in order to calculate the dependency tree. See | ||
# https://github.com/python-poetry/poetry/issues/4562 | ||
# TODO: Improve management of enterprise features |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is, at least in part, being tracked in #4764 (comment).
api/Dockerfile
Outdated
COPY api/pyproject.toml api/poetry.lock ./ | ||
RUN \ | ||
pip install -q poetry${POETRY_VERSION_CONSTRAINT:-} &&\ | ||
poetry config virtualenvs.create false &&\ | ||
poetry config cache-dir /var/cache/pypoetry &&\ | ||
### | ||
# NOTE: Forcefully delete optional private dependencies for now since | ||
# Poetry needs to fetch them in order to calculate the dependency tree. See | ||
# https://github.com/python-poetry/poetry/issues/4562 | ||
# TODO: Improve management of enterprise features | ||
sed -ie '/tool.poetry.group.auth-controller/,+1d' pyproject.toml &&\ | ||
sed -ie '/tool.poetry.group.saml/,+1d' pyproject.toml &&\ | ||
sed -ie '/tool.poetry.group.ldap/,+1d' pyproject.toml &&\ | ||
sed -ie '/tool.poetry.group.workflows/,+1d' pyproject.toml &&\ | ||
sed -ie '/tool.poetry.group.licensing/,+1d' pyproject.toml &&\ | ||
poetry lock &&\ | ||
### | ||
poetry install --no-root $([ "$INSTALL_DEV_DEPENDENCIES" != "true" ] && echo "--without dev") &&\ | ||
rm -rf /var/cache/pypoetry &&\ | ||
rm -rf /tmp/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Private dependency problem aside (which, IMO, should not be solved here), why isn't this make install
with POETRY_CACHE_DIR
set to a mounted path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'd rather use #4764 to improve private dependency management. This sed
trickery only exists to favor this PoC.
why isn't this make install with POETRY_CACHE_DIR set to a mounted path?
Another good question. I can list a few reasons why not to use make install
within the Docker build:
- IMO
make
is better suited for persons interacting with the shell directly. Especially, assumingmake
commands are going to run Docker underneath [if we eventually decide in favor of it], I'd rather not introduce inconsistency by having some commands that are intended to run from within a container instead of from the outside. - Currently
make install
will: 1. install pip, 2. install Poetry in a venv, then finally 3. install Python packages.pip
is already available in the Python Docker image, there's no need to reinstall.- The Poetry install script will setup a venv and install Poetry. A Docker env has no benefit out of Python virtualenvs, or really anything else this script does besides finally
pip install poetry
. - Though currently a necessary step to setup the local env,
make install
could be disregarded entirely thanks to./run
— or Docker Compose, really.
- Finally,
RUN --mount
to improve build efficiency sounds like a great idea! We'd have to explore that.
docker/new/docker-compose.yml
Outdated
context: ../../ | ||
dockerfile: api/Dockerfile | ||
args: | ||
POETRY_VERSION_CONSTRAINT: ==2.1.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather pin to a specific version (see Makefile
).
api/poetry.lock
Outdated
@@ -1,4 +1,4 @@ | |||
# This file is automatically @generated by Poetry 2.1.1 and should not be changed by hand. | |||
# This file is automatically @generated by Poetry 2.1.2 and should not be changed by hand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makefile
POETRY_VERSION
is still 2.1.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching the inconsistency. I tend to upgrade patch versions if they're harmless, but I shouldn't upgrade this yet, not in this PR.
# NOTE: Why not Alpine? Because Alpine demands installation of OS software | ||
# prior to running app code, which leads to maintenance work and a larger image | ||
# size compared to Debian — considering Debian layers are shared across images. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing I don't understand is why we care about the size of the development container at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Often when introducing Docker images, people will ask me why I don't favor Alpine for the sake of ending up with a smaller final image. I created an habit of documenting it. 🤷
Sorry this might be more than what you asked but I'd love to align with you and the team how we see Docker images. To my experience, Debian-based images will favor three pillars mostly: 1. storage, 2. build speed, and 3. maintenance.
Storage
While Alpine is famous for its base 5 MB Linux image, Debian adds a few hundred MBs. So how come Debian-based images favor storage?
Because Alpine is so clean, one will often need to install basic dev OS libs and headers to later compile and install the app and its dependencies. Now the resulting image will have extra layers, often big ones, to allow for the app dependencies to be added.
Meanwhile, Debian bundles most common dev libs, rarely demanding apt-get
work. Because of that alone, the few hundred MBs of Debian including base libs are shared across other Docker images following the same principle. For some, local storage matters more to others, e.g. most Apple Silicon users run on 256 GB of SSD which makes Docker a pain in the ass. Standards help.
Alpine scenario:
- app1: alpine + specific base libs + app dependencies + app code
- app2: alpine + other base libs + app dependencies + app code
- app3: alpine + essential base libs + app dependencies + app code
Debian scenario:
- app1: debian + app dependencies + app code
- app2: debian + some eventual specific lib + app dependencies + app code
- app3: debian + app dependencies + app code
Build speed
Same as storage, build speed is also favored with Debian-based images since Docker won't need to spend time running installation of base libs.
Alpine scenario:
- app1: alpine + install base libs and clear cache + app dependencies + app code
- app2: alpine + install other base libs + app dependencies + app code
- app3: alpine + install and compile essential base libs + app dependencies + app code
Debian scenario:
- app1: debian + app dependencies + app code
- app2: debian + install this lib + app dependencies + app code
- app3: debian + app dependencies + app code
Maintenance
Also similarly to storage, Debian-based images often allows the team to dismiss the effort of writing and taking care of the code necessary to manage base libs in the Docker image.
Alpine scenario:
- app1: alpine + work work base libs + app dependencies + app code
- app2: alpine + work base libs work + app dependencies + app code
- app3: alpine + work keep track of 3rd party base libs more work + app dependencies + app code
Debian scenario:
- app1: debian + app dependencies + app code
- app2: debian + some apt-get + app dependencies + app code
- app3: debian + app dependencies + app code
The above looks a lot prettier with graphics but I hope the text makes sense. Like I said earlier, it comes from my own experience, and I'd love to enrich it with the team's.
BTW I'm a fan of Alpine, especially for one-off tasks or when using images I don't need to customize, e.g. postgres
.
@khvn26 I agree with every comment. The shape of this PR right now is more of a foundation to discuss, rather than an actual patch to merge. I’ve updated the status to Draft accordingly. I’ve introduced a [tiny] new boilerplate as the shortest possible path to validate the PoC as an end. And only if we decide to pursue it to the benefit of users, then most definitely we’ll have to look into the means. This patch barely scratches the surface of the ultimate intent, which is a combination of long-term benefits:
For now, looking at the local environment, the serve:
./run
# Encourage developers to name migrations
MIGRATION_NAME ?= $(shell read -p 'Migration name (snake_case): ' name; echo $$name)
migration:
./run python manage.py makemigrations -n $(MIGRATION_NAME)
docs:
... While I hope that makes sense! Please let me know your thoughts. I will add to the other comments shortly. |
If the team decides this is worth pursuing, I'd love to write an RFC and publish it eventually. Likely it would be a WIP lasting as long as it takes until I learn all the ins and outs of the product, unless it gets help from other folks who'd like to join an adventure when the time is right. For now this PR is really nothing but a PoC to discuss the idea of Dockerizing the product from local to production. |
I have added information todocs/
if required so people know about the feature!Changes
Introduce a new Docker environment, initially aimed at the local environment.
The design is intended to allow developers to run the code as easily as possible.
In the future, this format might contribute to remote environments as well (needs discussion).
How did you test this code?
./api/run
— gets the application running in one command../api/run pytest --sw --pdb api/tests/
— or run any other command.cd api; ./run pytest -x tests/
— current directory is passed to the container.Ideally any command should work just by prefixing it with
./api/run
.