As more and more applications are being developed in Python, containerizing these applications has become an important task for developers. Docker, the popular containerization technology, allows developers to package their applications into a lightweight, portable container that can run on any system that supports Docker.In this article, we will discuss how to optimize and secure a Python application with Docker by analyzing a sample Dockerfile provided below:
Repository available here
# Use an official Python runtime as a parent image FROM python:3.9-slim@sha256:2bac43769ace90ebd3ad83e9229355e25dfc58e58543d3ab326c3330b505283d as build_stage # Set the working directory to /app WORKDIR /app # Copy the current directory contents into the container at /app COPY app.py /app # Install pipreqs RUN pip install pipreqs # Generate requirements.txt file based on app.py RUN pipreqs /app # Install any needed packages specified in requirements.txt RUN pip install --no-cache-dir -r requirements.txt # Cleanup after ourselves RUN pip uninstall -y pipreqs RUN pip cache purge FROM build_stage as run_stage # Create a new user with a specific user ID RUN useradd --uid 1001 app_admin # Switch to the new user USER app_admin COPY . /app # Run app.py when the container launches CMD ["python","app/app.py"]
There are three things this dockerfile achieves
- Uses minimal images, explicit tags and uses a staged build process to increase security
- Automatically finds and install dependencies using pipreqs
- Optimizes the build process by separating dependencies and source-code
Below is a breakdown of each line of the dockerfile and how it contributes to ensuring a secure, optimized and easy-to-use containerized environment for Python applications
Stage 1: Build the Requirements
This docker file uses a multi-staged approach.
There are several benefits of writing a multi-stage Dockerfile:
- Smaller image size: Multi-stage builds allow you to build a final image that only contains the necessary files and dependencies for your application to run. By separating the build process from the final runtime image, you can reduce the size of the final image.
- Faster build times: When you use a multi-stage build, Docker can reuse previously built stages if the contents of those stages haven’t changed. This can significantly speed up the build process, especially when building large or complex images.
- Improved security: By separating the build process from the final runtime image, you can reduce the risk of including unnecessary files and dependencies in your final image. This can help improve the security of your application.
- Better organization: Multi-stage builds can help you organize your Dockerfile into logical stages, making it easier to understand and maintain.
Overall, multi-stage builds can help you create more efficient and secure Docker images while also reducing the time it takes to build and deploy your application.
A Minimal Image with Explicit Tags
This line specifies the base image to use for the container. In this case, it uses the official Python 3.9 slim image, which is a lightweight version of Python that includes only the essential packages. It also specifies a digest with the token “@sha256:2bac43769ace90ebd3ad83e5392295e25dfc58e58543d3ab326c3330b505283d”. This practice ensures that every time we rebuild the Docker image for this Python application, the same underlying operating system and library versions are used. This provides a deterministic build.
# Use an official Python runtime as a parent image FROM python:3.9-slim@sha256:2bac43769ace90ebd3ad83e5392295e25dfc58e58543d3ab326c3330b505283d as build_stage
To find the digest, we have several options:
- Grabbing the Docker base image digest from Docker Hub.
- Downloading the Docker image onto our computer with
docker pull python:3.10-slim, which reveals the Docker image digest:
3.10-slim: Pulling from library/python 7d63c13d9b9b: Pull complete 6ad2a11ca37b: Pull complete 1d79bc863ed3: Pull complete c72b5f03bec8: Pull complete 0c3b0c5ce69b: Pull complete Digest: sha256:2bac43769ace90ebd3ad83e5392295e25dfc58e58543d3ab326c3330b505283d Status: Downloaded newer image for python:3.10-slim docker.io/library/python:3.10-slim
If we already have the Python Docker image on our computer, we can just get the image digest from the current existing image on disk with the command
docker images --digests | grep python:
python 3.10-slim sha256:2bac43769ace90ebd3ad83e5392295e25dfc58e58543d3ab326c3330b505283d
Once we have the base image digest, we can just add it to the aforementioned Dockerfile.
Setting the Working Directory
# Set the working directory to /app WORKDIR /app
This line sets the working directory inside the container to
/app, which is where the application code will be copied to.
Copying Files with Requirements
# Copy the current directory contents into the container at /app # COPY . /app COPY app.py /app COPY .env /app
These lines copy the
website_test.py file and the
.env file from the local machine to the
/app directory inside the container.
# Install pipreqs RUN pip install --no-cache-dir pipreqs
This line installs the
pipreqs package, which will be used later to generate a
requirements.txt file.The flag –no-cache-dir stops pip from looking in the cache directory for the library files.
# Generate requirements.txt file based on app.py RUN pipreqs /app
This line generates a
requirements.txt file based on the Python packages imported in the
# Install any needed packages specified in requirements.txt RUN pip install --no-cache-dir -r requirements.txt
This line installs the required packages specified in the
# Cleanup after ourselves RUN pip uninstall -y pipreqs RUN pip cache purge
This will remove the pipreqs library and delete any cache of pip
Stage 2: Setup Application
In this stage we setup the app to be run in the container
FROM build_stage as run_stage
This line changes the build stage to
run_stage. As before we are using the same minimal image and digest token to ensure a deterministic build.
Create an Application-Admin User Account
To keep the attack surface of our container small we must create a user other than admin.
# Create a new user with a specific user ID RUN useradd --uid 1001 app_admin # Switch to the new user USER app_admin
Copy Application Files
COPY . /app
This line copies the entire project directory to the
/app directory inside the container. If we had done this before this stage any change to files in our project folder will re-trigger the
COPY instruction, and subsequently, the rest of the build layers. That leaves us opportunity for optimization and speed-up.
The next time Docker checks if layers can be reused, if it finds that there are no changes to the
requirements.txt file, it will execute the
COPY instruction. With this, we speed up a lot of the build process, no waiting for minutes between builds each time that we modify something in our code.
Setup the Container Command
# Run app.py when the container launches CMD ["python","app/app.py"]
This line specifies the command to run when the container is started. In this case, it runs the
website_test.py script using the
Optimizing and securing a Dockerfile is an essential step in deploying a Python application in a containerized environment. By following the tips outlined above, you can create a Dockerfile that is lightweight, fast, and secure.