How to Get Started with Docker on Linux


How To Get Started With Docker on Linux

Applications are being built, shipped and updated at an increasingly fast pace. It’s a trend that has generated interest in solutions to facilitate this complex process. The result has been a flood of new methodologies and tools into the DevOps space. In this article, I will focus on two of these tools: Docker and Docker Compose. More specifically, using them on Linux to build an API in Flask.

If you prefer working in the Windows environment, we’ve got you covered. Check out our Windows version of this article here.

 

What is DevOps?

The AWS site describes DevOps as “the combination of cultural philosophies, practices, and tools that increases an organization’s ability to deliver applications and services at high velocity”. In other words, DevOps is about merging the Development and Operations silos into one team, so engineers work across the entire application lifecycle. If you build it, you own it and you’re responsible for making sure your application works as expected in all environments.

 

What is Docker?

Docker performs operating system level virtualization (a process often referred to as containerization; hence the term Docker Containers). It was initially developed for Linux, but it is now fully supported on macOS and Windows, as well as all major cloud service providers (including AWS, Azure, Google Cloud). With Docker, you can package your application, and all of the software required to run it, into a single container. You can then run that container in your local development environment all the way to production.

Docker has become one of the darlings of the DevOps community because it enables true independence between applications, environments, infrastructure, and developers.

 

Docker Containers or Virtual Machines?

You may ask yourself: “If containers are just another virtualization strategy, why should I consider them? I am already using a Virtual Machine.”.

There are a few outstanding benefits to using containers instead of virtual machines. But before we talk about them, I think it’s important for us to understand the main differences between these virtualization types.

docker container vs virtual machines Above: Containers vs Virtual Machines

Virtual machines (VMs) are an abstraction of physical hardware, turning one server (hardware) into many servers (virtualized). The hypervisor allows multiple VMs to run on a single machine. Each VM includes a full copy of an operating system, one or more apps, necessary binaries, and libraries. VMs are usually slower to boot compared to the same OS installed on a bare metal server.

Containers are an abstraction at the application layer that packages code and dependencies together. Multiple containers can run on the same machine and operating system, sharing the kernel with other containers, each running as isolated processes in the user space.

 

Key Benefits of Containers

  • Containers are faster and lighter to ship as they house the minimum requirements to run your application.
  • Containers can be versioned, shared, and archived.
  • Containers are instantly started as they carry a much smaller blueprint than Virtual Machines.
  • Build configurations are managed with declarative code.
  • Containers can be built and extended on top of pre-existing containers.

 

Composing Containers

Often a single container isn’t enough. For example, when you have services communicating with one, or more, databases that you own. You might also have cases in which you need to use a combination of Linux and Windows containers.

This is where Docker Compose comes in. It helps create multiple isolated environments on a single host. Which is very handy for development environments where you can have all application dependencies running together, as different containers on the same host. Composed containers only recreate containers that have been changed, helping to speed up development time. You can also configure the order of containers, and their respective dependencies, so they are bootstrapped in the correct order.

Now that we understand the concept and ideas behind Docker containers, it’s time to get our hands dirty. I’ll guide you, step by step, through the process of setting up your container environment.

 

Not a Linux developer? Don’t worry!

I will be demoing all of this on a Linux environment, using a Python Flask application and a PostgreSQL container. Don’t worry though, many of the concepts I will go through in this article apply equally to development across all platforms. You can also switch to the Windows version of this article by clicking here.

 

Requirements

Before we can continue any further, you will need to install Docker and Docker Compose. The installation is simple and step by step instructions for your platform can be found on the official Docker website. Once Docker installation is complete, it’s time to install Docker Compose. The process is even simpler than for Docker and the official instructions are available here.

 

Let’s Go!

To verify the installation has completed successfully, run the following commands in the terminal:

If everything has been set up correctly, the commands will return the versions of the tools installed (the versions in your environment might differ slightly):

 

Docker internals

OK, we’re on our way! Before we get too deep, it’s useful to know a few Docker terms. Knowing these will help you understand how everything is interconnected.

Daemon

The Daemon can be considered the brain of the whole operation. The Daemon is responsible for taking care of the lifecycle of containers and handling things with the Operating System. It does all of the heavy lifting every time a command is executed.

Client

The Client is an HTTP API wrapper that exposes a set of commands interpreted by the Daemon.

Registries

The Registries are responsible for storing images. They can be public, or private, and are available with different providers (Azure has its own container registry). Docker is configured to look for images on Docker Hub by default. To see how they interact with each other, let’s run our first image:

$ docker run hello-world

The above list explains what happened behind the scenes: the client (the docker command-line tool), contacted the daemon to run the hello-world image. Since the image wasn’t available locally, it had to be downloaded from the registry (Docker Hub is set as the default). The daemon then created a container from the image and ran it, sending the output generated to the client and making it appear in your terminal.

If you’re experiencing problems with the example above, e.g. you’re required to run the Docker client using sudo, there are a few post-installation steps you might want to go through described here. In short, you’ll need to add your current user to the docker user group, removing the need for elevated privileges. This can be done with the following commands:

 

The Docker client

Now that you know how to run a docker image, let’s look at the basics of image management. You can see what images are currently downloaded using docker images.

Right now, we only have the hello-world image we’ve downloaded in the previous step. Let’s download a Linux image and use it to execute custom commands. The image we’re going to use is Alpine, a lightweight Docker image based on Alpine Linux. We’re going to use docker pull to explicitly download the image from the image registry:

docker pull alpine
docker images

We now have two images at our disposal. Let’s run a command using the new image:

docker run alpine cat /etc/os-release

Printing the contents of the /etc/os-release file on the guest filesystem we can see which version of Alpine it is running. Using docker run creates a new container and runs the command inside the container until completion. If you want to run an interactive command inside the container, you’ll need to pass the -i -t flags to the run command.

docker run -it alpine sh

We now have an interactive shell inside the Docker container. This means that the container is running until we exit the shell. You can verify that by opening another terminal window and running docker ps in it.

You can confirm that it’s the same container by comparing the CONTAINER_ID against the HOSTNAME from the previous command. You can use that value in order to connect to an already running container and you will see that the changes in one shell session are visible in the other.

docker exec -it container_id sh

The changes will not persist between different runs of the same container though.

You can see the running containers with docker container ls or. By adding the -a flag you can see all previously running containers. To remove the stopped ones use docker rm.

docker container ls -a
docker rm container_id/container_name

The containers that stopped running are preserved on disk by default. In some contexts, they might be used to debug an issue after the run completed. To automatically clean them up, the --rm flag can be added to the run command. Additionally, you might have noticed that containers are given names such as boring_neumann in the example above. Unless you pass one explicitly with the --name flag, a random name is generated. If you know the name of the container, you can replace the container id in the commands that require it, so it’s good practice to name your containers.

 

Building your first container

With the basics of the Docker client mastered, it’s time to build a container that will host an API service. It will have a single endpoint: returning Hello, world!. The code is available on GitHub:

In order to run the service, a couple of steps have to be completed:

  • The Python module dependencies need to be installed.
  • The FLASK_APP shell variable has to point to the python file with the code.
  • flask run can then be invoked to start the service.

Since there isn’t an existing image that does all that, we will need to create our own. All image definitions are contained inside the Dockerfiles, which specify the parent image and a set of instructions that you want to be executed.

The Dockerfile for the service looks as follows:

There’s a lot to take in here so let’s go step by step.

FROM python:alpine3.6

  • Specify the base image in the name:tag format. In this case, it’s the Alpine distribution containing Python 3.6.

EXPOSE 5000

  • Have the container listen on port 5000. That does not mean this port will be available for communication outside the container – it has to be published separately. More on that soon.

ENV FLASK_ENV development
ENV FLASK_APP /api/api.py

  • Set environmental variables consumed by the code.

CMD ["flask", "run", "--host=0.0.0.0"]

  • The default command executed when the container is running.

COPY ./api /api

  • Copy the source code directory from the host to the image.

RUN pip3 install -r /api/requirements.txt

  • Install the dependencies inside the container.

To build the newly defined image we will use docker build:

docker build -t api:latest

The -t flag allows to specify the name and tag for the new image. Once the image is built, it will appear on the list of your images:

docker images

You can see it’s using the name and tag specified. The order of instructions in the Dockerfile might seem confusing at first. It might appear as if the application is being run before its dependencies are installed. That’s not the case though, the entry command specified by CMD does not get executed until the container is started. Additionally, ordering the commands in this way takes advantage of image build cache. Each build step is being cached, so that if any line in the Dockerfile is changed, only it and the lines following it are re-evaluated. Trying to rebuild the image again would result in all steps being evaluated from cache.

docker build -t api:latest

 

Running the service inside the container

While running the container using docker run API does start the Flask service, it won’t be working as expected.

That’s because the image is configured to listen on port 5000, but the port hasn’t been forwarded to the host. In order to make the port available on the host, it has to be published:

docker run --rm --name api -p 8082:5000 api

This forwards hosts port 8082 to container’s 5000. You can see port forwarding configuration of a container using docker port.

docker port container_id/container_name

So far we’ve been starting the container in the foreground and closing the terminal window would stop the container. If you want the container to be running in the background without eating up one of your terminals, you can run it in detached mode, with the -d flag.

docker run --rm --name api -p 8082:5000 -d api
docker ps
docker logs container_id/container_name

docker stop can then be used in order to stop the server and bring the container down.

docker stop container_id/container_name
docker ps

 

Working with multiple containers

Most web services depend on a database, so let’s add one to this project.

docker run --rm --name postgres_db -p 5435:5432 -e POSTGRES_PASSWORD=postgrespassword -d postgres

The default postgres image is downloaded and started in detached mode. The default postgres port is forwarded to 5435 on the host (-p flag) and we set the POSTGRES_PASSWORD> shell variable inside the container (-e flag) which is used to set as the database password. Let’s verify the database is running correctly.

docker logs --tail 20 postgres_db

PGPASSWORD=postgrespassword psql -h localhost -p 5435 -U postgres -c '\l'

With the database running as expected, let’s update the application to connect to the database and define our database models. We’re going to use SQLAlchemy as the ORM and the Flask-SQLAlchemy library to make the integration easier.

The updated web server logs the time of each request and saves it with a numeric ID. Let’s look at the changes in detail:

When starting the server, we configure the database connection. It’s using the password passed earlier to the postgres image and it’s connecting to the postgres_db host, which is the other container.

A simple request logging model – it stores an incremental ID of the request and the time of the request.

When starting the service, create tables on the database for our models.

Now every incoming request will be saved to the database and the returned message will contain the requests database id. The updated source code is available on GitHub. With the updates in place, the API image needs to be rebuilt.

docker build -t api:latest

Using the build cache, only the last two steps of the Dockerfile had to be executed – the Python source files were copied over and dependencies were installed. By default, Docker containers can only communicate with the host, not with each other. In order to allow them to communicate with each other, additional configuration is required. As long as the postgres_db image is running first, the API can be started with a link to it, allowing to resolve database connections. With the link configured, we can see that the inter-communication is working correctly.

docker run --rm --name api -p 8082:5000 --link postgres_db:postgres_db -d api
curl localhost:8082
docker logs api
docker stop api postgres_db

It might seem that the advantages of using containers are outweighed by the cumbersome setup: you have to start the containers individually, in the correct sequence and explicitly link them in order for them to work together.

That’s where Docker Compose comes in.

 

Meet Docker Compose

Docker Compose is a tool for running multi-container Docker applications. While it requires some additional configuration (in the form of a docker-compose.yaml file containing the definition of the application’s services), multiple containers can then be built and run with a single command. Docker.compose is not a replacement of the Docker command line client, but an abstraction layer on top of it. Our docker-compose.yaml file will contain the definition of the API and the database service.

The version directive specifies which version of the Docker Compose syntax we’re using. It’s important to provide it, as there are non-backwards compatible changes between versions. You can read more about this in the official documentation. In the services section, the containers we will be running are described. The postgres_db definition should look familiar to you, as it contains the arguments that used to be passed to docker run:

docker run --rm -p 5435:5432 -e POSTGRES_PASSWORD=postgrespassword --name postgres_db -d postgres

The advantage of storing them in the docker-compose.yaml file is that you won’t have to remember them when you to start a container. The service uses a public image instead of a local Dockerfile. The opposite is the case for the API image so we need to provide the build context (path to the Dockerfile) and the Dockerfile to build the image from. The configuration also ensures that the API container is started after postgres_db since the former depends on the later.

With the container built, and run-time configuration specified in the docker-compose.yaml file, both containers can be built and started with a single command.

docker-compose build
docker-compose up -d
curl localhost:8082
docker-compose down

 

Docker Compose for local development

In the Dockerfile we copied the source code from the host machine to the API container – so any changes made locally are not picked up until the image is rebuilt. To avoid having to rebuild the image every time the application code is updated, it’s possible to mount a local directory inside the container, allowing modifications in one environment to be present in the other.

That change, applied to the docker-compose.yaml file, would work great for development, but it’s not a configuration that would be welcome in production. In production, you want to avoid having the ability to circumvent the release process and edit a production application in situ. Fortunately, there’s no need to entirely duplicate the docker-compose.yaml file for each environment as using docker-compose.override.yaml allows you to compose two files; one as the base and the other overlaying modifications on top of it. In this case, the only modification we want locally is to mount the source code directory inside the container.

You can find the changes on GitHub. When running docker-compose now, the API service will contain the configuration from the docker-compose.yaml file with the values from docker-compose.override.yaml taking precedence over duplicates. Once the API service is started, the Flask dev server will be monitoring the code for changes, and if any occur, the dev server will restart inside the container. It’s worth pointing out that changes to the compose or override files will require the images to be rebuilt.

docker-compose up -d
docker logs -f api
curl localhost:8082

 

Where to go from here?

If you’ve enjoyed this experience with Docker, there’s plenty of ways you can further your knowledge. The official Docker website is a goldmine of information, you can familiarise yourself with the guides or learn, in depth, about the Docker client commands and config file syntax. Once you’ve quenched your thirst for Docker knowledge, you can move on to exploring the world of container orchestration frameworks which work on top of container technologies, such as Docker, and help automate container deployment, scaling and management tasks. The frameworks you might want to read about are:

 

Conclusion

So there you go. Docker is a fast and consistent way to accelerate and automate the shipping of software. It saves developers from having to set up multiple development environments each time they test and deploy code. That time can then be spent developing quality software instead.

Hopefully, this article has sparked your interest in Docker. I would love to hear where you take this new knowledge and what you think about Docker. So feel free to comment below.

This article is based on Rafael Carvalhos’ recent post ‘How to get Started with Docker on Windows’. It has been adapted and expanded for the Linux platform by Jakub Musko.