Docker Notes

Docker helps developers build, share, run, and verify applications anywhere - without tedious environment configuration and management. I want to learn Docker to make replicating my server setup easy. I plan to automate the server setup process and have some documentation on why the server is configured the way it is to make launching new projects easier.

Date Created:

Last Edited:

2 532

References

Learning Docker, Optimize the power of Docker to run your applications quickly and easily, Pethuru Raj, Vinod Singh, Jeeva S. Chelladhurai
Creating a Container Image for Use on Amazon ECS
Best Practices When Writing DockerFiles

The inhibiting dependency factor between software and hardware needs to be decimated by leveraging virtualization, a kind of beneficial abstraction, through an additional layer of indirection. The ides is to run any software on any hardware. This is achieved by creating multiple virtual machines (VMs) out of a single physical server, with each VM having its own operating system (OS). Through this isolation, which is enacted through automated tools and controlled resource sharing, heterogeneous applications are accommodated in a physical machine.
...
A container generally contains an application, and all of the application's libraries, binaries, and other dependencies are stuffed together to be presented as a comprehensive, yet compact, entity for the outside world. Containers are exceptionally lightweight, highly portable, easily and quickly provision-able, and so on. Docker containers achieve native system performance.

Getting Started with Docker

Virtualization has set the goal of bringing forth IT infrastructure optimization and portability. However, virtualization technology has serious drawbacks, such as performance degradation due to the heavyweight nature of virtual machines (VMs), the lack of application portability, slowness in provisioning IT resources, and so on. The Docker initiative has been specifically designed for making the containerization paradigm easier to grasp and use. Docker enables the containerization process to be accomplished in a risk-free and accelerated fashion.
Docker is an open source containerization engine, which automates the process of packaging, shipping, and deployment of any software applications that are presented as lightweight, portable, and self-sufficient containers, that will run virtually anywhere.
A Docker container is a software bucket compromising everything necessary to run the software independently. There can be multiple Docker containers in a single machine and containers are completely isolated from one another as well as from the host machine. A Docker container includes a software component along with all of its dependencies (binary, libraries, configuration files, scripts, jars, and so on).
The Docker solution primarily consists of the following components:
- The Docker engine
- - Enable the realization of purpose-specific as well as generic Docker containers.
- The Docker Hub
- - A fast-growing repository of the Docker images that can be combined in different ways for producing publicly findable, network accessible, and widely usable containers.

Docker on Linux

The Docker engine produces, monitors, and manages multiple containers as illustrated in the diagram below:

The diagram above illustrates how future IT systems would have hundreds of application-aware containers, which would innately be capable of facilitating their seamless integration and orchestration for delivering modular applications (business, social, mobile, analytical, and embedded solutions). These contained applications could fluently run on converged, federated, virtualized, shared, dedicated, and automated infrastructures.

Differentiating between Containerization and Virtualization

In the containerization paradigm, strategically sound optimizations have been accomplished through a few crucial and well-defined rationalizations and the insightful sharing of the compute resources. Some of the innate and hitherto underutilized capabilities of the Linux kernel have been rediscovered.
The advantages of containerization over virtualization include:
- bare metal-scale performance
- real time scalability
- higher availability
The diagram on the left shows virtualization, while the diagram on the right shows containerization:

Virtual Machines (VMs)	Containers
Represents hardware-level virtualization	Represents operating system virtualization
Heavyweight	Lightweight
Slow provisioning	Real-time provisioning and scalability
Limited performance	Native performance
Fully isolated and hence more secure	Process-level isolation and hence less secure

Containers are isolated from each other at the process level, and hence, they are liable for any kind of security incursion. Some vital features are not available in containers that are available in VMs like: SSH, TTY, and other security functionalities in the containers.

Installing Docker

$ sudo yum update
$ sudo yum install -y docket.io
# Creates a soft link that lets you execute docket commands with docker instead of docker.io
$ sudo ln -sf /usr/bin/docker.io /usr/local/bin/docker
# Check whether installation was successdul
$ sudo docker version
Client version: 1.5.0 
Client API version: 1.17 
Go version (client): go1.4.1 
Git commit (client): a8a31ef 
OS/Arch (client): linux/amd64 
Server version: 1.5.0 
Server API version: 1.17 
Go version (server): go1.4.1 
Git commit (server): a8a31ef
# Learn more about the Docker environment using the docker info subcommand
$ sudo docker -D info
Containers: 0 
Images: 0 
Storage Driver: aufs
	Root Dir: /var/lib/docker/aufs 
	Backing Filesystem: extfs 
	Dirs: 0
Execution Driver: native-0.2 
Kernel Version: 3.13.0-45-generic 
Operating System: Ubuntu 14.04.1 LTS 
CPUs: 4 
Total Memory: 3.908 GiB 
Name: dockerhost ID: ZNXR:QQSY:IGKJ:ZLYU:G4P7:AXVC:2KAJ:A3Q5:YCRQ:IJD3:7RON:IJ6Y 
Debug mode (server): false 
Debug mode (client): true 
Fds: 10 
Goroutines: 14 
EventsListeners: 0 
Init Path: /usr/bin/docker 
Docker Root Dir: /var/lib/docker 
WARNING: No swap limit support

Client Server Communication

On Linux installations, Docker is usually programmed for carrying out server-client communication by using the UNIX socket (/var/run/docker.sock). Docker also has an IANA registered port, which is 2375. However, for security reasons, this port is not enabled by default.

Downloading the First Docker Image

Having installed the Docker engine successfully, the next logical step is to download the images from the Docker repository. The Docker registry is an application repository, which hosts a range of applications that vary between basic Linux Images and advanced applications. The docker pull subcommand is used for downloading any number of images from the registry.

$ sudo docker pull busybox
511136ea3c5a: Pull complete 
df7546f9f060: Pull complete 
ea13149945cb: Pull complete 
4986bf8c1536: Pull complete 
busybox:latest: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. 
Status: Downloaded newer image for busybox:latest
# Verify that images have been downloaded
$ sudo docker images
REPOSITORY  TAG    IMAGE ID     CREATED      VIRTUAL SIZE 
busybox 		latest 4986bf8c1536 12 weeks ago 2.433 MB

Running the First Docker Container

You can now run your first docker container

$ sudo docker run busybox echo "Hello Wolrd!"
"Hello World!"

Handling Docker Containers

Clarifying the Docker Terms

A Docker Image is a collection of all files that make up a software application. Each change that is made to the original image is stored in a separate layer. Each Docker Image has to originate from a base image according to the various requirements. Additional modules can be attaches to the base image for deriving the various images that can exhibit the preferred behavior.
Each image has a unique ID. A Docker Image has a read only template. The Docker Images are the building components of the Docker containers. In general, the base Docker Image represents an operating system. Adding additional modules to the base image ultimately dawns a container. The easiest way of thinking about a container is as the read-write layer that sits on one or more of the read-only images. When the container is run, the Docker engine not only merges all of the required images together, but it also merges the changes from the read-write layer into the container itself. The changes can be merged by using the docker commit subcommand.
A Docker layer could represent either read-only images or read-write images.
The read-write layer is the container layer.
A Docker Registry is a place where the Docker Images can be stored in order to be publicly found, accesses, and used by the worldwide developers for quickly crafting fresh and composite applications without any risks. A registry is for registering the Docker images, whereas the repository is for storing those registered Docker images in a publicly discoverable and centralized place.
A Docker Repository is a namespace used for storing a Docker image. Child images are the ones that have their own parent images. The base image does not have any parent image. The images sitting on a base image are named as parent images because the parent images bear the child images.

Working with Docker Images

The docker pull -a <image> command downloads all variants that are associated with that image. By default, Docker always uses the image that is tagged as latest. An image can be tag qualified by appending the tag to the repository name: <repository>:<tag>.
The docker pull subcommand always looks for images at the Docker index. The Docker Hub Registry also provides a platform for third-party developers and providers for sharing their images for general consumption. The third-party images are prefixed by the user ID of the developers or the depositors. You can pull an image from third party by going docker pull <user ID|manual repositroy name>/<repository>
You can search docker images by using the docker search <image name> command

Working with an interactive container

The docker run subcommand takes an image as its input, and you can use the -i and -t flags to launch the container as interactive, The -i flag is the key driver, which makes the container interactive by grabbing the standard input (STDIN) of the container. The -t flag allocates a pseudo-TTY or a personal terminal (terminal emulator) and then assigns that to the container.
The docker ps subcommand will list all the running containers and their important properties
- This command accepts the -a command, which will list all the containers in that Docker host regardless of its status
You can run the docker attach <conatiner> to attach to a container and launch the container prompt

Controlling Docker Containers

The Docker engine enables you to start, stop, and restart a container with a set of docker subcommands.
docker stop <container ID> stops the container ID by first sending a SIGTERM to the main process, and if that fails, by sending SIGKILL to the main process
docker start <container ID> to start a container. This will not automatically attach to a container. You can automatically attach to container by supplying the -a flag to the start command or by using the docker attach <container ID> subcommand.
The docker restart subcommand will first execute the stop subcommand and then the start subcommand.
The docker pause subcommand will essentially freeze the execution of all the processes within that container. The docker unpause subcommand will unfreeze the execution of all the processes within that container and resume the execution form the point where it was frozen.
The docker run subcommand takes a --rm option which will remove the container as soon as it reaches its stopped state in order to conserve disk space on the Docker Host.
You can also manually remove containers with the docker rm <container ID> subcommand.

Launching a Container as a daemon

The docker run subcommand supports the -d option, which will launch a container in a detached mode, that is, it will launch a container as a daemon.
The docker logs subcommand is used for viewing the output generated by the daemon container

Building Images

Leveraging Dockerfile is the most competent way to build powerful images for the software development community. Dockerfile is a text-based build script that contains special instructions in a sequence for building the right and relevant images from the base images.

# Create the DockerFile
# The first line is for choosing the base image selection
# The second instruction is for carrying out the command CMD
$ cat DockerFile From busybox:latest \nCMD echo Hello World!!
# Build the DockerFile
$ sudo docker build .
Sending build context to Docker daemon 3.072 kB 
Sending build context to Docker daemon 
Step 0 : from busybox:latest
...
Successfully built 0a2abe57c325
$ sudo docker run 0a2abe57c325
Hello World!!

You can specify an IMAGE name and a TAG name by using the docker tag subcommand or by specifying them when building the image
Building images with an image name is always recommended as a best practice

A Quick Overview of the DockerFile's Syntax

A DockerFile is made up of instructions, comments, and empty lines:

# Comment
INSTRUCTION arguments

The instruction line of DockerFIle is made up of two components, where the instruction line begins with the instruction itself, which is followed by the arguments of the instruction.
The standard practice is to use UPPERCASE to denote the INSTRUCTION
The comment line a DockerFIle must begin with a # symbol. the # symbol after an instruction is considered as an argument.

DockerFile Instructions

The FROM instruction is the most important instruction and it is the first valid instruction of a DockerFile. It sets the base image for a build process. The subsequent instructions would use this base image and build on top of it. The docker build system looks in the Docker Host for images first, but if the image is not found, it looks in the Docker Hub Registry.

The From Instruction:

# <image> is the name of the image which will be used as the base image
# <tag>: optional tag qualifier for that image. If any tag qualifier has not been specified, then latest is assumed
FROM <image>[:<tag>]

Docker allows multiple FROM instructions in a single DockerFile in order to create multiple images, but this is discouraged.
The MAINTAINER instruction is an informational instruction of a DockerFile that enables authors to set the details of the image. It is recommended that you put it after the FROM instruction.
The COPY instruction enables you to copy the files from the Docker host to the filesystem of the new image. COPY <src> ... <dst>
- <src>: This is the source directory, the file in the build context, or the directory from where the docker build subcommand was invoked
- ...: This indicates that multiple source files can either be specified directly or be specified by wildcards
- <dst>: This is the destination path for the new image into which the source file or directory will be copied. If multiple files have been specified, then the destination path must be a directory and it must end with a slash /.
Using an absolute path is recommended. In the absence of an absolute path, the COPY instruction will assume that the destination path will start from root /.
The ADD instruction is similar to the COPY instruction, but it can also handle TAR files and remote URLs. ADD <src> ... <dst>
The ENV instruction sets an environment variable in the new image. An environment variable is a key-value pair, which can be accessed by any script or application. The Linux applications use the environment variables a lot for the starting configuration. ENV <key> <value>
The USER instruction sets the start up user ID or user Name in the new image. USER <UID> | <UName>
The WORKDIR instruction changes the current working directory from / to the path specified by this instruction. The ensuing instructions, such as RUN, CMD, and ENTRYPOINT will also work on the directory set by the WORKDIR instruction. WORKDIR <dirpath> - <dirpath> is the path for the working directory to be set in.
The VOLUME instruction creates a directory in the image filesystem, which can later be used for mounting volumes from the Docker host or other containers.
The EXPOSE instruction opens up a container network port for communicating between the container and the external world. EXPOST <port>[/<protocol>] [<port>[/<protocol>]...]
The RUN instruction is the real workhorse during build time, and it can run any command. The general recommendation is to execute multiple commands by using the RUN instruction. This reduces the layers in the resulting Docker image because the Docker system inherently creates a layer for each time an instruction is called in DockerFile
The CMD instruction can run any command like the RUN command, but the CMD instruction is executed when the container is launched from the newly created image, while the RUN command is executed during build time. Only the last CMD instruction is executed.
The ENTRYPOINT instruction will help in crafting an image for running an application (entry point) during the complete life cycle of the container, which would have ben spun out of the image.
The ONBUILD instruction registers a build instruction to an image and this is triggered when another image is built by using the image as its base image. ONBUILD <instruction>
The .dockerignore file can be used to not send some working files to the daemon.

Publishing Images

This chapter focuses on publishing images in a public repository for public discovery and consumption.
The Docker Hub is a central place used for keeping the Docker images either in a public or private repository. The Docker Hub provides features, such as a repository for Docker images, user authentication, automated image builds, and integration with GitHub or BitBucket, and managing organizations and groups.

Running Services in a Container

Like any computing node, the Docker containers need to be networked, in order to be found and accessible by other containers and clients. In a network, any node is being identified through its IP address. Docker internally uses Linux capabilities to provide network connectivity to containers.
How Docker selects an IP address for a container:
- During the installation, Docker create a virtual interface with the name docker0 on the Docker host. It also selects a private IP address range, and assigns an address from the selected range to the docker0 virtual interface. This selected IP address is always outside the range of the Docker host IP address in order to avoid an IP address conflict.
- Later, when we spin up a container, the Docker engine selects an unused IP address from the IP address range selected for the docker0 virtual interface. Then, the engine assigns this IP address to the freshly spun container.
You can run the sudo docker inspect <container ID> to get information about the container - including its network settings like seen below:

"NetworkSettings": { 
	"Bridge": "docker0", 
	"Gateway": "172.17.42.1", 
	"IPAddress": "172.17.0.12", 
	"IPPrefixLen": 16, 
	"PortMapping": null, 
	"Ports": {} 
},

In the traditional paradigm, the server applications are usually launched in the background either as a service or a daemon because the host system is a general-purpose system. However, in the container paradigm, it is imperative to launch an application in the foreground because the images are crafted for a sole purpose.

Docker achieves network isolation for the containers by the IP address assignment criteria:
- Assign a private IP address to the container, which is not reachable from an external network
- Assign an IP address to the container outside the host's IP network
The -p option of the docker run subcommand enables you to bind a container port to a user-specified or auto-generated port of the Docker host. Thus, any communication destined for the IP address and the port of the Docker host would be forwarded to the port of the container. The -p option supports the following four formats of arguments:
- <hostPost>:<containerPort>
- <containerPort>
- <ip>:<hostPort>:<containerPort>
- <ip>::<containerPort>
where <ip> is the IP address of the Docker host, <hostPort> is the Docker host port number, and <containerPort> is the port number of the container.

$ sudo docker run -d -p 80:80 apache2 # Runs the apache2 container and connects the port 80 of the host system to port 80 of the docker conatiner

The Docker engine achieves this seamless connectivity by automatically configuring the Network Address Translation (NAT) rule in the Linux iptables configuration files.
You can expose ports using the EXPOSE instruction in a DockerFile and using the -P flag when running the docker run subcommand

Sharing Data with Containers

The data volume is part of the Docker host filesystem, and it gets mounted inside the container. A data volume can be inscribed in a Docker image using the VOLUME instruction of the DockerFile, and it can be prescribed during the launch of the container using the - option of the docker run subcommand.

The Docker engine provides a nifty interface to mount (share) the data volume from one container to another. Docker makes this interface available through the --volumes-from option of the docker run subcommand. The --volumes-from option takes a container name or container ID as its input and automatically mounts all the data volumes available on the specified container.

Orchestrating Containers

One of the prominent features of the Docker technology is linking containers. Containers can be linked together to offer complex and business aware services. The linked containers have a kind of source-recipient relationship, wherein the source container gets linked to the recipient container, and the recipient securely receives a variety of information from the source container.
The Docker engine provides the --link option in the odcker run subcommand to link a source container to a recipient container: --link <container>:<alias> where <container> is the name of the source container and <alias> is the name seen by the recipient container.