Here's Why Docker Images Start With "FROM ubuntu"

In Docker, we run processes in isolated environments, such as Rails applications or Node.js applications. Why, then, do Docker "image" files start with lines like FROM ubuntu? Are  containers running full Ubuntu operating systems? If not, why do we specify an entire operating system? For that matter, what does it even mean to "run an operating system?

These instructions are tailed for MacOS users, but work for Linux users as well. Just replace Homebrew with your package manager.

Let's install the process viewing application htop to get an understanding of how containers work. On your Mac, open a terminal and run brew install htop. Once the install completes, run it by typing htop and hitting enter.

In the htop process viewer interface, press t to switch to tree view. Look at the top of the process tree, and you'll see that every process on your Mac is a child of two parent processes:

A screenshot of htop for Mac showing that kernel_task and launchd are the two top level processes in tree view

All operating systems have a "kernel", the core computer program of the operating system that controls everything and facilitates access to hardware.

When your computer boots, it loads the "kernel" program into memory and hands over control to it. The kernel is the core glue between the hardware and the operating system. Then the kernel runs an "initialization" process, which is launchd on your Mac. Processes ending in d signify they're a "daemon," which is a process that runs in the background and doesn't accept user input.

I think of launchd as the program that boots the operating system. It's responsible for setting up networking, and scheduling jobs and services the operating system needs.

Key icon

Key concept: All operating systems have some variation of a "kernel" and an "initialization program". On your Mac, the initialization program is launchd. On the server hosting this website, I see /sbin/init as the root process. It's different per operating system, and can also be changed by savvy users.

Ok, so what's going on in a Docker container? To find out, let's run the same htop tool in an Ubuntu Docker container. Make sure you have Docker for Mac installed first.

If you're still running htop in your Mac's terminal, press ctrl-c to exit back to the shell. Then, in the terminal, run:

docker run --rm -it ubuntu bash

A brief explanation of this command:

  • We're running bash in a container
  • Using ubuntu as the base "image"
  • Using the flags -it so our keystrokes get into the container
  • Using --rm so the container will be removed automatically after we stop it

This command will download the ubuntu image if you've never run it before. Subsequent runs use your local image cache.

Now we're on a Bash shell inside an Ubuntu Docker container. Is Ubuntu running in the container? Meaning, is there an initializing program running inside the container? Let's check! In the container shell, install htop:

apt-get update && apt-get install htop

Then run it:

htop
htop running in an ubuntu Docker container showing the only processes visible are bash and htop itself

Is there something that looks like an initializing process here? Nope!

So no, we aren't running a full Ubuntu operating system in the container.

While running Bash in a container, the "container" is actually an isolated environment in a hidden parent Linux Virtual Machine. The Linux parent environment does run an init process, otherwise it wouldn't be running. We can't access the parent environment, because our container is running in "isolation," which is the point of Docker. Docker for Mac runs a Linux virtual machine automatically, and containers are run in isolation inside of that VM.

My mental image is the whale icon on my Mac's status bar holds a little Linux computer, and containers are ephemeral isolated environments created inside the computer.

So why is it FROM ubuntu ?

What does it mean to be FROM ubuntu and why even specify an operating system?

Recall earlier we made a distinction between the kernel and the operating system. The operating system (in this case, Ubuntu) brings along lots of software and libraries that we want to be repeatable between builds and environments. We've already seen Ubuntu has the shell bash in it (vs Alpine Linux FROM alpine, which only has sh, not bash).

We've also seen operating systems have their own package managers (apt-get in our case), which we want to specify per-container to ensure we can install dependencies for the right operating system. Operating systems also have system libraries that our programs might depend on to run.

Key icon

Key concept: When we define Dockerfiles, we don't just specify how to run our process. We also specify how to build it, and we often need package manager tools to install the necessary dependencies for Rails/Node/etc. Applications also usually depend on system libraries, like networking. Operating systems happen to have all of this out of the box!