Docker is fantastic and it's changed the way I build software. There are a ton of Docker tutorials and documentation online, but they generally have a high threshold and assume you come from a DevOps background or at least have a good sense for know all the parts fit together. What if you're not sure of the best practices or even where to start? DevOps is hard, and it doesn't help that the Docker project is under active development. Guidelines even six months old are often well out of date now. Maybe you'd like to start using these tools, but you don't want to back yourself into a corner.
That's the position I found myself in when I started looking for a better way to develop and deploy software. Specifically my academic, artistic and research projects: I'm a small one-person shop, and most of my projects are not enormous. I can't afford the hosting bills on a full blown AWS or even Heroku when I might get a couple of visitors a month. At the same time, I'd love to make sure my code can scale and deploy as needed... just in case. I also had a sense that Docker could simplify my (development) life, which is incredibly important when juggling multiple projects. Docker helps a lot with the simplification, but not before making it more complicated at first.
This post is the introduction to Docker that I wish I had had when I was starting out. This is not a tutorial (if you just want to dive in and look at code, I've put together a boilerplate repository). Instead, this post focuses on high-level concepts and theory of operation rather than syntax or recipes. I wrote it as a way to organize my own thoughts as I was learning, I hope this gets you started thinking more clearly as you google-search through the hundreds of documents and examples online.
What is Docker? It's a bit like a virtual machine in that it allows you to run sandboxed environments, but the technology under the hood is completely different. The engineering here is very cool, but of more immediate interest is what it lets us do: Docker allows us to build, run and teardown complete software environments quickly and easily, without really concerning yourself much with the parent environment. If something runs on a docker container, it will run on any host which supports Docker. Not "likely" to, it will. No more "works on my machine," or messing around trying to get everyone "set up with a dev environment," No more (or a lot less) figuring out how to build a load balancer. No more (or a lot less) figuring out intricate internal routing setups to set up private relationships between backend services. Yay!
In addition, this self-contained approach allows us to scale production apps in interesting and efficient ways. If we properly compartmentalize our code, we can multiply containers to scale to exactly what we need, no more and no less. All of this lends itself very well to "cloud computing" on something like Digital Ocean, where we can make use of pushbutton deployment to add and remove computational power on an ad-hoc basis.
The first thing that confused me: Docker can be used in many different ways. The same toolset can be used to construct production hosting environments and it also can be used to create development environments. It is not, however, a tool specifically for making pipelines between the two, nor are the choices made for one necessarily good for the other. Docker is general purpose like a knife, and like a knife it doesn't prevent you from doing things that are dangerous or stupid - it's up to you to apply it correctly.
One common use case, for example, is to bake your code into a docker image for deployment. This is exactly what you want in a containerized production swarm, where it is important that replicas really are exact replicas and can come up quickly. In a development context, you probably want more flexibility: the ability to change the code on the fly without rebuilding and deploying the whole container. Same tools, different use case.
You should approach these two broad categories of task differently: development and production. The toolchain we have with Docker can be used for both, but the way in which it is applied is different. What's initially confusing here is that Docker makes it easy to move from one to the other, including using the same syntax and even the same config filenames. This is powerful and you will learn to love it, but keep it in mind as you read through tutorials: the authors don't always tell you what case they're solving for.
Most online tutorials will show you how to get something up and running locally in a single container (usually as a long command line), perhaps slightly more complex, as a compose file. They might even show you how to deploy in a swarm of 1, but almost always leave the details of architecture as an exercise to the reader.
In most cases you're probably fine, you don't really need ten concurrent database servers for your small project, but it's particularly confusing when the authors don't make it clear what they're putting together is good for development versus production (or more importantly: why!) There is also nothing quite as frustrating as finding something that tells you exactly what you think you need but ends with "don't do this for real."
The major learning here for me was to fall out of love with Docker for a bit - decompose the problems and figure out where the limits are. Docker will solve a lot of problems, but it won't solve all the problems. You may discover things that Docker does much better (like internal private "overlay" networking between containers) but then again you might discover things it can't handle at all (like restricting access to a single IP).
The overall approach of Docker is to decompose your server stack into services. The word that Docker uses for a single logical unit is container which is created by way of a base image plus various settings. You should think of each major task you want to perform as occupying a single container.
A static website or basic shell environment might have a single container. For example you could spin up a "node" or "nginx" or "apache" image to complete a task. This is often done for one-off tasks, such as stages in build pipelines. It's also sometimes used for running command line apps.
For most client/server setups, where you want to have something (or multiple somethings) running in support of an app you are building, you need a little more complexity. A typical stack might be something like: A react app served via nginx and talking to a database backend via some kind of api. In addition you probably want to scale, support SSL and perhaps add other services later, so you reverse-proxy access via nginx. Thus you might minimally conceptualize this as a stack of three containers: [postgres] + [react+sever] + [nginx].
1. Images/Single Containers
The base unit of docker-ness is the image, which can be used along with a "Dockerfile" to build a single container. A container is an instance of an image.
It is possible to build your own images, and you will want to do this, but this is not where you want to begin: instead, you should start by making use of public images, in particular those regularly maintained by the larger projects. You can do this because images can be checked into repositories. For most major tasks (setting up a database, launching a webserver) there exist well documented images that accept enough parameters to get the job done.
If you only want to do one single thing on your development machine (such as launch a static web server or standalone unix image) then you can likely stop here. Use a public image and pass in parameters to accomplish your task. There are images for pretty much every project you can think of.
Docker is really powerful when you start combining multiple containers together. Docker-compose is the tool you use for this. Docker compose uses a single file: docker-compose.yml which allows you to describe the containers you want and the relationships between them. Using this file you can bring up your containers in specific order, specify how and where they become available to the outside world, where they store their data, and any number of more interesting and complicated things. In addition, it is possible to describe containers in terms of build instructions, which means that you can have multiple images assemble themselves and then deploy them in relation to each other. This is especially useful for development, where you can create complex containers on the fly.
If you want a stack of services for local development purposes, then you can probably stop here.
3. Stacks and Swarms
Docker stack does a very similar thing to docker-compose. In fact, it's confusing because it initially looks exactly the same. You even describe a stack using the same docker-compose.yml file! One of the first things you'll notice right away, however, is that some of the settings in your docker-compose.yml will be ignored when you deploy as a stack. In particular, docker stack doesn't support 'building' anything, it will only work with precomposed images. This is a huge clue indicating what stack is for: deployment. Specifically it's for deployment in a "swarm" which may consist of a single computer but might also be a large and complicated system featuring multiple nodes. In such a case, concurrency is important: you need to guarentee every container is the same at pretty much the same time, hence the lack of build support. Swarms also have some other neat features, including docker secrets, which is a way to securely share information like api tokens across your entire swarm.
Heuristic: use docker and docker-compose for local work. When you're ready for deploying to the outside world, start thinking about swarms and stacks.
The transition between local development and production or staging is where docker repositories come in handy. Docker repositories (or more generically: container repositories) do what they say on the tin: they store an image ready for use elsewhere.
One major difference from code repositories: images are "snapshots" and can't be partially versioned as you can do with code. Containers can be tagged (including the automatic 'latest' tag), but are sealed boxes including all the code and configuration you need. As a good practice, you should tag each of these with a release number or name to ensure all the containers running your image are aligned (something that is especially important when you're running at scale).
Back in the early days, when you just ran 'docker hello-world', you used a repository. That's where the 'hello-world' image came from.
In step 2, when you were using 'docker-composer', you probably used a repository (or several). That's where things like Postgres and nginx repositories come from when you specified 'image.'
For deployment purposes, you're going to want your app packed into a container so your production cluster can launch it into a swarm without any trouble.
By the time you are ready to deploy a swarm you will have read enough tutorials that there's not much more I can tell you, except some parting advice: Gitlab offers unlimited free private container repositories.
There's a companion repository I've put togther to get you started
Run through this and you will end up with a SSL-secured Docker swarm manager prepped to deploy Nginx reverse-proxied apps which it will retrieve automatically on update via Watchtower. In addition, this setup includes a functional firewall and IP-restricted Portainer.
Note: You probably do NOT want anything here if you are just trying to use docker locally on your own machine. This is specifically for getting an SSL secured swarm set up for hosting, which is something I found significantly difficult I decided to pull it all together. Happy Containering!