Designing REST APIs – lessons learnt

Introduction

We’ve been designing and deploying REST APIs for a while now. Time to document some of the lessons we’ve learnt during that process. They’re not presented in any particular order and they relate to various parts of the development lifecycle.

How to do paging

If a list query (say GET /members?start=0&count=100) could return a total of, say, 15.000 results, you need to indicate this in the response (so the client can show a paging control).

Our early APIs returned an “envelope” object, containing the list of 100 members plus the total count. This is OK, but it forces the client to implement an extra class for all list queries, whereas what he really expects is an array of members.

This can be achieved by putting the total count in a response header (X-Total-Count is often used) and returning a pure list of members. Note that even if you don’t expect a list to grow long, it’s a good idea to provide paging anyway for consistency.

How to handle long running tasks

Users of your API will expect immediate responses, you should not leave them waiting 20 seconds for an operation to complete (it’s likely to timeout on the client side). If the underlying operation might take longer than, say, one second, you should run it in the background and return a task id and provide an additional method to get the status of the task via its id (and to cancel it, if possible).

How to handle versioning of your API

The simplest way to handle versioning is simply to introduce the version into the path of your API, e.g. GET /items, GET /v2/items. This is a common pattern and users will accept it. However, you shouldn’t mix versions, so documentation and playground for older versions should only be available via an “previous versions” link.

Naming in REST APIs

Naming conventions are an important part of making your API easy to use. A number of de facto naming conventions have emerged which you can and should follow.

  • id
    Objects should, wherever possible, have an id property which identifies them uniquely.
    {
      id: 5,
    color: “blue

    }

  • list vs objects
    Use plural names for lists, singular for objects:
    GET /items?start=20&count=10 will be expected to return a list of up to 10 items starting at item 20.
    GET /item/2 will be expected to return an item with id=2.

Authentication

Since REST APIs should be stateless, every API call should be authenticated separately.

REST APIs can use any authentication method supported by http(s). However, most APIs these days use an API key (a GUID like 7b515391-79e3-4857-92b1-9f7d0f099fcd). The most unobtrusive way to pass the API key to the server is to use a request header (called something like api-key). You can also use a query parameter (e.g. GET /items?api-key=7b515391-79e3-4857-92b1-9f7d0f099fcd). For ease of use (especially from a browser), you should support both styles.

An API alone is not enough

These days, people expect a lot from APIs. As well as a great API, you’ll also need online portal with documentation, code-samples in multiple languages and a sandbox/playground where people can try out your APIs in a production-near manner.

Which tools to use

We use Swagger/OpenAPI to develop the API and then we wire it to a back-end (Java or Python) using OpenAPI java or python inflectors1. We use Redoc for online documentation of REST APIs. Postman is indispensable for ad hoc testing during development and curl is a useful simple command-line API client. We use HAProxy for load-balancing and Letsencrypt for SSL certs. We use Docker for deployment (not just for the server, but also for all infrastructure components like SSL termination, proxying etc).



1. You’ll often see people generating a REST API from a server implementation. At first glance, this seems easier than the API-First approach we use, but it’s the wrong approach because your API must be pristine and if you generate it from an implementation, you will inevitably bleed some implementation details into the API.

API-First Design

Introduction

We have been applying API-First design principles for several years now and advise our customers to do the same. As the name suggests, API-First design means designing a system around an API rather than adding an API to an already designed system as an afterthought.

Note: when we say API, we mean a REST-API using JSON object representation – this is because virtually all APIs use REST these days, due to the fact that REST is based on http(s) which has emerged as the dominant communication protocol of the internet.

Why API-First design?

The main reasons for designing a system around an API are:

* Its canonical: the API is the canonical form of a system – i.e. stripping off details of presentation, storage etc, the API represents the system design in its most reduced, pure form. The API should remain valid even as you swap out client and server implementations.

* You can’t do without it: the fragmentation of client technologies (web, mobile etc.) means that we need to support multiple clients, making an API almost unavoidable.

* You’d better start with it: if you don’t make an API the starting point of your design, it’s going to be very hard to achieve a correct API without any bleeding of implementation details.

How to do API-First design?

The starting point of API-First design is that the API is authored and verified independently of any implementation. Crucially, this means creating the API with a set of specialised API design tools such as OpenAPI (which we use), NOT generating it from Java or Javascript classes. This lesson is often hard for developers to learn because they are used to thinking of an API as a way to invoke their code, rather than the other way around (namely that their code serves the API).

What this means in practice is that when your customers request an API, you must first design the correct API and then figure out what changes are required to the back-end code to serve that API. This is more difficult than it sounds – APIs follow standards with respect to statelessness, security, paging etc which may be very hard to serve with existing code. If you give into the temptation of asking your API users to live with the limitations of your back-end implementations, you will regret it – your customers will be unhappy from the start and you will likely have to live with those limitations for the lifetime of the system, even after you have swapped out the server implementation. Of course you can make v2 of your API, but you will still have to serve v1 because some customers will not be able or willing to upgrade their clients.

Connecting your API to your back-end systems

We have said in the previous section that you should not generate the API from the back-end system, but rather design it independently. So you may be wondering how the API gets connected to your back-end system. After you’ve designed your API (we use the OpenAPI Swagger editor to author the API), we use an “inflector” to wire-up the API to the back-end logic. OpenAPI provides inflectors for various languages, including our two preferred back-end languages, Java and Python. The inflector takes the API definition (OpenAPI uses a yaml or JSON definition) and takes care of invoking the back-end methods defined in the API definition. In practice this means that if you have existing back-end logic, you’ll usually need to modify it or create adapter classes to map between the API and the back-end. That’s the price of API-First, but its a price worth paying to achieve a usable and robust API.

Follow standards!

Try not to reinvent any wheels. There are emerging, de facto, standards for designing REST APIs. Follow them. People will generally not thank you for designing new ways to handle naming, authentication, state or paging, for which there are already standards. Its tempting to think that your business case is unique and therefore deserves a unique API with its own standards, however, remember that customers will often have experience writing clients for multiple APIs and they will be grateful whenever you follow patterns that they already understand.

For example, a REST GET method which is named /messages will be expected to return a list of JSON Message objects. That should be the same Message object which is returned by GET /message/. There should not be an arbitrary MessageList class (this type of construct was a typical side-effect of generating an API from an existing language-specific implementation). the required paging metrics, such as the total items returned by a query, should be handled by out-of-band variables, such as an X-Total-Count header, thus leaving the contents as a simple object list.

However, these are only emerging standards and there’s no single definitive guide. The best way to learn them is by consuming (i.e. writing clients for) several well known REST APIs and by reading best-practice articles (see below for some links).

More reading

Zalando restful API guidelines
Red Hat – thoughts on restful design

git workshop

Introduction

In this post, we’ll cover the basics of using the git source control system, which has emerged as the dominant tool in this space (especially since Microsoft paid $7.5B for github.com).

Audience

This workshop is intended for developers who are already aware what a source control system is and why you need it.

Platforms

We’ll do this workshop on Ubuntu, but git itself is completely cross-platform, so everything in this workshop is applicable for other platforms such as macOS and Windows. There are various GUIs available, but we’ll concentrate on the command line, since that’s what you should learn first. Many IDEs and editors (like VSCode) have git support built in, but it’s still important to be comfortable with the git command-line.

Installing git

Note: on macOS, git is preinstalled and on Windows you can install it from https://git-scm.com

Creating a repository

The first thing to note is that, unlike previous generations of source-control systems, git does not require a server – it creates its local repositories inside hidden folders (.git) in your project folder.

We’ll start by creating a “git-workshop” folder which we’ll use to hold our example project.

We’ll create a single index.html to represent our project source files.

Now we’ll create a new git repository:

Now we’ll check the status of the git repository with:

This tells us that index.html is untracked – i.e. hasn’t yet been added to the repository, so lets add it. We use “git add .” to tell git to add all files in the current directory to the git repository.

Adding files to the repository

git calls this “staging” files. When a file has been added to a git repository its considered to be in the staging area.

Now lets check the status again:

Now we see that the files are “staged”, which means that git knows about them. We still need to commit them to create a version of the file in the repository, so lets go ahead and do that with the “git commit” command:

Committing versions of your files

OK, git needs some additional configuration before the first commit, so lets do that:

Now we’re ready to commit our project files to the repository.

That’s better, now git has committed index.html to the repository.

If we now do a git status:

We can see that everything has been committed to the repository. So if we now make a change to index.html, we would expect to have one version of the file in our directory and another, older, version in the repository. Let’s go ahead and do that.

and check with git status:

We can see that git knows that we’ve modified index.html. git will also tell us what has changed if we use git diff:

OK, lets commit the changed version of index.html to the repository:

Note that we gain had to add index.html. This seems a little counter-intuitive, but it’s the way git works – each version of a file must be added and then committed.

We can now get a history of our index.html file with git log:

Rolling back changes

One of the reasons why you need git is in order to be able to revert to a previous version of your project when you realise that the changes you’re currently making are going nowhere.

Lets first make a change to index.html.

In a real project, we may have been working for hours and have made many changes, which we now regret and wish to roll back. There are several ways to do this with git, but the simplest one just gets us the latest committed version back again as follows:

Lets check that our changes have been rolled back:

Looks good!

Removing files

If you want to remove a file from the repository (“unstage” it), you can do it with git rm .

Deleting the git repository

You can also delete the whole repository simply by removing the .git folder (“$ rm -rf .git”). If you do that, you’ll need to start over again with git init.

Recap

So we’ve learnt that git creates a local repository in a “.git” subfolder when you type “git init”. We’ve also covered adding and committing files to the repository and reverting to a previously committed version.

Until now, you’re working alone with a local repository – this covers your own needs for maintaining versions as you work. However, the real power of version control is when you are working with other people on a project. In the next section we’ll therefore cover remote repositories.

Remote repositories

git supports multiple levels of repositories. Usually, a project will have local repositories on each of the developer’s machines and a central, main repository which holds the contributions of all the developers involved in the project.

This central repository may be in the cloud (like github.com) or it may be privately hosted by a company. For the purposes of this workshop, we’ll use github.com as remote repository.

I logged onto my account on github.com and created a repository called “rogerarmstrong/git-workshop”. Github helpfully gave me a list of onscreen instructions for setting this up as a remote repository for my project. I need two steps “git remote add” and “git push” as shown below:

So now my project is committed to both the local git repository and to the remote repository at github.com. New members of the project team can now do a “git clone https://github.com/rogerarmstrong/git-workshop.git” to start working on the project. This one line will initialise a local repository and get the latest project files from the central repository in a single step.

Working with the remote repository

We continue committing to the local repository as before and when we are ready to publish our changes to the rest of the project team, we push our changes to the remote repository.

We can double-check online that our changes have been pushed out to the remote repository by looking at the file online at github.com:

To get changes pushed to the remote repository by other developers, we need to do a “git pull”:

Branching

Branching allows you to commit to a separate path than the main (“master”) development. This means that you can continue to commit your changes, but they won’t affect team members who are working on the master or another branch. Below we create a branch called “experimental”:

If we now do a git status, we’ll see that we are on the branch “experimental”.

If we now make a change to index.html and commit it, it’ll be committed to the “experimental” branch.

We can now switch back and forward between “experimental” and master branches:

We can see all branches with:

If we now decide we like the experimental branch and want to merge it into the master:

Finally, we can delete the experimental branch:

Summary

Thats it for now. We hope you enjoyed our basic introduction to git. We have covered how to create a local git repository, how to add and commit files to it, how to create a remote repository and push changes out to it and pull changes from it and finally how to create branches. git has many more commands to deal with the complexities of real-world projects such as conflict resolution (during merging of changes from multiple developers) – we’ll leave you to explore these more advanced topics for yourself. Have fun!

VMWare Clarity/Angular modal dialogs

The VMWare Clarity Design System documentation is a bit vague about how modal dialogs should be handled. The examples presented are not really appropriate for real-world applications, where dialogs need to be reusable components, usually containing forms. So, here’s a more realistic example of how to use modal dialogs in clarity applications.

The code is available as a stackblitz at https://stackblitz.com/github/rogerarmstrong/clarity-sample-modal.

What we want to achieve is a modal component which can be called from anywhere in the application and which takes a model object as input and returns a modified model as output (i.e. the dialog does nothing with the object except allow the user to edit it – the caller has control over what to do with the edited model object).

The application is simple – it displays a user’s details and and an edit button which displays a dialog allowing first and last name to be edited.

click the button to show the dialog
edit the user details
user details are updated

The applications consists of two components “home” and “edit-user-dialog”.

In the home component html, we insert the <edit-user-dialog> tag.

 

In the home component code,  we reference the edit-user-dialog via a @ViewChild. When the Edit User button is clicked, we invoke the open method on the modal, passing the user object we wish to edit. We then subscribe to the onOK event from the modal (which takes the modified user returned from the dialog.

The dialog html is simple – a form allowing the properties of the user to be edited.

The dialog component has an open and close method.

Notes: setTimeout is needed to reliably set focus in a modal (over repeated opens) because angular has no DOM ready function to tell us when focus can be set.

Letsencrypt in 15 minutes

I was looking for a simple  way to use Let’s Encrypt to enable https for a web site and I found a Docker image nmarus/docker-haproxy-certbot which met my needs.

Remember, Let’s Encrypt represents a complete break from traditional certificate issuers in that:
(a) its free.
(b) certificate creation, installation and renewal is fully automated.

These are huge advantages relative to working with the previous certificate issuers and anyone who deploys anything to the internet should immediately take advantage of them. Let’s Encrypt’s audacious goal is to improve the whole internet by getting everyone to use https.

Let’s Encrypt provides a “certbot” which handles the whole lifecycle of the certificates for you. There’s plenty of Let’s Encrypt documentation on how to install the certbot into popular web servers (like apache) or proxy servers (like HAProxy). However, what we are doing below is packaging a HAProxy instance with certbot installed as a Docker container, so you can simply put it in front of one or more web properties you want certificates for. That way, you don’t need to touch your existing server or proxy configuration to use Let’s Encrypt certificates.

In our case, our web site was already a Docker container, so I just had to modify the docker-compose file from:

to:

Where haproxy-certbot is our certificate issuing, SSL terminating transparent proxy which takes care of all certificate related activities and then passes http requests on to our original service.

I first needed to create the three directories “~/data/config”, “~/data/letsencrypt” and “~/data/certs” on my docker host (which the haproxy-certbot container needs for persistent storage of its proxy configuration file and the certificates).

I then took the example haproxy.cfg file provided (see https://hub.docker.com/r/nmarus/haproxy-certbot), and copied it to the ~/data/config directory and changed the backend “my_http_backend” in the haproxy.cfg file to:

This means that the proxy server now forwards requests to port 80 (http) on the address “web-site”, which is the address of the web-site container, provided to the proxy container via the docker links instruction.

I brought both containers up with “docker-compose up -d”, and checked that my web-site was still available over http.

At this point, our new proxy is passing through http requests to the backend. But it is also ready to handle the requests which Let’s Encrypt will use to verify that you own the domain and issue you the requested certificate.

I then asked Let’s Encrypt to create a certificate with the command:

This caused Let’s Encrypt to verify that I really owned the domain by visiting the address I provided and checking that it reaches the container. Let’s Encrypt issued the certificate and the certificate was stored (in the ~/data/certs directory which I provided to the container).

I then refreshed the proxy with:

And I was then immediately able to visit the website with https.

Let’s Encrypt certificates are short-lived (a few months), but the haproxy-certbot container automatically renews them for you before they expire.

On another server I had several different microservices for which I wanted https access, so I configured a second haproxy instance as the back-end rather than a web-site. That way, I had one proxy instance handling SSL termination/certificate administration and another routing requests to the various microservices (based on HAProxy host header rules).

The great thing about this approach is that you don’t have to mess around with your existing http services or proxies, instead you simply put this container in front of them.

Backing up ESXi VMs with ghettoVCB

We’ve been using the free ESXi ghettoVCB backup utility for the last 5 years to backup about 150 VMs daily without a glitch. ghettoVCB snapshots the VM, copies away the files (with a configurable retention period) and then removes the snapshot. The resulting backup is a snapshot of the VM which means that when you need it, you can directly run the backup copy of the VM with ESXi and start it without having to restore it. ghettoVCB is fast and reliable (it copies sparse disks correctly to an NFS backup share so the resulting backup is as compact as the original VM disks).

ghettoVCB has no deduplication capabilities, so its usually not appropriate for offsite backup of VMs. We use borg to archive the ghettoVCB backups offsite. This has the advantage that you get indefinite retention offsite (since borg does very efficient deduplication). You can also mount any borg backup (using its FUSE mounting), so you can run any backed up VM straight out of the archive.

ghettoVCB is available at https://github.com/lamw/ghettoVCB

Highly recommended!

Borg backup

Borg backup (https://github.com/borgbackup) is an open source backup tool which, in addition to the usual backup features like strong client-side encryption and compression, has several important characteristics which make it particularly suitable for handling large offsite backups (like virtual machine backups):

  1. Deduplication: this ensures that even if the source files move or change names, that they will not be re-backed up unnecessarily.
  2. The backup can be moved. Borg backups are just directories – this means that you can make the first, full backup locally, copy it to the destination via a USB disk and the continue incremental backups over the network.

We are currently using borg for several offsite backups, including a weekly offsite backup of VMWare ghettovcb local backups (VM image clones) and the resulting backup traffic is less than 10% of the file size of the local backups (this obviously depends on how much change is occurring between backups, but 90% deduplication is the average over 15 VMs backed up offsite).

The steps to do an initial backup of the directory (data_dir) locally, move it to a remote server (server1) and resume incremental backups to that server (via ssh to user1@server) are as follows:

Notes:
(1) borg should be installed on both the client machine and the server machine.
(2) When you move the backup (repository), the name of the directory you move it to does not need to match that which it was moved from.

Thanks to Gabriel for recommending borg!

Postgres-BDR + Docker: shared-nothing multimaster-replication databases made easy

Postgres-BDR + 

To achieve fault tolerance, you need redundant systems. There are two basic approaches to redundancy, active-standby or active-active.

Active-standby

Active-standby means that in the event of failure of the active node, a failover to a standby node is carried out.

Active-active

Active-active means that all nodes are continuously active. In the event of failure of a node, that node simply stops being used and the other nodes assume the full load.

The problem with active-standby

Active-standby has a huge problem in the real world – at the time when a node fails, the chances of failover occurring smoothly are hugely reduced since that the problem that caused the failure is quite likely to affect the system’s ability to failover smoothly – in other words, when things are failing, its a bad idea to start trying to switch over to standby nodes.

For this reason, active-active is the preferred approach to achieving robust fault-tolerance.

Node independence and shared-nothing architecture

Furthermore, redundant nodes should be as independent from one another as possible. For this, a shared-nothing architecture is the ideal. That way, when a node fails for any reason, there’s no reason to fear that the remaining nodes will also fail.

An example of this is the nodes should not share a server, a network or a database. Its relatively easy to distribute redundant  nodes across independent servers, slightly harder to distribute across independent networks and very hard to achieve independent databases.

Postgres-BDR

Postgres-BDR (bidirectional replication) provides shared-nothing database redundancy. Postgres-BDR is a special distribution of Postgres with extensions for replication. It would be relatively complex to install, but Docker makes it easy to use. We’ve deployed it in several recent projects and its been great.

Below I’ll provide you with step-by-step instructions to install a two node pair of Postgres-BDR databases as two Docker containers.

Setting up a Postgres-BDR node pair as Docker containers

Here’s how to setup a pair of BDR databases on a pair of docker hosts (we’ll call them host1, host2) to replicate a database called “testdb”.

We’re basing out containers on the Docker image “jgiannuzzi/postgres-bdr” which is available from the central Docker registry.

Here’s the docker-compose file entries required on each host. Each host additionally has a /data directory which will hold the database files.

Once the containers are up on the two hosts, connect with pgadmin to host1  and do the following:

Create a database “testdb” and run the following queries against it to initialise replication (run each command separately – not as a batch).

NOTE: make sure you have the dbtest database selected before running the queries!

Then connect pgadmin to host2:

Create a database “dbtest” and run the following queries against it to initialise replication (run each command separately – not as a batch).

========================================

You can now test that replication is working correctly by creating a test table in the dbtest database on host1 (dbtest/schemas/public/Tables/Create) and then checking that it is correctly replicated to host2.

========================================

Notes:
(1) Replication occurs on the standard postgres port 5432, so this network port must be available between the hosts.
(2) If you need multiple replicated databases, you need to repeat this process for each database.
(3) Sequences will run in independent ranges (default is 50000 apart), to avoid collision if records are added on both sides.
(4) The most efficient application deployment strategy is to treat one database as the primary and the other as the secondary – this results in the least replication traffic and the lowest potential for conflicts.

Docker workshop

Overview:

We recently carried out a short introductory Docker workshop, starting from scratch, installing Docker and taking it through to the point where a software stack, consisting of several linked containers, are deployed using docker-compose. Here’s what we covered.

Docker concepts:

Docker containers are easy-to-deploy units of software, analogous to  the shipping containers used by the transport industry, which simplifies the job of shipping diverse goods around the world.

Docker images are the templates for the containers. Every Docker container is started from an image. Images are defined by a Dockerfile which contains instructions for building the image, based on an existing image (for for instance, a web-server image will be based on an OS image, simply adding a layer of web-server software to it).

A Docker registry is where images are stored. Every machine where Docker is installed has a local registry. Additionally, Docker provides a central registry (from which images are fetched if they aren’t available locally). And finally, you can host your own private registries.

Starting point:

A freshly installed Ubuntu 16.04 server, called docker-test.

Installing Docker:

We won’t use the standard Docker package available from the Ubuntu repositories because Docker is changing fast – instead we’ll add the Docker apt-repository and install the latest version from there.

The last two commands ensure that we can run Docker commands without sudo.

Check that Docker is correctly installed with:

If everything is OK, Docker will download the hello-world image from the central Docker registry and run it.

Building a simple website container:

The purpose of Docker is to allow you to build and deploy something like a website without having to worry about details like which OS is to be used, which web-server etc. What we want is to have an image called say “website” which takes some files we give it and publishes them via a web-server.

First create a directory “website” where we will work on creating our container.

Create a text file called “Dockerfile” containing the following:

This defines a new image, based on the existing image “alpine”, a compact Linux OS image. A web-server, “lighttpd” is installed. Then our website content (index.html) is copied to the web-server content folder. and finally the web-server is started.

Next create a simple index.html as content for the website:

Now build an image from the Dockerfile and call it “website”.

The “docker images” command will show us the newly built image.

We can now run a container based on our new image as follows:

This will bring up the website container based on our “website” image and publishing the web content on port 8088 (the -p parameter maps the host port 8088 to the standard web-server port 80 within the container).

The “docker ps” command will show us the running container:

Point a browser at http://docker-test:8088 and you’ll see our simple “Hello World!” web-page, served by our new container.

Basic Docker commands:

Persistent data (stateful containers):

When a container is rebuilt from an image, it loses any changes which were made to it since it was last built. In order to preserve state (for instance, a database container will usually need to preserve its database contents, even if the container is rebuilt), this state must be maintained by the host and provided to the container by means of “volumes”.

To illustrate this, we’ll create a database container using the postgresql database server. First we create a data directory on the docker host, which will maintain the persistent state of the database.

Then we create a database container based on a standard “postgres” image from the main Docker registry. We pass the -v parameter instructing docker to map the host directory (~/data)  to the container path /var/lib/postgresql/data (where postgresql stores its database contents).

If you now look in the host directory ~/data, you will see that postgresql has created a set of database files there. Note: you’ll need to use sudo to list the files because postgresql has modified the file permissions.

Now connect to the new database server (docker-test:5432, user: postgres, password: postgres) with a postgres client (e.g. pgadmin) and create a database called “test”.

To demonstrate that the host is maintaining state for the container, we’ll now recreate the container and image from scratch.

Then we’ll run the container again (which will again fetch the image from the main Docker registry because we removed it from the local Docker registry with the “docker rmi” command).

Reconnect with postgres client – our new “test” database is still there even though we rebuilt the container (because we specified a persistent volume).

docker-compose – deploying a whole stack

The philosophy of a container is that its supposed to do just one thing well – this reduces complexity and increases reusability. So you shouldn’t use a single container to deploy several components. For instance, if you have a web application stack which consists of a database, a REST server, a client web application and a proxy server, then this stack should be deployed as four containers.

For this workshop, we’ll deploy a database and a REST server as a stack, using docker-compose to deploy the stack in a single operation.

We first need to install docker-compose (its an add-on tool for Docker).

Now we’ll create a small REST server in python and deploy it in a container.

Enter the following python code into test.py:

Our sample REST server will respond to a GET request to /hello with “Hello from the rest server!”

Next we’ll create a Dockerfile for the REST server image

And enter and save the following content into Dockerfile:

Now we can build and run the container.

Point a browser at http://docker-test:5000/hello and the REST server should return “Hello from the rest server!”

So now we have a REST server running as a container. The next step is to hook up the rest server to the database, so instead of always returning a fixed text string, it can do a more real-world task of returning the result of a query against the database.

Using the postgres client, connect to the database and create a database “test”, with a table “test” and one int column “test”. Insert a few values which our REST server will sum.

Now we’ll update the code of our REST server to sum the values from the test table.

We’ll need to update the REST server image to install the psycopg2 database library

Point a browser at http://docker-test:5000/hello and the REST server should now return “20” (the sum of the two values in the database).

So we now have two containers, one of which uses the other.

However, we are still starting both separately and in a specific order. We also currently have ports 5000 (rest server http) and 5432 (postgres) open and we have a hard-coded reference to “docker-test” in the rest server code. We could of course allow the database server to be passed in as a command line variable or as an environment variable, but docker-compose provides a better way, by linking the containers, so that the database is first started, then a private network is created to link the two containers and the address of the database server on that network is passed to the rest container, who can then communicate privately with its database container.

Lets update our REST server code, to reference the database as “database” instead of the host name “docker-test”. We’ll then use docker-compose to ensure that the host name “database” will be pointing to the database container.

Now we’ll create a docker-compose.yml file to link the REST server to the database server, by means of the host-name “database” (defined by the name of the service in the docker-compose.yml file). We no longer need to expose the postgresql port (5432), since docker-compose will provide a private network between the containers, allowing the REST server to access port 5432 inside the database container without it being exposed to the host.

The depends_on instruction ensures that the REST server container is started only after the database server container has been started.

Remove the current “rest” and “database” containers, rebuild the rest server image (with the updated test.py) and bring them back up as a stack with docker-compose

Check it again by browsing to http://docker-test:5000/hello (the REST server should still return “20”).

If you run the “docker ps” command, you’ll see that the containers are no longer called “rest” and “database”, but that docker-compose has constructed names based on the service names in the docker-compose.yml file.

You’ll also noticed that the database container is no longer publishing port 5432 – its only available on the private network which docker-compose creates between the containers. So this means that only port 5000 – the port published by the rest server – is now exposed to the outside world.

Note that we’re doing everything (building the image and deploying the container) on a single machine – in the real world, the docker images are built during the development cycle on a developer workstation (or a continuous integration server like jenkins) and pushed to a remote registry (like docker hub or a private registry). During deployment, the images are pulled from the remote registry and the containers started on the production server.

Docker swarm

Docker swarm is a very interesting extension to Docker which allows you to deploy your software stacks across a cluster of worker machines. Its easy to set up – one Docker host creates the swarm and becomes the swarm manager, and other hosts join the swarm. Using an extension to the docker-compose.yml, software stacks can be deployed across the cluster with replicas for fault tolerance and load balancing. We didn’t have time to cover docker swarm in this workshop, but we’ll cover it soon.

Contact us!

We’ve already gathered lot of experience using Docker to help our customers efficiently deploy their software stacks. If you’re interested in having us help your organisation get up and running with containers, get in touch with us at info@armstrongconsulting.com