All posts by roger

Letsencrypt in 15 minutes

I was looking for a simple  way to use Let’s Encrypt to enable https for a web site and I found a Docker image nmarus/docker-haproxy-certbot which met my needs.

Remember, Let’s Encrypt represents a complete break from traditional certificate issuers in that:
(a) its free.
(b) certificate creation, installation and renewal is fully automated.

These are huge advantages relative to working with the previous certificate issuers and anyone who deploys anything to the internet should immediately take advantage of them. Let’s Encrypt’s audacious goal is to improve the whole internet by getting everyone to use https.

Let’s Encrypt provides a “certbot” which handles the whole lifecycle of the certificates for you. There’s plenty of Let’s Encrypt documentation on how to install the certbot into popular web servers (like apache) or proxy servers (like HAProxy). However, what we are doing below is packaging a HAProxy instance with certbot installed as a Docker container, so you can simply put it in front of one or more web properties you want certificates for. That way, you don’t need to touch your existing server or proxy configuration to use Let’s Encrypt certificates.

In our case, our web site was already a Docker container, so I just had to modify the docker-compose file from:

version: "2"
services:
 web-site:
  image: web-site
  restart: always
  ports:
   - 80:80

to:

version: "2"
services:
 web-site:
  image: web-site
  restart: always

 haproxy-certbot:
  image: nmarus/haproxy-certbot
  container_name: haproxy-certbot
  restart: always
  ports:
   - 80:80
   - 443:443
  links:
   - web-site
  cap_add:
   - NET_ADMIN
  volumes:
   - ~/data/config:/config
   - ~/data/letsencrypt:/etc/letsencrypt
   - ~/data/certs:/usr/local/etc/haproxy/certs.d

Where haproxy-certbot is our certificate issuing, SSL terminating transparent proxy which takes care of all certificate related activities and then passes http requests on to our original service.

I first needed to create the three directories “~/data/config”, “~/data/letsencrypt” and “~/data/certs” on my docker host (which the haproxy-certbot container needs for persistent storage of its proxy configuration file and the certificates).

I then took the example haproxy.cfg file provided (see https://hub.docker.com/r/nmarus/haproxy-certbot), and copied it to the ~/data/config directory and changed the backend “my_http_backend” in the haproxy.cfg file to:

backend my_http_backend
  mode http
  balance leastconn
  option tcp-check
  option log-health-checks
  server web-site web-site:80 check port 80

This means that the proxy server now forwards requests to port 80 (http) on the address “web-site”, which is the address of the web-site container, provided to the proxy container via the docker links instruction.

I brought both containers up with “docker-compose up -d”, and checked that my web-site was still available over http.

At this point, our new proxy is passing through http requests to the backend. But it is also ready to handle the requests which Let’s Encrypt will use to verify that you own the domain and issue you the requested certificate.

I then asked Let’s Encrypt to create a certificate with the command:

docker exec haproxy-certbot certbot-certonly --domain <hostname> --email <your email address>

This caused Let’s Encrypt to verify that I really owned the domain by visiting the address I provided and checking that it reaches the container. Let’s Encrypt issued the certificate and the certificate was stored (in the ~/data/certs directory which I provided to the container).

I then refreshed the proxy with:

docker exec haproxy-certbot haproxy-refresh

And I was then immediately able to visit the website with https.

Let’s Encrypt certificates are short-lived (a few months), but the haproxy-certbot container automatically renews them for you before they expire.

On another server I had several different microservices for which I wanted https access, so I configured a second haproxy instance as the back-end rather than a web-site. That way, I had one proxy instance handling SSL termination/certificate administration and another routing requests to the various microservices (based on HAProxy host header rules).

The great thing about this approach is that you don’t have to mess around with your existing http services or proxies, instead you simply put this container in front of them.

Backing up ESXi VMs with ghettoVCB

We’ve been using the free ESXi ghettoVCB backup utility for the last 5 years to backup about 150 VMs daily without a glitch. ghettoVCB snapshots the VM, copies away the files (with a configurable retention period) and then removes the snapshot. The resulting backup is a snapshot of the VM which means that when you need it, you can directly run the backup copy of the VM with ESXi and start it without having to restore it. ghettoVCB is fast and reliable (it copies sparse disks correctly to an NFS backup share so the resulting backup is as compact as the original VM disks).

ghettoVCB has no deduplication capabilities, so its usually not appropriate for offsite backup of VMs. We use borg to archive the ghettoVCB backups offsite. This has the advantage that you get indefinite retention offsite (since borg does very efficient deduplication). You can also mount any borg backup (using its FUSE mounting), so you can run any backed up VM straight out of the archive.

ghettoVCB is available at https://github.com/lamw/ghettoVCB

Highly recommended!

Borg backup

Borg backup (https://github.com/borgbackup) is an open source backup tool which, in addition to the usual backup features like strong client-side encryption and compression, has several important characteristics which make it particularly suitable for handling large offsite backups (like virtual machine backups):

  1. Deduplication: this ensures that even if the source files move or change names, that they will not be re-backed up unnecessarily.
  2. The backup can be moved. Borg backups are just directories – this means that you can make the first, full backup locally, copy it to the destination via a USB disk and the continue incremental backups over the network.

We are currently using borg for several offsite backups, including a weekly offsite backup of VMWare ghettovcb local backups (VM image clones) and the resulting backup traffic is less than 10% of the file size of the local backups (this obviously depends on how much change is occurring between backups, but 90% deduplication is the average over 15 VMs backed up offsite).

The steps to do an initial backup of the directory (data_dir) locally, move it to a remote server (server1) and resume incremental backups to that server (via ssh to user1@server) are as follows:

# initialise a local borg backup repository
# in directory borg_repo_dir
$ mkdir borg_repo_dir
$ borg init borg_repo_dir

# make an initial backup
# to the local repository
$ borg create --stats --progress borg_repo_dir::backup1 data_dir

# copy borg_repo_dir to server1 via USB disk
# make further (incremental) backups via ssh
borg create --stats --progress user1@server1:new_borg_repo_dir::backup1 data_dir

Notes:
(1) borg should be installed on both the client machine and the server machine.
(2) When you move the backup (repository), the name of the directory you move it to does not need to match that which it was moved from.

Thanks to Gabriel for recommending borg!

Docker workshop

Overview:

We recently carried out a short introductory Docker workshop, starting from scratch, installing Docker and taking it through to the point where a software stack, consisting of several linked containers, are deployed using docker-compose. Here’s what we covered.

Docker concepts:

Docker containers are easy-to-deploy units of software, analogous to  the shipping containers used by the transport industry, which simplifies the job of shipping diverse goods around the world.

Docker images are the templates for the containers. Every Docker container is started from an image. Images are defined by a Dockerfile which contains instructions for building the image, based on an existing image (for for instance, a web-server image will be based on an OS image, simply adding a layer of web-server software to it).

A Docker registry is where images are stored. Every machine where Docker is installed has a local registry. Additionally, Docker provides a central registry (from which images are fetched if they aren’t available locally). And finally, you can host your own private registries.

Starting point:

A freshly installed Ubuntu 16.04 server, called docker-test.

Installing Docker:

We won’t use the standard Docker package available from the Ubuntu repositories because Docker is changing fast – instead we’ll add the Docker apt-repository and install the latest version from there.

$ ssh administrator@docker-test

$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

$ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"

$ sudo apt-get update

$ sudo apt-get install -y docker-ce

$ sudo usermod -aG docker ${USER}

$ su - ${USER}

The last two commands ensure that we can run Docker commands without sudo.

Check that Docker is correctly installed with:

$ docker run hello-world

If everything is OK, Docker will download the hello-world image from the central Docker registry and run it.

Building a simple website container:

The purpose of Docker is to allow you to build and deploy something like a website without having to worry about details like which OS is to be used, which web-server etc. What we want is to have an image called say “website” which takes some files we give it and publishes them via a web-server.

First create a directory “website” where we will work on creating our container.

$ mkdir website
$ cd website

Create a text file called “Dockerfile” containing the following:

FROM alpine

RUN apk update \
 && apk add lighttpd \
 && rm -rf /var/cache/apk/*

COPY ./index.html /var/www/localhost/htdocs

CMD ["lighttpd", "-D", "-f", "/etc/lighttpd/lighttpd.conf"]

This defines a new image, based on the existing image “alpine”, a compact Linux OS image. A web-server, “lighttpd” is installed. Then our website content (index.html) is copied to the web-server content folder. and finally the web-server is started.

Next create a simple index.html as content for the website:

<html>
<body>
Hello World!
</body>
</html>

Now build an image from the Dockerfile and call it “website”.

 $ docker build . -t website

The “docker images” command will show us the newly built image.

$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
website latest 23853e62e631 34 seconds ago 11.4MB
hello-world latest 725dcfab7d63 3 days ago 1.84kB
alpine latest 053cde6e8953 3 days ago 3.97MB

We can now run a container based on our new image as follows:

$ docker run --name website -d -p 8088:80 website

This will bring up the website container based on our “website” image and publishing the web content on port 8088 (the -p parameter maps the host port 8088 to the standard web-server port 80 within the container).

The “docker ps” command will show us the running container:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                  NAMES
9f5612aa7b81        website           "lighttpd -D -f /e..."   29 seconds ago      Up 29 seconds       0.0.0.0:8088->80/tcp   website

Point a browser at http://docker-test:8088 and you’ll see our simple “Hello World!” web-page, served by our new container.

Basic Docker commands:

# list running containers
$ docker ps

# list all containers (including stopped ones)
$ docker ps -a
 
# list images (in the local docker registry)
$ docker images

# create a container from an image
$ docker run -d --name <container name> <image> 

# view logs (default stdout) of a container
$ docker logs <container name >

# provides shell access within a container
$ docker exec -it <container name> /bin/sh

# start a container
$ docker start <container name>

# stop a container
$ docker stop <container name>

Persistent data (stateful containers):

When a container is rebuilt from an image, it loses any changes which were made to it since it was last built. In order to preserve state (for instance, a database container will usually need to preserve its database contents, even if the container is rebuilt), this state must be maintained by the host and provided to the container by means of “volumes”.

To illustrate this, we’ll create a database container using the postgresql database server. First we create a data directory on the docker host, which will maintain the persistent state of the database.

$ cd
$ mkdir data

Then we create a database container based on a standard “postgres” image from the main Docker registry. We pass the -v parameter instructing docker to map the host directory (~/data)  to the container path /var/lib/postgresql/data (where postgresql stores its database contents).

$ docker run -d --name database -p 5432:5432 -v ~/data:/var/lib/postgresql/data postgres

If you now look in the host directory ~/data, you will see that postgresql has created a set of database files there. Note: you’ll need to use sudo to list the files because postgresql has modified the file permissions.

$ sudo ls data

Now connect to the new database server (docker-test:5432, user: postgres, password: postgres) with a postgres client (e.g. pgadmin) and create a database called “test”.

To demonstrate that the host is maintaining state for the container, we’ll now recreate the container and image from scratch.

$ docker rm -f database

$ docker rmi postgres

Then we’ll run the container again (which will again fetch the image from the main Docker registry because we removed it from the local Docker registry with the “docker rmi” command).

$ docker run -d --name database -p 5432:5432 -v ~/data:/var/lib/postgresql/data postgres

Reconnect with postgres client – our new “test” database is still there even though we rebuilt the container (because we specified a persistent volume).

docker-compose – deploying a whole stack

The philosophy of a container is that its supposed to do just one thing well – this reduces complexity and increases reusability. So you shouldn’t use a single container to deploy several components. For instance, if you have a web application stack which consists of a database, a REST server, a client web application and a proxy server, then this stack should be deployed as four containers.

For this workshop, we’ll deploy a database and a REST server as a stack, using docker-compose to deploy the stack in a single operation.

We first need to install docker-compose (its an add-on tool for Docker).

$ sudo apt-get install docker-compose

Now we’ll create a small REST server in python and deploy it in a container.

$ cd
$ mkdir rest
$ cd rest
$ nano test.py

Enter the following python code into test.py:

from flask import Flask
app = Flask(__name__)

@app.route("/hello")
def hello():
    return "Hello from the rest server!"

app.run(debug=True,host='0.0.0.0')

Our sample REST server will respond to a GET request to /hello with “Hello from the rest server!”

Next we’ll create a Dockerfile for the REST server image

$ nano Dockerfile

And enter and save the following content into Dockerfile:

FROM python

RUN pip install flask

COPY test.py .

CMD python test.py

Now we can build and run the container.

# build the image (and call it "rest")
$ docker build -t rest .

# run the container from the image (call the container "rest" as well) 
$ docker run -d --name rest -p 5000:5000 rest

Point a browser at http://docker-test:5000/hello and the REST server should return “Hello from the rest server!”

So now we have a REST server running as a container. The next step is to hook up the rest server to the database, so instead of always returning a fixed text string, it can do a more real-world task of returning the result of a query against the database.

Using the postgres client, connect to the database and create a database “test”, with a table “test” and one int column “test”. Insert a few values which our REST server will sum.

insert into test values(5);
insert into test values(15);
select sum(test) from test;

Now we’ll update the code of our REST server to sum the values from the test table.

$ nano test.py
import psycopg2
from flask import Flask

app = Flask(__name__)

@app.route("/hello")
def hello():
   conn = psycopg2.connect("host='docker-test' dbname='test' user='postgres' password='postgres'")
   cursor = conn.cursor()
   cursor.execute("SELECT sum(test) FROM test")
   sum = cursor.fetchone()[0]
   conn.close ()
   return str(sum)

app.run(debug=True,host="0.0.0.0")

We’ll need to update the REST server image to install the psycopg2 database library

$ nano Dockerfile
FROM python

RUN pip install flask psycopg2

COPY test.py .

CMD python test.py
# remove the container based on the current image, then rebuild the image
$ docker rm -f rest
$ docker build . -t rest
$ docker run -d --name rest -p 5000:5000 rest

Point a browser at http://docker-test:5000/hello and the REST server should now return “20” (the sum of the two values in the database).

So we now have two containers, one of which uses the other.

However, we are still starting both separately and in a specific order. We also currently have ports 5000 (rest server http) and 5432 (postgres) open and we have a hard-coded reference to “docker-test” in the rest server code. We could of course allow the database server to be passed in as a command line variable or as an environment variable, but docker-compose provides a better way, by linking the containers, so that the database is first started, then a private network is created to link the two containers and the address of the database server on that network is passed to the rest container, who can then communicate privately with its database container.

Lets update our REST server code, to reference the database as “database” instead of the host name “docker-test”. We’ll then use docker-compose to ensure that the host name “database” will be pointing to the database container.

$ nano test.py
import psycopg2
from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
   conn = psycopg2.connect("host='database' dbname='test' user='postgres' password='postgres'")
   cursor = conn.cursor()
   cursor.execute("SELECT sum(test) FROM test")
   sum = cursor.fetchone()[0]
   conn.close ()
   return str(sum)

app.run(debug=True,host="0.0.0.0")

Now we’ll create a docker-compose.yml file to link the REST server to the database server, by means of the host-name “database” (defined by the name of the service in the docker-compose.yml file). We no longer need to expose the postgresql port (5432), since docker-compose will provide a private network between the containers, allowing the REST server to access port 5432 inside the database container without it being exposed to the host.

The depends_on instruction ensures that the REST server container is started only after the database server container has been started.

$ nano docker-compose.yml
version: '2'

services:
 database:
  image: postgres
  restart: always
  volumes:
   - ~/data:/var/lib/postgresql/data

 rest:
  image: rest
  restart: always
  ports:
   - 5000:5000
  links:
   - database
  depends_on:
   - database

Remove the current “rest” and “database” containers, rebuild the rest server image (with the updated test.py) and bring them back up as a stack with docker-compose

$ docker rm -f rest
$ docker rm -f database
$ docker build . -t rest
$ docker-compose up -d

Check it again by browsing to http://docker-test:5000/hello (the REST server should still return “20”).

If you run the “docker ps” command, you’ll see that the containers are no longer called “rest” and “database”, but that docker-compose has constructed names based on the service names in the docker-compose.yml file.

You’ll also noticed that the database container is no longer publishing port 5432 – its only available on the private network which docker-compose creates between the containers. So this means that only port 5000 – the port published by the rest server – is now exposed to the outside world.

Note that we’re doing everything (building the image and deploying the container) on a single machine – in the real world, the docker images are built during the development cycle on a developer workstation (or a continuous integration server like jenkins) and pushed to a remote registry (like docker hub or a private registry). During deployment, the images are pulled from the remote registry and the containers started on the production server.

Docker swarm

Docker swarm is a very interesting extension to Docker which allows you to deploy your software stacks across a cluster of worker machines. Its easy to set up – one Docker host creates the swarm and becomes the swarm manager, and other hosts join the swarm. Using an extension to the docker-compose.yml, software stacks can be deployed across the cluster with replicas for fault tolerance and load balancing. We didn’t have time to cover docker swarm in this workshop, but we’ll cover it soon.

Contact us!

We’ve already gathered lot of experience using Docker to help our customers efficiently deploy their software stacks. If you’re interested in having us help your organisation get up and running with containers, get in touch with us at info@armstrongconsulting.com

clustered cron jobs

We’ve been recently running rest APIs on active-active server pairs (docker containers running on pairs of VMs on separate hosts) with postgres-BDR (multi-master bidirectional replication) for fault-tolerant storage and a pair of fault-tolerant HAProxy instances for incoming request routing. This is a robust setup which provides zero downtime during rolling updates or hardware maintenance or failure.

However, clustering scheduled jobs (i.e. ensuring that scheduled jobs execute exactly once) becomes a problem in this configuration. Multi-master replicated databases avoid a single point of failure but they are not suitable for use with database-based clustered schedulers like Quartz, so we needed to consider other options. There are complex clustered job schedulers, but we wanted to keep it simple and use Linux crond for scheduling. We finally settled on using keepalived to maintain a single master across the cluster and schedule the cron jobs identically on all servers, using a small if_master.sh script to ensure that a crontab entry only runs if the server is currently the keepalived master. The crontab entries then look like:

0 * * * * ./if_master.sh "./command.sh"

if_master.sh checks if the server is currently keepalived master (by looking for the floating ip address)

#!/bin/bash
FLOATING_IP_ADDRESS=x.x.x.x
if (ip addr show | grep "$FLOATING_IP_ADDRESS")
then
	eval $1
fi

and only executing the command if the server is currently master (the same crontabs execute on the other servers, but do nothing if the server is not the keepalive master).

This was the simplest solution we could find which allows us to keep using cron and configure crontabs on all servers identically.

 

Finally, perfect wifi coverage at home

We live in an old house, on three levels. Its always been a challenge to achieve consistent wifi coverage throughout the house. We neglected to install ethernet cabling when we renovated and have been struggling with wifi issues ever since. We tried power-line networking (Devolo, TP-Link) and, although it worked most of the time, it provided very inconsistent performance and it was impossible to figure out why. We then reverted to a central wireless router and range extenders (Apple, TP-Link). Coverage was pretty bad in many parts of the house. Last weekend, we installed the AmplifiHD mesh networking system from Ubiquiti and we finally have the full performance of our internet provider (40-60Mbps LTE, depending on the time of day) from any or all devices, anywhere in the house.

516535-ubiquity-amplifi-hd-home-wi-fi-system

Replacing VMWare with KVM

I’ve been using an i5 Intel NUC at home as a home server.

I initially installed ESX on the NUC and ran an ubuntu VM with iptables, DNS, DHCP etc. However, I wanted to put the firewall between the home network and the LTE router, so I needed two network interfaces. The NUC only has one, so I thought I’d use VLANs to split the network.  That turned out to be pretty complicated to manage so I ended up buying a USB3 ethernet adapter (AX88179) for the NUC instead.

Getting that to work with ESX was a pain (I tried pass-through, but couldn’t get it to work reliably), so in the end I replaced ESX on the NUC with KVM. Worked great – the USB ethernet adapter installed out of the box with Ubuntu and the whole configuration only took an hour to set up.

Like with ESX, I still need an extra VM on my macbook (Ubuntu desktop  under Fusion) to run virt-manager to manage KVM (instead of the Windows VM where I used to run the vmware client to manage the ESX installation).

I used to backup the VMs with ghettoVCB to a Synology via nfs (which worked completely reliably). Now I’ll use LVM snapshots and and file copy to backup the KVM VMs to the same Synology.

Powerline networking with mixed Devolo and TP-LINK

We’ve been using Devolo powerline networking at home for years – we have an old house with wifi-proof walls. We’d been having plenty of problems with them, possibly due to overheating of the wifi plugs. After replacing the dLan 500 adapters with the more expensive Devolo dLan Pro adapters without any improvement, I finally decided to try TP-LINK AV500s instead.

The TP-LINKs are much cheaper (base plug plus two WLAN repeaters for < EUR 100). They are also compatible with the Devolo dLan 500s, so I’m still using a few of the Devolos (for a printer and in the cellar). Unfortunately they don’t perform better than the Devolos – I still get about 30Mbit throughput (measured with iperf between two devices connected by ethernet to the TP-LINKs). Should’ve installed CAT5 when wiring the house :(.

Addendum: the Devolos did not play so well with the TP-LINK network, so I ended up replacing them with some additional TP-LINKs. For the time being, the network seems to run reliably.