Category Archives: Uncategorized

Backing up ESXi VMs with ghettoVCB

We’ve been using the free ESXi ghettoVCB backup utility for the last 5 years to backup about 150 VMs daily without a glitch. ghettoVCB snapshots the VM, copies away the files (with a configurable retention period) and then removes the snapshot. The resulting backup is a snapshot of the VM which means that when you need it, you can directly run the backup copy of the VM with ESXi and start it without having to restore it. ghettoVCB is fast and reliable (it copies sparse disks correctly to an NFS backup share so the resulting backup is as compact as the original VM disks).

ghettoVCB has no deduplication capabilities, so its usually not appropriate for offsite backup of VMs. We use borg to archive the ghettoVCB backups offsite. This has the advantage that you get indefinite retention offsite (since borg does very efficient deduplication). You can also mount any borg backup (using its FUSE mounting), so you can run any backed up VM straight out of the archive.

ghettoVCB is available at https://github.com/lamw/ghettoVCB

Highly recommended!

Borg backup

Borg backup (https://github.com/borgbackup) is an open source backup tool which, in addition to the usual backup features like strong client-side encryption and compression, has several important characteristics which make it particularly suitable for handling large offsite backups (like virtual machine backups):

  1. Deduplication: this ensures that even if the source files move or change names, that they will not be re-backed up unnecessarily.
  2. The backup can be moved. Borg backups are just directories – this means that you can make the first, full backup locally, copy it to the destination via a USB disk and the continue incremental backups over the network.

We are currently using borg for several offsite backups, including a weekly offsite backup of VMWare ghettovcb local backups (VM image clones) and the resulting backup traffic is less than 10% of the file size of the local backups (this obviously depends on how much change is occurring between backups, but 90% deduplication is the average over 15 VMs backed up offsite).

The steps to do an initial backup of the directory (data_dir) locally, move it to a remote server (server1) and resume incremental backups to that server (via ssh to user1@server) are as follows:

# initialise a local borg backup repository
# in directory borg_repo_dir
$ mkdir borg_repo_dir
$ borg init borg_repo_dir

# make an initial backup
# to the local repository
$ borg create --stats --progress borg_repo_dir::backup1 data_dir

# copy borg_repo_dir to server1 via USB disk
# make further (incremental) backups via ssh
borg create --stats --progress user1@server1:new_borg_repo_dir::backup1 data_dir

Notes:
(1) borg should be installed on both the client machine and the server machine.
(2) When you move the backup (repository), the name of the directory you move it to does not need to match that which it was moved from.

Thanks to Gabriel for recommending borg!

clustered cron jobs

We’ve been recently running rest APIs on active-active server pairs (docker containers running on pairs of VMs on separate hosts) with postgres-BDR (multi-master bidirectional replication) for fault-tolerant storage and a pair of fault-tolerant HAProxy instances for incoming request routing. This is a robust setup which provides zero downtime during rolling updates or hardware maintenance or failure.

However, clustering scheduled jobs (i.e. ensuring that scheduled jobs execute exactly once) becomes a problem in this configuration. Multi-master replicated databases avoid a single point of failure but they are not suitable for use with database-based clustered schedulers like Quartz, so we needed to consider other options. There are complex clustered job schedulers, but we wanted to keep it simple and use Linux crond for scheduling. We finally settled on using keepalived to maintain a single master across the cluster and schedule the cron jobs identically on all servers, using a small if_master.sh script to ensure that a crontab entry only runs if the server is currently the keepalived master. The crontab entries then look like:

0 * * * * ./if_master.sh "./command.sh"

if_master.sh checks if the server is currently keepalived master (by looking for the floating ip address)

#!/bin/bash
FLOATING_IP_ADDRESS=x.x.x.x
if (ip addr show | grep "$FLOATING_IP_ADDRESS")
then
	eval $1
fi

and only executing the command if the server is currently master (the same crontabs execute on the other servers, but do nothing if the server is not the keepalive master).

This was the simplest solution we could find which allows us to keep using cron and configure crontabs on all servers identically.

 

Finally, perfect wifi coverage at home

We live in an old house, on three levels. Its always been a challenge to achieve consistent wifi coverage throughout the house. We neglected to install ethernet cabling when we renovated and have been struggling with wifi issues ever since. We tried power-line networking (Devolo, TP-Link) and, although it worked most of the time, it provided very inconsistent performance and it was impossible to figure out why. We then reverted to a central wireless router and range extenders (Apple, TP-Link). Coverage was pretty bad in many parts of the house. Last weekend, we installed the AmplifiHD mesh networking system from Ubiquiti and we finally have the full performance of our internet provider (40-60Mbps LTE, depending on the time of day) from any or all devices, anywhere in the house.

516535-ubiquity-amplifi-hd-home-wi-fi-system

Replacing VMWare with KVM

I’ve been using an i5 Intel NUC at home as a home server.

I initially installed ESX on the NUC and ran an ubuntu VM with iptables, DNS, DHCP etc. However, I wanted to put the firewall between the home network and the LTE router, so I needed two network interfaces. The NUC only has one, so I thought I’d use VLANs to split the network.  That turned out to be pretty complicated to manage so I ended up buying a USB3 ethernet adapter (AX88179) for the NUC instead.

Getting that to work with ESX was a pain (I tried pass-through, but couldn’t get it to work reliably), so in the end I replaced ESX on the NUC with KVM. Worked great – the USB ethernet adapter installed out of the box with Ubuntu and the whole configuration only took an hour to set up.

Like with ESX, I still need an extra VM on my macbook (Ubuntu desktop  under Fusion) to run virt-manager to manage KVM (instead of the Windows VM where I used to run the vmware client to manage the ESX installation).

I used to backup the VMs with ghettoVCB to a Synology via nfs (which worked completely reliably). Now I’ll use LVM snapshots and and file copy to backup the KVM VMs to the same Synology.

Powerline networking with mixed Devolo and TP-LINK

We’ve been using Devolo powerline networking at home for years – we have an old house with wifi-proof walls. We’d been having plenty of problems with them, possibly due to overheating of the wifi plugs. After replacing the dLan 500 adapters with the more expensive Devolo dLan Pro adapters without any improvement, I finally decided to try TP-LINK AV500s instead.

The TP-LINKs are much cheaper (base plug plus two WLAN repeaters for < EUR 100). They are also compatible with the Devolo dLan 500s, so I’m still using a few of the Devolos (for a printer and in the cellar). Unfortunately they don’t perform better than the Devolos – I still get about 30Mbit throughput (measured with iperf between two devices connected by ethernet to the TP-LINKs). Should’ve installed CAT5 when wiring the house :(.

Addendum: the Devolos did not play so well with the TP-LINK network, so I ended up replacing them with some additional TP-LINKs. For the time being, the network seems to run reliably.