All posts by administrator

Postgres-BDR + Docker: shared-nothing databases made easy

Postgres-BDR + 

To achieve fault tolerance, you need redundant systems. There are two basic approaches to redundancy, active-standby or active-active.

Active-standby

Active-standby means that in the event of failure of the active node, a failover to a standby node is carried out.

Active-active

Active-active means that all nodes are continuously active. In the event of failure of a node, that node simply stops being used and the other nodes assume the full load.

The problem with active-standby

Active-standy has a huge problem in the real world – at the time when a node fails, the chances of failover occurring smoothly are hugely reduced since that the problem that caused the failure is quite likely to affect the system’s ability to failover smoothly – in other words, when things are failing, its a bad idea to start trying to switch over to standby nodes.

For this reason, active-active is the preferred approach to achieving robust fault-tolerance.

Node independence and shared-nothing architecture

Furthermore, redundant nodes should be as independent from one another as possible. For this, a shared-nothing architecture is the ideal. That way, when a node fails for any reason, there’s no reason to fear that the remaining nodes will also fail.

An example of this is the nodes should not share a server, a network or a database. Its relatively easy to distribute redundant  nodes across independent servers, slightly harder to distribute across independent networks and very hard to achieve independent databases.

Postgres-BDR

Postgres-BDR (bidirectional replication) provides shared-nothing database redundancy. Postgres-BDR is a special distribution of Postgres with extensions for replication. It would be relatively complex to install, but Docker makes it easy to use. We’ve deployed it in several recent projects and its been great.

Below I’ll provide you with step-by-step instructions to install a two node pair of Postgres-BDR databases as two Docker containers.

Setting up a Postgres-BDR node pair as Docker containers

Here’s how to setup a pair of BDR databases on a pair of docker hosts (we’ll call them host1, host2) to replicate a database called “testdb”.

We’re basing out containers on the Docker image “jgiannuzzi/postgres-bdr” which is available from the central Docker registry.

Here’s the docker-compose file entries required on each host. Each host additionally has a /data directory which will hold the database files.

version: "3"

services:
 database:
   image: jgiannuzzi/postgres-bdr
   restart: always
   ports:
     - 5432:5432
   volumes:
     - /data:/var/lib/postgresql/data
   environment:
     POSTGRES_PASSWORD: <xxxxxxx>

Once the containers are up on the two hosts, connect with pgadmin to host1  and do the following:

Create a database “testdb” and run the following queries against it to initialise replication.

NOTE: make sure you have the dbtest database selected before running the queries!

CREATE EXTENSION IF NOT EXISTS btree_gist;

CREATE EXTENSION IF NOT EXISTS bdr;

SELECT bdr.bdr_group_create(
 local_node_name := 'host1',
 node_external_dsn := 'host=host1 port=5432 dbname=signals password=<xxxxxxx>'
);

Then connect pgadmin to host2:

Create a database “dbtest” and run the following queries against it to initialise replication.

CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE EXTENSION IF NOT EXISTS bdr;

SELECT bdr.bdr_group_join(
 local_node_name := 'host2',
 node_external_dsn := 'host=host2 port=5432 dbname=dbtest password=<xxxxxxx>',
 join_using_dsn := 'host=host1 port=5432 dbname= dbtest password=<xxxxxxx>'
);

========================================

You can now test that replication is working correctly by creating a test table in the dbtest database on host1 (dbtest/schemas/public/Tables/Create) and then checking that it is correctly replicated to host2.

========================================

Notes:
(1) Replication occurs on the standard postgres port 5432, so this network port must be available between the hosts.
(2) If you need multiple replicated databases, you need to repeat this process for each database.
(3) Sequences will run in independent ranges (default is 50000 apart), to avoid collision if records are added on both sides.
(4) The most efficient application deployment strategy is to treat one database as the primary and the other as the secondary – this results in the least replication traffic and the lowest potential for conflicts.