Postgres-BDR is an excellent multi-master replication clustering extension for PostgreSQL (https://www.2ndquadrant.com/en/resources/postgres-bdr-2ndquadrant/). We often use it for fault tolerant systems – its asynchronous and uncomplicated to set up (at least if you use it as a docker container).
However, if a node of a Postgres-BDR cluster gets corrupted (by a hardware failure for instance) and Postgres cannot repair it automatically, you need to remove the node and readd it. Unfortunately, this is not well-handled by Postgres-BDR (maybe it’ll be easier in a future release, but in the version we’re using, it requires some tricky manual intervention).
I’ll document a working procedure (found here) using two docker containers based on the jgiannuzzi/postgres-bdr image.
As a first step, we’ll boot up two Postgres-BDR containers with the following docker-compose.yml file on a docker host called “docker-test”:
version: "2.2"
services:
postgres1:
image: jgiannuzzi/postgres-bdr
ports:
- 7001:5432
volumes:
- postgres1-data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres1
restart: always
postgres2:
image: jgiannuzzi/postgres-bdr
ports:
- 7002:5432
volumes:
- postgres2-data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres2
restart: always
volumes:
postgres1-data:
postgres2-data:
We create a database “test” on each instance and then setup replication as follows:
(note: execute the SQL statements individually).
on postgres1:
CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE EXTENSION IF NOT EXISTS bdr;
SELECT bdr.bdr_group_create(
local_node_name := 'postgres1',
node_external_dsn := 'host=docker-test port=7001 dbname=test password=postgres1'
);
on postgres2:
CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE EXTENSION IF NOT EXISTS bdr;
SELECT bdr.bdr_group_join(
local_node_name := 'postgres2',
node_external_dsn := 'host=docker-test port=7002 dbname=test password=postgres2',
join_using_dsn := 'host=docker-test port=7001 dbname=test password=postgres1'
);
We can check that replication is working correctly b creating some tables in the test database on one instance and then checking that they are replicated to the other instance and vice versa.
Then we destroy the postgres1 instance:
docker ps
docker rm -f -v 238336_postgres1_1_b83117a70a12
docker volume ls
docker volume rm 238336_postgres1-data
Step 1: recreate a fresh instance postgres1
docker-compose up -d
Step 2: recreate and empty database “test” on postgres1
Step 3: execute the following on postgres2
BEGIN;
SET LOCAL bdr.skip_ddl_locking = on;
SET LOCAL bdr.permit_unsafe_ddl_commands = on;
SET LOCAL bdr.skip_ddl_replication = on;
SECURITY LABEL FOR bdr ON DATABASE test IS NULL;
DELETE FROM bdr.bdr_connections;
DELETE FROM bdr.bdr_nodes;
SELECT bdr.bdr_connections_changed();
COMMIT;
DROP EXTENSION bdr;
Step 4: restart postgres2
docker-compose down && docker-compose up -d
Step 5: reestablish replication (starting with postgres2 and replicating initially to postgres1)
on postgres2
CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE EXTENSION IF NOT EXISTS bdr;
SELECT bdr.bdr_group_create(
local_node_name := 'postgres2',
node_external_dsn := 'host=docker-test port=7002 dbname=test password=postgres2'
);
on postgres1:
CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE EXTENSION IF NOT EXISTS bdr;
SELECT bdr.bdr_group_join(
local_node_name := 'postgres1',
node_external_dsn := 'host=docker-test port=7001 dbname=test password=postgres1',
join_using_dsn := 'host=docker-test port=7002 dbname=test password=postgres2'
);
Step 6: check that replication is working again.