Repairing a broken Postgres-BDR cluster

Postgres-BDR is an excellent multi-master replication clustering extension for PostgreSQL (https://www.2ndquadrant.com/en/resources/postgres-bdr-2ndquadrant/). We often use it for fault tolerant systems – its asynchronous and uncomplicated to set up (at least if you use it as a docker container).

However, if a node of a Postgres-BDR cluster gets corrupted (by a hardware failure for instance) and Postgres cannot repair it automatically, you need to remove the node and readd it. Unfortunately, this is not well-handled by Postgres-BDR (maybe it’ll be easier in a future release, but in the version we’re using, it requires some tricky manual intervention).

I’ll document a working procedure (found here) using two docker containers based on the jgiannuzzi/postgres-bdr image.

As a first step, we’ll boot up two Postgres-BDR containers with the following docker-compose.yml file on a docker host called “docker-test”:

version: "2.2"
services:

    postgres1:
      image: jgiannuzzi/postgres-bdr

      ports:
        - 7001:5432

      volumes:
        - postgres1-data:/var/lib/postgresql/data

      environment:
        - POSTGRES_PASSWORD=postgres1

      restart: always

    postgres2:
      image: jgiannuzzi/postgres-bdr

      ports:
        - 7002:5432

      volumes:
        - postgres2-data:/var/lib/postgresql/data

      environment:
        - POSTGRES_PASSWORD=postgres2

      restart: always

volumes:
  postgres1-data:
  postgres2-data:

We create a database “test” on each instance and then setup replication as follows:
(note: execute the SQL statements individually).

on postgres1:

CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE EXTENSION IF NOT EXISTS bdr;

SELECT bdr.bdr_group_create(
  local_node_name := 'postgres1',
  node_external_dsn := 'host=docker-test port=7001 dbname=test password=postgres1'
);

on postgres2:

CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE EXTENSION IF NOT EXISTS bdr;

SELECT bdr.bdr_group_join(
  local_node_name := 'postgres2',
  node_external_dsn := 'host=docker-test port=7002 dbname=test password=postgres2',
  join_using_dsn := 'host=docker-test port=7001 dbname=test password=postgres1'
);

We can check that replication is working correctly b creating some tables in the test database on one instance and then checking that they are replicated to the other instance and vice versa.

Then we destroy the postgres1 instance:

docker ps
docker rm -f -v 238336_postgres1_1_b83117a70a12
docker volume ls
docker volume rm 238336_postgres1-data

Step 1: recreate a fresh instance postgres1

docker-compose up -d

Step 2: recreate and empty database “test” on postgres1

Step 3: execute the following on postgres2

BEGIN;
SET LOCAL bdr.skip_ddl_locking = on;
SET LOCAL bdr.permit_unsafe_ddl_commands = on;
SET LOCAL bdr.skip_ddl_replication = on;
SECURITY LABEL FOR bdr ON DATABASE test IS NULL;
DELETE FROM bdr.bdr_connections;
DELETE FROM bdr.bdr_nodes;
SELECT bdr.bdr_connections_changed();
COMMIT;

DROP EXTENSION bdr;

Step 4: restart postgres2

docker-compose down && docker-compose up -d

Step 5: reestablish replication (starting with postgres2 and replicating initially to postgres1)

on postgres2

CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE EXTENSION IF NOT EXISTS bdr;

SELECT bdr.bdr_group_create(
  local_node_name := 'postgres2',
  node_external_dsn := 'host=docker-test port=7002 dbname=test password=postgres2'
);

on postgres1:

CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE EXTENSION IF NOT EXISTS bdr;

SELECT bdr.bdr_group_join(
  local_node_name := 'postgres1',
  node_external_dsn := 'host=docker-test port=7001 dbname=test password=postgres1',
  join_using_dsn := 'host=docker-test port=7002 dbname=test password=postgres2'
);

Step 6: check that replication is working again.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.