We have been happily using Postgres-BDR for years as a multi-master database for fault-tolerant applications – it’s been incredibly robust for us. The advantage of a multi-master database is that it provides shared-nothing redundancy – you can have multiple application nodes, running geographically separated from one another and there’s no single point of failure. Postgres-BDR replication is asynchronous which makes it truly loosely-coupled, and they provide serviceable workarounds for many of the issues which asynchronous replication brings up (sequences, unique constraints). Most importantly, despite every conceivable network connectivity issue and node adding, removing and accidental destruction, we have never lost any data or had any downtime over a period of 5 years on those applications using Postgres-BDR.
However, the licensing of Postgres-BDR has changed and if we want continue to use it on the old license (i.e. for free), we’re stuck on an old version of Postgres (9.4).
So we’re looking for an alternative. Could CockroachDB (a clustered, fairly Postgres-compatible database) be a suitable replacement for Postgres-BDR?
TLDR answer: not exactly. The fact that CockroachDB uses synchronous replication and requires a minimum of 3 nodes and quorum of 2 means that its not a drop-in replacement for Postgres-BDR.
CockroachDB is a clustered database which provides near-compatibility to Postgres. A CockroachDB cluster uses synchronous replication between the nodes (and a 2 out of 3 quorum). This strategy has the advantage that it works like a normal database (i.e. no distributed sequences or conflict resolution or other weirdnesses arising from asynchronous replication), but the disadvantage that the performance of writes is subject to network performance (because they need to be replicated synchronously) and that the network topology is potentially more complex, since the database only remains functional if a quorum is available.
CockroachDB also provides horizontal scaling across nodes, which is certainly interesting for production applications, but not the topic of this post.
Postgres-BDR can be effectively used in a simple 2-node situation (primary and secondary servers). Both Postgres-BDR nodes continue to function even if they lose contact with one another (at the risk of running out of distributed sequence range if the nodes lose contact for too long or eventual conflict resolution issues when the node reestablish contact – but both these problems can be relatively easily avoided in the database model design). CockroachDB on the other hand requires 3 nodes and a quorum of 2. In practical terms, this means that you need 3 networks to connect the nodes and 2 out of those 3 networks have to be working at all times, otherwise the database won’t be usable. A workaround for this is to have two nodes on the primary site and a third on the secondary site, so that the nodes on the primary site will retain the quorum in the event that the network connection between the primary and secondary site goes down. In any case, the application may have to be aware of the CockroachDB database cluster status so that it can shut down the application node when the database node loses its contact to the cluster.
- Test 1: CockroachDB write performance vs Postgres-BDR
- Test 2: How well does CockroachDB handle one or more offline nodes etc.
- Test 3: Adding and removing nodes in production
I deployed 3 CockroachDB nodes in Docker (3 is the minimum recommended number of nodes), following directions from here.
Here’s the docker-compose.yml for the CockroachDB cluster.
version: "3.8" services: cockroachdb_node_1: image: cockroachdb/cockroach:v19.2.4 container_name: cockroachdb_node_1 ports: - "8001:8080" - "20001:26257" volumes: - cockroach_data_node_1:/cockroach/cockroach-data command: start --insecure --join=cockroachdb_node_1,cockroachdb_node_2,cockroachdb_node_3 networks: - cockroachdb_network cockroachdb_node_2: image: cockroachdb/cockroach:v19.2.4 container_name: cockroachdb_node_2 ports: - "8002:8080" - "20002:26257" volumes: - cockroach_data_node_2:/cockroach/cockroach-data command: start --insecure --join=cockroachdb_node_1,cockroachdb_node_2,cockroachdb_node_3 networks: - cockroachdb_network cockroachdb_node_3: image: cockroachdb/cockroach:v19.2.4 container_name: cockroachdb_node_3 ports: - "8003:8080" - "20003:26257" volumes: - cockroach_data_node_3:/cockroach/cockroach-data command: start --insecure --join=cockroachdb_node_1,cockroachdb_node_2,cockroachdb_node_3 networks: - cockroachdb_network cockroachdb_runner: image: cockroachdb/cockroach:v19.2.4 restart: on-failure depends_on: - cockroachdb_node_1 - cockroachdb_node_2 - cockroachdb_node_3 volumes: - ./data/sql-init.d:/docker-entrypoint-initdb.d - ./data/init-data.sh:/usr/local/bin/init-data.sh:ro command: - "shell" - "/usr/local/bin/init-data.sh" - "cockroachdb_node_1:26257" networks: - cockroachdb_network networks: cockroachdb_network: volumes: cockroach_data_node_1: cockroach_data_node_2: cockroach_data_node_3:
I did the same with postgres-bdr, following directions from here.
Here’s the docker-compose.yml for the Postgres-BDR cluster:
version: "3.8" services: node-1: image: jgiannuzzi/postgres-bdr container_name: bdr_node_1 volumes: - bdr_node_1_data:/var/lib/postgresql/data ports: - 6001:5432 environment: - POSTGRES_PASSWORD=postgres restart: always node-2: image: jgiannuzzi/postgres-bdr container_name: bdr_node_2 volumes: - bdr_node_2_data:/var/lib/postgresql/data ports: - 6002:5432 environment: - POSTGRES_PASSWORD=postgres restart: always node-3: image: jgiannuzzi/postgres-bdr container_name: bdr_node_3 volumes: - bdr_node_3_data:/var/lib/postgresql/data ports: - 6003:5432 environment: - POSTGRES_PASSWORD=postgres restart: always volumes: bdr_node_1_data: bdr_node_2_data: bdr_node_3_data:
Test 1: insert 10000 rows
I did 10000 single row inserts, each containing a JSON column (about 50MB of data total).
CockroachDB took 7ms per row insert. Postgres-BDR was twice as fast, with 3.5ms per row. This performance difference is presumably mainly due to the synchronous replication of CockroachDB versus the asynchronous replication of Postgres-BDR.
Test 2a: take node 3 down, add 10000 rows and bring node 3 back online
Both databases successfully resync’d the node within a few seconds after it came back online.
(1) Postgres-BDR did not allow DDL operations while a node was offline, CockroachDB did
Test 2b: repeat Test 1 with 2 out of 3 nodes down
CockroachDB refused to function as soon as it lost its quorum of 2 nodes. It recovered correctly as soon as the offline nodes were brought online. Postgres-BDR worked fine with only one node active and resynced the other nodes when they came back online.
This behaviour is by design in both cases – if you want 2 nodes to be able to fail in CockroachDB, you need more than 3.
Test 3: adding and removing nodes
Adding a node was easy – I just added a node 4 to the docker-compose and brought it up. It automatically joined the cluster.
To remove the node I executed “/cockroach/cockroach.sh quit –decommission –insecure” from within node 4.
rogersmac:cockroachdb roger$ docker exec -it cockroachdb_node_4 bash root@2d1e9d46f9eb:/cockroach# /cockroach/cockroach.sh quit --decommission --insecure id | is_live | replicas | is_decommissioning | is_draining +----+---------+----------+--------------------+-------------+ 4 | true | 112 | true | false (1 row) ............ id | is_live | replicas | is_decommissioning | is_draining +----+---------+----------+--------------------+-------------+ 4 | true | 0 | true | false (1 row) No more data reported on target nodes. Please verify cluster health before removing the nodes.
CockroachDB appears to be an excellent product, but it’s not a drop-in replacement for Postgres-BDR. CockroachDB’s synchronous replication has advantages and disadvantages compared to the asynchronous replication of Postgres-BDR and requires a very different system design.