Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't know that this article tackled the real elephant in the room for bi-di replication. What happens when the same insert happens on both primaries at once, who wins?


They will each apply the insert locally. Then they will attempt to replicate that insert to each other. Each will attempt to apply the replicated conflicting insert, which will cause an error and halt replication for both nodes.

If you update the same data on both nodes, this is a recipe for almost certain disaster. Postgres is not a distributed database and this doesn't make it one.


Yeah, read the docs here [1] and really does not seem like this would be a good system to work with. To make something fault tolerant you have to have a side subscription running watching for server errors so you can resolve these conflicts when they arise.

pglogical gave you the option to determine the winner of a conflicting write (which I think is preferable). The inbuilt postgres stuff seems like it'd be rife with potential errors.

[1] https://www.postgresql.org/docs/16/logical-replication-confl...


I assume you just go very basic. All updates to table X go to database Y. Don’t have all the clients slamming different databases with updates to the same stuff.

Not ideal but would still help with some scale stuff.


Or an analytics system - spray events at whatever server, because each row is a unique record separate from the others, you get no conflicts.

Benefit - if your current system with a single primary for writes is suffering and your data and use-case fits this pattern, you can increase your write throughput without scaling up.


This. But if you use integer surrogate primary keys and seed one server with even ones and one server with odd ones, both incrementing by two, and put some uniqueness constraint on your natural key, you won't have this issue of insert collisions and halted replication, just normal SQL errors. IIRC (non postgres-based advice from an earlier life but should work.)


Yeah, that's not what this is for. This is more for "I have one DC in US and one in EU, and so on, with completely different customers, but if I need to I can access the data from any region anywhere".


Everyone in the thread including myself were thinking about failover and scale applications, but your post is the right answer


Right: Is the idea here that you use UUIDs for inserted rows, hence avoiding duplicate inserts?

But what happens if multiple updates or deletes target the same existing row?


I would assume the intended use-case is a "pseudo-sharded" cluster.

In an actually-sharded cluster, some scheme (e.g. a hash ring) directs writes to particular shards, and so any given row only lives on one shard.

In a pseudo-sharded cluster, you still direct writes to different "shards" — but all the pseudo-shards also mirror each-other's data, so every pseudo-shard actually has a complete dataset. But each row is still only owned by one particular shard; when updating that row, you must update it through that shard. All the data not owned by the shard, should be treated as a read-only/immutable cache on that shard of the other shards' data.

Personally, though, if I were setting something like this up, rather than bi-di logical replication on a single table, I'd just partition the table with one partition per shard, and then have the "owning" shard be the publisher for that partition and the rest of the shards be subscribers for that partition. Same effect, much less implementation complexity and much less risk.


That sounds like a smart model for thinking about this, thanks.


Maybe a column with a server id could work? Can you have a schema differing only in the default value for a server id column or would it be replicated as "insert default value"?


No need... here's a different twist on that .. make the integer primary key generation setup such that one server has odd IDs and the other even and they each increment by two. I did this for many years on Sybase and on SQL Server for active-active failover.


In the MySQL world, bidirectional replication is common, but with at-most-one of the two being writable (the replication user can still write even if the secondary is read-only). Maybe they meant it to be similar and not truly multi-master? It does seem like a glaring omission given they talk about having two primaries.


This sounds like circular replication that was possible back in MySQL 3.x using asynchronous replication, and generally not recommended for primary-primary setups.

I can't speak for others but to my mind Galera or Group Replication are the "modern" solutions to multi-primary MySQL replication. Both are considered virtually-synchronous, and thus actively prevent conflicting queries from succeeding.


Maybe the way to use this is to have two nodes but only one of which receives traffic at any one time. With an active-active setup like that, the failover process can be swift.

However, this is kind of a tough thing to do right -- how do really know that only one is receiving traffic? -- and I wonder if there are many cases where it is a compelling alternative to an ordinary failover setup.


I had pure active-active running at one job. We had our app layer centralize all its database access through a single set of wrapper generic SQL functions and had that set of functions implement some failures and hang-handling simple exponential backoff retry logic on the local DB before failing to the more remote one, with bidirectional. (nonpostgres) replication.

You can indeed get into split brain situations and have to make hard choices as per the CAP theorem. And the replication will break sooner or later and you have to detect that and learn how to recover safely manually at least. Don't do it just for funsies. But it's definitely doable.


You use shard-compatible ids like snowflake.


I assumed the above poster was referring to business rules - ie you can only have 1 person per table. DB 1 seats person A, DB 2 seats person B, when they replicate there’s a constraint that fails.

I’m not sure how ids like that would resolve this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: