There are a bunch of pitfalls. For one, listen/notify in Postgres wakes up all l...

ngrilly · on Oct 2, 2016

> If each table has a sequential ID and an "updated_at" column, you can exploit those to build deltas.

It's hard because concurrent transactions can commit sequential IDs or timestamps out of order. There can be a lag between the moment when the sequential ID or the timestamp was generated, and the moment when it was committed.

> Another option is to maintain a "changes" table that simply tracks the ID of each modified row, with a sequence number per change.

Yes.

atombender · on Oct 2, 2016

Good point about sequence IDs.

Another, mroe recent method that's bound to be more foolproof is to track the actual transaction log. Postgres exposes an API for "logical decoding" of said log. The downside is that it's fairly new, Postgres-specific, and I also suspect it puts more burden on the client, which has to interpret the entries and determine if/how they should be indexed. But in theory it gives you a completely data-loss-proof, application-independent way of streaming changes from Postgres.

ngrilly · on Oct 3, 2016

I agree, using PostgreSQL logical decoding is a good solution.

MySQL can do something similar using a replication stream (https://github.com/siddontang/go-mysql-elasticsearch) and MongoDB by tailing the oplog.