I realized I’ve left out a major part from mysequence ofprevious feed
posts: a justification for why we should bother with a separate
So let’s give it a shot!
The fundamental problem is:
ids are assigned
order, but become visible to other threads in commit
To see how this causes a problem, consider the interactions and visibilities between three transactions to the same database:
t0: TRX0: BEGIN; INSERT INTO kv (ns, k, v) VALUES ("-", "k0", "v0"); COMMIT; t1: TRX1: BEGIN; INSERT INTO kv (ns, k, v) VALUES ("-", "k1", "v1"); t2: TRX2: BEGIN; INSERT INTO kv (ns, k, v) VALUES ("-", "k2", "v2"); t3: TRX0: SELECT MAX(id) FROM kv; t4: TRX2: COMMIT; t5: TRX0: SELECT MAX(id) FROM kv; t6: TRX1: COMMIT; t7: TRX0: SELECT MAX(id) FROM kv;
Here, we have two transactions that both insert a new
The database has
to assign an
value to each of those records, because we might be creating other associations to those records in our application code.
But other threads —
in this case — shouldn’t be able to see those records until we
, and so indeed the
Next, we commit
This was the third
insert the database saw, so we expect
, and indeed, it is, as verified at
Finally, we commit
Consumers will now be able to see an
record — hooray!
Except: if we applied a feed-style consumption based on
, we have introduced the possibility of skipping records.
To see how this happens, consider a feed-style fetch w/
We will run a query like
SELECT * FROM kv WHERE id > :cursor ORDER BY id LIMIT :limit
KV(id=1, ns=-, k=k0, v=v0) KV(id=3, ns=-, k=k2, v=v2)
Note that there is no
record committed yet, so this response is correct!
The problem comes from subsequent fetches: In particular, we will record
as the cursor to use for the next fetch.
This will prevent the subsequent fetch (with
) from ever observing the
This is pretty bad, and that’s why we pay the penalty of storing a separately assigned
After all the updates, I figure it’d be helpful to give a reference for the current schema we’ve built up over all the blog posts mentioned above:
CREATE TABLE `kv` ( `id` bigint(22) NOT NULL AUTO_INCREMENT, `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `updated_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `feed_sync_id` bigint(22) DEFAULT NULL, `shard` int(11) DEFAULT '0', `ns` varchar(255) COLLATE utf8_bin NOT NULL DEFAULT '', `k` varchar(255) COLLATE utf8_bin NOT NULL, `v` longblob NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `u_ns_k` (`ns`,`k`), UNIQUE KEY `u_fsi` (`feed_sync_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;