I suspect many apps would benefit from splitting the commit completion notificat...

koolba · on March 19, 2025

IIRC, that doesn’t work in practice because once you have a synchronous commit requested, all existing async commits that preceded it must be guaranteed to be fsynced for that sync commit to be fsynced. Which kind of makes sense if you think about the later work potentially reading the result of the earlier one.

The high throughput gains if async commit shine when the totality of the workload is async.

rzwitserloot · on March 19, 2025

I think you mean: If anybody anywhere decided to invoke a wait() until the state "my COMMIT; has been durably committed", then that means all previous commits must also be in durably committed state before we can continue.

Sure. But this doesn't affect other code that is merely wait()ing for 'logically committed'. Not in the past, and usefully, not in the future either. The 'logical system' can be far ahead of the 'durable' system, it doesn't have to wait.

Where it would go wrong is if code A needs to wait around for code B to do a thing (such as provide a value), and B's code for some reason contains a wait() on durable. Possibly because B was written before the split in commit behaviour was around, and for backwards compatibility reasons, for them COMMIT; means: "wait for durable commit".

Given that `SET synchronous_commit = off;` works per transaction, I'm kinda inspired here. I can think of a few places where we commit for various logical reasons but I don't need the fsync guarantee.

paulddraper · on March 19, 2025

But....

Only the sync committer needs to wait for those. (Nothing changes.)

The async committers can go on their merry way.

x0x0 · on March 19, 2025

Right, but there are common use cases (audit, log, analytics tables) where you can tolerate a small risk of loss and get some hefty speedups.

ddorian43 · on March 19, 2025

Probably higher cost to send 2 notifications to the user compared to always fsync on all writes in a SSD.

shayonj · on March 19, 2025

+1 - I have found its better to instead use it on paths that you know are idempotent and recoverable. Liking expiring data on a cron and so on.

cryptonector · on March 19, 2025

The network is faster then the storage.

rubiquity · on March 19, 2025

No it isn't and hasn't been for quite some time and it will probably remain that way for the rest of our lives.

cryptonector · on March 20, 2025

It still doesn't matter. All you need is enough CPU cycles to send the additional completion notice.

sgarland · on March 20, 2025

With nearly all DBaaS, and _especially_ Aurora, as TFA mentions, the storage is also on the network, so it's moot.

ants_a · on March 20, 2025

There is a paper exploring this concept: https://cs.uwaterloo.ca/~kdaudjee/ED.pdf

UI wise it does not make sense to have this distinction, as the window to get durability is a small fraction of a second. But for concurrent modifications the reduction in lock duration can mean an order of magnitude throughput.