In mysql parlance, I think it is pretty common to run a replicated server where you save a copy of the binlogs, which are the replication messages that the primary sends to the secondary to keep the secondary in sync. That supposedly lets you reconstruct every state that the db has been in. It would depend on implementation details that I haven't checked though: it doesn't follow automatically from replication. But I know of some sites that do it that way.
There is also a concept of "purely functional data structures" used for example by Happstack. Those are data structures where you never mutate anything, but instead do updates by allocating new nodes and pointers, so an update takes O(log n) operations since you have to reallocate the path to the root of a b-tree like structure, instead of just overwriting a cell. In principle that should still not be too bad, if the amount of data isn't huge. For whatever reason, though, happstack's performance is apparently disappointing even when you take its design into account.
This is precisely what https://github.com/sirixdb/sirix does. A resource in a database is stored in a huge persistent structure of index pages.
The main index is a trie, which indexes revision numbers. The leaf nodes of this trie are "RevisionRootPages". Under each RevisionRootPage another trie indexes the main data. Data is addressed through dense unique and stable 64bit int nodeKeys. Furthermore, the user-defined secondary indexes currently are also stored as further tries under a RevisionRootPage.
The last layer of inner pages in a trie adds references to a predefined maximum number of data page fragments. The copy-on-write architecture does not simply copy whole data pages, but it depends on the versioning algorithm. The default is a sliding snapshot algorithm, which copies changed/inserted/deleted nodes plus nodes, which fall out of a predefined window (usually the size is low, as the page fragments have to be read from random locations in parallel to reconstruct a full page). This reduces the amount of data to store for each new revision. The inner pages of the trie (as well as the data pages) are not page-aligned, thus they might be small. Furthermore, they are compressed before writing to persistent storage.
Currently, it offers a single read-write transaction on a resource plus read-only transactions without any locks.
Those logs though are at the lower Db level of abstraction (using them requires getting out of the database proper).
What the parent is referring to is the actual database schema itself. It’s a giant versioned log essentially.
Presumably there is some sort of ‘key frame’ however, or you’d need to go all the way back however many years ago to start rebuilding the objects current state?
> Presumably there is some sort of ‘key frame’ however, or you’d need to go all the way back however many years ago to start rebuilding the objects current state?
Correct. My criteria for the snapshot feature becoming a priority will be when log recovery takes longer than 5 minutes. I cannot see that occurring anytime soon based on current figures.
There is also a concept of "purely functional data structures" used for example by Happstack. Those are data structures where you never mutate anything, but instead do updates by allocating new nodes and pointers, so an update takes O(log n) operations since you have to reallocate the path to the root of a b-tree like structure, instead of just overwriting a cell. In principle that should still not be too bad, if the amount of data isn't huge. For whatever reason, though, happstack's performance is apparently disappointing even when you take its design into account.