Are you at all worried about the size of the audit table? Keeping a JSON copy of both the old and the new data is going to add up pretty quickly for fast-changing tables.
I've built audit systems in the past which track just the new values, with the idea being that you can replay the audit log for a particular record from the beginning to reconstruct its state. I'm not convinced that's a better approach than yours here though, just interested in your thinking.
> Are you at all worried about the size of the audit table? Keeping a JSON copy of both the old and the new data is going to add up pretty quickly for fast-changing tables.
That's def a valid concern if you're auditing a high frequency insert/update table, or have your entire database under audit. If you do have those needs a system that logs outside of postgres (like pgaudit) would be a better fit.
In my experience most startups and mid-size enterprises sprinkle in auditing around the sensitive parts of the DB, like `account` or `access_control` tables where writes load isn't much of a concern.
> I've built audit systems in the past which track just the new values
Yeah, that solution works great! The one place it breaks down is if you apply auditing to an existing table, and need to be able to recover the records from when auditing was enabled (initial state is not snapshotted)
We toyed with another approach to avoiding having to track old_record by first checking to see if the `record_id` exists in the audit table, and then inserting a row with the OLD values with a `SNAPSHOT` `operation` if it does not.
Even though that query/check was operating on an index, the performance overhead was higher than we were comfortable with.
Storage is pretty cheap these days so we opted to optimize for write throughput + reduced memory usage rather than any considerations wrt disk.
> We toyed with another approach to avoiding having to track old_record by first checking to see if the `record_id` exists in the audit table, and then inserting a row with the OLD values with a `SNAPSHOT` `operation` if it does not.
> Even though that query/check was operating on an index, the performance overhead was higher than we were comfortable with.
Was this done inside the trigger function? Couldn't you just insert the SNAPSHOT values first before applying the trigger? Seems wasteful to me to track both NEW and OLD for all subsequent operations just to ensure the initial value is covered.
I've also used an update trigger to track only changed columns, inspired by this post[0] which wraps a comparison function into a new subtract operator for JSONB, similar to the built in subtract operator for HSTORE which removes any keypairs that haven't changed.
Building a bit off Simon's above question. I'm curious if any thought was given to using JSON over JSONB. The trade off of really fast ability to ingest them and save them seems like it could be better than JSONB which has to do some conversion and if there are any indexes on the JSONB columns could really dramatically slow down throughput.
> I'm curious if any thought was given to using JSON over JSONB
For sure! There is a conversion penalty that is paid incrementally (at write time). For paying that penalty, jsonb gets you reduced storage size and dramatically faster column value extraction with `->` or `->>` if you end up having to filter the audit table by something other than the `record_id`
It is a tradeoff though and depending on the specifics of your use-case JSON could be a better choice.
First of all, this looks super awesome, and at least an order of magnitude better than our home-grown solution (grown over 25+ years on a DB that is now mariadb).
How well does this work when you audit multiple tables with foreign key relationships between them?
If the user references a customer, and you want to find out what customer the user belonged to at time X (but the customer might have been deleted in the mean time), you have to track through the history of multiple tables to find answers. More if you have n2m relationships...
It sounds doable, but not trivial to get right for more complicated models.
---
Update: to clarify, I want to be able to present a somewhat homan-readable history of an object without doing much table-specific logic.
If an author can publish many books, you typically have an n2m mapping table, like book_authors. If you want to add newly published books to the "author" history view, you have to determine than you have to include parts of the history of the book_authors table. I don't think you can do that through a join though? Trying to think my way through this...
Maybe this can be dealt with easier if you can assume each table has a non-composite primary key, and include that primary key in the history table, and then you can do joins more easily on the history?
All the data would be available and joining it would be possible with some jsonb gymnastics but its optimized more for compliance style auditing than rapid lookups for complex queries.
For that scenario replicating audit data to a OLAP system may be a better fit
Really nice post! A possible addition could be showing the diff as its own field. This seems like a sane approach: https://stackoverflow.com/a/36043269
But how could you go about not storing the diff but still being able to select it?
this is great! I've done something similar many times, but the ability to do it on arbitrary tables is really nice.
If you have the HSTORE extension, you can also use it to get a diff of the old and new: `HSTORE(OLD) - HSTORE(NEW)` and vice versa, very nice to easily have the deltas.
I'm not completely clear on the reason for both record_id and old_record_id - is this so that you can capture instances where one of the columns that makes up the primary key for the row is itself modified for some reason?
Being able to see when the primary key changes using `record_id` and `old_record_id` is a perk, but we also wanted to make sure a snapshot of the existing data was available when auditing is enabled on a table that already has data in it. See comment ^ for more color
Long story short: it is avoidable if you're okay with paying the penalty of querying the audit table by `record_id` before every write. We decided to use more disk and keep the higher write throughput
happy to answer any questions