With TimescaleDB compression, 1000 rows of uncompressed data are compressed into column segments, moved to external TOAST pages, and then pointers to these column segments are stored in the table's "row" (along with other statistics, including some common aggregates).
So while the query processor might still be "row-by-row", each "row" it processes actually corresponds to a column segment for which parallelization/vectorization is possible. And because these column segments are TOASTed, the row itself are just pointers, and you only need to read in those compressed column segments that you are actually SELECTing.
Anyway, might have known this, just wanted to clarify. Thanks for discussion!
yeah very interesting, i was wondering how timescale pushed postgres more towards columnar without rewriting a bunch of postgres itself.
My understanding of TOAST is that it itself is just a bunch of rows in a toast table that split the compressed "row" or in this case "1000 rows of 1 column" across as many rows as required to store the data whilst remaining within the postgres page size limits (normally 8kb).
With the often quoted postgres per row overhead of 23 bytes~ which you would have to pay for each TOAST row as well, does this not add up and eat into your storage efficiencies? or does compression work so well that the 23 bytes x N rows (1 row pointing to toast + N toast rows) required to store the "row" isn't important?
Does timescale do it’s own compression alg too? I see in pg 14 toast column compression can be lz4 instead of ootb pglz which has a few probs appr, I see mentions on the mailing list of significant possible optimizations. When dealing with EBS style storage where read latencies can be multi millis compression is always going to be a win, but is an easy optimization either way I’d think.
Timescale implements its own compression algorithms. It includes several ones, and automatically applies the choice of algorithm based on the data types of columns.
- Gorilla compression for floats
- Delta-of-delta + Simple-8b with run-length encoding compression for timestamps and other integer-like types
- Whole-row dictionary compression for columns with a few repeating values (+ LZ compression on top)
- LZ-based array compression for all other types
This means within even the same table, different columns will be compressed using different algorithms based on their type (or inferred entropy).
So while the query processor might still be "row-by-row", each "row" it processes actually corresponds to a column segment for which parallelization/vectorization is possible. And because these column segments are TOASTed, the row itself are just pointers, and you only need to read in those compressed column segments that you are actually SELECTing.
Anyway, might have known this, just wanted to clarify. Thanks for discussion!