What did you choose over Influx?

thelittlenag · on July 12, 2023

GP here.

First time: we chose Timescale over Influx (and a few other competitors). I really liked Timescale. This was ~2.5 years ago and obviously Timescale was much less mature than today.

We also sent data to AWS Timestream. We're heavy AWS users and got to try out the product. I found it ok, but it was expensive even for us (Disney+).

Second time: team migrated from Influx to Clickhouse for server and network metrics. Services had been relying on that data being correct and timely in order to route traffic, and well...there were issues. The simplest solution was simply to replace Influx with a product more suited to handling high volumes of metrics...yeah, ironic.

willvarfar · on July 10, 2023

(not GP)

I've been bitten by the old Influx and had to migrate to something we could trust... (Influx basically tacitly admitted that the original architecture was pretty poo and they've since swapped out the engine (twice?), but it smells a bit like mongodb trying to reinvent itself and distance itself from all the early web-scale claims, so I'm kinda skeptical).

So I've rolled our own with MySQL+tokudb, but that's not a good choice for a new system as tokudb is disappearing. When I tried to migrate to MyRocks we discovered the newer kids like MyRocks don't really work nearly so well for specifically this kind of use-case.

Something I haven't personally tried, but have heard rave reviews of, is Timescale. Its a special storage engine for Postgres and it has a lot of nice features like auto-maintained rollups. And they have lots of deep technical blogposts that I find myself agreeing with, so it must be good! :D

RedShift1 · on July 10, 2023

I use Timescale and I can recommend it. The reason why I'm still using influx too (1.7) is because it's unmatched in its data storage efficiency and query performance. You can get close with Timescale, but its main power is having the query power of PostgreSQL, if you have room for the extra hardware resources it requires.

willvarfar · on July 10, 2023

(My memory is that we kept on influx including 1.7, but it was a while ago now so memory might be fuzzy)

I guess influx perf and efficiency is really depending a lot on your data shape then :)

Our experience was that performance dropped off a cliff if you had too much data, too much tagset cardinality, or else your query was too broad. And when it failed, it lost data.

In fact, it lost data generally. When we were replacing it we dual-ran an ACID DB version (which, with tokudb, was fast enough to keep up (although we didn't index every tag column)). So we did a diff and discovered just small random holes in the influx data that we'd never noticed before.

We had other considerations when we went mysql, as in, we were already using it. If shopping for a standalone solution to start a new project on, I'm thinking Timescale is the go-to these days?

physicles · on July 10, 2023

I’ve been self-hosting InfluxDB in the hundreds of GB range for several years. I wouldn’t say I’m super happy with it, but… let’s say we’ve reached an understanding, the software and I. We’re on the latest patch of 1.8 and content to stay there.

I agree with GP about storage efficiency, which is superb. Query performance is good as long as a single query doesn’t deal with more than ~dozens of series. And $deity help you if you want to do hourly roll-ups of all series for a short time range, as RAM usage is wildly unpredictable. Storage is optimized for long reads of a single series, not for short reads of many series (but in fairness, you have to choose one or the other, that’s just the physics of the thing).

If I were starting from scratch, I’d probably pick Timescale. Or maybe DuckDB… I wonder if it would work for our use case.

tetha · on July 10, 2023

> And $deity help you if you want to do hourly roll-ups of all series for a short time range, as RAM usage is wildly unpredictable.

I think I went properly mad while trying to troubleshoot this. The same query sometimes pulls 5GB, sometimes 20GB, sometimes 50GB and sometimes OOMs at 200GB memory pulled beyond base load of the system. And there's no query planner, no execution log, no metrics to help you. And most documentation or threads about it can be summarized as "well sucks to be you, eh? Maybe less data would be an option I guess"

We don't do that anymore and just roll up a very small select number of metrics.

And yeah, we've committed to Postgres as our main DB 2 years ago or so, and currently time is clearing up to start work with TimescaleDB. Zabbix is supposed to be great with it.

ople · on July 10, 2023

I have pretty much exactly the same experience.

However, I do feel that they are trying to really do the right thing with the new 3.0 architecture, addressing the deficiencies (most importantly performance and full-fledged SQL) while keeping the stuff that works (InfluxQL for simple and legacy queries). Also leveraging open-source projects and contributing to their upstream is a plus. Thus I’m hoping for them to succeed delivering on that promise.

physicles · on July 11, 2023

Agreed, embracing battle-hardened open source tech will be a win for them and for customers.

However, once your storage layer is parquet and your query layer is SQL, well... DuckDB is also basically parquet+SQL, and it won't be long before there's a nice Postgres wire protocol adapter in front of it. What's the advantage of continuing to use InfluxDB if you don't need clustering or HA?

valyala · on July 10, 2023

It's unusual to read that InfluxDB is fast and efficient. Did you try VictoriaMetrics? It usually needs 10x less RAM than InfluxDB for the same workload, especially when the number of active time series is high. It also uses less CPU and disk space on the same production workload. [1]

[1] https://valyala.medium.com/insert-benchmarks-with-inch-influ...

RedShift1 · on July 10, 2023

I've heard of VictoriaMetrics before but haven't had time to play with it. InfluxDB is also now ingrained in a production system so replacing it is not straightforward. The query language is also different meaning everything that uses it will need to be updated too, and coming from a mainly SQL background, PromSQL/MetricsQL looks oddly weird.

valyala · on July 10, 2023

Agreed that PromQL and MetricsQL have limited querying capabilities comparing to SQL or InfluxQL. But they cover the most of use cases for analyzing time series measurements, and allow writing much simpler queries than InfluxQL or Flux for these particular cases [1].

[1] https://valyala.medium.com/promql-tutorial-for-beginners-9ab...