InfluxData is killing InfluxDB with their changes. Their v1->v2->v3 changes are beyond insane. They revealed that flux is deprecated in v3, their main selling point for v2. In database domain you would like stability not breaking change in every 2-3 years. See https://news.ycombinator.com/item?id=37206194
Don't forget that in v1->V2 they deprecated influxQL, their half-baked implementation of a SQL-like query language with half the semantics of SQL and plenty of footguns. And now after telling everyone how much better flux was going to be, they deprecated it and brought back influxQL...
And before that they thrashing around on which versions clustering features would appear in. That was an initial deal breaker for us because "Oops we don't know how to stay in business without changing terms" is a strong signal.
So we went to C* but revisited Influx years later. If the stability had been there in the first place, we would have been an early enterprise customer.
With v3 we're bringing the v1 API into it so that people will be able to interact with it as if it were a v1 database. There will be data migration tools from both v1 and v2 into v3. We would have loved to bring the v2 API along, but ultimately the API and the Flux language surface area were far too large.
We tried to do too much all at once with v2. Our goal with v3 is to bring the focus back onto the core of the database.
I said a bit of this in our community Slack channel earlier today, but I think it's worth repeating here:
I really would have loved for us to have more time developing Flux as a language, but we spent all of our time trying to optimize its performance. As a result, we weren't able to make changes to it that I think would have helped people understand and use it.
We pushed on it for 4 years and were told again and again by new users and people that chose other tools that they didn't want to adopt Flux and found it difficult to use. I realize that this wasn't universal and we had some real fans. But we still weren't able to bring the performance that most of our users required.
At the end of the day it was just too much to try to do in addition to creating the core database and all the managed, hosted service around it.
I was told once quite some time ago that while Flux made the impossible things possible (i.e. you could do things you actually couldn't do in other query languages), it made the easy things hard. I think this was a correct assessment and that it's what caused the getting started experience to be so poor.
We knew this and wanted to improve it. I brainstormed with Nathaniel (primary Flux author & creator) around 3.5 years ago to make changes to the language. We did a lot of work and documented the whole thing: https://github.com/influxdata/flux/blob/master/docs/new_flux...
But we couldn't ever get to that work. We shelved it because we were constantly fighting performance and reliability fires.
With version 3 we built around an existing query engine (Apache DataFusion) and it's getting developed not only by us, but developers around the world. We think this will have huge benefits over time that lead to a faster query engine that has more features and is more reliable.
InfluxData are masters at Pivots looking for product market fit, though no clear migration path for users of old versions must be very concerning for its customers.
Also after reading this announcement It is not clear for me what will actually be in that Open Source version ? In the past no Open Source high availability made Open Source only usable for toy workloads.
V3 will support the V1 API and there will be data migration tools for both v1 and v2 into v3. For our customers, we continue to support v1, v2 and v3 and will for quite some time.
I was just evaluating InfluxDB yesterday for a commercial data streaming pipeline, but there are too many different names and versions that it is VERY unclear what's what, and what is actually available.
I'm sorry to say I don't have much good to say about InfluxDB. I first considered them around 2015. I loaded about a week of logs from our OpenStack systems into it. It seemed to work so I included it in our set of internally supported systems and started going about the task of adding it to our backup schedule.
Well, perhaps things are different now but at the time the only way to do a full backup was to save the output from a command like "select * from *". Before adding that to a cron task I tried it from the admin console and quickly caused the server to OOM and die.
It was immediately clear to me that this wasn't a company that cared much about reliability since I was able to trivially crash it doing the only thing I could do at the time to backup my system. Obviously nobody at Influx cared about pesky things like backups or workload management, so I quietly decommissioned the system without incident and moved on to other things.
Some years later I watched Andy Pavlo interview their founder on one of the regular CMU database sessions. Andy was very critical at many points, effectively telling him "we told you this wouldn't work". I felt much the same way.
Influx looks very much like a company that may eventually build right thing after building everything else first.
"Turns out, stream processing was only really just getting started. In the last 12-18 months a veritable plethora of stream processing projects and companies have shown up, demonstrating that there is not only demand for it but also different approaches to take too." https://rmoff.net/2023/09/21/an-itch-that-just-has-to-be-scr...
Feel free to check out OSS Proton: github.com/timeplus-io/proton, and let me know any documentation/design need to be updated to make it a bit clearer
Influx 1.8 was and still is great. With 2.x and InfluxData shoving ridiculously slow Flux down people's throats on every step they managed to drive away much of their user base.
I just hope 3.0 will bring back more focus on query performance and simplicity.
I think they lost a lot of goodwill in 2.x with Flux and subsequently abandoning the thing they were pushing so hard. They also now have 4-5 different names for the same piece of technology in its various incarnations
Honestly someone should just fork the InfluxDB v1.x line and focus on stability and performance. The pre-1.0 to 1.0 release was painful, but what's there now is pretty solid as well as MIT licensed.
It blows my mind that they still have not learned their lessons with world breaking changes. How can you build anything on a database that is constantly reinventing itself chasing some new hotness?
I found Flux to be awesome in that it allowed me to express queries that wouldn’t be expressible at all with promQL or SQL. Yes, you could write down a query that was extremely slow, but I rarely hit those cases.
Not that I’m going to miss it very much; in fact them going back to SQL in IOx immediately put brakes on my effort not only to further learn the language, but also any desire to write any further dashboards and left me feeling I wasted a bunch of time learning it as much as I've done up to that point. I just hope they don’t end up going back to Flux again in 4.0.
Not that it matters, but I would be happiest if you just picked up InfluxDB 1.8.x, Kapacitor, and Chronograf again.
Really painful to see you guys abandon a great product lineup for the "we made our our programming language too!!!" hype train that was the hot thing 5-6 hypecycles ago. In the future, just stay off the hype wagon and stick to the core product.
The TICK stack was one of the coolest pieces of software developed and it is just unfortunate it was abandoned.
Telegraf is still very much alive and well. With v3, we realized the scope of what we were trying to do in v2 was simply too large. We're putting our focus back on the core of the database. Having an embedded VM (it'll be either Python or Javascript) will eliminate the need for Kapacitor in the stack, which had its own programming language and execution engine.
Ultimately, we're trying to narrow the scope of what we're developing and letting other tools that people already know and use have first class placement in the InfluxDB stack.
We'll have to test Chronograf against 3.0, but I think it should just work. Unfortunately, we don't have the resources to continue developing it, but it's all available under a permissive MIT license here: https://github.com/influxdata/chronograf
That is unfortunate. You guys need to stop chasing shiny objects and pick up the 1.x stack. You had a lot going and you’re about to make the same mistakes again.
One thing that isn’t clear to me at all, is – what time scales is Edge expected to be suitable for? If I’m a tracking my sensors at home and storing them for a year and then progressively downsampling for successive years (i.e. by no means "Big Data") – is Edge suitable for me or do I need to look for an alternative? I don’t think the blog post explained this well, perhaps exactly because they're knowingly alienating this sort of the user base?
I think Edge would be suitable for this use case. It's not that longer time period data won't be stored and queryable, it's just that Edge isn't optimized to make that longer period data queryable very fast (because it's not rewriting the files to reorganize them for that). Ultimately, we'll have to test once it's ready.
It sounds like Edge won't do the downsampling so you'd have to code that yourself. Their overall strategy may be "open core in name only"; if they don't promote or explain the open source version at all then it won't threaten sales of the commercial versions.