A real issue with microservices is that you can no longer use the database system for transactional integrity. Microservices are fine for things you never have to back out. But if something can go wrong that requires a clean cancel of an entire transaction, now you need all the logic for that in the microservice calls. Which is very hard to get right for the cases where either side fails.
(Today's interaction with phone support: Convincing Smart and Final that when their third-party delivery company had their systems go down, they didn't actually deliver the order, even though they generated a receipt for it.)
Jimmy Bogard's talk "Six Little Lines of Fail"[0] is an interesting case study of all the ways things get so much more complicated when you can't just roll back a transaction if something goes wrong.
Yeah, and while things like the Saga pattern[1] is interesting, it hasn't impressed me with its elegance or cleverness :[ (also, there seems to many real life cases where it would not work)
I personally don't like saga pattern. For example when some operation fails, it should be reverted ob every service it went through. But what to do, when that revert fails? It's just matter of time when the system ends up in inconsistent state.
Is there a better option? And what is it? (genuinely curious, I've never had to deal with scale beyond where a single RDBMS could handle atomicity for me, and I'm curious about the best practice for handling this in bigger systems)
Disclosure: It's only my opinion, that I developed through trial and error during building ecommerce applications in a small team. Problems that I solved may be different than problems you will encounter.
I aim for the system that will fix itself. So for example, when new product comes to product service, product service stores the product in a strict consistent manner without relying on other services. Other services, which want to know about that new product, polls the product service in regular interval and gets new products. Also you need to make endpoints idempotent.
The good thing is, that when something bad happens, the system data will eventually converge. Sometimes there will be bugs, so you will fix the bug and then wait, until system fix its data.
The bad thing is, that the system is only eventually consistent. You will need to track the delay and keep it short.
As adrianmsmith mentioned, its usually better to create monolith when possible.
Two phase commit is the traditional answer. Only extreme "enterprise" environments support it like JTA transaction managers in Java land.
But 2PC is slow.
The modern answer is to move scalability to your DB layer and have X instances of your monolith. Some NewSQL databases like CockroachDB and TiDB are reasonable choices as long as you avoid complex SQL
I'm sorry but this is exactly what tons of "old" businesses have done for ages. But they are being told over and over that that isn't "web scale".
In my old life we just scaled via the DB and were able to deploy as many copies of our monolithic services as we needed to scale. We had a few monolithic services (no, not microservices, really nothing micro about them at all) that even shared some core code through a library.
In my new life that is no longer wanted and we need microservices and multiple databases. Except that it's all just logically split and in reality ends up in the same DB instance. But you know, we could scale it to a new DB instance if we ever needed to. Not that we will if you ask me because we run maaaany customers on the same database instance too. So it's all fake and we could've just used the same old boring working fine approach as any old business. Go figure.
Yeah you're right to an extent. But in the past you could hit a point where one DB could not handle the traffic. There's tons of businesses out there doing one DB instance per customer because of this.
With Cockroach and TiDB you can host unlimited traffic on the same db as long as you're careful about queries. I guess you could with Mongo or whatever too if you're okay with data corruption
In my experience being careful with the queries is harder and/or more expensive. Developer time is either very expensive or you just don't have the right developers.
It also depends a lot what kind of money you can throw around. Are you a big fat insurance company? Music business? Bank? You are just gonna throw money at Oracle and your hardware vendor and its gonna scale (thinking 10-ish years back into my past here as an example). I think we came from like 4GB of RAM on the DB and a couple of CPUs with single path i/o.
Could have spent ages optimizing all the different workloads. And this was already using read only replicas heavily for around the globe acceleration. This internal app was used 24/7 from different places in the world tho heaviest usage was Europe/US timezone. Instead they got 128GB of RAM, 16 cores IIRC and 4-path i/o (still spinning disk w/ SCSI at the time). Sure it cost a lot I'm sure. But I doubt it cost more than a year's salary of _one_ of us developers. And optimization would've required the application developers, the backend batch job developers and the (shared resource) DB admins to work on a multitude of workloads and use cases to analyze and optimize.
FF to my current job with lots and lots of but mainly smaller customers. One DB instance per per customer would be a lot of overhead. We do one schema per customer on PostgreSQL right now with sharding.
>> A real issue with microservices is that you can no longer use the database system for transactional integrity.
> Is there a better option? And what is it?
If you truly have a distributed system, then you have to pick from a number of not-ideal options.
But, there are certainly many systems where the developers and architects have a choice as to whether to create a microservices or a monolith. If you're facing that sort of choice, the "better option" (in terms of being a solution to distributed transactions) is to use a monolith, and not have to deal with distributed transactions at all. You get the facility to just rollback the entire transaction.
(Today's interaction with phone support: Convincing Smart and Final that when their third-party delivery company had their systems go down, they didn't actually deliver the order, even though they generated a receipt for it.)