Vitess is amazing. It lets you horizontally scale a mysql database at the database level and not the application level.
With vanilla MySQL there are limits to how big a database can get. If your application/dataset is continually growing, eventually you will need to shard the database. With Vanilla MySQL that entails spinning up a second primary db and teaching the application which db to route queries too. Or putting it another way, the sharding logic lives in the application.
With vitess the sharding logic lives “in the database” so your application doesn’t need to be updated.
It comes with the overhead of some middle ware but in my experience that overhead is well worth keeping the sharding complexity out of the application.
Keep in mind I’m not talking about a database with hundreds of tables getting too big but rather a small set of tables that are large enough and have enough traffic that scaling the underlying instance becomes problematic.
> With vitess the sharding logic lives “in the database” so your application doesn’t need to be updated.
That is unfortunately not true in the majority of cases. Queries of medium complexity or higher will often just fail to run on vitess. You also still need a sharding key if you want queries to be efficient. Essentially you don't want the vtgates stitching tables from different shards together.
As you scale, scatter queries become more and more problematic. You need to have carefully considered lookups, and often alter your queries to include sharding keys everywhere.
Cross shard writes and two phase commits are also a recipe for deadlocks.
Finally, you need to keep in mind that every shard you connect to uses up a connection in its pool. If you have relatively long lived sessions (like... request length) that touch a ton of shards in the same query, you can exhaust your connection pool with 'idle' connections.
Vitess _is_ amazing. I am grateful for it every day. It is not a panacea. Neither are the alternatives. Switching to Vitess is a beautiful problem to have, because you have a problem of scale. The switch is harrowing.
Vitess is a database solution for deploying, scaling and managing large clusters of open-source database instances. It currently supports MySQL, Percona and MariaDB. It’s architected to run as effectively in a public or private cloud architecture as it does on dedicated hardware. It combines and extends many important SQL features with the scalability of a NoSQL database.
Really glad to see them using inclusive naming, and remove the term "master" from their jargon. Watching movies like Django Unchained has given me an appreciation for how simple words like that can be triggering, so I work hard to avoid it in my engineering.
With vanilla MySQL there are limits to how big a database can get. If your application/dataset is continually growing, eventually you will need to shard the database. With Vanilla MySQL that entails spinning up a second primary db and teaching the application which db to route queries too. Or putting it another way, the sharding logic lives in the application.
With vitess the sharding logic lives “in the database” so your application doesn’t need to be updated.
It comes with the overhead of some middle ware but in my experience that overhead is well worth keeping the sharding complexity out of the application.
Keep in mind I’m not talking about a database with hundreds of tables getting too big but rather a small set of tables that are large enough and have enough traffic that scaling the underlying instance becomes problematic.